 Thank you so much. So I thought I would start with a very short public service announcement, so at UCLA we have this path Institute called Institute for Pure and Applied Mathematics and next spring, so these are the dates from mid-March until mid-June, we will have a program which is for some reason named quantitative linear algebra and as part of that program I'm hoping to learn what this picture actually means, the graphics on the side. But if you look carefully at the description of it, you see that it includes a lot of things that are of interest to people here, including certain things like random matrices. I mean the idea is to kind of look at situations where you are interested in large-end limits of various things and you might be interested in what happens in the limit, which is like what free probability does, but you might also be interested in quantitative understanding of the limit and so I just want to say that if you are interested, I have no idea how many people in the end will be able to accommodate it and quite know how the whole thing works. But there is a button there to apply, so if you're interested in coming, you should press the button. Okay. So this is IPAM, Institute for Pure and Applied Mathematics. All right. So my third lecture I will finally get to random matrices in full force. And so I just wanted to start with telling you of a right away of a theorem. It's a whole page, isn't it? I want to give you two theorems that concern asymptotics of random matrices and freeness, so all the theorems basically will be telling you that somehow things become freely independent. So it's a lot of language on the slide. It looks even more intimidating than I thought. But basically the idea is this. You have, first of all, a bunch of matrices, U1n, UQn, so Q is an integer, not the quantum integer or anything like that. It's just a number. Okay, so these are what are called HAR unitaries. So that means that you sample them from the unitary and by an unitaries and by unitary matrices with respect to the HAR measure of their underlying Lie group. And then they also have these friends, B1n, Bdn. Now these are, you can take them to be random, but you can even take them to be non-random. These are some deterministic n by n deterministic matrices. Okay, so they're not, they're not varying. I mean they're varying with n, okay? And about these you make one assumption, namely that their law converges, so their joint law converges to some mu of some variables, B1, Bd. And also you're making the assumption that their norms do not get too large. So their operator norms, so for for matrix the operator norm is simply the size of the largest singular value or the square root of this largest singular value depending on how you define a singular value. Anyway, so we're assuming that this is finite. Okay, and then the claim is that if you put them together, so if you look at the the law of these guys at finite n, so you look at your unitaries, of course you look at their adjoints and the Bs, all of this converges together to something asymptotic. I'm just using lowercase letters for the variables that represent the asymptotic limit. And the key thing is that well the Bs among themselves, you know, you know what they are, they're just just the limit of these Bs, right? That's what we've assumed. But the point is that these are free. These are free and these look like generators of a free group. They are free unitaries. Each unitary has all moments zero except for the zeroth one. So these are like generators of the free group. So these are like Q generators in their inverses. And then there are these Bs from which they are free, okay? So that's one of the first statements. So let me give you a little lemma. I mean, you probably are very familiar with this. If I take gamma group and inside of gamma I take lambda, a subgroup, and then I take an element, let's say y, which is free from lambda. Okay? So this is just a group situation. So free simply means that there are no relations between y and the elements of lambda. In that case, it's very easy to check that lambda and y lambda, y inverse, are free. Okay? Just a moment's thought, how could there be a relation between something here and something there? What if there were a relation like that? You would quickly see that there will have to be a relation between something that involves y's and something that doesn't involve y's. Well, that can't be. And if, in fact, if this is in infinite order, if this y has infinite order in the group, so all of its powers are different, then actually you can continue and you can conclude that y squared lambda, y minus 2, and so on. All of these are going to be free from each other. So if you conjugate a group by free element many times, you just keep producing free groups. Okay? That's very easy to check. Well, somehow the same phenomenology happens in free probability. So if you have some m tau, your non-commutative probability space, and I have a unitary here so that u star u is u, u star is 1. And I will assume the analog of saying that it has infinite order. So I will assume that the moment, all the powers of u are 0 for all k, well, except for, of course, the zeroth power. And then let's say that I have some kind of an a sub inside m and that u, u star is free, freely independent from a. Well, from that you can quickly discover that a, ua, u star, u squared, a, u star squared, all of these guys are freely independent. Okay? And the proof is not very hard. You just write down the free-ness condition and it's almost immediate. So if you use this in conjunction with that theorem, here's one thing that you can deduce. Well, actually a corollary right here at the end, I wrote it. But let me maybe do an example. If I take a single bn, so I will take d equals 1, q equals 1. I take a single bn and as my bn I will simply take the matrix with lots of ones and lots of minus ones. So it'll be an n by n matrix and half ones and half minus ones. Of course, well, if it's odd, there'll be one extra you pick. Okay? And it's a diagonal matrix, right? And then I take un, one of my hard unitaries, then if I look at un, bn, un star, then this thing is going to be asymptotically free from bn by this, right? And of course by this theorem, because asymptotically b will be free from u and u star and so therefore I can apply this little lemma. So in particular if I want to do something like this, I add bn itself to this rotation of bn. Yes? Then the law of this will converge to what? Well, let's verify that the hypothesis of this theorem is satisfied. What's the operator norm of bn? Anyone? One, right? It's a unitary in fact. So it's one. Does it converge in law? What is its distribution of eigenvalues? Well, half the eigenvalues are one. The other half are minus one. So its law is delta, delta-mass at minus one with mass a half, delta-mass at one with, with one a half. So we know that this guy converges delta-mass at minus one plus delta-mass at one. Okay? So what does this converge to? What law, what does the law of this one converge to? Well, this is something that asymptotically will be, always in fact will be something that has the same law, right? So it's law delta-mass at minus a half, minus one, one. This is one, the same one and these are going to be freely independent. So I've heard the word free convolution. What you're going to get is simply the result of convolving this measure, freely convolving it with that measure. Right? And one of the exercises in the notes is to do this, this computation. In fact, you get the arc sign law. And just to see if we've cheated, here's a plot. So the, what's below, the, the, the sort of yellow is a histogram of this thing. So I've actually taken several hundreds, a hundred, I forget the exact size, but it's like 300 by 300 something matrix B that I took. I took a random unitary, I did this. And then the sort of green is a sort of arc sign law. Unfortunately, I cannot, somebody should explain to me how to get mathematics to actually plot functions fully. So since the arc sign law has these two singularities at the two endpoints, of course, mathematics doesn't plot them. But so we'll just have to live with that. In any way, in any case, you see that there's a fairly good fit here. Okay, any questions about this? Okay, I won't prove this right away. I will just keep on stating things and then we'll just kind of try to prove as little as possible. Oh, one amusing corollary here. So in this case, of course, B had support like that to point masses. But let's say that we chose a different B, which actually asymptotically would have connected support for its measure. Then anything that you can concoct by using B in this unitary, so something like that or whatever else you want, will end up having to have connected support. And this is just the statement that I told you before that when you start with free things and these free things have connected supports, whatever you do with them, polynomially or even continuous functions, so anything in this Easter algebra they generate, will have connected support. So kind of fun. All right, then there is the following statement, which let me again go through. So the first part I've sort of said, and in fact, if you take Johanna Dimitru's proof, the combinatorial proof that was in the very, very first lecture of the very, very first day, you will see that by just doing several matrices rather than one and keeping track of colors and all the diagrams, you get convergence from this GUE ensemble. So just to remind you, these are self-adjoint matrices, all of whose entries are Gaussian and they're maximally independent, right? So then the joint law of such a thing will converge to a joint law of a detop of free semi-circular elements. Now this theorem actually has an upgrade, namely, I can throw in deterministic matrices. So again, let's say that I throw in these fixed BJN, these are deterministic and I wrote them to be diagonal, actually it's not necessary. Okay, and suppose that the B's jointly converge to some limit, then also you have the joint law of these things so that your GUE matrices and these guys converging do that, okay? And one computation you can get from this, this is the very typical things that you do in population statistics and so forth. If you take these BNs, again, take a single matrix BN and then multiply it on both sides by a random GUE matrix and you can compute what that converges to, it will converge to the, oops, that's a typo, this would be the multiplicative free convolution, of course. Sorry about that, a multiplicative free convolution where mu is the law that is the limit of your B's, whatever that ends up being, and mu is a free Poisson law, or a Marchenko-Pastore law. And of course there are versions of this also when these things are not square matrices but you can also get at that by putting in more diagonal matrices to cut the size of the matrix on the outside, for instance. Okay, so, oh, one last thing I wanted to comment about convergence. So the way I've written it here, this is convergence, this is weak convergence, right? So convergence against any test function, let's say any polynomial test function, but you can upgrade it in various ways. First of all, as already Ioana explained in her talk, you can upgrade this to almost sure convergence. Moreover, you can even, as Roland mentioned, make certain that not only does the trace of a polynomial in these matrices converge to the corresponding trace here, but even the operator norm does. Okay, so there is this phenomena which is called strong convergence. And so what this means here is that if you take a polynomial in these matrices, let me just write it as vector notation a1n, bn, then the operator norm of this will converge to the corresponding polynomial in your a's and b's. Okay? All right. So that tells you, for instance, that the largest and smallest singular eigenvalues behave the right way in various other things as well. Any questions about this? All right. So this is the theorem that I want to concentrate on in terms of ideas of the proof. So first of all, let me explain how the first theorem, so this theorem about unitary matrices, how can that be deduced from this statement? This statement is about GUE, so a1ad are GUE here, b1, bk are fixed deterministic matrices. So how do we get the unitary statement? Well, the observation is essentially that, and I think this was mentioned in Percy's talk if I'm not mistaken, in order to sample your unitary matrices, the easiest way to do it is to put ujn to be the polar part in the polar decomposition. Well, what you would want to polarly decompose is a non-self-adjoint complex Gaussian matrix. And so what you would do is something like a2j plus i a2j plus 1. So remember, a are self-adjoint matrices with complex entries. Now if you do this, and think a little bit, you see that what you get is a non-self-adjoint matrix with complex entries. Now this matrix can always be decomposed as some uj times the absolute value. Let's call this one zj. So that's always true for any matrix. You can write it as a unitary times a positive matrix. And this positive matrix, well, if you look at the spectrum of the z, the spectral distribution of the z, you will find that it's a quarter-circle law. So if the quarter-circle law, if we were allowed to assume that zero isn't exactly where it is, which by the way solved lots of problems in mathematics of course, but if zero were here, then the quarter-circle law would actually be boundedly invertible. So the z would have a bounded inverse. And then we would be in great shape because we could just solve for this equation. You could write uj to be zj multiplied by zj, absolute value zj inverse. By the way, absolute value x, maybe I should say this is the square root of x star x. And then it would be immediate essentially from everything that's done that these uj's are going to converge because they're essentially a little worse than polynomials in the zj. Now, notice that uj is a hard unitary. This is clear because if you look at the zj because its entries are complex Gaussian, zj has the same law as vzj for any fixed unitary u, v. And so therefore its polar part satisfies that uj has the same law as v uj. So it verifies the uniqueness, I mean by uniqueness of hard measure on the unitary group, it must be the hard unitary. So now if we could solve like that, then we would see that the joint law of these u1 and uqn together with your b's, this will converge to the joint law of the u's together with the b's where the u's are the polar part of x plus iy where x and y are free semi-circular variables. Now this is something called the circular variable and it's a little combinatorial verification if you like and there are various ways to do this but in any case it's not very hard to check that the polar part of that is actually a free hard unitary. You get this statement right away. So the only problem is that of course zero is where it is and so the inverse of this z is not quite bounded but it's okay. It turns out that the problem is not so substantial and you have to just regularize things a little bit, you add a little epsilon here and then you prove it for that and then you remove the epsilon. Okay and so not as big of a problem as you would worry about. So basically once you know this statement here about the GUE matrices and constant matrices, this statement with the unitaries follows. Incidentally if you're interested in the orthogonal case as opposed to the unitary case, the thing works the same way. You would just have to start with the a's being real symmetric Gaussian matrices. Prove an analog of this theorem which is not much harder and then use the same kind of tricketry here in order to get the orthogonal polar part and so forth. Okay. Alright so my point is that this is really the statement to prove. Okay so to prove this statement I want to use the same bag of tricks that I used probably almost in every lecture. Namely there is this magic derivative that we ran into several times. Finally I give it a name. This was introduced by Vojkulesk in this context and he called it the free difference quotient, or free difference quotient derivation. So let me say a few words about it because it turns out to be a very useful tool but as useful as differentiation is in the classical story. So the setup here is that you have some kind of an algebra D. So this is like the algebra of coefficients. They don't play a role. So D, EG, D is C or some unital algebra. We have also some variables x1, xn and I write D, x1, xn for the algebra of non-commutative polynomials in x1, xn whose coefficients are from D. So this is a typical such thing as some monomial like D0, xi1, D1, xi2, xiP, DP. That's a typical such thing. And then the way you differentiate such a thing, so if I call this algebra A, I have these different quotient derivations. They go from A to A tensor A. And the rule is that you differentiate the way you learn to differentiate in calculus but every time you differentiate an x, you put a tensor sign instead of it. So you remember where that x was and you mark that location with a tensor sign. So let me give you an example. Suppose I have something like D0, x1, D0, x3, D1, x3, D2, x1, D3, x4. Let's do one more x1 here, D4. And let's compute the derivative of this thing with respect to x1. So as I said, the rule is that you differentiate like you do in calculus. Ds are constants, so they don't get differentiated at all. So here I write D0. Then I encounter this x1 and what do I do? I put there a tensor sign. So I'm done differentiating with this term. And then I keep on differentiating. The next time I encounter is in x3, but I'm doing the partial derivative with respect to x1. So I don't get to differentiate this x3. Or if I do, I just get a zero term. So the next interesting term is this one. So I get D0, x1, D1, x3, D2. And here I put a tensor sign in place of this x1. I get D3, x1, D4. And finally I get D0, x1, D1, x3, D2, x1, D3, tensor, D4. Because the last one I get to differentiate is that x1. So this is a derivation. It obeys the Leibniz rule. The Leibniz rule is here when you differentiate a product of two variables. This is how it behaves. You differentiate the first times the second plus the first times the differential of the last. And now what do I mean by this DIP times Q? Well, what I mean is that the derivative lives in A tensor A. And this A tensor A is in a natural way a bimodule over A. In fact, in different ways. But what I'm using is left multiplication here and right multiplication there for the bimodule structure. And that's what I've written here. Actually, the Leibniz rule, together with the fact that dj xi is delta i equals j one tensor one. And the fact that the constants get killed. These two things together with the Leibniz rule actually uniquely determine the whole thing. It's very easy to check. It's just a recursive type thing. Okay. All right. Great. Maybe two quick comments. If the algebra D is just the complex scalars. And n equals one. So we only have one variable. Then the algebra of non-commutative polynomials with coefficients in C and one variable, of course, is extremely commutative. And it's just the algebra of polynomials. I can identify it with the algebra of polynomials in one variable. And therefore, if I look at its tensor square, which is the destination for my derivative. This I can identify with the tensor product of polynomials with themselves, which I can think of as polynomials in two variables. And so the idea is that if I take some polynomial p tensor q, I think of it as p of s q of t. Right? And now, if you think about these identifications, then the difference quotient derivation becomes dp is p of s minus p of t over s minus t. So this is why it's called the difference quotient. All right. Great. So now the point is that this derivation can be very useful in characterizing or characterizing certain laws. So while you're reading this, let me just write for you the analog of the classical story. So suppose mu is a probability measure, and you happen to know that the expected value under mu of x times f of x is the same thing as the expected value under mu of the derivative of f. Then use Gaussian. To see that this property holds under the Gaussian measure is very easy. It's just saying that if I take x times f of x e to the minus x squared over 2 dx, there's some normalization factor, but it comes out in the wash. It doesn't really matter. It's the same one on both sides. So I can view this as integration by parts, and I can rewrite it as f prime e to the minus x squared over 2 dx, which is exactly this identity. But you can actually check that this determines the measure. For instance, you can check easily that if you take the Fourier transform of the measure, this will give you an equation for that Fourier transform, and you'll be done. All right. Now, what I claim is that the same thing happens. So Stein's method is exactly the method of inventing some differential operator. So in this case, the differential operator is d dx. So f goes to d dx f minus x f with a property that under your measure, the expectation of the result of this operator is always zero. So that's a way to characterize a measure. And so the idea is that if you want to actually prove that some measure is close to your measure, you first check that this operator is close to zero on that one, and then that gives you some conditions of convergence. So what I claim is that there is a corresponding statement in free probability, namely there is this equation here, so it's the expected value of a product like that, is the same thing as tau tensor tau of this derivative. And I claim that this actually characterizes two things. One is that x1, xn are, x1, xd are free semi-circular variables. And secondly, that they actually are free from the coefficient algebra d. Now, this is, oops, what am I doing? So to prove this, there are two steps. One is that you can prove uniqueness. So you can prove that if this were to hold, then actually the law is determined completely. Well, we sort of already did this. If you remember, let's imagine d is c for simplicity. You see, if I then take, look at tau of xiq and write it as tau tensor tau of diq, if q, if the degree of q is equal to, say, r, then this here is an arbitrary degree r plus 1 polynomial, right? So if I know how to compute this, I would know how to compute the expected value of any polynomial of degree r plus 1. On the other hand, when you differentiate q, you get a tensor product, and all the terms in this tensor product have degrees at most d, sorry, r minus 1 because you've differentiated. So this involves degrees, and therefore you have a recursive relation between expected values. I mean, look, if I want tau of xi, I write it as tau of xi times 1, which is tau tensor tau of di of 1 of 1, which is 0. If I want tau of xi xj, I write it as tau tensor tau of di of xj, and that's delta i.