 very happy to be here and I thank Dima for inviting me so my talk is maybe a little bit off the main topic of this workshop I would have liked to talk about networked economies that would have been maybe a little more appropriate but this work is not finished yet so I'd like to talk about something that we worked on for the last few years with Hall-Maalez who was a PhD student and Mark Potters and I hope that you'll find it interesting and I think that you know the topic is pretty general and has a lot of different applications and I'll show at the end one application to financial data okay so in this talk I'm going to look at random matrices that are constructed from something that I'll generically call a signal and something that I'm going to call some noise okay so so so these are randomly perturbed matrices and I'll be more precise about these two components later and the questions I'm going to try to address in this talk is not about eigenvalues that have been really studied a lot in the last I don't know 60 or 70 years but but about eigenvectors and in particular I'm going to ask how similar are the eigen vectors of the pure matrix the signal C and those of the noisy observation of C which is M so similarity of the eigenvectors and another question that's as you will see very interesting is how similar are the eigenvectors of two independent noisy observations of C so the C will be the same but the noisy part will be drawn from the same ensemble but another sample and you can ask how the eigenvectors between these two noisy observations are or not similar and then I'll end by some applications okay so in order to be slightly more precise about the type of random matrices I'm going to consider one I've already written and I'm going to call it the free additive noise case so in a generic case I'm going to consider the C as the arbitrary matrix of course of large size so all these results will in principle be that be valid when the size of these matrices go to infinity and then the noise component is made of a sudden diagonal matrix B with the arbitrary spectrum and and some random rotations or transpose okay and you know in a very loose way this is what defines freeness between matrices two matrices are free if you know in a sense the basis in which they are diagonal are randomly rotated with respect to one another and another case which as you'll see is very important is what I'm going to call free multiplicative noise which in this case is considering this matrix C again to be a positive definite matrix and M is now the square root of C the same OB O transpose square root of C so the signal is here and this is this is the noise okay so there are many examples of such problems one is an inference problem you observe M and you would like to reconstruct C as best as you can and a very usual case in this respect is when B is a Vignard matrix in which case OB O transpose is also a Vignard matrix so Vignard matrix means that all the elements are Gaussian independent so this is the most natural noisy case where you add to a sudden deterministic matrix random noise with a randomness on all entries you can think of this perturbation as in quantum mechanics like adding a kind of time-dependent random perturbation to the system and this is considered in in models of quantum dissipations or things like that and then there's the well-known Dyson Brownian motion where OBO transpose poses actually a Brownian noise and in this case you try to study the stochastic evolution of the eigenvalues that's the typical problem that Dyson looked at but it's also interesting to look at the eigenvectors so the arch example in the multiplicative case is when C is a covariance matrix okay so you have observables for example n time series of of length T so as I'm going to show later for financial markets imagine that you have n stocks say n equal 500 if you look at the S&P 500 and you have a certain number of years of data so typically there's 250 days in a year so if you have 10 years there 2500 entries so what you're looking at are returns of these stocks and you're trying to infer from these returns the covariance matrix and so of course what you have is a sudden realization of these returns drawn from a putative true covariance matrix and you try to reconstruct this covariance matrix from observation and so in this case what you observe is the empirical covariance matrix this you don't know and this is usually model as what's called a wichord noise where OBO transform is xx transpose where x is an n by T white noise maybe Gaussian maybe not matrix okay so as I said the problem of knowing the eigenvalue spectrum of these matrices been beaten to death something that's less studied is is the overlap problem and so what I'm going to focus on is this object phi of lambda i Cj which is defined as follows as n times and I'm going to discuss why the normalization in the second the expectation and again I'm going to discuss why we need an expectation here of the dot product or in a bracket quantum mechanics notation ui vj where ui is the eigenvector of m of the noisy matrix associated to eigenvalue lambda i and vj is the eigenvector of c associated to eigenvalue cj so I'm going to consider very large matrices as I've said n is much larger than one and if you really look at this concy for a given lambda and a given c this without averaging this doesn't convert to anything in the large n limit so it's better to actually average these overlaps I forgot the square here so it's the overlap squared that I'm going to call an overlap by the views of language and so this thing if you actually average over small intervals say around a given value of lambda you can fix cj and you average over certain small interval of lambda of width of small very small width but still much larger than one of red in such case there is there are still a lot of eigenvalues in these small intervals then this converges with something that we're going to compute the n factor here means actually that the overlaps are very quickly of all the one over n so if you want as soon as you put up a major a large matrix the overlap with the initial direction gets very very rapidly lost and so it becomes of all the one of ren and you have to look at at this scale one of rent to see something non-trivial and the reason why these overlaps are very quickly small can be into it I mean you can build an intuition of this phenomenon using this Dyson Brownian motion picture that I alluded to because if you look for example that's very easy to derive it's a second-order perturbation theory if you want the evolution of a certain vector when you add a small perturbation to a matrix that that's an eigenvector there's a term here which is deterministic but there's a random term and you see there's a lambda i minus lambda j here which hybridizes with all the other eigenvectors of different indices and so this is a very efficient way to mix and hybridize as I said the eigenvectors so very quickly at each collision of these eigenvalues you lose some coherence and and the the the structure of the eigenvector becomes spread over a finite interval and hence of order n eigenvectors so that's really why you need an end here okay so before telling you the results that that we obtained I need to introduce some pretty standard I mean most of you probably know these objects but some of them are maybe less well known so I'm going to spend a little time on yes I'm averaging over a certain interval I'm not discretizing okay so lambda i lambda j can be as close as the one no so so here what I'm sorry what I'm going to compute is this object here where lambda i I mean in the last slide oh I'm sorry so this is just an illustration of this mechanism of why you lose coherence so this is an exact result but if you look at the equation for lambda i then what you find is that collisions are are avoided so that's that's the you know repulsion of levels but but they can come very close together and when they come close together you hybridize with the rest okay so so the most used tools in random matrix theory is is the so-called resolvent or or many names for this object but the resolvent is just the inverse so m is again our noisy matrix z is a complex number and and so you consider this matrix z times the identity minus m minus 1 so the inverse of z i minus m and from the knowledge of this object you can for example recover the spectral density or the eigenvalue distribution depending on the background physicist or math that you prefer so if you take the imaginary part of the normalized trace of this resolvent computed at lambda minus a small imaginary part and you take the limit where eta goes to zero then this gives you pi times the the density of eigenvalues that around lambda and interestingly if you are interested in the overlaps then it's very easy to show the similar formula if you take the if you bracket the imaginary part of the full matrix now between vi and vi where again vi is the the vector the eigenvector associated to eigenvalue ci then again when you take eta goes to zero you get this quantity becomes pi times the density of states times the overlap the normalized overlap that I talked about so in practice again it's very important to understand that eta is the resolution scale at which you look at your eigenvalue spectrum and so in order for all this thing to make sense and in particular in order to in a sense automatically compute these averages over small intervals around lambda you should take eta to be small formally to go to zero but still within mind the idea that it must be much larger than one of rent so that you know these these these objects actually in a way are microscopes that zoom around some lambdas but with some finite resolutions and this resolution should not be too strong so you shouldn't be able to look at individual eigenvalues but some kind of coarse-grained version of them okay once you get once you have these the still just transform which is again normalized trace you can define more objects so for example you can define so so G is often also called a green function so Tony Z introduced a name for the in functional inverse of G which is the blue function so the blue function actually acting on the G fun on the green function is identity and with the blue function you can construct what's called the R transform which is the blue function minor minus one over Z and you'll see why this is an interesting object and for example if you compute the R transform of a Wigner matrix it's very simple it's just a linear function of Z you can also introduce another transform that's called the S transform again it looks very abstract at this point but you'll see why these objects are interesting so you know it's a little bit more involved you have to define this quantity and then from this T transform there's a formula for the S transform and you remember I talked about Wishart matrices so Wishart matrices are defined if you want as the empirical covariance matrix you would measure on white noise so if you have time series and time series of lengths capital T so in principle the true covariance matrix is identity there's no structure but if you measure on a finite data set this covariance matrix you'll find something different from identity and what you find is is a member of the Wishart ensemble and so the S transform of a Wishart matrix is given by one over one plus QZ where Q is a very important quantity that will come back later is the ratio of the size of the matrix to the lengths of the time series and so for example when T goes to infinity at finite n it means that you have a lot more data than you have time series and therefore in this limit you will recover the fact that the empirical covariance matrix is the identity which is the true covariance matrix so when Q goes to zero it means that you remove the noise and when Q is large there's a lot of noise okay so the main theoretical result that we got with the collaborators I alluded to in my first slide that's published in IEE 2016 is the following is that you can actually compute in the large n limit the green function the matrix green function the resolvent for this noisy matrix M and express it in terms of the resolvent of the pure matrix C and the point is that in order to do that actually there you know you can forget about brackets in the large n limit this becomes true self averaging and but the point is that you're not computing these objects at the same point in the complex plane and this is really what makes the whole story interesting as you'll see in a minute so for example in the additive case so you remember the additive case is this one so it's described by the Sun diagonal matrix B and then these random rotations then capital Z of small Z is given by a self consistent formula which is that capital Z of small Z is Z minus the odd transform of the B matrix computed at point GM of Z but of course GM of Z is going to be the trace of this matrix so this is a self consistent equation defining capital Z of Z in the multiplicative noise case there's a similar formula that involves the S transform of the matrix B so how do we get these results well you can use you know what physicists like to do that is a replica calculation I won't give you details but if you use the replica representation of the resolvent which is a kind of Gaussian integral with zero dimension and use the Harish Andra its X and Zuber formula in the low-rank case then you can pretty easily obtain these formulas what's easy to check is that if you take the trace of these matrix equalities then you recover what are called the free convolution rules and maybe some of you are familiar with with this quite amazing result I think is that if you have free the sum of two free matrices then the spectrum obeys this sum rule in a way convolution rule which is the R transform of the sum is the sum of the R transform and similar in the multiplicative case the S transform of the product is given by the product of the S transform so this is just a consequence of this more general result that holds true for in a matrix sense okay so now we have all the ingredients we need to go back to the main topic of my talk which is the computing this object and as I've said this object is actually hidden in this this dot product of G Vi with Vi itself so okay you turn the crank and one gets explicit formulas for these functions Vi of lambda and C which are though thus these kind of coarse grain average of the overlap squared of the corresponding eigenvectors within mind again the factor n that I emphasized a lot at the beginning so all these curves should you should think of them as is actually very small they're over the over the 1 over n but we blow up to see this fraction so for example if you have a matrix to which you add a random term so for example if B is a Wigner spectrum so it means that the the noise that you add to C is say independent entries with with a second moment that's finite they don't necessarily need to be Gaussian then there's a lot of universality in this problem and what you get is that the overlap is given by this expression so what you see is that there's a C minus lambda squared here so this is a Cauchy like formula as a fun for a given lambda as a function of C or for a given C as a function of lambda with a sun width which is given by this this object so so this is the shape of these functions say for a given lambda as a function of C it peaks around the the unperturbed eigenvalue but there's a there's a dispersion and what you see here are the blue lines are the exact formulas and the dots are numerical simulations to check so you can check for example when when sigma goes to zero so when there's no noise you should expect this phi of lambda C to become a delta function at lambda equal C because in this case you don't put up at all the eigenvectors and that's that's what you recover you you can see that when sigma goes to zero you have a Cauchy distribution of width zero and and the the shift is also zero so this Cauchy distribution becomes a delta function but the Cauchy distribution has also this interesting property that it decays very slowly as a function of the distance between C and lambda and in particular it decays as a parallel so it means that the initial eigenvector actually spreads out very far in that with a with a very slowly decaying tail and this is again related to this kind of Dyson mixing that I that I alluded to before so this formula as I said it is true for all Wigner like matrices it's you don't necessarily have to perturb with a Gaussian noise provided the noise as a second moment this is this is a universal result which is interesting actually there's a similar result for multiplicative noise and for the case of empirical covariance matrices which is the most relevant example then you get again an explicit formula where you see appearing explicitly in the formula the Q factor so you remember the Q factor is the ratio of the size of the matrix the the number of time series that you want to correlate with the number of time steps that you have to measure them and again what you see is that when Q goes to zero which means that you have a lot more data points than you have time series then again this this thing tends to a delta function which is what you expect and this result this explicit result was actually obtained in 2011 in a beautiful paper by Ludwig and Pichier what are your assumptions on the matrix C on the no no assumption no assumption no are we okay well in the identity there's a degeneracy so that there's a problem there but if you have if you have you know generic matrix with non-degenerate spectrum and this is just your slide before just before concern the result itself not not the trace no exactly this is a matrix identity from which you can get a trace result and the trace result are these ones so the trace are for the spectrum and but what's interesting is that if you keep the full matrix structure then you get information on the eigenvectors okay so there's an example which is well maybe a little bit related to your question which is interesting which is the case where C is actually of rank one so there's a unique non-zero eigenvalue gamma all the other ones are zero and you put up this with a Wigner sorry yes a Wigner a Brownian motion WFT a Brownian matrix noise and so what's interesting is that in principle all these strictly speaking all these results are true in the end go to infinity limit but but they still make sense to order one over M and so you can still use this formalism and if you are careful keeping terms to order one over M then you get exact results that have been obtained using different techniques and one of them that I want to emphasize is the so-called by the narrow especially transition which is what happens in this case so again you start from the spectrum where there are n minus one eigenvalues equal to zero and one equal to gamma which is equal to five here so that's at time t equals zero if you want and then you run this Brownian noise WFT and you see how the spectrum evolves and what happens so if you do that without this outlier so called outlier then it's clear that what you have is zero plus a Wigner matrix so you have a semicircle law that appears and the width of the semicircle grows with time and so so this is this is what you see you see actually lines this is a really numerical simulation so that all these lines here they repel each other and they occupy a parabola and inside the parabola the spectrum would be a semicircle so in this parabola you have for each time a Wigner semicircle of radius 2 sigma square root of t and there's something happening to the isolated eigenvalue as well and again this you can grab using the the previous results to all the one over n and you find that the outlier has another dynamics it evolves as gamma plus sigma square t over gamma so it's a small less a straight line and this is true as long as t is smaller than a sudden critical time which in this case is gamma over sigma squared and what happens when t reaches t star is is this star here where the red line crosses the boundary of the Wigner semicircle and in this case the outlier just disappears it's eaten by the by the Wigner C and it's it's in a sense a true phase transition that is called now the BBB transition after Ben-Aruz and Pichet so for t greater than t star there's no isolated eigenvalue anymore it's just the the semicircle so you can ask also so this as I said you get from the r transform formalism to order one over n but you can also ask what happens for the overlaps and for the overlaps what you find is that the overlap of this isolated eigenvector with itself with what it was at time t equals 0 is equal to 1 minus t over t star so you see when t is 0 the overlap is 1 because you haven't done anything and when t goes to t star then it's the last moment where the isolated eigenvector remembers the structure it had at time t equals 0 after that it's completely lost in the Wigner C there's actually in an informatic information theoretic way there's no way of recovering the information contained in the eigenvector in the initial eigenvector after this time or when the the amount of noise is beyond some critical value what happens just at t star is to my knowledge not known or not written in one I mean if we do hand waving arguments you can conjecture that precisely at t star the overlap goes down to 0 as n to the minus one-third okay so now I'm going to try to show how you can use these results for something concrete and the question I'm going to ask is an inference problem is you observe a random realization of this noisy observation m and what you would like is to have the best estimate of c of the true signal c for example the true covariance matrix or the true matrix to which you add some noise knowing m and what I'm going to assume here is that the problem is rotationally invariant in the sense that there's no information that you have about the problem that allows you to privilege some directions in in space so you have no idea of where the true eigen vectors of c are pointing in space and so in this case the best you can do is to assume that your estimator is going to have the same eigenvectors as the noisy as your observation because this is the only thing that breaks the invariance is the observation of the covariance matrix or the noisy matrix m so you're stuck with these directions if you want there's nothing else you can do except if you have some information about the problem of course you might have some information about the problem that tells you that some directions should exist for symmetry reasons or you know but here we're assuming that there's nothing that the prior that you have there's no information about the directions in the n-dimensional space so the best you can do the only thing you can do is to write that your best estimator is going to be the sum over the eigenvectors of the noisy observation again that's the only thing you can choose times some pseudo or a dressed eigenvalues if you want psi i that you have to determine okay so now if you if you try to minimize the distance between the L2 distance between psi and and c so how you know what's was the closest you can get to see using this representation then you find conditions that fix these psi i and these psi i's are given by this equation so it's a weighted average of the true eigenvalues of c multiplied by the overlaps but you stare at this formula and you you see that's there's something completely stupid about this formula which is that I want to know c so if I express psi hat as a function of things that I don't know this is called you know where the oracle estimator you've done nothing useful because you want to have this best estimator psi hat as a function of things that if you knew them you would not need to estimate them so this formula looks to be empty and what I'm going to show you is that for some kind of miraculous reason in the large end limit you can estimate these objects without knowing see just knowing what you observe okay so here there's a little bit of transformations so this formula you first re-express it as in the large end limit as an integral rather than a sum and so these overlaps appear like like like the object I've defined before and it so happens that you can re-express this integral like this and by using the rules that I've written here in the end you get a result that where c has disappeared only m appears so for example in the additive case you get that what you should do is you take the empirical eigenvalue that's lambda i you've observed this is the eigenvalue of m and you transform it using a function f1 in order to get this this dressed eigenvalue psi hat and this function only depends on objects that are known once you've observed m and you know the structure of the noise so for example here is the the real part of the of the still just transform the hillbett transform of the spectrum and and you see alpha 1 and off and beta 1 are kind of complicated functions but they only involve m and and the noise rb is again the r transform of the of the noise matrix so everything only depends on on m and of course you have to know something about the noise so every if everything is Gaussian what you get is the well-known formula that when you observe Gaussian so one-dimensional Gaussian variables if you know that you're observing a Gaussian variable with noise the best estimator of the of the variable itself is the variable with you've observed times the signal divided by signal plus noise so so this is the one-dimensional version of this complicated formula if you want in the Gaussian case so in the multiplicative case there's a similar formula that again only involves m and the structure of the noise so gamma b and omega b are the equivalent of these transforms here and so in the empirical a covariance matrix case you get this explicit formula f2 of lambda how you should renormalize the empirical value to get the the estimator is lambda divided by this object so again you see that when q goes to zero you don't have to do anything because you observe perfectly the matrix and if q is non-zero you know what to do so this is an example of this function f2 this is the observed lambda this is how you should change the lambda to get the best estimator it looks very close to a straight line which in this context is called the linear shrinkage so for a long time people have proposed to transform the eigenvalues in a very simple way just in a linear way in this kind of very simple fashion but that you see that in it's actually in general more more complicated okay now what's I think maybe even more interesting for some application is that you can extend the techniques I've explained to you for another problem which is not the overlap of the matrix of the eigenvectors of m with those of c but the eigenvalue vectors of m with those of another realization of this random matrix problem m prime so here w and w prime are two independent realization of the noise and what I'm looking at here is this different overlap phi between the vector close to the eigenvalue lambda of the first realization and close to the eigenvalue lambda twiddle of the second realization okay so I think it's pretty explicit and again using the kind of techniques that I've told you about you can actually get rather cumbersome I haven't even there to write the full formulas but explicit formula for this five lambda lambda twiddle for the additive case or form of multiplicative case so for example you know it's even if alpha and beta I've not shown them but there's a function phi of q q twiddle these two matrices may not even have the same signal to noise ratio and n over t ratio in the multiplicative case so so there's a formula I mean which depends on on the parameters of the problem and if you want to imagine what these formulas look like then this is for a fixed lambda twiddle the overlap of the eigenvectors of one matrix compared to those of the second one it peaks around lambda twiddle but it has a sign again a kind of cushy shape so the red line is again the theoretical formula and the dots are numerical experiments but what's really cute I think is that this formula this that I haven't shown this phi of q q twiddle in the multiplicative case or the analog formula for the additive case it doesn't depend on C you just need to know mmm prime which is your which you observe and from mmm prime from the knowledge of mmm prime the the spectrum you can calculate what you expect to be the overlap between lambda and lambda twiddle I mean the overlap between the eigenvectors corresponding to lambda and lambda twiddle assuming that they both come from the same ensemble so assuming they come from the same ensemble then you're able to compute this function without knowing C so it's a kind of strange result right you're trying to test whether m and m prime are coming from the same ensemble but you don't know this and you don't know the generating matrix C so it can be used to test actually whether your two observations originate or not from the same unknown C and the formulas that you get are interesting because they are universal you don't have to assume much about the noise so for example if everything was Gaussian then you could have other ways of testing this hypothesis but what makes the result I think quite interesting is that you don't have to have a lot of knowledge about the structure of the noise to get in the large and limit these universal formulas so okay so it's interesting to apply these this idea to financial time series because it's very hard to imagine a priori that the financial markets are the same between say 1990 and 2000 and between 2000 and 2010 so so it's interesting to see whether if you measure empirically the covariance matrix on say 10 years and then on 10 subsequent years can you test whether the true covariance matrix that you don't know you only know the observation and by the way if you see that this quantity Q which determines the amount of noise the the lack of knowledge that you have about your problem it's very very difficult to have small values of Q again Q goes to zero is the perfect signal case but in this present era of big data say if you know you consider say I said 500 stocks in a in a portfolio of 500 stocks in the S&P 500 and I said 10 years 10 years is 2500 points with daily data and so this ratio is point two so it's not very small so you're clearly in a in a high noise regime because because you have a lot of objects so so this is the overlap that I just talked about at the same value of lambda in principle you could make tests for different values lambda and lambda twiddle I'm specializing this formula to the case where lambda equals lambda twiddle and Q equals Q twiddle and in this case what you get is so if you zoom on the part of the spectrum where most eigenvalues of the covariance matrix lie you get the empirical points are the green points and the prediction okay don't worry about the the fact that there are two lines there's some slight subtlety here that maybe if you're interested I can tell you about but look at the red line the red line is is not bad it's not bright it's not exactly on the data points but as I said it is a little bit of ambiguity here on the value of Q that you should use and again for reason I can explain but suddenly if you zoom out and look at the largest eigenvalues of the covariance matrix the predictions the theoretical prediction if the covariance is the same in the two time periods so I don't remember here if it's probably it's 2000 2010 and 2010 and beyond then you see that there's a very strong discrepancy the overlap is much smaller than what it should be if the true covariance matrix hadn't changed so the conclusion is that the large eigenvectors are unstable in a sense it's not a problem of noise it's not a problem of observation it's really the structure of the underlying problem that makes the eigenvectors evolve with time and this is a conclusion that we had obtained already we're using different methods with hall-mallez and what you have to realize is that this is extremely important for portfolio optimization the reason is that these large eigenvalues in terms of risk they correspond to portfolios portfolios define direction in space and these directions are the most risky ones this is the risk associated with the corresponding portfolios and so if you think that you're neutral in the direction of largest risk but these directions of largest risk shift with time then you're exposed to kind of unknown risk that were not in your risk model so here you know I've shown you a kind of eyeballing test it's just by the eye but it would be interesting to turn this into a true statistical test and this we haven't done and in particular there are subtleties about the hypothesis this change of value of Q comes from the fact that there are it's it's not a good it's not a very good model to assume that the data is completely white noise okay so I told you that this overlap function between two realization of the same realization of the same ensemble is an ugly formula that I haven't written but there's a simple interpretation of this formula which you you probably will understand intuitively so remember what I call now phi 0 phi 0 is the overlap between m and c so this is these are the objects that are considered in the first part of the talk phi now is the overlap between two different realizations and so what you can write by definition is that the eigenvalues of the noisy matrix is a superposition of the sorry the eigenvectors of the noisy matrix are a superposition of the eigenvectors of the the underlying pure matrix v mu with the square root of the overlaps which gives you the weight of u lambda on v mu and and random sine factors epsilon mu lambda and then you have to integrate over all muses with the corresponding eigenvalue density so if you compute the overlaps between two eigenvectors corresponding to different and realization of the same problem then you get something like this and now if you if you make what's called in this field the an organic hypothesis that is that all these phase factors all these signs are completely random as soon as mu and lambda are different then by by taking the average over this you get this kind of triangular convolution formula which is that the overlap between eigenvectors at lambda and lambda prime for the problem of MMM prime is a convolution of the overlaps of the pure matrix with one of them and the pure matrix of the other which I think is kind of triangular equality that that is intuitive but if you use this formula then then it appears to depend on c because for example there's this explicit density of eigenvalues rho c but as I've shown you the in the end the dependence of c on c disappears for five okay there's maybe I should do how long do I have okay okay including questions okay so I have time for so this formula here as has an interesting consequence for applications which is the following so I'm again going to reconsider this problem of of the estimation of my random matrix again I'm coming back to this problem here okay and this formula here and what I'm going to show is that using this formula you get a representation that's very useful empirically and that makes again a lot of sense so the idea is to consider a first realization of your problem with eigenvectors ui and then you take a second realization independent realization with I'm sorry I should have called it M here so this is M prime which is an independent realization of the of the covariance matrix and what I'm defining are these object new I of q which are the projection of this independent realization on the vector corresponding to the first realization bracket bracketed with ui itself so again all this can be measured empirically and what you get using the convolution formula is that this new I of q is actually exactly given recovering psi I hat so this is called a kind of cross validation or out of sample estimator of psi hat where in a sense you don't even need any formula you just compute this with an out of sample or a cross validation a period to determine the covariance matrix C and so that's how it looks like if you do this experiment numerically this is real financial data and and so all these points here are point-wise estimation of this formula but when you average what we call here the this oracle estimator you get the the the blue triangles and the formula the analytical formula that I told you about is the is the green one so this is this is an easy way if you want to estimate these psi hat that you need to use in your L2 estimator of the covariance matrix okay so let me conclude and and give a few problems that I you know at least I find them interesting so I try to tell you that the free random matrix results on the R transform and S transform for the still just transform can be extended to the full resolvent matrix and this gives access to overlaps and these overlaps formulas lead to what I've called here large-dimensional miracles that the the so-called oracle estimator that when you write it down you have the impression that you're writing something silly because you're expressing things that you would like to know in terms of things that you don't know so but but in the larger limit it turns out that you it can be estimated and in particular this hypothesis large matrices are generated from the same underlying matrix C without knowing see this can be tested and as I told you it's very useful in some cases you have a physical model so you you may have some information about C but in the case of financial markets for example there's nothing really that you can use except the data itself so so as I said it's an eyeballing test at this stage I think it would be interesting to turn this into a true statistical test large n something interesting that we've done recently with the floor on the nation and mark borders is to extend this to the SVD of more complicated covariance matrices so in the present case I've only considered square symmetric matrices where you correlate n objects with themselves but it's clear that in many empirical problems you want to correlate n objects with m other objects that may have absolutely nothing to do so for example maybe you've heard about this Sunspot problem where stock markets are supposed to be correlated with the activity of the Sun so you could you know correlate anything with anything else so in principle in general cross correlations have n rows and m columns and it's interesting to ask the same questions when n and m goes to infinity and in particular how to clean these cross correlation matrices so since these are rectangular matrices you don't you shouldn't you I mean the notion of eigenvector doesn't eigenvalues and eigenvectors don't make sense directly so you have to use an SVD representation but the this are the rotation invariant estimator that I told you about I said en route but actually now we've done it so we have an explicit formula that generalizes by f1 and f2 formulas these ones for the case of the singular values of more generate general correlation matrices what's interesting is that I've insisted on the fact that there's a Dyson motion description of the additive case we believe that there's also a Dyson motion description of the of the multiplicative case but for the moment where we haven't been able to I mean again there are leads but we haven't completed that project yet something that I find interesting as well maybe there's no nice solution but it seems that it's possible at least in some sense I mean there are already results showing that some of the free algebra cannot be generalized but I think there might be interesting interpolation schemes between the trivial case where you add two matrices that commute in this case it's obvious that you don't change the eigenvectors and you just add the eigenvalues and the free case where you have another rule for addition which is more complex but explicit and and you see I've tried to insist on the fact that freeness means that the the direction of the eigen space are completely independent from one another maybe there are ways to introduce some correlations between these directions that make some some generalizations interesting but although we have very preliminary ideas I don't have much to say I think just an interesting question and an obvious thing also that you can ask is in some cases there is a prior information on eigenvectors how can one use it to go beyond this this rotational invariant estimator for example in the case of financial markets it is clear that and actually Mateo worked on that a long time ago that the largest eigenvalue the one corresponding to you know actually in in this scale the largest eigenvalue is out of the window here and it corresponds to what people call the market mode that is a portfolio made of say one over square root of n on each on each stock so you take an exposure to the market to the market with an equal weight so there's a prior which is just you know one one one one the largest eigenvalue should be around this direction and and and those those those here are related I mean you can read some economic structure information in these in the structure of these ones related to economic sectors for example you know energy or banks or things like that so so it's not true that we have no prior information but the way to marry this prior information with this idea of the rotation the invariant estimators is I think to be worked out and on this side finish thank you questions yes okay so so for example something that you can do is to extend the wishart ensemble in cases where you have parallel distribution and what you see is not it's pretty obvious that you can be dominated if you compute a covariance and that the returns are very large on some days then even if there's no correlation but these two objects happen to go in the same direction the day when things move a lot then you're going to find some spurious correlation because everything is coming dominated by this so there are priors that you that you can use to account for the tails or account for the fact that the volatility of the markets is itself changing in time there are periods where the velocity is large and periods where it's smaller but you can extend these wishart ensembles to account for for these non-stationarity if you want on this so you use maximum estimators for the true eigenvalues of C that exactly coincide with the eigenvectors of C which coincide with the eigenvectors of M yes but then you show that there is a huge utilization so what I mean the question is that's why because of this arbitration that's why you you have a non-trivial result folks I had yeah so and this is exactly what in the end tells you that the best estimator is not to take the empirical matrix that you've observed it's to take only its eigenvectors but you have to shift the eigenvalues by some quantity because of this because of this mixing so essentially so this mixing tells you that so eigenvalues have satisfied some laws of large numbers yes yes yes absolutely but still you have you are choosing essentially the same eigenvectors so yes because because there's no prior because you can't do anything else because you have no prior that's the hypothesis of this rotation invariant prior what else can you do you know imagine what you get the same you know the Bayesian approach tells you that the only thing you can do is to take you as the as your prior for the eigenvectors of C again as I said there's nothing else you can do the only thing you can play with is not the eigenvectors is the eigenvalues that's the only thing you can do because of the rotation invariant assumption and that's what I was saying in the end if you have more information then the problem becomes maybe more interesting but also more complicated because you break the rotation of rotational symmetry but that's that's exactly the point and you're right I mean there's a law of large numbers that means that in the end you only if I mean that what's really nice is that M the noisy version of C contains enough information to compute these overlaps even if you don't know see but that's because all these results are true in the large n limit they're true only so you have a lot of information actually so couldn't you modify a bit your your prior I mean this this formula for potato I mean the previous page by mixing by mixing a bit the eigenvectors ui which are close which have closed by your eigenvalue somehow this here is I mean this this estimator is diagonal in this basis diagonal matrix in this basis of the UIs could choose somehow mix a bit include off diagonal terms but only near which are close concerns eigenvectors which are closed by energy well I think that in this case it means that you're you have a prior that that is not rotation invariant so that's what I said that if you have a rotation invariant prior then the maximum likelihood is is diagonal but I'm not saying that in concrete cases this my this could be a good idea but I think that's what's interesting in my opinion about what I said at the end trying to extend this to non-rotational invariant ensembles and see what you can do so yeah so here there's a little bit of numerical experiments to optimize this what we found is that so I said it should be large compared to one over n but small and so one of the square root of n seems to be a good compromise can we expect that each of the component is converging to some distribution to some Gaussian or as it is for the the eigenvectors of GUE I think in the in this rotation invariant case yes