 is it good yes okay we can see well so today I'm going to talk about this recent work we did with some people in our group in Paris Bruno Florent and Lenka and phase retrieval which recently appeared on the archive as well so first let me start by describing mainly a broader class of models among which belongs phase retrieval which are generalized in our models so the most generic kind of model we're gonna consider here is of this type where you have some signal in n dimensions that you're trying to recover for the observations of y mu which is generated there the noisy observation of the matrix vector product of a something matrix and x on the signal right so you have your signal you multiply it by a something random sensing matrix and then you it goes through a probabilistic observation channel and that's what you observe when you're trying to infer the vector x star from these observations and in our setup we're going to have the n dimension for the signal and the m observations so in this setup the most generic way to state that you are looking at a phase retrieval problem is to say that the measurements should only depend on the modulus of the absolute value in the real case of this matrix vector product right so even in the in the noiseless case which corresponds to this y mu directly equals to the modulus square there are this problem is highly non-trivial there are already many algorithms and techniques which have been developed from very different points of view from some indefinite programming from spectral methods from non-convex optimization etc here we are taking a quite specific point of view that is we are trying to understand what are the fundamental limits of phase retrieval when you consider a random set right so you consider a random matrix random signal you're trying to understand the typical properties of this problem but is a typical performance that you can obtain optimal performance and so let me emphasize that this is very different from the from other studies for instance which focus on the injectivity properties that would because I have some kind of worst case bound so okay let me now specify a bit more exactly how the questions we're gonna try to answer on the model so is the problem only on my side I also cannot tell here yeah me neither so I guess he has a problem are you back yeah back I don't know I think it's my connection which completely which had a you okay let me show again sorry for the problem okay is it working again now yes yes okay cool so okay let me let me take it back here so what is the minimum number of samples in the high dimensional limit that we need to recover the signal at least better than a random guess that is we recover a finite fraction of the signal can recover it perfectly in a sense that has to be defined on which what are the performance of the optimal polynomial time algorithms for this problem right so the now I'm going to describe the main hypothesis that we have for our study which are quite generic so the the signal matrix would be either real or complex so I'm going to define this parameter beta which is one in the case and two in the complex case I'm going to assume that I have some the signal was generated yes and prior distribution that I know with variance whole and under sensing matrix I'm going to have two hypothesis so I'm going to assume that it's orthogonally invariant on the right meaning that it's right again vectors are completely localized and I'm also going to assume that it's empirical a singular value distribution converges in the large N limit right so this in terms of under matrix models this is actually a very generic hypothesis then compasses a lot of possible models like of course the Gaussian models you can also consider arbitrary product of Gaussian matrices random uniformly sampled unitary or orthogonal matrices and more generically I mean if you have an asymptotic spectrum a distribution you can design around the matrix model very easily which satisfies this hypothesis and which has the required asymptotic distribution right so now that I have set up the the framework let me state our first result which is very classical for studies which is a conjecture based on the replica method of statistical physics I've maybe the most simplest way to state this is to say that there exists some scalar optimization problem which I do not write it explicitly but everything is completely explicit which depends on two variables so it's a supremum of the two real variables of the sum of three terms which decompose quite nicely between contributions from the prior distribution of contributions from the channel observation distribution and contributions from the spectrum of the sun signatures right and this problem is completely explicit if you know the for a particular problem and then if you can solve this problem the claim is that you can obtain what is the information theoretic and minimal mean squared error simply by considering qx right so this conjecture is obtained using our usual tools of statistical physics and the replica symmetric assumptions and okay first question we can answer is can we at least prove this conjecture is in some cases and the answer is we can in two cases the first one is if the the matrix is Gaussian real complex so the real case was already done a few years ago in a paper by Jean Bardier and others in the complex case you have to be a bit more careful but it basically goes through as well and the other case in which we can prove it is if the prior is Gaussian and the matrix something matrix is basically a Gaussian matrix times another matrix random or deterministic assumptions on it are very very light and then in this in one of these two cases we can prove the above conjecture and it uses some interpolation techniques that were introduced but are based basically on an idea of Guéra in early 2000s and which have been refined a lot since okay so maybe for the first theoretical result another nice thing about this formulation is that as I will describe now is that it also gives access to algorithmic performances so let me recall this this variational principle and it involves some function of true variable that I'm going to call the replicacy metric potential right so we're trying to maximize this potential to obtain the information theoretic MMSE and there's also another nice properties of this GLM problem is that it's a strong conjecture which has been shown in some cases not in this most general cases but it's widely believed that the the optimal polynomial time algorithms in terms of recovery of MMSE are actually belong to the class of approximate message passing algorithms right so these are algorithms which are quite involved but they are explicit and iterative and a very nice property of these algorithms which has been proven in this case is that the they achieve an asymptotic error which is given also by this replicacy metric potential but instead of being the local the global maximum is basically the local maximum which is closest to the point of q equals 0 q equals 0 meaning MSE equals is maximum so it's a random initialization right so you should start from a random initialization in this replicacy metric potential and you just do the gradient ascent you end up into the error which by approximate message passing so it's very nice because it allows to give without doing any miracle simulations it allows to investigate the existence of computational gaps just by considering the landscape of this potential right so here I'm giving two very simple examples in which you can see that here there is actually a gap and here there is not okay so now let me focus a bit more on the phase retrieval so the the first result I'm going to state is about the weak recovery problem so here we consider the phase retrieval problem in a very generic setup so we just assume that the the channel distribution is a function of the of the modulus of the absolute value and then we ask the question what is the minimal number of measurements you need to beat a random guessing polynomial time right so what the number of measurements do you need to get an MSE a mean square error which is non trivial and we have a sharp answer for this which is given by this equation this threshold is a solution of this equation so it's a bit involved but it's not that complicated it involves the the characteristics of the problem the channel and the the spectral distribution and again we derive this using the analysis I presented before and because we said that we can investigate the algorithmic performances by looking at the replica symmetric potential right and for instance one interesting consequence that we can draw from this formula which for which we don't really have an intuitive interpretation is that for any phase retrieval channel the highest recovery threshold is always going to be reached in terms of spectrums by orthogonal or unitary matrices right so this is some consequence which we couldn't for which we don't really have an interpretation or an intuition if someone has his share but it's some interesting consequence which is a very simple consequence of our equations and if you specialize this equation to the case of noise less phase retrieval it simplifies this a lot and in particular you can find back the previously known results for Gaussian matrices or for random unitary matrices okay so this is for the weak recovery problem now I'm going to specify even more on the noise less phase retrieval so here I'm assuming the channel is noise less and I'm going to ask the question how many measurements do you need to achieve the best possible recovery right so from this point adding more measurements would not improve your recovery of the signal and the answer to this problem is actually very simple in the sense that if you define small r basically as the fraction of nonzero eigenvalues then the threshold is simply given by beta so beta is I recall one in the real case two in the complex case so it's very simple and quite I mean interestingly this can be derived in the real case you know using a very simple counting argument which was already originally made for complex sensing which basically amounts to say that in the real case if you know a number the absolute value of a number you know it up to its sign right so if you know a vector of absolute values you can basically you have an exponential number of possible vectors to try but it's finite and so since we are looking at information theoretical thresholds you can derive this threshold quite easily with these arguments however in the complex case we find that this threshold is 2R and for the complex case you cannot you don't have such a simple counting argument so as far as we know we could only derive it using our replicacy metric potential analysis and now let me present a few briefly some numerical applications of our analysis so here I am again in the noiseless phase retrieval setup and I'm going to present two simple cases of course we could analyze the much more but for the sake of the presentation I'm going to focus on these two which are the complex Gaussian matrices and the column unitary matrices so for these two cases for instance the weak recovery properties we already know and so here I am plotting as a mean square error the function of alpha so basically the number of measurements and in this in the blue line is the predicted optimal algorithmic performance and the orange line is the information theoretic optimal performance so as you can see and the sorry and the points are explicit numerical simulations of the algorithms so as you can see in both these cases you have some computational gap meaning that in this live blue regions you cannot recover the signal you cannot achieve the information theoretic error in polynomial kind right and as you can see the points of the AMP algorithm really match the analytical prediction so it's all very coherent right and from this plot especially in the in the unitary case you can draw some conclusions as well or some observations for instance the first observation which is the last in this list is that we also ran the algorithm for matrices which in theory do not satisfy our rotation invariance hypothesis and namely we took this could forget transform matrices from which we randomly set sample the columns and the steel the algorithm performs basically for this matrices as well as for the uniformly sampled in theory matrices right so it seems to indicate that you could allow for some control structure without harming to match the performance of the algorithm and other results and also interestingly which we think was not originally known in the in the unitary case in the information theoretic sense you see there is this all or nothing transition meaning that's for alpha equal to the the information theoretic MSE brutally drops from 1 to 0 right and it's again looking at the I mean analyzing our equations you can show that in the complex setting the unitary matrices are the only matrices which can show such a transition so again something for which we don't really have an intuition and the however in the real case you can find many random matrices for example your program once but there's many others for which you have such a transition so there seems to be some kind of difference in this sense between the real and complex okay so I think I'm already almost at the end so let me conclude basically by by this slide which is a table that summarizes many of our results in a compact way so the here I'm in the first column I'm giving the the random matrix ensemble and the and the the overlines of a noiseless phase retrieval except the the last three which are for generic phase retrieval and I'm giving the values of the different thresholds that we that we can compute so we have this weak recovery threshold full recovery threshold in the IT sense and in the algorithmic sense and basically in red are results that were not presented in literature before so this gives a nice overview of how our work basically tries to fill some gaps in the studies of hard phases in phase retrieval and with this I would like to thank you for your attention and also maybe to point your attention to the fact that we use the great numerical package for the simulation which is called 3AMP which was developed in the group especially by by Antoine Baker and they are the they have a paper on this now and it's a really great modular package to code approximate message passing for a lot of influence problems. Thank you. Thanks Antoine for the great talk.