 Okay. Welcome. Welcome back. So, we have, we have computed using the replica trick and this replica symmetric assumption, the average value of the minimal loss as a function of alpha and sigma square and we have seen that there is a critical value of alpha beyond which the system is of typically incompatible. And, and, and this, this critical value clearly moves to the left as we increase the, the noise. Now, Russia and young have developed using objects that we have computed already a more an even more powerful approach to compute, not just the average value but the full distribution. So, let's call it the full distribution of, I mean, so the minimal loss in the larger limit, which is what essentially wanted to do. Just just to present the approach and then it's just a lot of calculations so we'll see what, how much can I can I cover but then in the end, there is, there is a final final formula what what is important here is the method. So what they, what they do is they assume a large deviation function motivated by similar problems where such large deviation approach is correct. So they assume that in the larger limit the full probability distribution of the minimal of the minimal loss is as this form R of e exponential minus NL of E, where he is even divided by N. I'm not sure how familiar you are with the theory of large deviations and, and the fact that probability distribution, depending on on a parameter on a typical a large parameter can develop this, this type of, of form. So I'm just prepared here like a crash course on on large deviations using the simplest example of coin coin tosses. So this P of M, N is the probability of getting exactly M tails in a sequence of N coin tosses. This is the exact result where of course the, the binomial term takes into account different permutations of the, of the sequence of ads and pace. So the idea of large deviations is encoded in this simple scheme. Plug in the number and equals 100 and M equals 90. So what is the probability of getting none 90 heads in, in a sequence of 100 coin tosses, this probability is very tiny, of course, this is a very rare, rare event. This is 10 to the minus 17. But then if you if you double both these numbers, and you ask what is the probability of getting 180 heads out of 200 200 tosses. What can you guess what the exponent would be here, like roughly, we have an intuition of, of, this is also very, very rare event right. The exact number is minus 33. So this event is rare, but this event is 16 orders of magnitude rarer, just because and the ratio is the same, just because we have, we have doubled the, the number of, of coins of tosses. So this means that this probability, which is an exact result in, in, in the limit and to infinity when you scale M as x times and has a probability that the case exponentially fast in the number of of tosses and and the coefficient of this exponentially fast decay is called the rate function, or the larger vision function that can be computed exactly for this for this problem. It has a symmetric shape with the minimum, which is also a zero at one alpha. So the rate function as as a zero and the minimum value at the most probable value, which corresponds to the least suppression of this exponential decay, and then it goes, it goes up up to the level log two, when x is equal to one or zero, why is it log two. Well because the probability of getting 100 tails or 100 heads out of 100 tosses is just one half to the power and which clearly is e to the minus and log two. This is a very fundamental object because it connects typical fluctuations with rare events where where you essentially observe a streak of completely unlikely, unlikely events. Right. Okay, that's, that's large deviations in, in, in two minutes. If you can do better than that. Well, be my guest. Okay, so this is this the, the thing we have a large parameter and and we assume a large deviation form for him in. So what, what Russia and young proposes and earlier on, young and, and they propose to analyze the replicated partition function that we have defined before but in a different limit. When beta goes to infinity and goes to zero, which is pretty much what what we did before in a different order, but assuming that and beta is equal to a fixed value s. The aim is that this object in this limit is related to the large deviation function LLV. This is not the Lagrangian this is just the symbol is, is particularly bad but this is just the large deviation function for this problem. How do we see this. Well let's try to write a chain of equalities. So we write this to the exponent. So we write this as exponential of n log z. Okay. And then we use the fact that little n is s over beta. So we get limit beta to infinity of exponential of s over beta log z. But then we use the fact that even the minimal loss is minus limit beta to infinity of one over beta log z. Right, which is, which is the essentially the starting the starting point of the previous lecture. So if we, if we do that, then we can, we can use the fact that here we have one over beta log z and replace this with imine with a minus sign, of course, because of this minus sign here. Well, we are taking the limit. Yeah, but, but, but I've got 10 pages. You, you will need to help me out here because. So, I mean, that's, I think we've done worse, worse thing in the previous hour. So we can, you can probably proceed here but okay so the idea would be that that you proceed and, and you're right that this is exponential minus SN. So essentially the idea is that this object in this in this peculiar limit is related to the Laplace transform of of the random variable that you are that you are after. So, the conclusion would be that the limit beta is the integral the E, PN, E exponential of minus and S E. And then what you do is the standard thing which is you replace the answers in in there, the large deviation answers inside the integral. So you say that this object we assume is R of the exponential of minus and L of the hat. And then you evaluate this, this integral for for large, for large N, because we have exponential of minus N into something. So for large N, this integral should go as exponential of minus N. What is there as E plus L of the right. Just most my choke. Okay, where as is the parameter that fixes the product and times beat. Sorry. So this this yet. So let me, let me, let me be more precise. So this, this will be a function of S. When you evaluate this integral in the southern point where of course you had to have to extremize over over yet. So, Phi of S is in particular, in this case I think is minus the minimum over E hat. Sorry, minus the minimum over E hat of S. Plus L. Okay, so the, the idea is that if you can extract this this limit out of the object that we have already computed and and put it in a in a large deviation form, depending on the parameter S, then this function. The rate function in the last space is related by a gender transform to the rate function in real space. So what you have to do now is to insert the gender transform and carve out the rate function real space from this, this object, Phi. The idea would be the general framework of the method is that the rate function real space would be minus would be given by minus. S star plus Phi of S star, where S star is the solution of the following equation. This is a completely equivalent. If I start from what you have here limit of beta going to infinity, if I replace the limit beta going to infinity by a limit and going to infinity, and I plug one over N and take the log of this. I think this is the log moment generating function in large deviations, the usual definition of the long amount generating function for E. Which is, yeah, which would establish the fact that Phi of S is precisely some sort of moment or cumulant generating function. Yeah, and then by Gartner at least you get the right function. This is what you're doing. Exactly. But so, like, if you really follow the recipe from large variation theory, the limits here would be a limit in N. So, how do I connect. Well, the limit in N is implicit in here so there is a further limit in N, because of this, because you are evaluating this, this object in a large and setting right. Okay, there is this this this integral that you need to evaluate using Laplace method. Yeah, okay. Yeah. So, so the Enrico is asking if I can repeat the question I'm saying that the object that appears here in the box looks very much up to logarithm and one over N to the so called log moment generating function in large deviation theory. But the limit would be taken in as N is going to infinity instead of beta. And then there is a genetic result under certain condition that is called the Gartner Elise theorem that gives you access from if you can compute this thing for any S. Then from the in from the lab classrooms from the convex conjugate, which is the operation that has done. There you recover the large deviation function and I was trying to connect this large, the classical large deviation approach that I just described where N is going to infinity with what is written there. So instead B is going to infinity. Yeah, I think this this is also a bit misleading in the sense that this limit has already been taken in some sense. The limit has already been been taken when when you are writing that logs that over over beta is minus me. Okay, so that that was, it was as likely misleading of course that the limit has already been taken inside. So the further limit and to infinity which which of course you, you need to assume in here to be able to use the Laplace method for this for this integral. So maybe maybe this, maybe the source of confusion was that I left, I left a limit that in reality we had already used. Okay, sorry about that. So the, the goal is then to extract this function five of five of us from this strange double limit. And if we are able to extract this function five of us by this inverse genre transform, we should be able to access the full distribution of the immune of the normalized immune in the large and limit. So there are examples in which it's a problem where actually all these steps can be can be carried out essentially until the end. So, it seems increasingly unlikely that I will be able to do that, but but in principle this this program can can work until the end. And all, all we have to do is to perform this double double scaling limit of an object that we had already computed though, so we can leverage on what we did just just a few minutes ago. So, I'm just recalling the fact that Z to the power and we computed it as exponential minus and over to find me. Sorry, fine, evaluated in human. By comparing this object with the fact that in the double scaling limit, we want this object to go as some g of s exponential and five of us, we can compare these two relations and obtain that five of us our quantity of interest is minus one over to the limit of phi and evaluated in Q mean, but fine evaluated in Q mean, we have it, we just computed. So fine, evaluated in Q mean was alpha and log of one plus beta minus beta q plus alpha log one plus beta q plus Sigma Square and divided by one plus beta one minus q. And then we have minus and log one minus q minus log one plus q and over one minus q. That was, that was the result that we had. And now we have to substitute every occurrence of little and with S over beta. So this becomes an S over beta. This becomes an S over beta. This becomes an S over beta. And this becomes an S over beta. So we, we have it. We also have to do something, something else though because of course we want to extract some information in the case in the interesting case where the system is incompatible, typically. So when, when the average of even is strictly larger than, than zero. And you remember that to extract some something so we want to be above alpha critical. Because otherwise the problem is much less, less interesting the problem is, is always more on average, compatible and it's not very interesting we want to know when the problem is not compatible. How, how is the distribution of the, of the loss around, around the average, the average value. So in order to tackle this region, we need to assume this double scaling limit between beta and q. Right. So we need to assume that q, so that beta one minus q is V, which implies that q is one minus V over beta. What you do now is to replace one minus V over beta every time you find a q. Okay. And if you, if you do that, it's, you get something like five and of human, which is alpha S over beta log of one plus be. This alpha log of one plus S sigma square plus one minus V over beta divided by one plus be minus S over beta log. What is what is one minus q is V over beta minus log one plus S over beta. So one minus V over beta divided by the over beta. So one minus q is always be over beta. So now you have something that you can evaluate in the limit beta to infinity, which is the limit that you that you want. Okay. Take, take out the limit beta to infinity. You don't have to divide anything here. You just just kill the terms that don't survive the limit. So, this term is, is skilled. Some part of this terms are are killed. Some other survive. So in the limit beta to infinity, what survives is something like alpha log one plus S sigma square plus one divided by one plus be minus log one plus S over be. So this function function phi of S comma P, which is the central, the central object that we need to, we need to analyze we have, we have essentially performed this, this step, and what remains is, this is function phi. Good. So, how do we get an equation for V, of course, there is still an equation for being the game that we need to impose V is going to be a function of S in this, in this problem, only as must, must survive. So what we do is we take the off diagonal condition, and we replace this. This, this condition there. And then we take the limit beta to infinity. So, if we do that. There isn't too much. That's. So this is the off diagonal condition written replacing every occurrence of Q with one minus be over be remember there was a queue in the numerator. And well, you can check your notes, but this is not mistaken, what you will obtain. This must be equal to this was a queue plus Sigma Square, and I'm replacing one minus be over beta plus Sigma Square, then we have one plus V, and then we have one plus V plus S over beta times beta one minus be over beta which was a queue. Sigma Square, it's just not very interesting way of replacing every occurrence of Q with with its correct correct value. Okay. And now, if we, if we send beta to infinity, this becomes this simplifies a lot. We get an equation for V, which is of this of this form, one plus the one plus V plus S one plus Sigma Square. Of course, maybe some of you might have noticed that the. I started from the off diagonal conditions but of course the very same equation for for V. I could have obtained by doing what by simply differentiating this, this object with respect to V and setting this, this derivative equal to zero and you can check that the two results. The two results would would agree. Okay, it is just a sanity, sanity check that we are that we are doing the same. So this one is the off diagonal condition is the condition for for the replicas, the replica inverse inverse. So this this equation provides V as a function of S, where S is the Laplace, the plus parameters that we need to act with the legendre transform to get the right function in real space. Okay, just to. The only problem that we have here is that this at some point will become pretty horrible equation to solve. And so let me let me proceed let me see what what I can do. So this object here is what we call their capital Phi of S comma V, but now this V is a V of S itself. Okay, so the object that I want is just minus one half. And here, where V solves this this equation. That's that's the flow of, of information. I need to solve. So this plug it back in here and put the minus one half in front. Okay, so let me just see what it's going to do. The program essentially can be carried out until the end with a couple of tricks, I don't know how much time I have 10 minutes maybe. So my rate function in in real space, we said, must be equal to an inverse legendre transform, right. But we, we said that Phi is minus a half capital Phi. So this minus E s star plus one of five of S, B of S, where of course S and B of S must be as star. So we can write as capital Phi to where S star is found as a function of E hat solving the following problem. So he had must be equal to minus five prime evaluated in S star. So this is one of the total derivative of capital Phi with respect to S evaluated in S equal to S star, but Phi is a function of two variables. S and V of S. So you need to take the derivative with respect to the first argument and the derivative with respect to the second argument using the chain rule, but the derivative with respect to the second argument is zero because of this condition. So what remains is really just the derivative of Phi partial derivative of S B with respect to S using S is equal to S star. So the, the program I think is relatively clear it is just a matter of, of algebra in in solving this equation of trying to solve this and unfortunately it's not going to be possible to give an explicit solution of this, this object, and plug it back in here and compute one derivative. So the derivative with respect to Phi can be can be computed here. So the derivative sorry with respect to S can be computer. So you get that he had is one over two of what of this guy differentiated. That one plus S minus the derivative of this object, which is one plus S over V, and here I put a one over V. So, this is the, this equality that connects S with E hat. I mean, S is S star of course. And if you massage this equation a bit, you can show that this is equivalent to E hat equal to one over to be V plus S star. So now, now what, what you have to do is to use this, this equation here and connect it with with this with this equality here, because everything in the end must be a function of E alone. So we have, we have many, many things in in the game we have S star, which is a function of E, he had you through this equation, but we also have V of S star, which is connected to a star by this equation. So we need to find a clever way to put to put these three elements together. And the best thing you can do is to rewrite this, this expression in using this simple term in the following cubic equation, which is the source of all problems. And here, so you have a cubic equation where a is one plus Sigma square over to E hat. So the connection between V and an E hat is via cubic equation that unfortunately we cannot solve in general in the full space of parameters, and also we cannot, we cannot study very, you know, I mean, we can solve it of course but we get a messy, messy expression that is not very, very illuminating. And also, we cannot analyze bounds, because we know that the be for example must be positive because of the way we have defined, defined it and this this bound is very hard to to read off from from this equation in terms of, of, of here. So what you have this final result, which is first final result in the second final result is the expression for L expression for hell, which is here. The only one that I erased. This is an alpha missing. Yeah, yeah, just just here. Yeah. Okay, yeah. So yes and you were right. Thank you. Okay, now that I've corrected it I can erase it. I'm extremely concerned. Okay, so essentially the putting putting together all the elements that we have that we have here, including this cubic equation we have an expression for L of E, which I'm just reporting for completeness. So this is a large deviation function for the minimal loss, which is something that you can plot. But it's still not completely trivial to plot it because it is parametrically given in terms of this V function. The function is an implicit function of, of yet given by this, by this cubic. Okay. So, so, so you need to solve to solve them together. Okay. Yeah, V is given in terms of V and and then you have. The plot in, of course, as a minimum which is also a zero at E mean, which is the average value of E mean over M. It is the, the quantity that we have computed at the end of the previous. And then it has, it has some some typical large deviation shape. The problem is that first problem we don't know the range of validity in, in, in E, because there will be some, some conditions on on the positivity of V that comes from this cubic equation and this is this is very hard to find it in it hasn't been done. Right. So, this, this equation imposes the fact that V is a function of E hat. But we have a constraint on V, because V must be positive. Remember that V was this object. So a constraint of positivity on this object imposes a constraint on may impose some constraint on on E hat. Yeah. So there is an argument. There is an argument of like sign sign permanence. So you can follow one group from a couple of limiting, limiting results that you that that are known. So, so you use an argument like perturbed the perturbed your argument where you switch off one of the parameters I don't remember exactly, but then, then you add it up like that's more small one and then this is by continuity you follow one of the solution. Sorry. Yeah, it's a continuity argument. Yes. So, so there is, there is a constraint that that we cannot, we cannot work out very, very easily from the cubic equation, like from the sign. I know that in principle everything can be done. It's the cubic equation is not but that with so many parameters in there. It's, it's, it's hard to find them very explicit result. So we have a second second problem potential problem here, which is, I discussed with many of you already, which is that for for a very similar problem for which the very same approach was was used. So this is a paper by Fyodorov and and let us all on this quadratic quadratic form, plus magnetic field type of symmetry breaking term. The minimization of this problem leads to a rate function obtained using this this method that has been proven correctly but only up to a certain point, and then this pulled after a certain point. This was passed by them bow and say to me, who proved that in this very similar result that uses the same technique. The rate function computed using replicas is correct up to certain point, but then they regularly correct result is is another following another branch. So we don't have the corresponding rigorous result for this class of problems and that's what we really miss, because it will be very important to have another comparison. This is one of the first example that examples that I know, where a full are full replica symmetric calculation on pretty much, you know, quite with mild assumptions, I would say, is rigorously disproven, at least in a range of in a range of the final rate rate function and we don't know exactly what what went wrong in this in the process. Yeah. In for this problem, it is, it is certainly quadratic. So this this has been done. It has it has gotten Gaussian fluctuations around around E mean for this problem. I'm not entirely sure about the problem with field reference to solve by. I would imagine so so this, this is done in in Russia's thesis the expansion around the minimum. So yes, that's, that's, that's possible to do it is an easier. It is an easier thing to check, of course, but yeah, so I think I hope I convinced you that that this is a very rich problem that can be tackled from different angles we have a non trivial results on the fact that that even under complete system in presence of non linear constraints can be incompatible, and we have a puzzle, a potential puzzle. Of course, it's so I urge rigorous people that are listening to, to try to tackle this problem because we really need a second example where that confirms or disproves this this calculation even for a tiny, even if for for a tiny region of the, of the face space. So, with this I conclude, of course take any any question and just let me say it's been real thrill and a great privilege to be to be here and well I hope that you've enjoyed it, at least as much as I did. Okay, thanks very much.