 Zdaj smo videli, kako vzajimo kanterovične probleme. Zdaj problem je, da je kosvenščen. Vse. Vzajimo probabilitje, mu, mu. in však bomo poživati boje, da je vsega vsnega opotrešenja z prišljegom prijev, z kako je vsega vsega vsega vsega vsega tudi, da je projekcija na prvom komponentu, ki je mu, a projekcija na prvom komponentu, ki je. Tako, zelo smo videli, da je tudi generalizacija tudi, da je izgleda, da je izgleda, da je izgleda, da je izgleda, da je izgleda, da je izgleda, stricbe peppersio. The measure nu on to door to measure nu and such that this is minim. Immrist on map, mustache map which are pushing for new, continue. And this is what we called monjt problem. So, this are our two problems. ki se vseto se načo vse več, ki se ne so vseči, pa je je to vseči izgleda, da so vseči. Vse to boma vseči, ki se nebajte, načo se je vseči in ne so vseči, tako, da se ne bo vseči vseči. In vse bo vseči, ki se ne so vseči, da se ne bo vseči. Tukaj, da se vseči, ki so vseči, primer, da se vseči, ki so vseči, zelo je jezik v zelo vsega, vsega občinja. Tako, je zelo vsega solučenje. Zelo smo videli, da je zelo vsega solučenja, je zelo vsega solučenja, smo videli, da vsega solučenja je zelo vsega solučenja, vsega solučenja, u license. 2. 4. 5. 6. 7. 8. 9. 10. 11. 10. 11. 11. No. To je dobro nek jez. Vse je zelo v vseh, da robili, začasno je zločnje zato, da je zelo vseh, zelo vseh, da je zelo vseh, zelo vseh, zelo vseh, zelo vseh. And, if you want there is a more nice geometric interpretation, z vsem provojattu delovu. In je dybt. Vočeš, da nozim s koncavitiverim je mora ne bo, da je zpratno valtilipa nozimov. Z kam je to proste pržene, zazve čas, čas je kar, je tuknit od obovoj, z kajščenjem. Tak je tuknit z vse formu c dot y1, c dot y2, zelo. Tuknit je tuknit z x. z tvoj svoj, kako, tako, nekaj sem vzela, ko je, da je, nekaj sem načala, da se je vse zelo, da so načal, da je, nekaj sem vzela, in pa je načala, da je, je, nekaj, vse, je, nekaj, vse, je, nekaj, načala, da je, nekaj, načala, da je, nekaj, poznačite ta vrna noga, kato je ta v sej tuk' v vrne v zelo. Včoč njič nič vrne zelo. Zelo se to je v povsem. soulsite je x. Je to zelo v te korsi funkcion nič z tuk' v x-y. je to funkcion z x-y, je to dja. Zelo se do zelo, til sem tega v povsem steda. In od kaj sem tuk' v složiti to, je to pojem, da je malo biločen v sej subdiferenšelji. Zato, da se pojel na konkevštenje, to je vse, da je vse superdiferenšelje. Pobrečenja, poločenja, naj sej pojel nače, to je pojel nače, da je to postavlja. Tako, večem, tudi iznoj. In tudi vidim, da je bilo zelo, da je zelo, da je pačen na graf, potem pačen na graf. Zelo, da je zapečen na graf, to je zapečen na graf. sem izgovoril, da so všetko kvalite na terkijo, že vse pričo je vznik, če je tukaj nekaj resga generale, posljena generale, ali je nekaj nekaj nekaj nekaj, če poču teorem vse natevno. So, bi smo odvrli na RNA, in na rehtmanianu, z noh. In... So x and y will be like open subset over n, or real. So I gave you an argument, which was wrong, but morally right, to show why you expect, or when you expect the support of pi to be included in a chi super differential of a second k function. And the argument was simple, was like, OK, imagine that c is moved. So actually, you just need c1 to do the random design. So say c is c1, and then you know that if a point x, y, so if a point y belongs to the c sub differential of phi point x, then you see that this function is a maximum at that point. So if you differentiate, I mean, in a maximum point, for a maximum point, the gradient is 0. So you get the gradient of phi of x equals gradient of x 0 x y. And if you can assume, which is just an assumption you put on the coast, that you can invert this relation. I mean, you can find, you can uniquely determine why, by this relation, this gives that the wall c super differential is a graph. But actually, what is failing here is that you don't know that you can differentiate phi. Phi is just an infimum of cost function. So it's an infimum of c1 function, or infimum of c infinity function. But you don't have a uniform. No, you have to. You actually have to. But so the point, an infimum of c1 function cannot be better than Lipschitz. Take even c1 function, take c1 function with uniform c1 norm, and infimum is going to be Lipschitz. So no hope. And even if the function has infinity with bounded c infinity norm and you take an infimum, you cannot get better than Lipschitz. At least, you cannot get a c1 function. There could be corners. So what is the point? The point is that you get a Lipschitz function, and Lipschitz function are differentiable in a lot of points, because a lot of points with respect to the back measure. This is Rademacher theorem. So in some sense, I can run this argument, but I just have to throw away some bad points. But if I throw away points, I have to be careful that I'm throwing away points, which my measure sees. I mean, if there is just one point where the function is not differentiable, and my starting measure is a mu, is a delta sitting at that point, it's clear that I cannot throw away that point. So I need that if I throw away a leg negligible set of points, I need that this is a mu negligible set of points. So I just have to ask that mu is absolutely continuous with the leg negligible. So the theorem, so you take the x and y open about the subset of Rn, mu, which is absolutely continuous with respect to the back measure, mu is just a probability measure, and I have no assumption on mu. And then I ask that my cost C is, say, C1, and that I can uniquely determine y by relation like that. So that the map y into grad x C x y is injective. And this is usually referred as a twist condition. Never understood y, but I think it comes from the dynamical system. But I don't understand the reason of twist, probably because you have y and x. So this is a twist condition. Then you have all these assumptions. Then there exists a unique solution to the Kantorovich problem. So I'm saying that there exists a unique plan that this plan, a unique solution to the Kantorovich problem, and it is given by a map. I have a unique solution to the Kantorovich problem. This solution is also a solution of the Monge problem. And I can also write it down. So the map is just this. t of u equals grad x C. So here I'm inverting this function as a function of y for x fixed, and I compute it. And phi is the Kantorovich potential for the problem. So it depends on the data. So the proof is essentially the one I gave you. The proof of existence, the one I gave you, because you know that phi is lepčit. It is a pretty easy exercise that with this assumption phi is lepčit. So there exists a set A included in x, such that ld of x minus a is 0. And phi is differentiable for every x in A. So if you have these, you also obviously, you have these by our assumption. And if you have these, you have these for every admissible plan. So now you take the optimal plan pi bar. You know that it is included in the c sub differential. So let me be short here. So this is the graph of the c sub differential of phi for some phi, which is c con cave. Super differential. And then now you know that you can run the argument I did before for every x in y. In A, sorry. Because for x in A, phi is differential. If you have a function which is differentiable and has a maximum, then the gradient is 0 and so on. So I know that the part of pi, which is above A, is a graph. So the argument before tells you that pi over A is a graph, which is the right formal way. So let me say that pi of x times y minus A times y. So pi, no, so the right way to write to this is that pi is concentrated on a graph of t is 0 for some function t. Because you know that you have a graph over a set and you know what that. So it can happen that you have here a vertical part, say, which is actually what happens. But you know that this point is moon negligible. So you don't really care. So this is the support of pi, the one I grow, which is included. And you can have some part where you are not a graph. But actually, this part is negligible. So it has a negligible projection. That's the idea. So this is the way you prove the existence. And the fact that the map is given by this is given by the argument I did before. So let me prove existence. Sorry, let me prove uniqueness. We prove existence. I told you that there was a kind of, I mean, there is always a small program when you ask yourself kind of a variational problem, which is existence, uniqueness, and properties of the solution. The three basic question, let's see. And so we have existence, in this case. For uniqueness, uniqueness is a nice argument. So I was claiming more before. I was claiming that there is a unique solution to the Kantorovich problem. And why there is a unique solution to the Kantorovich problem? Well, the reason is that this argument shows that every optimal plan is concentrated on a graph. Not just that there is an optimal plan, which is concentrated on a graph. And which would be sufficient to prove the existence of the solution to the monju problem. But, I mean, here you get more. You get that every optimal plan is concentrated on a graph. And if you know that every optimal plan is concentrated on a graph, you have automatically uniqueness. And the reason is this one. Suppose you have two solutions, pi y, pi 1 and pi 2 solution of k. You know that there exist two maps, t1 and t2, such that pi 1 lives on the graph of one map and pi 2 lives on the graph of the other map. But now, you remember that the problem is convex in pi, or is actually linear in pi. So pi 1 plus pi 2 over 2 is a solution as well. If these are two solutions, this is a solution as well. But this cannot be the solution induced by a map unless these two maps are the same, induced by a map. Because you say you have the graph of t1, the graph of t2, and what I'm saying here is that the point is sent. So what is doing this plan, pi 1 over plus pi 2 over 2, is sending this point x half year and half year. So it's not, obviously, not given by a map. OK, so let me state an important corollary, which is just a particular case of this theorem. But I like it, not only me. Which is, VG has shown that the cost, I mean, is trying to show you that when the cost is induced by the distance squares, you have a lot. When the cost is the distance square on a Riemannian manifold or on a metric space and so on, you have a lot of geometric properties of the space, which are given by, I mean, which are encoded in the optimal transportation problem. So let me show you what happens when the cost is the distance square on a Riemannian manifold. So corollary is the theorem, as I called it, the theorem, because it's all the dignity of the theorem, it's the theorem, which is due to Brignet. And this, that's the C. So the assumptions are the same as before. So C is the distance squares, the distance square on a Riemannian and then you have your probability measure, mu, which is absolutely continuous with respect to the bag, mu, which is a generic probability measure, x and y, which are open, say bounded sets. Then there exists a unique T, such that solving the Monch problem. And this is what we have proved before, because C satisfies the assumption of the theorem. So C is smooth, as you want, it's analytic. And the gradient of, and it satisfies the twist condition. So there is this a unique solution. And this solution, that's what I would like just to point out, is given by the gradient of u, which this time is different from phi, because usually I exchange u at phi, but this u is not phi. U is convex, plain convex, OK, is a convex map. And you also have this Monjampere type equation. So the determinant of the issue of u equals, wait, I need something. So up to now, I know just that mu is absolutely continuous with respect to the bag measure. So if also mu is absolutely continuous with the bag measure, so if mu, you can write mu as f dx, and mu as g uy, meaning that there are densities with respect to the bag measure, then this map is a solution of this equation, which is f over z, composition del u, OK? So kind of u solves a PD, which is a Monjampere type PD. So then I give you some comment in which sense this PD is satisfied, OK? Because when I am going to talk about regularity, it's important to understand which is the sense in which this PD is satisfied. OK. So the proof, which is actually not the proof, now is an assumption. If, yeah, in these two parts of the theorem, then I assume that mu is also absolutely continuous with the bag measure, and I get this additional conclusion. So I mean it's the interesting case in some sense. OK, the proof of part one is the previous theorem. So this one was proved before. Just notice that you are in the hypothesis of the previous theorem. So for proving two, you know by the previous theorem, just that t is given by grad x, c of x, and to be the cheapest proof, but at least grad phi, where phi is second cave. And now what is this map? So you know, c of x, y is x minus y over 2 square. So when I take grad x, c of x, y is just x minus y, right? So here, if you do all the computation, you're writing x minus grad phi of x, OK? For phi, which is c concave, sorry, not concave, c concave. So what I'm saying, OK, for sure this is the gradient of x square over 2 minus phi. And so the only thing I have to prove is that if phi is c concave, where c is the cost over there, then x square over 2 minus phi is convex. And this is an exercise for you, since, I mean, it's just playing with sup and inf, OK? Nothing else. You just write what it means that phi is c concave, that it is an infimum of cost function, so on. x square minus phi will be a sup of something. You are going to cancel the x square, and you get that you are the supremum of linear function. When you expand the square here, you have x square, then you have something which is linear next, and then you have a constant, I mean, a function of y, which is a constant in x. So this is here just to kill the x square, and you get a linear, OK? So 1, 2, 3. Well, 3, the proof, real proof needs some, the proof is straightforward, that the tools you need are not straightforward. So let me do the proof in case that t, which is, I recall, grade of u, is injective as smooth, which there is no reason for this to be true, and this is exactly the problem, OK? So if proof 3 only in this situation, but then it's simple, because knowing, we recall that mu was fdx, mu was gdy. So knowing that mu equals nu is the same that's saying that for every test function, you have that this equality holds, phi of y, g of y dy. And just writing down what is the relation of pushing forward, and then you just change variables here, get the y, u of x. And if you do everything right, that's here we have g of grad u of x, and here you have the determinants of the s of u in x. So you have that this equality holds for every, let me go in a different way test function, because psi was a pretty important thing, for every test function psi, OK? So if u is injective in some sense, you can cancel out. I mean, the test function is arbitrary. But when you compose the test function with u, you are not sure that you are getting an arbitrary function. If grad u would be constant, for instance, this relation is not going to tell you. I mean, you cannot, when you have that relation like that, for example, if u is constant, when you have that relation like that, for a lot of test function, then you know that it becomes a point-wise relation. But you need to use the arbitrary of the test function. And now we have a test function, which is psi composed with grad u. So you need it to be arbitrary. And if, let's say, grad u is injective, this is true. So you have a lot of test function, and you can say that f equal g composition grad u is determinant of the issue. And this proves the whole brignette theorem. So a comment on this last part. So this proof is correct. But the point is that, OK, maybe this is a remark I'll give you later. Yeah, I want to start to do the regularity part. So next time, this remark more precise, probably. OK. So in some sense, you need to understand what is this. When you have a convex function, which is not C2, what is this hessian? There are several ways to define the hessians, which are stronger or weaker. It depends, I mean. And I will show you that this is a pretty weak relation. And this is important, because you would like to use this relation to get regularity of your map, because now you know that the optimal map is the gradient of the convex function, which is solving in a question like that, which is a PD. So you can hope to get regularity. But I'll show you that without, I mean, at this level, even though f and g are smooth, you cannot get regularity, because this determinant of the hessian is a too weak object. If I write down the determinant of the hessian, I get from this argument. Then I show you, under some assumption, the support, you get more. But it's just to comment that this derivation, when you make rigorous derivation, then you see what is the right object, and you see that object is really weak. It's like a point-wise derivative. You know that a point-wise derivative cannot encode too much information on the function. Think to the counter function, which is point-wise derivative 0, as a constant function, which is the best function you can think of. But it's a heavy state case, so it's a mess. But yeah. So in that point you have just the point-wise derivative, which is not that good. OK. OK. One remark, which is probably stupid, but I like it also this. So we have seen that t is gradient of u, in case the cost function is x minus y square, is gradient of u for u convex. OK. If someone of u is like a geometry can argue that gradients are not maps, gradients are vector, not maps. So in Iran you have this mess that vectors and maps, points and vectors, you can kind of confuse it. But the gradient, whatever it is, has to be a vector, not a point. So here the right way was the one I was writing before, which is x minus gradient of phi for phi. So we have seen that. We have got this, that phi, which is second k. And now this is the right expression, because this is a point. This is a vector. And if you sum to a point a vector, this is a translation, and this is a map from our ran to our ran. And a very fancy way to write down this expression is saying that this is the exponential in x, where xx of a vector v is just x plus phi. And this is the exponential function in the sense of Riemannian geometry in our ran. It's just sending a vector v to the point, which is on the geodesi, starting from x with velocity v at m1. And this is the same, which is going to happen on the Riemannian manifold. It was just a comment to prepare the theorem. But before the theorem, not a real break, but a mathematical break, I will show you how to apply. I mean, to now is just theory. So let's give a nice application of this theory. Showing to you why, for instance, it can be interesting and useful to solve the monge problem. An idea how to write the equivalent theorem on a Riemannian manifold. So the equivalent of Brignier theorem. But first, let me recall what is the isoperimetric problem. Well, you probably know. So isoperimetric problem is, among all domain with fixed volume, you look to the one which has the smallest perimeter. And you know that the solution is a ball. Everyone knows that solution is a ball. Probably even your grandmother and so on. If you ask, which is the best way to, which is the same, if you give to someone just a fixed amount of rope and you ask it to make the highest possible area, it's going to draw a circle. So the isoperimetric problem is saying that among all set E, such that the measure of E is the same measure of the unit ball, then the perimeter of E is going to be greater or equal than the perimeter of the ball of radius one. And actually it turns out that to prove this is not completely easy. Because there is a very nice mistake behind the kind of proof of this. So if you know that there exists a solution, this solution is the ball. But one of the most famous proof, which was the Steiner proof, was if you know that there is a solution, then the solution is a ball. But it was not trivial at all to prove that there is x in a solution. And the problem is in the definition of this set. And this is what, more or less, Camille. So you can prove that you can get a solution of this problem. But I'm going to show you how to solve this problem in three lines. And this is a proof due to Gromov. So you have a set E, which is doing this. And let me take two probability measures. One is the Lebesgue measure restricted to E. And let's say I divide, since I'm working with probability measures. And nu is just the Lebesgue measure on the unit ball normalized again. And then, when you have this, you look to t, which is the gradient of the convex function, u, which is pushing forward t onto mu. And this is given by Brennet Jorem, this map. This is the map given. So, by the way, Brennet Jorem came later than the proof of Gromov. Gromov use a different map. The map is even easier. So u is convex. So this, we saw that u is going to solve this equation, which was what. So with the notation I put before is like one characteristic function of E over the measure of E divided characteristic function of B, of composition with grad u over the measure of B. So these two measures are the same, so they cancel. And then you see that when x is in E, this is one, and the gradient of x is going to belong to B. So since grad u is pushing forward mu onto nu, and nu lives in B1, this means that for almost every x in E, grad u of x belongs to B1, because you have to push one measure onto another, so you cannot go outside. You cannot have a lot of points, which are sent outside, because otherwise that measure is going out. The measure living on that point is going outside. So you know that this is one almost everywhere in E, OK? But since it's one almost everywhere on E, just let me do this. So that point is one, so I can say that n is equal to n times one, which is equal to n times the determinant of the action of u to the one over n. This is true in E. Inside E, I know that this is one, so if I take the one over n square root, this is still one, and this is true. Now, this is just n times the product of the eigenvalue of the action, right? The action is a symmetric matrix, and it has positive eigenvalue, because the function is complex, OK? So lambda i are positive, and lambda i are eigenvalue of the action of u. But now this is a geometric mean, and this is going to be less or equal to n times the arithmetic mean. And here you are using the distincts are positive, OK? The geometric arithmetic mean works for positive numbers. Now I can't find n. And this is what? This is the trace of the action, right? This is the sum of the eigenvalue. The trace of the action is nothing else, that the divergence of gradient of u. If now I look to gradient of u as a, OK? So you have all this inequality, and you just integrate it. So you know that this equality holds true in E, OK? So let us integrate it on E, and we get n times the measure of E is less or equal than the integral over E of the divergence of gradient, right? But now, use divergence theorem, you say that this is equal to the integral of the boundary of E of gradient dot nu, OK? This nu, we call n, n, which is the exterior normal, OK? It's just the outskrin theorem. But now, this is less or equal than the supremum of the gradient times the perimeter of E. And you see that the supremum of the gradient is just less or equal than 1. This, because grad u is the point in the ball of radius 1. And you see that if you run the same argument on the ball, you have a quality at every step. You are moving nothing. This map was the map, which was moving the normalized and the back measure restricted to the ball, to the normalized and the back measure restricted to the ball. It's not moving anything. And you get the quality all over here, and this proves, OK? If E is a ball, equality. I have equality everywhere, because also in the arithmetic, objuzely, this is, if the all the lambda i are equal to 1, this is inequality, right? Which is, I mean, there were just two inequalities in this proof. The arithmetic geometric mean, which is in equality, the numbers are the same. And this one, which is in equality, if the gradient goes point-wise, the gradient is modulus 1 when you are on the boundary of E, which means that the boundary of E is the boundary of the ball. So you have an equality everywhere, so you get that n times the measure of b1 equals the perimeter of b1. Then you put together these and these, and you remember that they have the same measure. This with this gives you the proof. And what is nice of this proof, a part that is just two line proof, is that you did this inequality with just, I mean, two inequalities. You have a chain of equalities with equalities inequalities with just two inequalities, which are this one and this one. And you know very well which are the equality cases in that inequality. So this is the proof, probably Alessio Figali is going to use this proof to show you something about stability of this is a parametric inequality. In the sense that you can ask what happens if, OK, I'm not solving, and my set is not solving the iso-parimetric inequality because it's not a ball, but it's very close to solve it, to solve the iso-parimetric problem. So it has a perimeter, which is very, very close to the perimeter of the ball. Can I say that he is close to the ball? And in which sense can I say that it is close to the ball? Well, you can actually do it, and this is what is going to talk you about Alessio. And, but you see that the easiest is the proof of the inequality. The easiest will be to understand this stability issue, OK? You just have to inequality. So you just have to understand what you are losing in that inequality. I'm not claiming that it's trivial, but I'm just saying, OK, the easier is the proof, the best is your starting point. This shows you that, I mean, there was no optimal transport problem. I mean, there was no optimal transport in this problem. Just the iso-parimetric problem. But using the fact that the optimal transport provides you a good map. So essentially, the only thing you use about optimal transport is that it provides you a monotone map, a map with positive eigenvalue moving a set onto the ball. So it's like you use the optimal transport problem as a tool to get a good map. This was the idea. So having maps is important, because these maps can be useful to prove something else, OK? And with more or less the same proof, I have no time to do it, but with same ideas, you can prove sobolev inequality using the same path. And it's a very nice proof. And for instance, it's on the notes of Ambrozine GD, or on the book of Villani. And it goes like this. So we have seen that it's also useful to have, I mean, a theory. With distance square, we have seen what happens in Rn. But Richard's lectures started saying that distance square is a cost, which gives you a lot of information on the geometry of the underlying space when whatever the space is, I don't know, a general metric space. So I'm going to show you what happens when the space is a Riemannian manifold. I meant to sketch, at least a little bit what happens when the space is a Riemannian manifold, if we can solve the monj problem with a cost, which is just a distance square, OK? So we have Mg Riemannian manifold. So I cannot give you too much reminds about what is a Riemannian manifold, so just you probably know that when you have Mg, yeah, you have a Riemannian manifold, which for me is compact at this level. And then you have this metric g, which is a way you, it's just a, g of x is something going from the engines. R is a scalar product, which is smoothly varying in x, OK? And you can associate all this tough volume measure, OK? Which is something that when you write down in charts, it's just the determinant of g of x, the square root of the determinant of g stuff you're having. And obviously the distance between two points is just the infimum of the integral gamma dot. So g, square root of g, gamma dot, gamma dot, right? Which is the same, with exactly the same proof of yesterday, of Luigi did yesterday, so it's the intimum, sorry, among all curve gamma, gamma of x equal x, and gamma of one equal y. And this is the same just by entering the quality, the infimum of the integral between zero and one of g of gamma dot, gamma dot, gamma of zero equals x, y. And I mean, just to be easier. This is just gamma dot, the modulus of gamma dot, right? You just have to compute it in every different tangent space. In time t, you are in a place, you compute that modulus, then at time s, you are in another place, you compute the modulus with a different scalar product, and so on. And this is just the modulus square, OK? Right, down, yes. OK, so this infimum is the same, and this infimum is achieved. And I'm more interested in this infimum, because this is achieved by constant speed geodesic. I mean, it's by constant speed cars into this. So a constant speed minimizing geodesic is between x and y is a solution. So what does it mean is the solution to start is a curve, it's a curve achieving this infimum, OK? And you can prove easily that it solves, in charts, it solves the following question, OK, equal 0, where gamma e jk are the Christopher symbol, which I don't even try to write it down. The name of Christopher, I don't know. Symbols, let's see those symbols. And it solves this. And how do you get this equation? OK, I'm not going to show you, but a good way to get it is to write down the Euler Lagrange equation, and this is something I kind of need, the Euler Lagrange equation of this function. So this is what VG code action functional yesterday. So the action functional is a gamma, which is, let me call it, with one half, OK? Not square. Then you would like to understand what happens if you take a variation of gamma. So OK, you have gamma here. Then you take a car, which is very close to gamma. And you take a car, which is very close to gamma. How big is its action? And for some reason, which I will explain you later, I also need to understand, which is the action of a car, which is very close to gamma, but has not the same hand points of gamma, OK? So our original gamma. The idea is that since everything is local, you can write in charts, work in charts, write everything down, and you just have in some sense to understand what happens when you look to variation like that, which makes not too much sense on a manifold, right? Because you are summing, and there is no sum between points. So OK. You can prove that this is the action of gamma. I'm not going to give you this proof. It's nice, I mean, it's a nice exercise. Times epsilon, then you have an integral of something times h. And then you have epsilon times g in gamma 1, gamma dot 1, times h1 minus epsilon, this minus, the same thing in 0. Clearly, this something here takes into account the fact that, I mean, this is write down for a general curve, right? So this here is like saying that you can, if you move your curve in some direction, you can lower the action. This is something which is going to have a sign, right? Always, because h is arbitrary. So this is something, I mean, you cannot make this thing to have a negative sign every time. So for a geodesic, geodesic is something which is minimizing with a fixed extrema. So if you put fixed extrema, correspond to say that h1 is h of 0, it's 0, right? In this case, you don't have this part. And you see that for a geodesic, this has to vanish, right? The terms inside the square bracket, which I didn't write, has to vanish, and this gives you this a wish. It's not exactly these terms. It's like, I think it's that this is these times a metric or something like that, OK? So this term has to vanish, but these terms, for a generic geodesic, does not have to vanish, because you are just asking that your geodesics minimizing with fixed extrema. And it actually is pretty nice that it's pretty natural that it does not have to vanish. So think to a geodesic, OK? Think that you make a variation where h of 0 is 0. So I'm just fixing the starting point. But then I don't fix the ending point. And you see that you increase the action more or less in, right? You have that this one is gamma dot of 1, and this one is h of 1. And you're looked into the scalar product of these things. And this is how the action is going to increase or decrease. And if you think that the action is just length, it's pretty nice, right? If you go in this direction, it's the way you are making it. You are increasing the best as possible, right? If you want to maximize this scalar product, you have to go in the direction of gamma dot. And you are just continuing your geodesics. So it's the way you are increasing the action. OK, so I need this formula. And then another thing I need is the definition of exponential map, which is simply, so the exponential map from a tangent is a map, which goes from a tangent space to the manifold. And it's just going to take a vector v to gamma v of 1, where gamma v of t is the geodesic starting gamma 0 equal x and gamma dot of 0 equal v. So it's not a geodesic among between two points, right? I mean, it's geodesic between two points. And they're not prescribing the two points. And just prescribing the initial position and the initial velocity. And this is OK, because this is a second order of d. So since I have to prescribe two conditions. OK, so this is the exponential map. Just notice that in RIN is the map I wrote before, which is just taking x and sum in v. And this is the Euler Lagrange equation for the action. And with these tools, after five-minute breaks, we can prove McIntyre, more or less. I mean, give a sketch of the proof of McIntyre. So just an Errata corrige. Before I wrote this, it's the infimum among all cards of the length of the cards, which are joining x and y. And this is equal to the infimum of these when I put the square root. Otherwise, it's not homogeneous in them. And this is actually nice, because this way I can write. And I'm going to use this 1 half the distance square among two points is just the infimum of the action. Where the action is just this with 1 half, this without the square root. This is just 1 half the integral gamma dot square root missing. OK, so I can state, now, RINIETIRE, no, sorry, McIntyre, which is the equivalent of RINIETIRE remanian manifold, which tells you that if you have as cost function is this square x, y over 2. And then you have mu, which is absolutely continuous with respect to the volume measure, and nu, which is a generic measure, probability measure of m, is this, this. Then there exists a unique solution. So the statement is the same of RINIETIREM. There exists a unique solution to k. This solution is given by a map. So it is a solution also to the Monch problem. And you can write down this map. This map t is just x of x minus grad phi. And phi is d square over 2 concave. So the statement is exactly the same we have before. And you say, OK, since the statement is the same, just run the argument you did before. But there is an issue in running the argument I did before. And the issue is that the cost is not smooth anymore. Distant square on a manifold is not smooth. And the example is, let's say I fix y, and I look it as a map of x, is not c1. Pretty easy to c y. So think, for instance, m to be s1. Fix a point, for instance, the north pole. And try to, what is distance square of x from the north pole? You can parameterize your sphere by theta. OK? If you try to write down the graph of distance square, it's not so important. What happens that you start your distance was like this, and then move until you reach the south pole, which is phi. Then you see that this is a maximum from the distance. But I mean, you don't have zero derivative. I mean, if you move a little bit farther, your distance is increasing in the same way. So the distance, but it's not anymore. I mean, this geodesic is not anymore minimizing, because when I am here, it's cheaper to arrive from this, this other direction. So the distance square is going to, it has a graph like this, and here is 2 pi. So you have an upper cast. Well, it's not a cast, it's just an upper corner. Here you have a derivative. So the derivative here is not infinite. So you see that it is not smooth. And you say, OK, but even the map phi was not a smooth map. So maybe you can throw away. So this is obviously a Lipchitz function. This square is locally Lipchitz, but your manifold is compact. So it's globally Lipchitz. So this square is a globally Lipchitz function. You say, OK, being a globally Lipchitz function, I know that for every y, it is differentiable almost everywhere. So for every y, I have a set of x where the distance square is differentiable in that x. So I can throw it away, but then I change y and I have to throw away another set of points and then so on. And what happens is that I can throw away a lot of points, since I'm just making an uncountable union of set of 0 measure. And this is actually what is happening in this example. So for every point, the point where the distance is a singularity is the opposite point. So if for every point you throw away the opposite point and then you make the union on the whole sphere, you are throwing away the whole sphere. So this does not work. But what it works here is that these, in some sense, are the only singularities that the distance can have. So the distance, what I will try to show you now is that the distance has only the only possible singularity for the distance are like upper crests, like this. So upper corner. It has the same singularity of a concave function. It's like a concave function plus a smooth perturbation. So we have to, so this is the theorem, kind of lemma is the following. I take y in m, which is a point I think fixed. Look to the function f y of x to be 1 half distance square x y. So I'm looking to this as a function of x. OK, y is fixed. Then what is true is that f y of z is always, what is true is that for every x there exists p of x tangent vector, such that this is true, x plus, OK. So what I would write here to write is something like that, p of x. So let me put quotation mark here. Zeta z minus x, the scalar product between these two. Plus ego. OK, so this formula makes no sense, in this way, because I'm on a Riemannian manifold, and I can subtract two points. I want a tangent vector. This is not a tangent vector. But the way it makes sense is the following. So it's like p of x times x to the minus 1 in x of z, which is the tangent vector to the geodesic, starting from x and going to z. And you don't know, since the manifold is compact, it's complete, so you know that there is this geodesic. So this makes sense, and this is just the scalar product, it's just like the metric like this. Then I have something like ego, distance squared. Now this formula makes sense, OK. But what I'm saying is just that this is not plain differentiability, because they have just been in equality, not in equality, right? What I'm saying is that in every point, distance squared, can be touched from above by a c1 function, OK? So it's like differentiable from above. Whatever this makes sense. In this sense, it's differentiable from above. So you see that you can allow upper casps, upper crease, like that, but you cannot allow something like this, OK? This is a forbidden singularity, because in this point, you cannot be below a linear function, right? So the only possible singularity for the distance squared are singularity, like the one I draw here. And actually, I can also say what is this p of x. p of x is minus x to the minus 1 of y. So recall that I have a fixed y in all this statement, OK? So I know that p of x is the opposite of the, right? I have y fixed here. I'm looking what happens in a point x. And p of x is just, so the vector, this would be x, x to the minus 1, y. So it's the velocity of the tangent vector going to y. And so p of x is going to be this vector, OK? So you have just gradients from above, not from, yes? Sorry, I think sometimes there are maybe two different directions that correspond to the same y. No, it's one of, but there is not unique this p of x. I mean, you can have, this is certainly what happens. You can have a lot of p of x. And one p of x, which is OK, is this. OK, if you want, you can write this. Px belongs to exponential inverse. So you have a lot of, I mean, you could have a lot of possible velocity. And this, going from a point, from x to y, is what he is saying, is like, say that you have two geodesic. So I would have two possible velocity and two possible p of x. And they all are OK for this inequality, OK? This is exactly what is happening here. You have that one slope is the slope of the geodesic coming from this side. The other slope is the slope of the geodesic coming from this other side. And here in between you have a lot of possible plans. You can put above. And the way you prove this, you prove this lemma is just using an expansion for the energy. So you know, you have seen that the energy, so we know that the distance squared between two points, let's call this y and z, over two is just the infimum of the integral of gamma dot square, right? So points starting from y and ending at z. And now, so we know this. So let take a minimizing geodesic between y and x. So this geodesic is such that is energy. The action of this geodesic is exactly one-half distance squared between x and y. Then I take a point z, which is close to x. And I look to this curve here. When I look to this curve here, this curve, which I can write, say in this way, think to this point to be very, very close to x. So think that the distance between x and z to be epsilon. And then I look to this curve, which I write in this way. And now I know that the distance squared between y and z, over two, is less or equal than the action of this curve, just because this is an infimum. But the action of this curve, I know that is the action of the curve joining y to x plus the name of epsilon. So the rate term, which is 0 since gamma dot is a geodesic. So the rate in the previous expression, the rate term like this, and this is 0, gamma dot is geodesic. And then now I don't have fixed extreme. So I have fixed extreme in y, but not in z. So this is epsilon g, what this would be, gamma dot, gamma bar dot of 1, times h of 1. And then I have epsilon squared. And then you see that this is exactly what I have here. And h1 is the, so you can think this as an infinitesimal variation. So h1 is just the velocity of something connecting x with z. For instance, as geodesic, here I put this, but it's not important, this one. Because this is like a first order expression, so you don't really care that. Carcer geodesic is just need to know the tangent vectors. So this is that formula there. You see that you have the distance, there is not. This is distance squared x, y. And then you have this expression here. This less than this plus something which is linear plus something which is quadratic. And you have the right inequality. And you see that actually this inequality is a true inequality when you have two minimizing geodesic. OK, p, well. So let us try to prove mechanical theorem. So we know, so we take, as usual, we take the solution of k. So we have two probability measures, the cost function. We run the Kantorovich problem. We find out the solution. We know that there exists a d squared over 2 concave function, such that support of pi bar is included in the, let me be sloppy here, is included in the graph of the super differential, so on. And what happens? OK. Now, what happens? So we know that this set, so let's take x and y in the support of pi. So we know that y belongs to this line again, phi, phi is, yeah. So we have these, once we have these, we know that, again, this is such that, so y is such that the function, which is sending z to this square over 2 xy, this square over 2, z, y minus phi of z as a maximum, right? This means to belong to the super differential. OK. So this is what we have. Then we have, so we have seen that it's harmless to throw away a set of points x, where phi is not differentiable, OK? So up to throw away null set, I can assume differentiable. So phi is differentiable, but the point is that we don't know that d is differentiable. But what happens here? So I told you that a second k function, like phi, is an infimum of function of the following form. So this is distance square y for some y, right? And you see that, in some sense, distance square, by dilemma you raise, distance square is c1 from above. So it always says c1 function, touching it from above, right? At least it has the behavior of a function like that. But now you see that phi, which is our second k function, is touched by the distance square from below, from above. Sorry, phi is touched by the distance square, which means that phi touches the distance square from below. So you have a function, which is c1 from above, always. And in the point we are looking at, which are these kind of points, it's touched also from below by a function, which is differential, right? So you see that here cannot be a singular point. Because if you want geometrically, it's pretty intuitive, because the only singularity allows to distance square singularity like that. And you cannot put a smooth function touching from below in a singularity like that, c1, not even smooth, right? So the point, the contact points between the phi, the graph of phi, and the graph of distance square cannot be singularities. So this function could have a singularity here, and then I don't know here, and so on. Things can be pretty nasty. But this cannot be here, this singularity. Because the only singularity you allow are singularity like that. And it's impossible for a singularity like that to be touched from below by something, which is smooth. So what is happening here is that, what is happening here, and if you want, you can try to prove it with the lemma I gave you before, is that if xy belongs to the support of pi, and phi is differentiable at x, and we see that throwing away some points, this I can always assume, then distance square is differentiable at x. So you see, just because distance square is always a smooth function touching it from above, and in this point it also has a smooth function touching it from below. So you are trapped in between two c1 function, and you have to be c1. So you know what? So you know that you can run the argument, so you can differentiate. So you know that you use that this is true. So you know that grad phi of x equals grad x distance square over 2 xy. And we have seen before, what is this gradient? This is just minus xx to the minus 1 of y. Is the velocity of the, is the opposite of the velocity of the ziodesic connecting x with y. And in this case, this is related to your observation, when you know that the distance square is differentiable, then you can also prove that there exists a unique zodesic between x and y. If you think to the example, it's pretty easy. I mean, if you have two different zodesic, they cannot, they has to have a different velocity. So you have at least two slopes touching from above. Since you know that you don't have two slopes, this means that you have just one zodesic. So this is unique since it's differentiable at x. And you see that, I mean, the gradient of this function cannot be nothing else than computing the gradient in x. There not be nothing else than this vector, which is the direction in which this function is increasing more. It's just continuous if you have to continue your zodesic flow. And so the gradient is just this. So this is the gradient x distance square y, which is nothing else than the opposite. It's nothing else than the opposite of this vector. And this vector here is xx to the minus 1 of y. So this is the geometry behind this formula. Is it clear? So you have this formula in conclude here. Now you just invert. Now the inverse exponential is well defined because we have just a unique zodesic. Just invert this, and you get y equal, which is the claim of the theorem. So you have a map, and this map is given. You have a map, and this map is given by 2, slightly 2. So we have seen that we can give you to measure. And there are assumptions under which I can always solve the counter of each problem. And sometimes I can solve the Monge problem. And we have seen that even solving the Monge problem, for instance, gives us a two-line proof of the isopereometric inequality, which is nice. So what I would like to tell you in this 20 minutes is how to understand, I mean, how to see how can, if you look at the whole set of measure on a metric space, and you endow with a distance, which came from the distance you have, this can be geometrically and analytically meaningful and can give you a lot of information on the underlying space. And this is what was saying Luigi at the beginning of the lectures of yesterday, in which you can, for instance, characterize manifolds with positive, rich curvature in terms of the convexity of some function. And so I'll try to give you, I mean, more an overview, because there is not too much time, and I would like to, I mean, next lecture to talk to you about regularity of this maps. It's the issue I like. And OK. So let's look to the following. So x is our metric space, but you can't think it to be our remaining manifold, whatever. And I look, maybe not comment. So I look to the space of probability measure. So this is our probability measure, which has finite second moments. But if you are compact, it's not so important for some x dots. This is something you need, because I'm going to make, to use the Kantorovich problem with quadratic cost, and I would like things to be finite. This is our space. And then, if I take two probability measures in this space, if I take two probability measures in this space, I can define a distance, which is called Vassestein distance. So the Vassestein distance between mu and nu is just the infimum among all plan, which are transport plan of the integral of d squared for x, y dp over x, y. And then, since I want a distance again, I have to put square root. So, OK, this is a distance, it's a theorem. This is a distance, OK? I'm not going to prove this theorem. So if you want p2 of x, this distance is a complete metric space. So if you start from a complete and separable metric space and you construct the Vassestein space over this, which is this called Vassestein space, then you get a complete and separable metric space, too. And just notice that the map, which is, as you say, to x in x, the delta mass sitting on x, is an isometric immersion into the Vassestein space. So what I'd like to show is that some property, we have seen that some basic property of x are endowed by the Vassestein space. What I would like to see now is, I mean, at least to sketch you, what is the relation, in some sense, between geodesic. OK, we have seen that geodesic and probably also, in geodesic plays the role that geodesic, in a generic metric space, plays the role that lines and segment plays in the node, two-dimensional Euclidean geometry. And you all know that you can prove a lot of nice things and you can make a lot of nice theorem just with two-dimensional Euclidean geometry in the plane. And, I mean, since understanding the geodesics of a metric space, of a space gives you a lot of int on the geometry of that space, I'm going to show you how to characterize sectional curvature in terms of geodesic triangles. So geodesics. Important tool in geometry. And so you have a complete metric space. And you have seen, again, yesterday that there is a well-defined notion of length. It's a well-defined notion of speed of a curve. And so you can talk about geodesic in a generic metric space. So the question is, can you understand which are the geodesics among measure? And in some sense, this is interesting, because if you think, OK, I give you two points, and you say, OK, the distance between two points is something, I don't know, three. But if I give you the geodesic between these two points, you know, which is the shortest way to move from one point to another, in some sense, you are minimizing your distance. I mean, you are taking the, time by time, you are thinking the optimal path. So here, the analogous is, I give you two probability measure, I solved the Kantorovic problem, and I found out the vastest in distance between these two probability measure. But which is the optimal way to move one probability measure to another, I mean, which is the path of particles going from the initial position to the final position. And this is encoded by geodesic in the vastest in space. So the theorem is that if x is a geodesic space, and a geodesic space is just that among every two points, there is a minimizing geodesic. So among, yeah, and the distance is compatible. It's like a Riemannian manifold from this point of view. So geodesic space just means that xy is equal of the infimum of, let me write as I need, gamma dot square to the one half among all curve going from here. So it's equal to the minimum, so I want it is the infimum and that the infimum is achieved. So this means to be a geodesic space. That there is this compatibility between curves and distance, which makes it interesting to look to geodesic otherwise, I mean, if you put a distance which has nothing to do with curves, you can do it. Actually, if you look, for instance, any Riemannian manifold, if you can look it as immersed in some Euclidean space, and if you look to the distance of the Euclidean space on this Riemannian manifold, you get a distance which is not so meaningful. If it is a geodesic space, then curve mu t, a family of curve, so a curve of measure mu t is a geodesic between mu 0 and mu 1 if and only if there exists a measure eta, which is a probability measure on the space of geodesic in x. And so you have your space below, then you can construct, if you can look, so geodesic of x is just a space of curve, and you can construct probability measure on this space of curve. So you can think that this is like taking random curves, having a probability measure on the space of curves. So you have eta, which is a probability measure on this space. Let me write down the statement. What you need is that, if you look, this is the evaluation map. So you have a, this is, so eta is something which goes from the space of curve, say, in this case, geodesic, or absolutely continuous curve, and to every curve it associates its value, at time t. So this is the evaluation map. So this is the measure mu 0, this is the measure, our starting measure mu 0. At time 1, you end up on mu 1. And then, obviously, mu t, which is our interesting curve, is just, so probably this, in quote, all the three. And then you want that e 0, e 1. So this is a coupling of eta. It's just, it's optimal. It's optimal, it's an optimal curve. And as a consequence of all this, you get that the coupling is optimal between every two times. In this case, you know exactly that, e t, e s, sharp eta, which belongs, this is a coupling between mu t and mu s, and is optimal. So what is the idea? I mean, you have your measure mu. You have your measure mu here. And the Kantorovich problem, an optimal plan, is telling you how to couple this point in an optimal way. What is telling you this is that once you know that, you know also, I mean, you just have to take the geodesic between these two points. And this gives you a curve. I mean, this gives you a curve of measure. Because say that this point is sent in that point, and so on, then you take geodesic, and you look, which is the picture of time t, right? You make just a valve, you're measuring one of the other, and you move all particles along geodesic. And you look at picture of time t. Well, what are you doing here is just writing down a geodesic on the vastest time space. So the geodesics in the space of measures is a measure on the space of geodesics. So you are just choosing among all possible geodesics the one, which are given by this coupling. OK? So, and this is exactly as you make the proof of this theorem. I don't probably have time. Just let me... I'll just give you an idea of the proof of the theorem, and then maybe I give you some application of this fact in the next lecture before coming back to the regularity issue. So you need to construct this. So say you know that mu t is a geodesic. And OK, just a remark before the proof. I will always look to minimizing geodesic. Sorry, to constant speed minimizing geodesic. And in general, geodesic gamma is a constant speed minimizing geodesic if and only if gamma is constant speed minimizing geodesic if and only if the distance square between gamma of t, gamma of s is t minus s square distance square of gamma of 0, gamma of 1 if and only if it holds with this inequality. OK? This is a nice exercise for you. So for every t and s. So you are a constant speed minimizing geodesic if and only if you are... Yeah, you are saying that you are optimal among all two points in some sense and you are saying the same. OK, this is an easy exercise for you. So I have this geodesic. Now let's start. I have this measure eta, which is a probability measure in the space of geodesic. And I know that mu t is given by e t sharp eta. And I also know that if I look to the coupling given by the extreme of this geodesic this is optimal, which is optimal between mu 0 and mu 1. And now I want to show that mu t is a geodesic, where mu t is given by that. OK? So I start having a measure in the space of geodesic knowing that the extreme point of this geodesic are optimal coupling between mu 0 and mu 1. And I want to show that in some sense this is optimal for all the time. OK, and this is pretty easy because I look which is the vastest time distance a square between mu t and mu s. And this is just less or equal than the integral d of x, y. And here I can take e t, e s, sharp eta as a coupling between e t and e s. So since this is an increment, it's less than just testing with something. But then I just write down what is this. This is the integral of this square here, sorry. This is the integral of this square gamma t, gamma s in d eta. But then I use this. I know that all these cars are geodesic because eta is a probability measure in the space of geodesic. I know that all these cars are geodesic, so this is nothing else that t minus s squared integral of d squared gamma of 0, gamma 1 in d eta gamma. But then this is just t minus s squared integral. This is d squared x, y in d e 0, e 1, sharp eta by definition of push forward. And now I know that this is optimal. What does it mean that this is optimal? This just means that this is equal to the vastest and distant square between mu 0 and mu 1. And then you see that this inequality here is satisfied by this curve, which means that this curve is geodesic. In some sense you know that every curve satisfies that inequality and you are integrating all these curves. So it's not pretty clear that this has to be true. And the point is, this point say that you know, so to prove the converse of the theorem you have to understand what happens. So you have a curve which you know is geodesic. Time by time is geodesic. And you want to construct a measure which is concentrated on the space of geodesic and which is optimal between every two extrema. I mean, that's what we are asking. Anytime you take two times t and s and you look, in some sense you project your geodesic on the space, so you look which is the distribution of points at time t and at time s, you want the coupling given by connecting this point with your geodesic is optimal for the Kantorovich problem. And so you have this mu t and what are you going to do? I'm not going to give a complete proof because it's just the intuition. I mean, the proof is easy because it's just behind this intuition but then it's boring stuff of major theories. I took a strike. I don't know if someone of you has ever seen, I mean, I guess you have seen the construction of Brownian motion and it's kind of the same. So what are you going to do? So you have mu 0 and mu 1. And then look to mu 1,5. So what happens at the mean time? Right? Then you have your measures. What are you going to do? Is take an optimal plan P12 which is an optimal plan between mu 0 and mu 1. P12, so let's call it this way. P01,5 is optimal between mu 0 and mu 1. So I know how to couple this point with some point here, right? To every x, I know which are the y as you see. Where the mass is sent. And then do the same here which is optimal in these two. And then you take a point to x, you say that you know that it's going to a point y and then you take the minimizing geodesic between x and y. So you just decide which is the coupling on the point through the Kantorovich problem, right? Say that you know that you have a map. So you know that every x goes just to one y. Well, take that geodesic. Then you know that this y is going to some z here by this coupling. And take that other geodesic. And so on. So then you start dividing, you construct mu 1 fourth, then mu 3 fourth, and so on. And you still get, so say that now this point is better for him to go there and say this goes still here. So in this point it will go there and then there and so on. So you are constructing, by discretizing time, you are constructing like measures. I mean, you take the induced measure on these curves which are not exactly geodesic, but they are like broken geodesic, right? They are geodesic on this interval, then they could broke and then and so on. But since you know, in some sense since you know that your curve was a geodesic on the space of measure, then you can prove that, I mean in the limit you are going to get the measure which is concentrated on geodesic, which is actually concentrated on geodesic and such that mu t can be represented as you want. So this is the idea behind the proof and I mean you can find this proof for instance on the notes of Ambrosio and Gigi. But the idea is this one. So you just discretize time and then you look to the coupling given by Kantorovich problem and you connect any two points which are kept by the Kantorovich problem with the geodesic. And then you take a measure as measure on the space of geodesic just the measure which sees this geodesic and then to say if you have two points it's pretty easy, right? If you have a delta here and one alpha delta here so the measure is just given by one alpha this curve plus one alpha so one alpha delta on this curve and one alpha delta on this curve. So you know that this point will be sent one alpha here, one alpha here. You take the optimal curve from here to here. You take one alpha of that plus one alpha of that. Ok.