 As before, I regard a solution u of tx of this nonlinear equation as a curve in a suitable functional Hilbert space. And accordingly, I write my equation as an ordinary differential equation in this space. So du over dt plus linear part of equation, something like Laplacian, plus nonlinearity, plus nonlinearity f of u of t. And this equals to, equals to the random force eta omega of t. Eta omega of t, and usually I have initial condition. When t equals to zero, this equals to zero. Crucial property, crucial for me property is that this random force is bounded. Random force is bounded, so eta omega of t, this is functional norm. It is bounded by some constant for every t, for every t, for every omega, for every omega. I need, of course, I need some properties from this random force. One of them is, one of them, okay, like this. So, well, I decompose in this space, in this space h. I take eigenbasis formed by eigenfunctions of the linear part of the equation. And I decompose my force in this basis. Then this is the composition on this basis in this space h. Now comes free coefficient number j. It has the form bj times eta j eta j omega of t. This is usual game, and summation goes, goes from one to m. And either m is either infinity, this is relatively easy case, or m is small number. Some select 234. This is where we really, we really have to use optimal control as I will explain today. As I will explain today. Right so and so this are, and this are, and eta j's are independent identically distributed random processes. Another notation. I denote by jr times sigma of length 1 from r minus 1 to r, where r is any integer, non-negative number. And I denote, and I denote by eta with upper index r, upper index. The restriction of my process of the, of the sigma, of the sigma, of the sigma, on the sigma r. Number r, right? And my another assumption is that here, of course, r equals to r equals to 1, 2, 3, etc. And my another crucial assumption is that these processes also are independent, identically distributed. Well, fortunately, there are at least, there is one class of examples for the processes which satisfy all assumptions which I need. These are, these are random hard series, and this is, and this, this can be written in the references which I gave. In the very unlikely case that at my last lecture I will have time, I can, I can, I can talk on this. Now, now, yes, among these segments. I denote sigma j1, j1 as j. Shift in time, I regat every process eta r as a random process defined, defined on the sigma j. So, with, with the argument, with the argument from 0 to 1. Now, excuse me. Now, I denote by e, the space L2, 0, 1, 0, 1 h. This is the space where, where forces sits. When we consider what happens for, for t from 0 to 1, right? And I denote, I denote by, I denote by, by, by h m subspace where the force lives. So, this is a span, this is a span of first n basis vectors, phi 1 dot phi m, right? And, and another important for me space is, that is, is the space e, which is just L2 space from 0 t, from 0 t to, to h m, h m. You see, as I, as I stressed before, there is some, it may be sounds non-logical. So, my forces are bounded, they belong to space L infinity. But I take by, as my main space L2 space, two reasons for this. Because it is very good to have Hilbert space, not not basis. More important that if I would take space L infinity, I would, I would arrive at a space, which is not, which is not separable. And this is, and this is really, this is really bad. Now, we change from, from continuous time to, to the discrete time. For this end, we consider, we consider time, we consider the map S. The map S, which is the map from h scalar product with e, where forces sits back to the space h. And this is the mapping, which moves initial data eta and the random force from 0 to t to the value of my solution at time 1, at time 1, at time 1, right? Immediately, then, then just, of course, of course, we see that in fact, what we are talking about, yes, if I, if I examine values of my solution at time 0, 1, 2, etc. Then in fact, I am talking about, I am talking about system S. And the system S, a system defined by this relation, value of my solution at time r plus 1 equals to the map S, applied to value of my solution at time r, r and with, with random force, with, with, with this, with this, with this, with this, with this, of the KICK number, number, number r plus 1. And this is, and this holds true, for r bigger or equal to 0. And of course, u of 0 equals to 0. This is, this is the initial condition. This is the initial, initial condition. Инициал-кондицион. Одна more notation. So, this is the system. I have to study trajectories of the system, when time error goes to infinity. I adopt the following notation. U error of U0, where error equals to 0, 1, etc. So, this is trajectory of the system S. And I have to study what happens with these trajectories when time goes to infinity. So, last time we have seen, and this is next to tautology, the following property. The following remarkable property, that if I consider the law of my solution at time equals to 1, this law can be calculated in the following way. I take the map S, I take the map S, and I consider push forward under the mapping S of the map, which is the, of the measure, which is the distribution of the, or maybe let me call initial data V for the moment, it will be a bit more elegant. Of the measure, which is direct product of the distribution of the initial data V, times measure L, where the measure L, which is the distribution of what I have here. So, the law of the kick number R is independent by R, and this is by definition, by definition the measure L. Since my kicks are bounded, then the support of measure L is bounded in the space H, but I assumed more. I assumed that the support of the measure L, which I denoted as compact K, which denoted as K, this is a compact inside space H. This is rather innocent assumption. It holds true for quite a lot of various examples. And, of course, same relation tells us that the value of my solution at time K plus 1, this is the mapping S applied to measure, which is the law of my solution at time K, direct product with the measure L. So, what does it mean? It means that if I want to study evolution of the distribution of my solution, because this is what I'm really interested in, this is behavior of the distribution of my solution at time K, with K goes to infinity. We clearly see from this, from the formula M, that this evolution depends not on the specific choice of these kicks, but only on the law of the distribution of the kick. Also, we see that to calculate distribution of my solution at K plus 1, we have to know only distribution of solution at time K, not at time K minus 1, etc. This means that the process UK, where K equals 0, it's a Markov process with discrete time or a Markov chain. It is useful and easy exercise. Take in any book any rigorous definition of Markov processes and to see that this process is Markov. It is easy, it is not a big deal. Yes, yes, thank you, thank you. This is the space, this is the space where the kick is. No, it is just E. It is just E without N. Right, so now I started to impose restrictions. There are restrictions which are innocent. Okay, so first restriction is just regularity. Let me remind you my restrictions. Restriction B1. Restriction B1 or B1 prime, because it corresponds to a stronger set of assumptions which I impose. This restriction tells me that the map S maps H, scalar product with E, not just to the space H, but to some smoother space, which is densely and compactly embedded into the space H. And that this map is C2-smooth. C2-smooth, C2-smooth. This is sort of the experimental fact. If for some quasi-linear parabolic equations we know that this equation is well posed and we can prove that there exists only one solution, then the corresponding map is smooth and usually analytic. So this is rather innocent assumption, C2-smooth. Assumption B2 was the assumption stability of zero. It has two parts. First, if I examine the norm of U of S eta, U of S eta, for any eta from the support of measure, because I do not care about right hand sides, which do not belong to the support of my measure. Then the norm of this guy is bounded gamma times norm U plus some constant beta, where gamma is smaller than 1. So this is stability of zero. And again, this is rather generical. This is true for huge class of equations. B, that the norm, that if I switch off the force, the norm S of U zero is bounded a gamma norm U. So look, roughly, if I have a quasi-linear parabolic equation from mathematical physics such that with zero right hand side this equation is well posed, then all my restrictions hold for solution for this equation. On the contrary, if I take good equations from mathematical physics, then often, when I put in the right hand side white noise, I cannot prove that this equation is well posed. So bounded noises, they are much more comfortable. It is more comfortable to work with them. So then this is, OK, then it is like this. Now with all these objects in hands, let us define number R star, which is the following number, which is beta divided by 1 minus gamma. It's a finite quantity, as you see, right? Now comes easy exercise. Easy exercise, which is really easy. Not two lines, maybe four lines. Exercise. Exercise tells us the following. That the ball in the space H of radius R star, R star closed ball, this is my notation for closed ball, right, is invariant for the system S. That is to say that if I take U such that norm U is bounded by R star, if I take Eta from this compact K, which is support of my measure L, then the norm of S of U Eta also bounded by R star. This is called that ball of radius R star is this absorbing set from my system. This is always like this for quasi-linear parabolic equations. Now look. But let us go back to the property B1. Look, from the property B1, we see that if norm U is bounded by R star, if Eta belongs to K, then S of U Eta is bounded in this space V. The norm in this space V is bounded by some constant K, which is constant, which depends on this support of the measure and this number R star. But now this is the end of the exercise. This is trivial consequence of B1, which is now B1 prime. Now let us consider, define the set X, which is a subset of my main space H. This subset is the following. When I take the ball BH in the space R star, I intersect with the ball in the space V of radius K, with this K, right? And I take closure in the space H. Then, obviously, since V is compactly embedded in the space H, then this is a compact subset of the space H. If you look at these two or three lines of definition, you will immediately see that all the X is invariant for my system. X is invariant as invariant for the system S, as invariant for the system S. So, in reality, my system evolves on smoother and bounded part of the main space H. So, life of parabolic equation with bounded force often happens on some compact part of the space H. Because of this, I can consider restriction on the system S on the compact X. If I study this system, then after this, it is rather easy exercise to derive literally the same consequence for the original system. It is practically the same, and in the references, which I gave you, you can find how to do it. It is not really exciting. So, therefore, what happens? From now on, I am going to talk only for system S, restrict to this compact X. So, I will take initial data U0, initial data V, or a random variable valid in the space X, and then solution with probability 1 always stays in X. Always stays in X, and now I want to understand what happens when time goes to infinity. I need some small piece of probability. This is small piece of probability. This is not probability. This is small piece of measure theory. This is called Lipschitz-Dohl distance. Lipschitz-Dohl distance. Lipschitz-Dohl distance. Distance in the space. Distance in the space. In the space of measures on M, where M is any Polish space. So, we will see now that space of measures in any Polish space. This is, in fact, a metric space. I can introduce in this metric space a number of different distances, but the one which I will tell you now is very, very comfortable. So, first notation. For this Polish space M, Polish is a complete separable metric space. I consider O of M, which is a ball, which is a ball made by all bounded continuous function in the space M, such that supremum norm of this function is bounded by one. And Lipschitz constant of this function also is bounded by one. And now comes the definition. For any two measures on this metric space of M, I defined Lipschitz-Dohl distance between mu and mu in the following way. I take supremum. Suprium is taken for all functions from this set OM. And supremum of what? I integrate function F against measure mu. I subtract function F integrated against measure mu. We immediately see that since I can take instead of F minus F, I can put here modulus. It will change nothing. So I will not write modulus here. Also we immediately see that since F is bounded by one, then the distance between any two measures changes between zero and two. This is trivial part. This is trivial part. Highly non-trivial part is given by the following remarkable result. This is a theorem due to Leonid Kantorovich who was remarkable Soviet mathematicians and economists who started to work in sorties and he died I think in 70s. He was the first brilliant mathematician who got Nobel prize in economics, not Nash Kantorovich. So this is one of his results. Ok, look. Fact number one. With this distance, with this distance, with this distance, with this distance, the space of measures P of M becomes a polished space. Becomes a polished space. Becomes a polished space. Becomes a polished space, right? So complete separable, not only distance. It's really perfect distance. So if we have any cache sequence of measures, then the limit exist and it is also measured. And it is space separable. What is remarkable, what is second remarkable property, that this distance matrices with convergence of measures. That is to say if a sequence of measures mu N weakly converges to some limit on measure mu, it happens if and only if, if and only if, Lipschitz-dual distance between mu N and mu, Lipschitz-dual goes to zero, goes to zero. This is extremely useful, extremely useful. Because this much, much, much better work not with weak convergence of measures, which is really just not at all so remarkable. But with, but with this distance. Okay? Okay, so this is, and this is what we will, what we will work with. Now look, now as I said, I will, I will talk about system, system restricted to the, to the compact, so to system S restricted to compact X. Therefore all the measures, all the distributions of my solution, they will be measures supported by the compact X. So I can regard them either as measures on the space X, on the space on my, on my space H or as measures on the space X. But there is another useful exercise that for the distances it makes no difference. Exercise. Exercise, exercise is the following. So if, if measure, measures mu nu are supported by, by this X, are supported by X, are supported, are supported by X, then the Lipschitz dual distance, Lipschitz dual distance, which I measure in the space H, this is the same as Lipschitz dual distance, if I use, if I use the distance in the whole, in the, in the space, in, in the space X. Hint to proof. Hint to proof, hint. Hint, hint is Kirsch-Brown theory. This is very beautiful result. Kirsch, Brown, I, I'm not sure about the spelling, but, but Google will, will tell you. Or just. What, what tells Kirsch-Brown theory? Let us take, let us take space of finite or infinite dimension, lies that just Rn, Rn, or just Hilbert space. Let us consider any subset at all, not necessarily closed, not necessarily compact, any subset at all. Assume that you have on this set, let me call it Y, so this is not right, that we have on this set Y a Lipschitz function. Then Kirsch-Brown tells us that we can extend it to the whole space with the same Lipschitz constant, right? It is the result, it is not at all easy to prove this. But and this rather easily implies, implies what I, what I told you. Right? Okay, now we have all definitions, we have all the objects and now the main result. Main result. Main result. Main result, you see, it's a good result, so statement is short, prove is long. That may be okay. So maybe statement is short also, because my assumptions are not at all that big. Look. So assume that assumptions which I, which I have imposed holds. Assume either weak assumptions from B1 to B4, to B4, right? Or strong assumption to B1 prime to B4 prime. To B4 prime, right? Assume this. Then, then we have, then we have theorem. Theorem. Theorem. Let, let U0, prime be any to, be any to initial data in the, in the space, in the space x. Then, then for any k, the following, the following holds. Look. I consider the law of my solution with initial data U0. I consider the law of another solution with initial data of 0 prime. I consider this loss at time k. I measure Lipschitz-Dohl distance between the two, between the two, right? And then, this Lipschitz-Dohl distance, Lipschitz-Dohl distance is bounded by c times kappa to the degree k times distance between U minus U0 prime. U minus U0 prime, right? Okay, so this is precisely that's the loss of any two solution with deterministic initial data, but we will soon see that it is not important, converges distance between them, distance between them converges to 0. Okay, corollary 1. Corollary 1 is simply trivial. Corollary 1 immediately immediately follows from the definition of Lipschitz-Dohl distance. For any functional f, which is bounded Lipschitz on the space x, right? The following happens. If I take f of U k of U0 If I take expected value minus expected value of f of U k of U0 prime of U0 prime, right? That then this distance is bounded by c kappa to the degree k U0 minus U0 prime minus U0 prime, right? So f of solution in physics observable observable value right so any two observable value with any two initial data they converges to the same limit to the same limit. Very easy exercise, really easy exercise easy exercise is that that if if f any c1 smooth functional on the space x also then also star holds star holds with c which depends on this functional. So I can take practically any reasonable functional you see it should not be bounded just because all functions are c1 smooth. Right. Is this clear, right? So this is finally mixing. So we we discuss this mixing many times but this is finally this is finally mixing on the blackboard. This is kappa kappa is smaller than 1 kappa is smaller than 1 kappa is smaller than 1 Of course, thank you. There exists kappa Yes, thank you. There exists in this set Yes This is This is not good This is not good. You are just absolutely right. Sorry. There exists c which is positive constant There exists kappa which is between 0 and 1 such that for any initial data this holds true. This again so for the my first question to the talk of Jean Pierre was that after this what else can be done he said that we know that the density exists but we know nothing about this density. So similar here to get some information about this about just wait, I need one more corollary I need one more corollary just to make just to make us happy corollary corollary 2 corollary 2 and look and there exist measure mu which is a measure supported by the space by the space x, yes such that for any for any measure for any for any initial data v which is a random variable which is a random variable let me in x random variable for any random variable initial data the following the following happens the following happens if I take if I take law of the solution UK with this random initial data v then the then the distance that's the distance to this measure mu to the measure mu obeys the same estimate it is bounded by C1 kappa2k and here comes here comes Lipschitz dual distance times Lipschitz dual distance between the law of this random variable v and the measure and the measure mu star so so asymptotically this is this very clever estimate so it is not at all obvious it is not at all obvious well, but it is not complicated either let me give you a scheme of the proof I do not want to write it because it will take time and it will draw your attention from the main scene which I want to talk about step one step one step one is to show show using use the mark of property use the mark of property to show that to show that to show that to show that the sequence that the no, just okay so it will be safer to tell like this so in fact this is the main result this is the main result you can find the proof of the derivation of the stuff this is my paper because unfortunately to write the proof I need one more notion I have no time to do this but this is called mu star stationary measure and to study properties of stationary measure of specific non-linear equation which comes from physics this is a physical problem of of prime importance of prime importance, okay now so then finally finally we so finally we have we have arrived at maybe the most interesting the most interesting part of my lectures namely I will show how to prove this stuff how to prove this result and how the theory of optimal control enters to the realm of mixing so what the two have in common so section 4 so section 4 section 4 this is proof of the theorem proof of the theorem proof of the theorem under the stronger assumptions under the assumptions under the assumptions under the assumptions B1 prime B4 prime when the proof will be presented at the end of the next lecture I will tell you the correction some additional efforts which should be made just to prove to prove the theorem under under weaker assumptions okay so to prove this we will use Dioblin coupling Dioblin coupling Dioblin coupling Dioblin coupling plus plus quadratic convergence quadratic convergence plus quadratic convergence plus quadratic convergence and plus properties of Markov processes Markov techniques Markov techniques Markov techniques Markov techniques This was really this was remarkable для меня, что в этом состоянии квадраты конводов используются в форме, который очень близко к форме, который используется, когда мы оправдываем Калмагоровский теорий, который анделирует КАМ теорий. Но, может быть, это один из лучших, один из лучших, один из лучших использований в КАМ технике. Так, теперь, что это дёблин? Что это дёблин? И я просто повторю, что я вывозил вам в википедии для пейпера о дёблинах. Окей, так вот, что это наша задача? Наша задача... Наша задача... Наша задача, чтобы доказать релейсию 2. Релейсию 2 это дистанция из системы УК у Фью-зиро минус лу из системы УК у Фью-зиро прайм, что эта дистанция дуэлл конвертирует конвертирует до 0, когда K когда K конвертирует до инфинити. Тогда, когда мы оправдываем это и когда мы анализируем нашу конструкцию, мы увидим, что рейсу конвертуры это просто ССС. Так, это... это это, что мы действительно хотим получить. Теперь now comes the doubling method of two equations the doubling the doubling method of two equations the doubling doubling method, . no doubling method of two equation method of two equations of two equations of two equations Окей, вannah вarre This is this is Это что-то очень клевое. Посмотрите, наш фейс-пейс для системы S, который расстирается на X, конечно, это компакт X. А теперь, вместо фейс-пейса X times X, consider the new dynamics. Consider the new dynamics. Consider the doubled dynamics. Dynamics is the following. Dynamics is U of K, V of K, where K is bigger, bigger equal than zero. And now I will tell you the initial data, and now I will change you, I will tell you how to calculate this vector for K equals to K plus one, if you know this vector for K equals to K. Initial data. Initial data, you can guess what they are. U of zero equals to U zero, V of zero equals to V zero prime. Equals to V zero prime. And now dynamics. Dynamics is probably, I will put them here because of crucial importance. So three. Relations three as U of K, V of K, is being updated for the same guys for K equal to minus one by the following formula. I have to take S of U of K minus one, eta K, eta K. Eta K is not literally the same, which I have in this system. Just wait for a moment. And this is new U, right? And new V is the following one. S of V of K minus one, eta prime K, eta prime K, right? And this is for K bigger or equal than one, of course, right? Because for K equal to zero, we know who they are. Now, who are these U of K, U of K prime? This is the whole point. This is why Dublin was really great. So look. Eta K. The pair. Eta K. The pair. Eta K, Eta K prime. I will sometime write it. Eta prime K or Eta K prime. It will be the same, of course. This is Eta K. This is a map. Eta K, Eta K prime. Apply to the following argument. Apply to U K minus one, V K minus one, and omega, and omega, right? So that is to say that this is a map. This is a map. So this Eta K, Eta prime K, this is a map. This is a mapping from the space X times X. X times X, because this guy sits here. This guy sits here. This guy sits in the probability space. But sometime ago I explained that it is very good. We can do this. We can say that the kick number K sits in an independent copy of the probability space. This is the safest way. Then I am absolutely sure that this kick number K is definitely independent from the kicks K minus one, because there is absolutely nothing about kick number K minus one. Okay. Which kind? So this is a map from here to E times E. You see? Look. So this dynamics is essentially the same dynamics as before with the difference, that I cook kick number K in some special way. There will be comments now. This is again another copy of the same dynamics. But again the kick number K is not literally the same one, but just another one. So what are restrictions on this map? Restrictions on this map are the following. Let us consider the law of the Eta K. When I fix U, when I fix V, and this... Look. If I fix any U and V, then Eta K becomes a random variable. Right? Because by definition random variable is a measurable mapping defined on the probability space. So I fixed any U and V. I can examine law of this random variable. The law of this random variable might be L. This is how it must be. So you see, new value of Eta K Oops. Yes, new value of Eta K is obtained in the following way. I take the mapping S, I apply it to old position U K minus 1, and I put here kick distributed as it should be distributed. And similar of course one guy is the same. Eta prime K of UV. Right? And this is true for any U, for any V, for any K. What is the point here? What is the point here? The point here is the following. Look, that this kicks Eta K, Eta K prime. Of course they are independent from all what was in the past. But they are not at all independent from each other. They are strongly correlated. They are strongly correlated. You see, here comes kick number K. Here comes, here sits kick number K distributed as it should be distributed. Here comes another kick number K distributed as it should be distributed. But these two guys, they are not at all independent. They are heavily dependent. And this is what Dioblin. This is what Dioblin. This is what Dioblin. And yes, of course I don't remember if I said, of course this is a measurable map. Measurable map. Right? Now comes the following lemma, which is the whole point. Lemma, which is the whole point. Lemma, which is the whole point. First. First, and now look. Due to this, of course this K-scopy of the probability space, I denote points here omega K. Right? This is the same K. Now watch. Pair U of K V of K Depends only Depends Depends only Depends only on Depends only on initial datas U0 U0 prime and and omega 1 up to K. Omega 1 up to K, right? So they do not depend on the future, so simply look how I have constructed this. Two. And this is this is why this is really very good. The law of UK of course equals to the law of my solution of UK with the initial data U0. The law of V of K of course is the law of solution UK with another initial data. You see what I have done. And property number 3 property number 3 that property number 3 as the dynamics 3 which I have here defines a Markov chain. That's the dynamics 3 is or defines is a Markov chain is a Markov chain in the space in the space x x times x in which sense that the in the sense that the distribution of pair UK UK VK depends only on the distribution UK minus 1 VK minus 1. Not on the choice of this random variable. Not. OK, right. Now you see this 1 is obvious 1 is obvious 2 2 is almost obvious let me show you just for the first step let me show you just the first step in general case proof of 2 proof of 2 I will not use 3 so I do not care about 3 I care about 2 let K equals to 1 let me show you how this machine run then let us consider G of U let us consider G of G of U of U1 but this is this is S this is S of U0 applied to kick with kick with kick eta 1 eta 1 this is a constant this is a random variable therefore this is nothing but image under the mapping S this is delta measure S in the point U0 times L but this is precisely same formula as the law of my solution which is somewhere there on the formula M if I to prove this 4K not 1 but bigger than 1 I have to repeat this calculation for the case when U0 is not constant but a random variable nothing big nothing nothing happens but just use this nothing apart from this so this property which is denoted by letter M this is nothing else nothing else is being used D U0 D U0 D U0 of U0 now okay now now it is clear what the strategy of the proof should be so from here I have absolutely absolutely trivial consequence absolutely trivial consequence that the distance between my solution UK U0 and UK UK of U0 prime of U0 prime of course it equals to the distance to the distance of the law of UK minus distance of the law of UK so now so now to prove this relation 2 that this distance quickly converges to 0 I will simply prove the same statement that this distance quickly converges to 0 I will prove this by choosing the map psi in such a way that this will become more or less clean clear in one way or another now just let us try to to push forward this program this is what Dioblin was doing when he invented his method of course he was applying this for finite dimensional system it is very easy to explain what does it mean but I have explained too much of extras so simply I have no time to repeat this explain rather well in my book we have there in index Dioblin so just go to Dioblin and you will have question yes UK it is not but still right and you will see that you will just 3 lines I do not have time for these 3 lines because to explain them really it will take at least 5 minutes no this is precisely correct you see the point is that it should be clear it should be clear it should be obvious for you obvious not obvious in fact that the law of UK VK depends only on the law of UK-1 VK-1 and this is yes it does not but write down this is M use M and just repeat the proof ok maybe for the beginner I will start next lecture by proving this but I will see ok so look so let us let us start with let us start with first try to do this with first naive try to do this first naive try to do this first naive try to do this is the following is the following first naive try naive first try by the way I think it works for some systems but definitely not for equations which I keep in mind so first naive try first try first try is the following first try is the following choose choose the map choose the map 4 in such a way and now watch in such a way that in such a way that VK-1 UK-1 which sits in this compact space X we have relation star norm of s V K-1 eta prime K minus s of no K yes yes yes yes yes because this is this is nothing but VK minus s of U of K-1 eta of K this is nothing but nothing but U number K so this distance and this is distance in this space X is bounded by one half of U of K-1 of U of K-1 minus V of K-1 so distance shrink shrink twice not twice if it is shrink by any constant smaller than 1 if we can achieve this if we can achieve this then we immediately have that distance between U of then for any omega then for any choice of random parameter U of K- V of K this distance is bounded by 2 to degree minus K C of omega by C of omega this is distance between original C of omega but distance between any two points like this it is bounded by twice diameter twice diameter of X so this is bounded by 2 to minus K twice diameter of the compact X if so then this random process UK VK for every omega fantastically quickly converges to each other and derive from this the required property the required property 2 it's a trivial exercise the only problem that this is too good this is too good I cannot achieve this but as you will see this is idealistic but this is not stupid because now I will show you what we can do and this what we can do right, this naive approach so now this we can do but what we can do we can do the following this is this is second try second try second try is the following second try the following second try the following the following is the second try ok look first of course let us abbreviate Let us abbreviate, let us denote, let us denote u of k minus 1, of course, denote it as u, and vk minus 1, of course, let us denote as v, let us denote as v, just simply to shorten notations. We have to consider r, this is 1 and notation number 3, d equals to distance between u, u and v. We have to consider two cases, two cases, two cases, two cases. This distance d is small, d is smaller than some delta, which is small constant, which we have to construct choosing everything around, right? Or distance is big, or distance is big, distance is big, right? Either we start with two close points or we start with two big points. Now question, how do you think which case is more complicated? When distance is smaller than it is big, Albert? No, no, no. Right, so this is just, you see, I simply want to draw your attention to what happens, because the real difficulty is the first case, when the distance is small. But to explain why the second case is easy, I first have to explain you the first case. So look, we now consider the first case. The first case, when distance d is smaller than delta, delta will be very, very small. All this game is about very small parameter, our control is very poor. This is d, delta smaller than 1, I will stress. Delta is to be chosen below, is to be chosen, is to be chosen, is to be chosen. Okay, the node by d subindex delta is within it of the diagonal. Its collection of pairs u, v belongs to actually x times h. Not here, well, most likely you will not see why here should be complete space compact, but it doesn't matter, this is not really important. Such that u minus v is bounded by delta. It is bounded by delta, yes? Then look, then maybe this is one. We start, we start by looking, we start by looking, we start by looking at, by looking at the, ah, just one more, one more word, one more very good word, one more very good word. So look, let mu and mu be measures in the same probability space, probability space m. Definition, definition. A coupling, a coupling, a coupling for this pair of measures, for this pair, for this pair mu nu, mu nu is a pair of random variables, is a pair of random variables, is a pair of random variables. XI and eta, XI and eta, such that the law of XI equals to mu, such that the law of eta equals to nu, equals to nu, right? At first and second lecture I told you that this is basic fact, that for every measure we can realize every measure as the law of some random variable. But now I choose, but now for the, for two measures I choose any two random variables, such that their laws are the same. What is the probability space here is of no importance. When the strategy, when the strategy is run, we are changing probability space systematically. So it doesn't matter. So now look, so now watch, now watch. When u and v are fixed, then just let us go here. When u and v are fixed, then eta k and eta prime k, this is some coupling, some coupling for the measure l with measure l, right? Because this is a random variable with the law l, and this is a random variable with the law l, right? So for any fixed l, we are looking for any fixed pair uv, uv, we are looking for a coupling for the pair ll. We start by looking for the coupling, for the coupling, for the coupling, for the coupling, which is, what is the name, four. Для каплины 4. Для каплины 4 в форме. В форме. В форме. В форме. В в форме. В форме. Эта K. Эта K — это просто любое любое варьёвое вариантов любого варианта омега, independent from everything, such that its law equals to L. Right? Such that the law of Eta K equals to L. Just short random variable is any random variable whose law whose law is L. But now comes, but now comes Kaplan. Because this is the right, this is precisely idea of Kaplan. Right? So I have to, I have to relate. You see? I have two copies, two copies of the law, of the law L. I have to realize them as distributions of two random variables, which are in some complicated way related to each other. Okay. So, there. Okay. And just Eta Prime of K and just Eta Prime of K. Eta Prime of K, just... Aha. Eta Prime of K is the following. I take a random variable Psi, but now watch, which... Aha, okay. Random variable Psi as before, which depends of u v omega, this is not the point, nothing new, so this is what is written there. New fact is below. So this random variable Psi of v omega, which is in four, it is taken, it has the following form. First, this is Eta K, Eta K of omega, precisely this, precisely this, plus unknown map, which depends on u v, and which I apply to the Eta K of omega. Eta K of omega. You see? This is tricky. This is tricky. This is tricky. So this is where K will now show up. You see, so I realized firstly, I realized measure L as distribution of some random variable. Right? Secondly, I realized measure L by a distribution of some random variable, which is just deformation of the region, a small deformation, small, small, small. Right? So now watch. Okay. So when now I'm just, of course, and of course my restriction is that the law of this random variable must be L, of course, right? It is written over there. Where, of course, where phi, where of course phi is a map, which is defined on the diagonal d delta times the space, times the space, times the space where the curve iterates to the space of the curve. And it is... Okay. The map, the map should be such that, the map should be such that, the map should be such that, should be such that, and now watch. Now I simply, I rewrite what is, or in this naive case. Look. This is what I am writing again. That S of V, A, and and I, for shorthand, I will denote, I will denote this as zeta. Right? So this is S of U eta plus zeta, plus zeta. Right? Because, because, right. So this is, this is eta plus zeta. Right? This is eta plus zeta. Zeta will be small. Ah, no. This is zeta. S V of eta plus zeta minus, minus, minus S of U, S of U eta. My, my dream is that this is smaller than, smaller than half of d, half of d star. So, so dream, dream is the same. Dream is the same. Dream is the same. Right? But now I take it, I take it seriously. First, first now I, I want to realize this dream assuming that this guy is close to this guy. Right? And then you will see that I will do, I will do more. I will do more. But now look. Okay? But now look. By, by, by Taylor formula. Look. V is a small, V equals to U plus small increment of size d. Eta plus zeta is equals to eta plus small increment zeta. So, we can apply a Taylor formula here. Km, Km. So, now watch. S of V eta plus zeta minus S of, S of U, S of U eta equals to, equals to differential of U in the first argument. Calculated, calculated here, calculated of U eta. Apply to distance between this and this. Apply to V minus U. Plus second differential. Differential in eta of S. Calculated also of U eta. Apply, apply to zeta. Apply to zeta. Apply to zeta. Plus O capital of norm V minus U squareded. Which is in fact d squareded. Right? d squareded. Correct? Plus, plus norm, plus norm of zeta in the space E squareded. Norm zeta in the space E squareded. This is the relationship. This is the relationship, right? Now watch. If I, if I may realize, if I really can do what I want to do, right? And if I actually, my dream, actually my dream, what I keep in mind, that my solution zeta will be of size d. Of size d, right? Then new increment, new distance between, between VK plus 1 and UK. Equals to, is a order of square, of the square of the old distance. And this is the whole idea, the whole idea of, of the quadratic convergence. Is that if I really can do this, really can do this, then this sequence of pairs of random variables will converge to, to each other fantastically quick. Quicker than in any, in any exponential way, right? So this is, this is what is called, this is what is called the method of quadratic convergence. And this is what, what is inside, what is inside of the KM theory. Right. Right. Right. Okay. Okay. Now. Now. Now. To achieve, to achieve, to achieve, to achieve the goal sharp. To achieve the goal sharp. To achieve the goal sharp. For a fixed, for a fixed, for a fixed, for a fixed pair, uv, uv. We have to consider the following equation on zeta. Consider the equation on zeta. The equation on zeta. The equation on zeta. The equation on zeta. The equation on zeta. Which equation? Which equation? This is, this is, this is naturally the equation that, that the first term vanish. This is called homological equation. And the, everybody which is, which is, which is doing, which is doing KM theory calls this homological equation. This is double sharp. And this equation precisely at this plus this equals to zero. That this is d eta, d eta to s applied to u eta, applied to s, differential in eta of s calculated at point u eta, applied to zeta, must be equal to minus d sub index u of s calculated in the same point u eta, u eta applied, applied to v minus, v minus u, v minus u, right? Let us denote this guy, this guy f. This guy f, right? This guy f. Look. If we can solve this equation exactly, then my two pairs of random variables will start to shrink to each other fantastically quick in this super exponential way. If you find a clever way to solve this equation approximately, they still will converge to each other very quick. And this precisely what people in the KM theory are doing, this is, this is, this is, this is their bread. This is their bread. So what we know about the equation, about the equation double sharp? What we know is the following. The following. a. By the condition, by the condition, b, b1 prime, by the condition b1 prime, f belongs to space v, which is a compact, which is smoother, smoother than h, smoother than h, right? And b. By the condition, by the condition b3 prime, b3 prime, this is, this is where, this is where the strong assumptions differs, differs from the weak assumptions, right? The map, the map, d eta, d eta has dense image. d, d, d eta of s, s of u eta has, has a dense image. has a dense image. has a dense image. has a dense image. has a dense image, right? This is a good hint that we can resolve this equation approximately. Yes, we can. This is given, this is given by the following, by the following lemma from the linear algebra. Let us denote, let us denote a subindex u eta, u eta equals to differential in eta, s evaluated at u, at u eta. Now lemma, lemma from linear algebra. Lineal algebra. Lineal algebra. Lineal algebra. In view of a, in view of a, in view, in view of a, in view of a, for any a, for any epsilon, for any epsilon bigger than zero, for any epsilon big, bigger than zero, there exists an epsilon, an epsilon, which is, which is just a natural number. I will later explain why, why I need this. And a linear operator, and a linear operator, and a linear operator, and a linear operator r. Each map the space h to the subspace e subindex. Это ве Концентре, ве Концентре и ве Концентре. Ве Концентре он генерал на первую фиг, и на первую фиг на фиг на фиг наagnотный элемент. В локте предыдущей сентября я говорил, что я, в зоне расходов, и на фига сентября, я должен разummирать сентября в Princeton L2L2O1, Это было называемое 1, 2, 3. Тогда продукт из всех пейзов, Fijl, делает базу в пейзе функции. И я беру, как говорится, галюркин, галюркин, сапспейс, галюркин, сапспейс, в этом пейзе. Так что, есть и мэп, such that. Such that. Such that. Теперь property, such that. Zero. It is bounded. Norm of the mapping are epsilon, norm of the mapping are epsilon in natural spaces. It is bounded by some constant C epsilon uniformly in UV from this diagonal D delta. E. Image of the mapping are epsilon is finite dimensional. This is, this is, галюркин, галюркин mapping. It is a galurkin mapping. And double E. And double E, this is, this is, this is, this is approximating this. This is approximating this. So if I take, if I apply the mapping F u, F u eta, compose with this R epsilon of u eta, if I apply to the F, then it will be, then this will be almost F back. So this is bounded by epsilon norm F of V. What does it mean? What does it mean? It means, it means that if I, if I really want to solve approximately the homological equation, where is this? The homological equation, right? I have simply have to, have to choose zeta equals to R of F. R of F, and this is, this is a solution. This is, okay? Okay. Why, why this is true? Why this is, why, why? It is in fact very easy. It is in fact very easy. Because look, because if the map, if the map R epsilon was, was, was invertible, then I could write down in those, in those to the map R epsilon like this, R epsilon. Compose with, compose with R epsilon. R epsilon conjugated to degree minus one. Degree minus one. R, R epsilon, R epsilon conjugated. So this is inverse to the map R epsilon. Because if I apply R epsilon to this guy, I will get, I will get identity. I will get identity, right? Right, I will get identity. Now I only know that, that the image is dense. This operator doesn't exist. But it is positive, non-negative self-enjoyed operator, right? So to, to get approximate inverse, I simply put here epsilon minus one and this object exist. And this object exist, right? The only disadvantage is that this operator is, is not Galerkin. It has, is, its, its image is not finite dimensional. So I may to apply some more efforts, which is not a big deal. It is easy. This is okay. So looks like easy. Looks like easy. This is not a big deal. Difficulties will come later. What is, what are difficulties here? Some difficulties. Maybe you know, this is, this is really, it is not difficult. Difficulties, yes, just to, to, to construct here proper. It's, it's fine, yes, so this homological equation always can be solved like this. Always can be solved like this, okay. So this is, it is like this, it is like this. Now, now let us, now let us return with this. Now we can solve homological equation. Now homological equation is, is, is approximately solved. Homological equation is, is approximately solved. And now let us return to the, to this, to the, no, no, no. Now with this operator I can, I can find approximate solution of the homological equation. The approximate solution is zeta equals to minus r epsilon Composed to, composed to d eta s of, s of uv Applied to, applied to v minus u. This is the, this is the, this is the approximate solution. If so, then in view of the, on view of the equation shop, in view of the relationship, the following happens. Now I am simply looking, looking at the, at this guy. Instead of this will be something, will be something of order epsilon. And so what will happen will be the following. s of v, eta plus zeta, minus s of u of eta. The norm of this guy is bounded by, and now watch. Is bounded by c. Now times, times, because we see, because, ah yes. Because, because the norm of this guy is, is, is bounded by, is bounded by epsilon d. c, here comes epsilon d. And here comes, here comes these two, these two guys. These two guys, if we, if we will see, they will be the following. c epsilon, c epsilon d, c epsilon d plus epsilon. Plus epsilon, ah, no, c, c times d, c times d. c times d, c times d. Let us, let us, let us analyze, no, no, no, no, no, no, no. No, no, excuse me. Because here we should, we should clearly see that I'm not, that I'm not teaching, ah, cheating you. So this is bounded by, bounded by c times d. And now comes d plus, constant c, which depends on epsilon d. d and plus epsilon, plus epsilon. Now let us understand who is who here. So this guy who is d squared, clearly, clearly this is, this is this guy, right? This guy, this guy, but norm of zeta. Norm of zeta is bounded by c epsilon times d. Norm of zeta is bounded by c epsilon times d. So, so, so this guy, so this guy, this is, it is here. It is here, right? But, but also we have disparity. Also we have disparity, disparity is here. Disparity is here, right? Now can we make this, can we choose this in such, in such a way that indeed the disparity of the next step is much smaller, that this is bigger than, smaller than, smaller than one half. Well, yes, yes, we can do it, we can do it. Look, look, first, first, we take epsilon so small, so small that, so small that, that c epsilon is bounded by one over six. C epsilon, c time epsilon is bounded by one over six. Next, we fix this epsilon, and with this epsilon, and with this epsilon fixed, we choose delta, we choose delta, this is, this is, right, the d, d is smaller than delta, right? We choose delta, because d is smaller than delta. We choose delta such that, such that c, c epsilon d is bounded by one over six, and c times d is bounded by one over six. By one of, yes, yes, yes, yes, yes, ask, ask, ask. Yes, yes, yes, but look, but look here, look here, look here. This is precisely why we need this. We use this, we use this. As smooth as smooth as smooth. No, of course, we shoot, we shoot. No, no question. Yes, of course, yes, of course, yes, of course, yes, yes, yes. Yes, yes, yes, yes, yes, yes. Yes yes yes. Look, right. Look what happens. Look what happens. You are just absolutely right. Look at this lemma. Lemma from linear algebra. What is bad with this lemma. This is an extremely weak statement. We have absolutely no control for this Epsilon. I have absolutely no control for this Epsilon. Вы видите, чтобы терпеть эту схему, я могу терпеть эту схему только потому, что я использую проблему. Только потому, что в Минвелу-Венешесе, и некоторые люди, которые должны быть очень плохими, и в определенном статусе, это просто не получится. Смотри, потому что здесь что-то случится, потому что это очень квадратная конвертация. Это очень квадратная конвертация. Реалеквадратная конвертация, реалеквадратная конвертация. Но рейта конвертации, которую я гарантию, она только экспоненциала. Я уверен, что здесь нет никакой экспоненциальной конвертации. В этой проблеме нет никакой экспоненциальной конвертации. Просто, ты видишь? Мы используем очень-очень флэксабльный лемма, это очень флэксабльный лемма, это очень большая soluция. Это очень плохая стима. И потом, это cost us quite a lot of efforts, just to check that this weak estimate still allow to finish the job. On the price that the convergence is not super exponential anymore. But as some clever people told me in the original paper of Nash, he claimed in this Z-paper, he claimed just the exponential convergence. He did not care about this super fast. Now, where I have cheated you is somebody who did not read the paper but can tell me where I did cheat you. Because there is one important step to be done here. One important step to be done. Because look, I have almost realized the first try. I told you that this cannot be done. First I assume that the initial data is close. Because the big distance is easy, you will see. But why I did not do exactly this? Why did not do exactly this? Yes, this is the whole point. This is the whole point that I have cheated you because of course the law of eta k of course is L. I took it like this. But eta prime k, it was constructed. So the law of eta prime k is L prime, which is not at all L. Which is not at all L. Like in Kalmogorov's theorem, the transformation is very good, but it is not symplectic. If it was Km, I would be dead. But not here. I am not dead. So I have almost done what I have to do. Of course what I have to explain you. What to do if the distance from u0 to u0 prime is big. Of course it is easier, you will see. But more serious I have to explain you how to save situation. Because I cannot allow the law to be different. Here comes two more very beautiful constructions from the sphere of probability. Now I will tell you one and we will continue next time with the second one. So I will be able to almost finish the proof. So look what happens. So I have denoted the law of the random variable eta prime, eta prime k is denoted L prime. But now watch how it was defined. L is a push forward of L under the map psi, which is a push forward of map L under the map identity plus map phi. Map phi is such that it is easy to see. Map phi is at least c1 smooth, c1 smooth, c1 smooth in eta, of course, in eta, in eta, or in dependence on this, on the parameters. I do not care about now. And secondly, the map f norm of the map f is bounded by constant times d. And thirdly, the map phi is finite dimensional. The map phi maps the space e to the space e n sub index epsilon. These three facts imply easily imply the following lemma. The following lemma is the following. Let us take measure L. Let us take measure L prime. Let us measure distance with them, but not Lipschitz-Dorl. But let us measure variational distance between them. Then variational distance between them is bounded by new constant c prime epsilon times d. This d comes there. This constant c epsilon depends on this dimension and epsilon. So growth with epsilon is very, very bad. Сейчас, дайте мне просто рекомендовать, что это означает variational distance. Вариational distance defined like this. Вариational distance between measures mu and mu prime, or total variational distance, is supremum, which is taken over all barrel subsets of the space x, of the space e, because it is measured on e, of mu of e minus mu prime of e. Another definition is that one-half of supremum I take f through the all bounded function, all bounded function of e, and I have to calculate integral of f against mu minus integral of f against mu prime. This is very much like definition of Lipschitz-Dorl distance. But the set is much bigger now. I have much more functions. I can use much more functions. Apart from this, there are two remarkable properties of variational distance. One of them is for variational distance in finite dimensional spaces. And we are effectively finite dimensional, you see. There is a good formula. So, if mu1 equals to p1 of x, p1 of x dx, if mu2 equals to p2 of x dx, then the variational distance is simply integral of p1 minus p2 dx. So, to estimate variational distance, we simply have to estimate integrals. But we are good in estimating integrals. Second fact about... So, this is how to prove this lemma. Take this formula, The map, you see all my restrictions on the map, on the measure L, such that finite dimension of measure L, they have Lipschitz density. They have Lipschitz density. The map phi here is C1 smooth. Let us write down more explicit formula for the density of image measure. This is simply formula for the change of variables in Lebesgue integral under the de-femorphism of the phase space. Explicit formula. And after this, we have two integrals. And this is an easy job to estimate this. Of course, I repeat. I repeat. So, if I start to control the constants, which I have here, it is something enormous. You see, therefore, no quadratic convergence. This is only epsilon delta. For any epsilon, there exists delta such that. No. This is difference with the classical key. With the classical key, right? And with this result enhance, next time I will show you how to control, how to save situation. How to save situation. You see, you remember. You remember that the map look. Here, I look for a map F, which is rather tricky. The map F is not the map, look. The map F, which is here, it is of more complicated structure than what I have produced now. Because now I have eta k of omega just out of blue, right? And second copy is like this. I take Psi uv. Psi uv. Psi uv applied to eta k of omega. You see? You see? But here, I look in this coupling in more general form. So next time I will show you how to start from this coupling, which is not really coupling because this measure, law of this guy is L prime, not L. How to reconstruct, how to save situation and how to construct real coupling. There is another beautiful tool from the probability series. We'll show up here, which is called gluing lemma. Gluing lemma, and gluing lemma, what is called Dabrushan lemma, an extremely beautiful fact of the measure series of probability series. Okay, thank you. That's all.