 Good morning everyone. I hope you're enjoying your time in Trieste. It's only been a couple of days, but I certainly have seen a lot of mathematics happening in the last couple of days. And I hope that you either have or will also some evening go into town and walk around Trieste, as I suggested, or at least at the weekend. And so today we'll continue on this basic introduction to ergodic theory, which sits alongside the two courses that Corina and Hannah are giving, which for the moment are still just giving a topological description. But later on, tomorrow and the day after, and next week especially, all of these things will come together to use ergodic theory to give some very sophisticated results about these kind of examples. that are being discussed. So I will pick up from yesterday's lecture on the definition of invariant measure. And let me remind you what the setting is. So we have x will be a metric space and we have a map f x to x, which in general will be continuous, but for a lot of the things I would say only needs to be a measurable map. Actually, we have B is the Borel sets and we will always just whenever I mention a set, it will be a Borel set. And remember that the definition was that mu is f invariant. I'm not sure if this was given as a definition or as a consequence of the definition of invariants, but anyway, mu is f invariant. We have the property that mu of the pre-image of any set is equal to A. For all A in B sometimes we consider even the big measurable sets, but it doesn't make much difference for us. So why do we care about this property? So for those of you who are already familiar with these results, you know that this property is magical, almost miraculous. Why is that? So the purpose of today's lecture is to give a little bit of motivation for why we care about invariant measures. So there is a classical theorem. The first theorem, I will discuss two theorems and then give some examples. And these two theorems really are incredible as to what they conclude just on the basis that we have an invariant measure. So the first one is called Poincare recurrence theorem and goes back, I believe, to the 1890s. So it's a very classical theorem, although I think that probably Poincare proved that in the particular case of volume preserving systems and it was later generalized to general measures, and the statement is very simple. Under these assumptions here, if we take a set of positive measures, almost every x in A, there exists an integer tau, which may depend on x, such that f tau of x belongs to A. So almost every point comes back to A. So I guess the remarkable thing is what seems like almost complete lack of assumptions for this theorem. So we have our space x, we have some set A, we have no information at all about the map. This is the crucial point of this result. The only information we have is that there's a relation given by this between the map and the measure. So this is a relation that connects the two things. And based on this, we have a set A, it has positive measure. We don't know anything about the structure of the map but we know that almost every point comes back to the set A. So the proof is actually very simple and it goes like this. So we want to show that almost every point comes back. So let's define the set of points that does not come back. So let's write A0 is equal to the set of x in A, such that fn of x does not come back for all n greater than 0. And then we just need to show that this set has measure 0. So if this set has measure 0, then it means that almost all points come back. So we also let An equal to the pre-images of A0. Okay, the calculation initially is kind of abstract. And there's an observation which I will leave as the first exercise actually for today for the afternoon session. Exercise one is just to show that all of these sets are pairwise disjoint. Fairly simple calculation but if n is different from m, this implies that An intersection An is empty. And then that's really all you need because then it's very, very simple because then we just take the measure, so we take the union of all these An's. So what's the measure of the union of these An's? Well, we know on the one hand we know that this is less than or equal to the measure of the whole space, clearly, which is equal to 1. You agree? Obviously it has to be less than 1. This is a probability measure. But these are disjoint. Right? This is the key point. So they're disjoint. So what does that mean? This means that this is equal to the sum of the measures. This is just using the countable additivity property of the measures and the fact that they're disjoint. And so far we haven't used the invariance of the measure but now we can use the invariance of the measure. So what is the measure of each An? Sorry? Mu of A naught. Mu of A naught, exactly, right? Because the definition of An is F minus N of A naught and the definition of invariance. Obviously, so the definition here because from this definition you easily get that all the itsets are the same, right? So mu of F minus N of A is equal to mu of A for any sense of this. So this is just equal to the sum N equals 0 to infinity of the same of just something that has some fixed measure. And why does this complete the proof? See that? This is clear, right? This is an infinite sum. So if this had positive measure this would blow up to infinity, which is a contradict this. So this implies that this has zero measure. So you see it's really a very, very simple proof even though the statement is remarkable and in fact has a really remarkable number of applications. Even outside dynamics, I mean there is a, it's remarkable how many times in some completely different context you end up with some kind of, iterate some kind of system and then you say by Poincare recurrence you get your conclusion. So it's a, it's a theorem that has applications even in other fields. Okay, so still however amazing this is it's kind of not much compared to the next result which goes back to Birkoff in the 1930s. I think this is probably a paper of 1931 or so. And it's a kind of in some sense a more sophisticated version of this because it says that not only if you take this set of positive measure points come back to A, but they come back infinitely often which in fact there is a version of this as just a slightly more sophisticated version of this kind of this argument gives you that they come back infinitely often. But it also talks about the frequency of visits not only infinitely often but you want to know how frequently. So for example do they spend, if you look at the orbit of some point, does it spend 10% of the time here or 20% of the time here? Okay, does that, does that probability exist? And so under this assumption that mu is an invariant measure this theorem says the following that for all phi in L1 of mu, so for all L1 functions for mu and for mu almost every x. So mu almost every x means that the set of point does not mean I guess you know or you were told in both cases it's mu almost every x. So it means that this does not hold for every point in A but the set of points for which it does not hold has zero measure, right? Which is exactly what we've proved here. So we've not proved that this set is empty but just that it has zero measure. So for mu almost every x we have the following limit exists. So as n tends to infinity of 1 over n sum exists. I will make some comments to give you a better feeling of what this means and why it's interesting. So let me make a first remark. So what this is is the, so phi. So we have our space x and we have our map on the space. So we have a dynamics on the space. So you have a point here x0 that goes to x1 that goes to x2. You have the orbit of the point x under F. And then we have our L1 function which is a function under space to R. And what this is doing is evaluating this function along the orbits and taking the average. This is all it's doing. So this is an L1 function so it can be a fairly crazy function. It's not very nice function but you look at the value of phi at this point, the value of phi at this point, the value of phi at all the different points of the orbit and you take the average value. There's absolutely no reason for which this converge. In general it does not converge. The theorem says that if mu is an invariant measure then it converges for mu almost every x. In general this limit will depend very much on x. There is a very simple way to see this. If you take for example the identity map. So if x is 0, 1 for example and f of x equals x then what is f bar of x? Everyone see that? So if it's the identity then every point is a fixed point. If it's the identity then every point just maps to itself which means that this fi of x is always just x. So this is the sum of phi, the average of phi evaluated always in the same point which is x. So when you take 1 over n of this sum you just get phi of x. So this is just phi of x. So even though this is an average in some cases, notice that this has no measure in it. This is just the dynamics. It does not depend. This limit just depends on the dynamical system and on the observable that you choose. On the test function, on the L1 function there is no measure. The statement is just that for mu almost every point the limit exists. But this sum and the limit does not depend on the measure. It just depends on the point and the observable. So here I don't need to specify the measure. I'm just mentioning that if you take the identity map then this limit clearly always exists. And in fact this average is always the function. So this shows that in general this limit will depend very much on the point. So in this case at every point you just have phi of x. Phi is just an L1 function. So the value can be completely different at every point. So in the next two lectures by Lucia and Davide they will introduce a condition on the measure which allows us to actually know what this limit is and that tells us that this limit is essentially constant. It's the same for every point. That's the notion of ergodicity. But for the moment, for the existence of the limit all you need is invariance and in general the limit depends very much on the point. A very useful L1 function to get a better feeling of this result is to take the characteristic function of some set. So let me actually write it here. So suppose we take this as the characteristic function of some set A or some A in x. So as you know the characteristic function is just a function that has value 1 if x is in the set and 0 if x does not belong to the set. So this is clearly an L1 function. So we can calculate this sum and what is it in this case? So in this case if we have 1 over n the sum of phi composed with fi of x is just 1 over n times the sum of the characteristic function because that's what we've chosen for phi in fi of x, i equals 0 to n minus 1. And what is this? What is this sum? This is not the measure of A. This is just a finite sum. This is counting. What is this counting? So this is 0 is 1 every time fi of x belongs to A. Okay, you see this is the characteristic function evaluated in the point fi of x. So this characteristic function takes only values 0 or 1 depending on whether the point where you evaluate it is in the set or is not in the set. So for each i you look at fi of x and you say does it belong to A? Yes, in that case this is 1. Does it not belong to A? In that case this is 0. So this sum here is just a sum of terms that are either 0 or 1 depending on when this belongs to A. So this we can write as 1 over n the cardinality of the set of indices between 0 and n minus 1 such that fi of x belongs to A. Do you agree with this? Please tell me if you have doubts or confused. This sum is just counting how many times you fall in A and this is also doing the same thing. I'm taking here fi of x. I'm checking each i for which it belongs to A and I'm taking the cardinality of these i's. So this is a number between 0 and 1 obviously because here I'm taking n indices. If fi of x always belongs to A then this will be n which we divide by n which is 1. If fi of x never belongs to A then this is 0. So this is always a number between 0 and 1 and it is the proportion of times that you fall into A. So it is the frequency of visits to A. So the existence of the limit of this which is given by this Birkhoff's theorem Birkhoff's theorem said that this limit exists which means that this limit exists which means that the proportion of times that you fall in A is converging. So we take some set A and we are counting how many times we fall in A. So we look at the first 1000 iterations and we say okay it falls in A 10% of the time and then we take more. We take the first 1 million iterations and maybe it's falling in a 20% of the time out of those 1 million because maybe it just happened after a long time it lands in A in some part of A where it tends to spend a long time. And then you wait longer and maybe it's 15% and then maybe the longer you wait it's converging towards 15%. So then after some time the longer you wait the longer you wait it just gets closer and closer to 15%. Okay. Very similar to when you just flip a coin and it's heads or tails I don't know when it's going to be heads or tails but if you flip it enough times the average tends to converge to 50%. Okay. This is really the fundamental result. Somehow this is perhaps the most important result in Ergodic Theory or Dynamical Systems and Ergodic Theory that is foundational to everything we do. Okay. The fact that this converges and to emphasize that point I'm going to say that again this is for mu almost every point and in many examples it's not that difficult to find points where this does not converge. Okay. So this is going to be my second exercise. So let me make it as a remark first. So let's take one of the systems that we are familiar with. We've been studying which is f of x equals 2x mod 1. You've been thinking about this map. Now did Hannah, did you say what, that Lebesgue measure? No, you did not talk about invariant measures I guess. No. Okay. So it's not so important for the example but it's fairly easy to show that Lebesgue measure is invariant for this map. Okay. So Lebesgue measure. So in, I think in, oh you did. Sorry. So, okay. So you already got this as an example from Irene. I was going to say that she mentioned that it's only necessary to check invariance on intervals in this case. Anyway, so Irene said that this Lebesgue measure is invariant in this case which means that Birkhoff's theorem holds. Okay. So this limit exists for any measurable set A the frequency of visits exists. So let's take our function phi to be the characteristic function of this interval zero one half. Okay. So we look at this interval zero one half. And now I take an arbitrary point which can be inside or outside. It doesn't matter. And I take the orbits and I look at the frequency of visits to zero one half. So Birkhoff's theorem says that for almost every point the frequency of visits converges to something. So you take the point, you look at how often it falls in zero one half and if you asymptotically this will converge. Okay. But the exercise is to find a point that for which it does not converge. Okay. So the exercise find x in zero one such that this limit that one over n the frequency of visits does not converge. There are lots of points like that. So the idea is that it will spend a certain proportion of time. For example, if you look at the first 100 iterates it's easy to choose a point that spends all its time in here for example. But then you can by looking at the base two representation right at the symbolic coding of the points you can choose a point that then after that spends lots of time here and then it can spend even more time here and the frequency of time that it spends inside A and outside A oscillates so that it does not converge. Okay. So this is the motivation. I think this counts these two theorems and the examples count in some sense as a good enough motivation for the definition of an invariant measure. As I said, it's really quite remarkable that both of them even here there seems to be almost no assumptions, right? You don't know anything about the dynamics. You're not assuming anything. You're just assuming a relationship between f and mu which is given by the end. The question is, okay, is this theory empty? In other words, even though you have some very special cases some very special examples of invariant measures maybe in general they don't exist. Okay, so we can prove whatever we want about invariant measures. Maybe they only exist for circle rotations for 2x mod 1 and maybe there's a very few special cases, right? So the next question is really do in general do we have invariant measures? So this is, I'm going to state a theorem in this direction. Fortunately, yes, in general you have lots of invariant measures. Okay, so question, does, let me just, let's just look at a couple of examples first. Very simple example. If x is 0, 1 and f of x equals 1 half of x I'm not sure if this is one of the examples that Irene did maybe in this case. Did you do this example? No? So this is my favorite most trivial and most elementary dynamical system. So what is the system here and what is the invariant measure? Right, so this is easy to see that if you take a point x, some initial condition x, you apply f of x, you just get half of x. Okay, so you get x1 and then you get half of x1 is x2. Okay, so you see that all the points are just converging to 0. 0 is a fixed point. So f of 0 equals 0 and this implies that the Dirac delta in 0, did you do Dirac deltas? Irene, I didn't say. No, okay. So Dirac delta, let me, we'll define it in a second. So is f invariant? Maybe this should be also an exercise. So the Dirac delta in some point P of a set A is the measure that is fully concentrated on the point P. Okay, so this is a measure that the measure of the set A is 1 if P belongs to A and is 0 if P does not belong to A. So together with Lebesgue measure, this is probably the most important measure that occurs in dynamical systems. It's a very simple measure. It puts the whole measure. So let me write this a little bit better if P. So let me give this as another exercise. This is a very nice exercise. So if you have a fixed point and you define the Dirac delta on that fixed point, then this measure is invariant. So this is also a nice example because it kind of is useful to highlight also the limitations of the theorem, right? So since the measure is completely concentrated on this point, when we have a theorem that says something like for mu almost every x, right? Then what does that mean in this case? The only conclusions we can draw are about that point itself, right? Because what mu almost every x means is that there may be a set where the conclusions do not hold, but this set has measure 0. But if this is the measure, this measure is completely concentrated on the fixed point 0 and then the set of all other points has measure 0 for this set. So the conclusions of Poincare's theorem and Birkhoff's theorem do not apply to any of these points if this is the measure we choose. We may have some systems that have many invariant measures and then the mu almost every depends on the measure you choose. For example, you can general... For example, this map here, 2x mod 1, has also a fixed point with 0, but has also... It has also an invariant measure which is the Dirac delta at 0, but it has also Lebesgue measures invariant. Both of those measures may... In fact, it has lots of other invariant measures on fixed points and some... It has lots and lots of measures, but it has at least those two measures. So the conclusions of this theorem, which is also important because I've emphasized how amazing it is, how miraculous it is, one of the biggest limitation is that the conclusions do depend on the measure you choose. So if you choose Lebesgue measure, then your conclusions is that for almost every point with respect to Lebesgue, you get some conclusions. If you choose the Dirac delta at 0, then your conclusion is the same, but in that case, mu almost every x when the Dirac delta is 0 just tells you about the point 0. It does not tell you about any other point because the set of all other points has measure 0. So it really depends on which measure you choose makes a difference here. So... Yeah, so this is an opportunity to make those remarks and to include this measure here. Let me now ask you a question. I claim that as opposed to the 2x mod 1, in this case, this is the only invariant measure in the system. Can someone tell me why? We're sure that this has to be the only invariant measure. Heuristically. Intuitively. Excuse me. We don't have the currents. Can you be a bit more precise about that? Sorry? Yes. Exactly, exactly. So by Poincare and the currents there, so suppose there was some other measure. Okay? Some other measure which would give positive measure to some other set of points, right? Because this measure only gives measure to the point 0. If it's another measure, if it only gives measure to the point 0, then it's the same measure. So let's suppose we have another measure. So this means there must be some other set A that has positive measure. Then by the Poincare and the currents theorem, if there was another invariant measure, every point in here would have to come back to A. But this does not happen. If there is a small set, the image of A belongs to here and converges all the points in A. They converge to 0 and they do not come back to A. Okay? So there can be no recurrence. Exactly. So Poincare and the currents says that if you have an invariant measure, you have recurrence. And here there is no recurrence except at the fixed point. Because it's a fixed point, it's the current. The recurrence applies here with respect to this measure. Because this point here just comes back to itself all the time. It has positive measure and it comes back to itself all the time. Okay, so using this observation we can modify this example and what if we take the same map on the open interval 0, 1. And we take the same map f of x equals one-half of x. Does this map have an invariant measure? Exactly. Exactly. By the same argument it cannot have because the argument did not depend on the point 0 and 1. The only possibility would be the Dirac delta in 0 but 0 is not now in the set. I've removed it from the set. We cannot have Dirac delta in 0. Every point is still converging to 0. But 0 is not part of the set so it's actually not converging. This is an open set and not every sequence has to converge within the set. You take x0, x1, x2. They would like to be converging to 0 but 0 is not there. This is an example almost trivial example of a dynamical system that does not have an invariant measure and it has no recurrence. In general there is no guarantee that systems will have invariant measures. So the theorem is the following. This is the so-called Krilov-Vogulyubov which I never really know how to spell and the Russian speakers here will forgive me. I don't know how to spell or how to pronounce something like this. This is also a very classical theorem from 1937 which follows exactly in this map I don't think that would give any... I'll have to do that calculation. I haven't done it. We're talking about probability... Sorry, everywhere here that brings me to a very good point. I'm always talking about finite measures. Probability measures everywhere in the Poincaré recurrence theorem because thank you for pointing that out actually. So if you take for example the real line and you just take a translation of the real line then Lebesgue measure is invariant by that translation. If you just take X plus 1 on the real line then Lebesgue measure is invariant but of course also in that case you do not have recurrence. That's an infinite measure on the real line and the Poincaré recurrence theorem does not apply. You just take a set and you just translate it and you lose it at infinity so you do not have recurrence. So all these theorems, the Birkhoff theorem and the Poincaré recurrence theorem they hold for probability measures or every finite measure. In terms of your question here I'm curious, I don't know I haven't done that calculation I will do that calculation and check. But certainly probability measures. So the theorem is the following suppose X is a compact metric space X to X is continuous then there exists invariant probability measure. As you can see the compactness assumption obviously there are situations in which X is not compact and the conclusions still hold but you cannot completely remove it this example shows in this case is the lack of compactness which fails the existence of invariant measures. You can actually easily construct similar examples if X is compact and F is not continuous you can do a modification of this where X the map is not continuous at zero so that zero is not a fixed point and you get exactly the same conclusion. There are obviously examples where these do not hold and the conclusions do but as a general result you cannot just really relax these covers. So I'm going to give a sketch of the proof here which is also not that difficult I'm not going to prove it completely though just an idea also because I only have 10 minutes OK so part of the reason to actually give a sketch of the proof is to introduce a general concept which we will need here which is the space of all invariant measures of all probability measures so let M equals the space of all Borel OK I will implicitly M in Borel probability measures on the space X and on this space we have a topology so we have a way to decide when two measures are close or at least when a sequence of measures is converging to another measure this is crucial and there are several ways to define topologies on this the most standard and useful one for us is the so called weak start topology this is actually just a specific version of a very general topology weak start topology in function spaces the probability measures can be seen as dual of the space of continuous function so the weak start topology is just a standard topology on the dual of the space and it is defined like this that a sequence of measures mu N converges to mu if and only if the integral with respect to d mu N of a continuous function converges to the integral of d mu for all phi continuous so I do not have time to go into more detail of this if you do not know what it means to integrate a function maybe in the first lecture there was some discussion about integrating functions with respect to measures if this is unfamiliar to you then just leave it is just this is a way to define what it means for a sequence of measures to converge and there are two facts which I will use in the proof after which the proof is actually very simple and I will not prove these facts they are just basic facts from measure theory or functional analysis and it is the following so one is that if X is compact this implies that M is compact M is weak star compact weak star in this topology and the second one is that if F is continuous F X to X continuous okay so there is a map F induces a map which I think it ended defined which is called the push forward map right so we have a map F star is a map from M M and if you remember this is defined in the following way right so F star F star of mu so you take a probability measure and you use F to define another probability measure defined in this way that the measure of the set A is equal to mu of F minus 1 of A this is the way this map is defined so given a measure mu you use the dynamics to define another measure F star mu this is a probability measure so this is really a map on the space of probability measures and the statement is that if F is continuous then F star is also continuous okay these are not completely trivial facts but they are general facts and I'm going to assume these two and then the proof of the kilo Bogolubov becomes fairly easy just using using these two properties so proof not of these properties but of the theorem and what are we going to do so we're going to define a sequence of measures so I'm going to choose an arbitrary let mu zero be an arbitrary measure this construction I'm going to give an arbitrary measure but it's also because it's arbitrary it's very useful to think of a specific measure and so I would like you to think of the possibility that mu zero is just equal to some dirac measure defined on some point so the space is always non-empty obviously why is it non-empty always what probability measure dirac measure yes so this is always non-empty as long as x is non-empty because if x has at least one point you can define the dirac delta measure on that point that's a probability measure ok so this space is always non-empty apart from the dirac deltas on every point and convex combinations of these it's not a priori clear that there's any other probability measures but in general as I said in many cases there are many probability measures sorry I have to finish here but let me be very quick so now let we define given this we're going to define a sequence of measures mu n to be equal to one over n the sum of this push forward fi star mu zero i equals zero to n minus one and I just want to point out what this means in this particular case what does this mean in this particular case so if we chosen like this then notice that this is one over n the sum i equals zero to n minus one of can you see what this is when we apply to delta x by the definition ok you should check I will not give this as an exercise but you should check that this is just a direct measure in fi of x ok in this case if mu zero is equal to delta of x so in this particular case this would just be take the direct delta along the orbit of x and take the convex combination of this ok that's all this is so you're taking the orbit you're taking the convex combination but we don't need that it's here ok and then by compactness so this is a sequence ok by compactness or sequential compactness to be a bit more precise that we have in this case this has some converging subsequence by compactness there exists mu in M and a subsequence nk converging to infinity such that mu of nk converges to mu and then I'm going to leave as an exercise the third exercise or now the fourth exercise actually exercise 4 to show that to show that f star of the sequence converges to the same thing ok this is a simple exercise you just take mu n of k is this this is a sequence and now for each of these measures I can take the f star the push forward of these measures and I claim that this sequence converges to the same measure as mu nk right so mu nk converges to mu f star of mu nk converges to mu and this gives it right why does it give the result now we use the continuity of f star ok so by continuity then by continuity of f star so one of the definition of continuity is that the limit the image of the limit is the limit of the images or something like that right so f star of mu nk must converge f star of mu nk converges to f star of mu ok and therefore since this converges to mu nk converges also to f star of mu so this together implies that mu is equal to f star of mu which implies that mu is an invariant measure sorry maybe you cannot see it down here which is what we wanted to prove so the result is that we've used the compactness of m so we've used these two properties here right so we just take an arbitrary measure we take the average of these measures by the push forward that's where we're using the dynamics to construct this invariant measure because in the end the measure you get has to be connected to the dynamics so we use the dynamics and we take the average and then we use the compactness so there exists a limit point and the continuity of f star to show that this limit point is invariant on the f star and therefore is an invariant to wrap up the last 30 seconds let me say that again this theorem is remarkable but it has its limitations in the sense that it just shows that there exists an invariant measure it doesn't give us any information about the invariant measure and as I have emphasized with the example of 2x mod 1 in general you have many systems that have infinitely many invariant measures and the question is which one are you interested in which one do you choose in the case of 2x mod 1 given the choice between Lebesgue measure and the Dirac delta in 0 for most purposes you will choose Lebesgue measure because then when you say almost every point it really refers to a set of full Lebesgue measure whereas if you take a singular measure like the Dirac delta it's only very small points but that depends so one of the in dynamical systems in the algorithm you often have a choice of measures and you really want to choose which invariant measure you're interested in whatever reason you want to whatever thing you want to do ok so I will stop here thank you very much