 Good morning. Can you all hear me? Also from the back. Good, so welcome to the last class of our Ergodic theory course. So today we are going to put together what we learned from Stefano's class and Lucia's class and the background we've heard from the first two classes. And we're going to discuss Birkhoff Ergodic theorem, which is kind of the starting point of one of the fundamental results in Ergodic theory. So before doing that, let me start by recalling a result we've seen in the second class of Corinna's course, which is this theorem, which was due to Vile, that says that you fix any rational number alpha, then for any continuous function f on the circle for any x in the circle, then if you take the average up to time n of the values of the function f along the orbit of the point x by the rational rotation by alpha, and if you take n larger and larger, well, this in the limit converges to the integral of f with respect to the Lebesgue measure, so by lambda I'm denoting the Lebesgue measure. And here, let me stress that this holds for any point. So the picture you should have in mind, again, is the circle 0, 1, then you fix a continuous function f, and then you can pick any initial condition x, and you can look at the orbit of x up to time n. So a long segment of the orbit, so this is x. Say this is our alpha of x, our square alpha of x, and so on. And you look at the values of the function along the orbit, and you average all these points you get, and if you take a longer and longer segment of the orbit, well, what you see in the end is the integral of f. Is this clear? This we've seen already. So yeah, please stop me if you have any questions. Don't be afraid to ask. And let me add something else that wasn't stated explicitly, but if you look back to the proof of the theorem, should be clear, is that more precisely, not only this convergence holds for any point in x, but is also uniform in x. So the sequence of function, let me denote a n of f at a point, and this is just the average. Up to time n, it's composed with r alpha to the k of x. This is a sequence of continuous function on the circle, and this converges uniformly to the integral of f. And hopefully we'll see. I'll come back to this part later in this class. Anyhow, this is the kind of result we are aiming for. So we want to do something similar, but in our setting of ergodic theory. So what is our setting? In this class, we have, as usual, xb mu, which is a probability space. So by this, I mean that xb is a measurable space, and mu is a probability measure. Let me stress that it's really fundamental that the measure is finite. We've already seen a few times what goes wrong if you take an infinite measure. And again, so this is a measure space, so a priori, there is no topology whatsoever on x, so no continuous function, nothing. And then you consider a transformation t from x to x, and the only assumption you put on t is that it's measure preserving. This means that the push forward, t star mu, is equal to mu. And again, this map t, as what there is no topology, so it's not continuous, it's just a measurable map which preserves this probability measure. So we want to kind of do the same thing. We have x, and we want to look at some function f from x to the real, and try to do something like this. But the first question is, what assumptions should I put on f? So in this theorem, we're looking at continuous functions, but here we have no topology, so we have no way of saying what a continuous function is. So the first question is, what is the good class of functions we want to look at? Well, if we are aiming for something like this, and we want to take some integral, maybe we should ask at least for the function to be measurable, and we want the integral to be finite. So a natural assumption on the function is asking to be in L1 of mu. So is it clear what I mean by L1? So Lucia briefly defined it at the end. So this is actually a space of equivalence classes of functions, which are integrable and the integral of the modulus is finite, and you say the two functions are equivalent if they coincide almost everywhere. So these are the assumptions you have. The probability space, a measure preserving transformation, and L1 function. You want to pick a point x in your space, follow the orbit, so here you have T of x, here you have square of x, and so on. And you want to look at the average of the values of the function along this orbit, and the question is, these averages, let me call them again a n of f in one over n of x, do the averages converge? And I mean, what does it mean converge? So in which sense do they converge if they do? So do they converge point-wise? Do they converge in some norm? Well, it doesn't make much sense to ask them to converge everywhere because f is an equivalence class of functions, so if I change the values of f along a zero measure set, this is the same function in L1, so it doesn't make much sense to ask for these averages to converge everywhere, right? So in which sense do they converge? And this question is answered by the protagonist of this lecture, which is Birkoff-Fergody's theorem, but before that, just a notation, I've already defined this a n of f, and these are called Birkoff averages, and if I denote s n of f, by this I mean just the sum without taking, without dividing by n, so this is n minus one of f composed with Tk of x, and these are called Birkoff sums. I think Corina will use them, so I'm just fixing notation. So what is the result? Well, Stefan already briefly mentioned part of it during his lecture, so this is Birkoff-Fergody's theorem. Am I missing an idea? So what is the statement? Well, again, you fix a probability space, you fix a measure preserving transformation using the same notation as Lucia, so empty is a measure preserving transformation. Then for all f, the conclusion is that for all f in L1 of mu and for almost every x, this average converge, a n f of x, maybe I should say the limit, the limit of the Birkoff averages exists, and this is what Stefan already mentioned. So let me call this f bar of x where this limit exists. So the statement is also that the function f bar is actually a function in L1, so it's integrable and it's team variant. So this function is in L1, it's team variant and the integral of this function is the same as the integral of your starting function. The integral of f bar with respect to the mu is equal to the integral of f with respect to the mu. Is the statement clear? The average is converged for mu almost every x. So why is this theorem called ergodic theorem if there is no notion of ergodicity where there is an easy corollary, which might be the form you've seen before. With the assumption above, if T is also ergodic with respect to mu, then you can actually say what this limit is. Then for mu almost every x in x, you have that the limit of the Birkoff averages. And let me just rewrite it from zero to n minus one of f along the orbit. This is actually the integral of f with respect to the invariant ergodic measure. Let's prove this corollary, but it's actually an easy exercise. Well, you just apply the theorem that tells you that for mu almost every point, this limit exists. So f bar, which with the notation is this limit where it exists, is an invariant function by ergodicity. What happens by ergodicity? It's constant. F bar of x is equal to a constant mu almost everywhere. So now I just have to convince you that this constant is actually the integral of f. Well, but the integral of f with respect to the mu also by this statement of the theorem is the same as the integral of f bar. But f bar is a constant. So this integral is just constant times the measure of the space. And x is a probability space. So this is one. Is this clear? So I guess it was never stated explicitly, but what do we mean by mu almost every x? We mean that there exists a set with full measure on which this statement holds for every point. You want me to write it down or put it as a remark? Almost every means that there exists a set y such that the measure of y is one. So it has full measure such that for all y in y the conclusion holds. Whenever we say mu almost everywhere or almost surely we always mean that there exists a set of full measures where this happens or on the opposite. The set where this property faces zero measure. So here there is, this is a statement about measure preserving transformation, but all the examples we've seen, usually you add a topological space and you add a continuous map. And we've seen that for continuous map on topological spaces, compact topological spaces there always exist an invariant measure but maybe there are many of them. So let's think about what happens if we change. So this statement holds for every mu measure so Stefan already mentioned that but what happens if we take the same topological dynamical system but we consider different invariant measures? The limit exists maybe let's take two ergodic invariant measures so the term on the right changes, right? Because if you change the measure a priori the integral of f changes. So something on the left has to change as well. What changes on the left? What is the part on the left that depends on the measure? Actually there are two parts on the left that depend on the measure. The point x, so if this statement holds for this x for a measure maybe this x is not a good x for another ergodic measure and actually it's not, okay? But there's also another part that changes with the measure. It's not just the point. The transformation maybe you can take the same but the other thing that changes is the function. So remember that the assumption is that the function is in one of mu so it's integrable with respect to mu and if you change the measure the same function might be integrable for a measure but not integrable for another measure. So well I haven't put this in the exercise but maybe you can think a few minutes of taking a continuous dynamical system maybe one of the ones we already seen and take two ergodic measures and find a function which is in one of one but not in one of the other. So it's integrable with respect to one but not with respect to the other. So I'm not sure how to state this remark but be careful about which measures you are considering, okay? And the last remark is that I haven't stated but part of the theorem is also that the same convergence also in L1. The convergence, the statement is for mu almost everywhere so it's a point why it's almost everywhere convergence but the sequence of the Birkhoff average is converges also in the L1 norm. It's also the L1 norm which Lucia defined. I should speed up a little bit. So if the measure is not ergodic, this is not very precise what I'm about to say but if the measure is not ergodic the function you get here is a function which is, so ergodicity means that the system is in the composable from the measure theoretic point of view. So if the system is not ergodic it means that you can decompose it in different measure preserving system with positive measure. And you can think of, let me draw a picture but this is not precise so maybe your system can be split into several invariant pieces and what this function is is the average of this function but just on these separate pieces. So when you average, averages, you get the same average, right? Doesn't make sense what I'm saying. This is very vague but you would have to make this statement precise and you can but it would take more than an hour so. But anyway, you should think of this limit as the average on the minimal invariant subset of subsystem of your space. And then when you average of these averages you get the same average. But also you can think that, so the integral of F composed with T is the same as the integral of F because T is measure preserving. So the integral of A and of F is the same of the integral of F for every F. So the statement is just basically, this last part of the statement is I can exchange the integral and the limit which you can't always do but yeah. Some applications. So let's take a number in zero one and let's write it down in its decimal expansion. So x is not point x, zero x one, x one, two and so on. So this is the decimal expansion. So by this I mean that x can be written as an infinite sum of the xi divided by 10 to the i. Where xi is a digit between zero and nine. So one thing you might wonder is how many times the same digit occurs in the decimal expansion of a number? And what's the proportion of the digit two in the decimal expansion of a random number? And the statement is that if you count the number of times say one less than k, less than n, such that the digit i of x is some fixed digit l between zero and nine. So many times this happened divided by n. So what's the proportion that the digits, the digit else occurs in the decimal expansion of x is that for the bag almost every point, well this frequency tends one tenth. So each digit occurs with the frequency you expected to occur. So no digit is better than the other, basically. And so well the statement is that this limit exists and is equal to one tenth. How do we prove this? Well, if you take any number, for example, you first notice that x one, the first digit is equal to l. If and only if x belongs to the interval i l, which is l divided by 10, l plus one divided by 10. Okay, well first we have an issue because maybe the decimal expansion is not unique but we've already seen is Hannah's course when an expansion in base two or 10 is not unique. And this happens for countably many points. So when we look for a statement of almost every point, it's okay, we don't worry too much about what happens in zero measure set. So let's suppose that this expansion is unique, so you can meditate this in connection to Hannah's course. Well, this is kind of obvious. So consider the map e 10 of x, which is just a multiplication by 10, mod one. So we learn in primary school that when we multiply by 10, we just shift the dot to a different place, right? So then xk is equal to l if and only if e 10 of k minus one of x, the first digit of this number is equal to l. We've just shifted the k digit to the first place and the first digit is equal to l if and only if it belongs to i l. What is this number then? Well, the number of times up to n such that xk is equal to l, I can just sum for k from one to n of the characteristic function of the interval i l of e 10 to the k minus one. And for the sake of notation, let just me shift the indices. So here it was k minus one for k from one to n. So let me put from zero to n minus one and here we put k. Well, then we've seen, well, it was one of the exercises of Lucia's class. It was for the doubling map, but okay, you can do for exercise for the time 10 map, but this map is ergodic with respect to the Lebesgue measure. So we can apply the corollary of what Birkoff ergodic theorem in this form and this tells you that this average converge for mu almost every point, for Lebesgue almost every point, by Birkoff theorem, Birkoff ergodic theorem for this average converge to the integral of this function in the lambda inside a set of measure one. And let me call this set of measure one n then l, the measure of is equal to one. Okay, there exists a set of full measures such that the Birkoff averages converge to the integral of the function, but what is the integral of an indicator function is just a measure of the set. This is just a Lebesgue measure of il and il is this interval, so this is one over 10. So the set of points in zero one for which the digit l as an asymptotic frequency as measure one and this asymptotic frequency is 110. So the set of in zero one, which are normal and binormal, which are normal in base 10, and by normal in base 10, I mean that the frequency of the digit of any digit from zero to nine occurs with frequency one tenth, well, this set and then it's just the intersection of all this nl, just do the same thing for all the digits from zero to nine. All these sets have measure one, so this set has measure one as well. Is this set, which has measure one? Is this clear? I'll take it as a yes, if you have questions ask. Okay, maybe in three minutes, let's do another example. A function is T invariant, so okay, let me put it here. F is T invariant if mu almost every x in x is equal to f composed with T, so f of x is equal to f composed with T. So the value of the function is the same along all the orbit of T. So yeah, sorry, I didn't write too much. So this average, this average is have a limit by Birkoff-Ergodic theorem for Lebesgue almost every point. And this means that there exists a point with a set of measure one, such that this limit exists and is equal to one 10 for all the points in that set. And I'm just calling that set n. So by this n is an any letter, by this I mean, you write the number isn't based on and you look at the frequency of the digit L. Just because the set of all normal numbers is just the intersection of all these sets for L for all the digits from zero to nine. So we can compute the set of all the numbers with the same objects or the frequency at which some digit occurs, but maybe for the continued fraction expansion of a number. So the thought this is the continuous fraction expansion of x, and you ask yourself what is the frequency of zero from k and minus one of how many times does the digit L appear in the continuous fraction expansion of a number divided by n and you ask yourself does this limit exist and what is it? Well I'll go a bit faster if it's okay for everyone. And the key point is just to remember that a zero is equal to L if and only if x belongs to the set x belongs to the set what I have called it. P L is one over L plus one, one over L. So remember that the connection with the Gauss map, here we have one half, one third, one fourth and so on. One way of finding the continuous fraction expansion was you denote this interval P one, this interval P two and so on. And if a point lands here, well the first digit of the continuous fraction expansion is one and so on. So if it's here it's two, et cetera. And the Gauss map acts on the set on the continuous fraction expansion as a shift on the left. We all remember this. Okay, so well then a k is equal to L if and only if gk of x, well this is the Gauss map, belongs to P L. So you can conclude in the same way. I'm saying then, okay I haven't said this but you haven't seen this. We have seen that the Gauss map, that the back measure is invariant for the Gauss map and let me state it as a fact. The Gauss map is ergodic, not the back measure, sorry. With respect to the measure mu f, which is the measure of, which is density. So in Irena's notation, maybe I should have called it lambda f, in Irena's notation, when you do lambda f this means that this is the measure which has density f with respect to lambda. Where was the function one over log two times one over x plus one. I don't remember if she did it in class for an exercise but at least we checked that this, the Gauss map leaves this measuring variant and as a fact I'll tell you that it's also ergodic. And now you can just apply the theorem I'm raising by the ergodic theorem. Mu at lambda f, sorry. Lambda f, almost every x in zero one, the frequency divided by n, well this for almost every x, this converges as we did for the base two expansions to the integral of the characteristic function of pl with respect to this measure. But again the integral of the characteristic function is nothing but the measure of this interval. Okay, so maybe let me, so what is this measure? This is just pl. This measure is the Lebesgue measure with density one over ln two times one over x plus one. Lebesgue measure. This is the definition of this. And this is just the integral from one divided by l plus one to one over l of this function. If we remember from calculus, let's keep a few passages and I'll tell you that if you integrate these functions correctly you get one over ln two times ln of l plus one ln of l plus one squared divided by l times l plus two. If you wanna check, maybe I got it wrong. Okay, so the last 10 minutes I want to talk about unique ergodicity. Do you have any question on this part? So let's go back to the setting where xd is a compact metric space and t is a continuous map. So we've seen before with Stefano that theorem, this was really of both theorem, which Stefano told us about, that in this case there always exist mu, a Borel probability measure, which is team variant. But we've seen with Hannah that maybe, well, more often than not, there are way more than just one probability measure. So for example, if you have a periodic orbit, then every periodic orbit supports an invariant measure. And for example, in the case of E2, there are, or in the case of expanding maps of degree two, the set of periodic points is dense. So there are lots and lots of invariant measure. But now I want to consider the opposite case where there is just one invariant measure. So definition, we say that t is uniquely ergodic. There exists only one invariant probability measure. There exists only one team variant probability measure. And again, you might ask why is t called uniquely ergodic? If there is no ergodicity here, where there is a proposition or theorem, that tells you that if t is uniquely ergodic, then it's ergodic with respect to its only invariant probability measure. So if t is uniquely ergodic, then it's ergodic. It is ergodic with respect to its invariant. I'm not gonna prove this, but last time yesterday Lucia briefly mentioned that the set of all team variant probability measure is a convex set, and the extremal points are exactly the ergodic measure. So if your convex set is just a point, the extremal set is the point itself, and so it's ergodic. This is the idea, but I'm not gonna do the details. On the other hand, I'm proving another characterization, which should make a link with the first part of this class that says for a continuous to x, where x is compact metric space, the following are equivalent. It is uniquely ergodic for every, again, I mean the setting of topological spaces. So I can talk about continuous function. So for every continuous function, the Birkhoff averages converge to a constant c, depending only on the function and not on the point, constant for all x. The same thing where the convergence is uniform. Oh, that's it. Okay, maybe I should say. The convergence in part two is uniform across x. Okay, so this is a, well, you should think of vial theorem we've seen before. So for example, this theorem also tells you that in the case of rotations, since we've already proved two, well, the Lebesgue measure is the only invariant measure for the rotation. This was not obvious, I guess. Examples, by vial theorem, the rotation, an irrational rotation, where alpha is irrational, is uniquely ergodic. You will see in the exercises, and this I think is a bit challenging, but well, it would be, the statement would be precise in the exercise, but you should think of isometries are uniquely ergodic. We will see next week, probably, that interval exchange transformation that Corinna Briefer mentioned are typically, and this typically I think will be precise next week, uniquely ergodic. But on the other hand, expanding maps of the circle, x, expanding S1, the protagonists of Hannah's course, are not uniquely ergodic. And again, the reason is simply that, for example, as soon as you have periodic orbits, fixed points, well, each of those carry an invariant measure, so there are usually more than one. And this is always the case. For pastly chaotic systems, they will be more often, they will have plenty of invariant measures, or not just one, but for those which are sometimes called slowly chaotic systems. Typically, so with probability one, if you pick one at random, they will be uniquely ergodic. I guess I don't have much time for the proof. Maybe we can just prove one point in the last two minutes. Yeah, so just to have an idea of what techniques you could use to prove a statement like this, we can prove that one implies two. Well, you have to prove a statement for all points, so one thing you can look at is fix any point and just look at the sequence of measure supported on the orbit of, open some segment of the orbit of the point. So a from zero to n minus one of the Dirac delta at tk of x. Is it clear this? Yeah, we've seen this already with Stefano. You look at the sequence of points for all positive, for all the natural number n. Well, we've seen this is contained in the set of all probability measure on x, maybe it was called p, I don't know. And Stefano mentioned the set of probability measure is compact. So every time you have a sequence, there is a convergent subsequence. And well, the convergence and sequence we've seen this in Stefano's class again has to be an invariant, well, it wasn't in Stefano, it was one of the exercises from Stefano's class has to be an invariant measure for the transformation t. Any weak limit of this set is an invariant, but how many choices do you have for an invariant measure for a limit point? Just one. So can you have more than one limit point for this set? No, all the sequences to converge by unique ergodicity, by definition of unique ergodicity, all the limits point have to be the invariant measure. So all the sequence one over n of delta tk of x, all the sequence have to converge weekly star to the only invariant measure because you have no other choices. And by definition of weak star convergence, then all functions for all continuous function f, well, when you integrate f with respect to this measure here, this converges to the integral of f with respect to the measure mu. And what is this integral on the left? The Birkof average. So a and f of x converges to the integral. So I think I missed one line from the statement is, so this is the statement and if any of these solves, then this constant is the integral of f, sorry, the element. Then this constant is the integral of the function with respect to its own invariant measure. This proved that one implies two and that this constant, this is the cf. Any question?