 Okay, good afternoon. So let me remind you of the theorem that we're studying, or that we're going to prove today. So we have an irrational circle rotation, then uniquely ergodic. So that means that there exists only one invariant ergodic measure. Since we know that Lebesgue measure is invariant, then Lebesgue measure is the ergodic invariant measure, okay? So the last lecture, we proved the theorem that showed that it is enough to prove that f is uniquely ergodic, since f is a continuous map on a metric space. It's enough to prove that the bulk of averages converge uniformly on all x for every continuous function, okay? So that's how we're going to prove this theorem in this case. We're going to use this. So prove. In fact, it's enough to show that there's a dense subset of continuous functions which converge uniformly, right? That's what we proved last time. So sufficient to prove that there exists a dense set phi of continuous functions. Now in this case, we're going to use the fact that these functions are complex valued functions. For simplicity in the previous statements, we talked about real valued functions, but the proof is exactly the same, okay? Of continuous complex valued functions, phi, such that the bulk of averages bnx, bn phi x, equal to 1 over n, phi equals 0, n minus 1, phi composed with f of x converge uniformly to a constant phi bar equals phi that depends on phi, which in this case, of course, because we know that Lebesgue measure, well, converge uniformly to a constant, okay? Which turns out it will have to be the same as the integral of phi with respect to Lebesgue. Okay? Is that clear? This is from the theorem that we proved last time, right? Last time we proved that f is uniquely ergodic if there exists a dense set of continuous functions for which these averages converge uniformly to a constant. Uniformly means an x, right, or obviously not in phi. So for every phi, this converges uniformly in x to a constant that depends on phi, okay? So what is the dense set of functions that we're going to use? Let's define cleverly n greater than 1, complex valued functions phi m of x to be equal to e 2 pi i m x, which is the same as taking cosine 2 pi m x plus i, thank you, yes, i sign. No, x is between 0 and 1, x is just between 0 and 1. You identify s1 with the quotient r quotient z, it's just a kind of interval between 0 and 1. That's why you're taking 2 pi, right? So x runs between 0 and 1, and this does a full period of the cosine and the sine here. Okay, so this set is not dense in the space of all continuous functions, but all the linear combinations of this are dense. And this is a classical theorem called Weierstrass, so Stone Weierstrass theorem, right? So the set phi of all linear combinations of phi m is dense in c0 of s1, continuous functions on s1, okay? This is sometimes called Weierstrass theorem. Moreover, simple observation is that this is linear in the function, right? So notice also that bn phi x is linear in phi in the sense that bn phi plus psi x, right? You can just check simple calculation. This is just equal to 1 over n, i equals 0, n minus 1, phi plus psi, okay? Then you just distribute them. The sum composed with f i is just the same as the sum of phi composed with f i of x plus psi composed with f i of x, okay? And this is just equal to bn phi x plus bn psi x, okay? So it is sufficient to show this convergence to a constant just for these functions here because every, we need to show this property of this uniform convergence for all the linear combinations of phi, but of course, if you have a linear, sorry, all the linear combinations of sets phi m, but of course, if you take a function that's a linear combination of sets phi m, then you will have uniform convergence if and only if you have uniform convergence for each of the functions in the linear combination, okay? So all we need to show is this uniform convergence, in fact, for these particular functions of this form, and this will complete the proof, okay? So sufficient to prove a uniform convergence phi m for all m. So how do we do it? Well, notice first of all that phi m composed with f of x is just equal to. So what is this? E to the 2 pi imx composed with f of x. Now you can identify, in this case, we can identify, so we can identify translation on the circle. We can identify the circle with the unit circle in the complex plane, right? And translation of x is just the same as. So this is, okay, so maybe, sorry, maybe I shouldn't have written it like this. So this is, what this means, sorry, is 2 pi f of x, this is the composition, actually. And so f of x is just x plus alpha, so this is equal to E to the 2 pi im, how do you write it? X plus alpha. And this we can write as E to the 2 pi imx times E to the 2 pi im alpha. And this is just equal to E to the 2 pi im alpha times phi m of x. So in this particular case, this composition becomes a product, right? We can write the composition with f as a product of this, with this value here. And therefore using also the fact that the C0 norm of phi, that phi m of x is always equal to 1 in the complex plane, right, for every x, because this is the formula 2 pi im of x, the absolute value of that number is always 1. And also a special formula for the sum of polynomials, so we have the sum from j equals 0 to n of xj is equal to 1 minus x m plus 1 over 1 minus x, okay, just formula for the sum. And why am I going to use, so I'm going to use these three properties to check the result for these functions, because then we have, so we have 1 over n, the sum, okay, I'm not sure if I have the indexing exactly right, let me just check. So i equals 0 to n minus 1 of phi m composed with f i of x is equal to 1 over n sum i equals 0 to n minus 1 of, so every time I compose with f, it corresponds to multiplying by e to the 2 pi i m alpha, right, so composing f i with f of i of x gives you e to the 2 pi i m i of alpha, sorry, that's why I put j, okay, you're right, that's why I put j, j, j, j, I thought there was something strange, sorry, but phi m is norm 1, so you're right, I have phi m but it's norm 1, so I'm just looking at the absolute value, the absolute value 1 and now I'm going to use this formula here, so I guess I'm going to use it for n minus 1 here and therefore what I get is this is equal to 1 over n and then I'm going to write the sum as a power, wait how am I going to write this, so yes, so x is just e to the 2 pi i m alpha to the power j, right, so I'm going to write this as 1, in fact I need to put the absolute values here, 1 minus e to the 2 pi i m alpha n here, right, and then here I have 1 minus e to the 2 pi i m alpha, okay, so maybe here I should have the absolute values here and the absolute values here, like this, if x is a complex number and so what am I going to do with this, well I'm going to use the fact that this anyway has absolute value 1, so this I'm just going to bound by 1, the absolute value of the difference here is 1, so I'm just going to write 1 over n times 1 over 1 minus e to the 2 pi i m, okay, because you measure the distance on the unit circle, this is the point on the unit circle, 1 is also a point on the unit circle, the maximum distance between any two points is 1, so this is less than equal to 1 and so we get this and so what do we do with this, well this is saying that this is converging as n goes to infinity, there's no dependence on n here, yeah I guess you're right, if we're measuring in the metric in the complex plane, if we identify the circle in the complex plane I guess we can see it like that, yes, that's fine, yeah, yeah, yeah, okay it doesn't matter, 2 will work in any situation, so let's just leave it like that, okay, as n goes to infinity this goes to zero, so this means that this series converges to zero and why is this convergence uniform, so for every x this converges to zero and why is it uniform, because it does not depend on x, exactly, it's a uniform bound that works for every x, so that's exactly what uniform convergence means, okay, yes, sorry, okay, that's a very good question, if m is equal to zero we don't take m equal to zero, we take only m greater than equal to one, yeah, we have not even defined phi of zero, we defined it for phi m, yeah, okay, so there's one more ingredient that might be worth observing, that's a good observation, so if m is equal to zero this is a problem, but there's also other situations in which this might be equal to one, so for example if this is a multiple of two pi then this is also equal to one and we get one minus one is zero on the bottom, so how do we guarantee that this is never a multiple of two pi, exactly, here we use the fact that alpha is irrational, okay, so the fact that alpha is irrational means that alpha times m can never be a integer, because m is an integer, alpha is irrational, this can never be an integer, so we've used the crucial way, otherwise we'd be proving a godicity of rational circle rotations as well as irrational circle rotations which are not organic, okay, very good, so this proves that there is a dense set of continuous functions which converges uniformly, which by the theorem which proved before implies that the system is uniquely organic, then if you remember we had a second theorem which is a corollary of this which was that, so this proves that Lebesgue measure is organic, right, so therefore x, f is uniquely organic and in particular a Lebesgue measure because we know that it's in variant, it has to be the unique invariant measure and therefore Lebesgue measure is organic, so the fact that Lebesgue measure is organic means that if you take the circle and you take any interval here ab, then and you'll take almost every point, remember what I said by Birkhoff's ergodic theorem, if you take Lebesgue almost every point here and you look at the frequency of visits here, in other words you plug the characteristic function of this interval into the Birkhoff averages, then the frequency will converge to the Lebesgue measure of ab, okay, and as I stated now we're going to prove the theorem, I stated the last lecture is that in the particular case in which the map is uniquely ergodic, it turns out that this is true for every point in the circle, it's a much stronger statement, even though it doesn't seem like a big difference to you, it is quite a bit stronger statement, so we're going to prove the theorem that almost every point is uniformly distributed, every point, sorry, so every the orbit of every x in S1 is uniformly distributed, of course with respect to Lebesgue, so proof, so let ab be an arc, ab in S1, okay, then for all epsilon there exists phi psi continuous such that phi is less than or equal to the characteristic function of ab and psi is greater than or equal to this function of ab and the difference between phi and psi is less than epsilon, between psi and phi I should write, right, so you can easily believe that, right, because if you write this as a function, so if you represent the circle as the unit interval here, here you have an interval ab, so the characteristic function is a function, is a step function, it's just like this, right, this is the characteristic function and what we're saying is we can approximate this characteristic function by two functions, one of which is always less or equal to the characteristic function, so it goes like this, zero and then it's continuous but it can be continuous very steep in a very steep way like this, in fact it can even be equal to one here and then above, right, you can take another function that is psi that is just a little bit bigger, right, this is psi, this is phi, okay, so it's easy to see that you have this property here and then what do we have for these functions, so using these functions we have a lim imp to infinity of one over n sum j equals zero to n minus one of the characteristic function of ab composed with fj of x, so this is just what we were trying to measure before, this is just the counting how many times it falls into the interval, right, so this is exactly the distribution, so we would like to show that the sum converges to the Lebesgue measure of ab, this is what we're trying to show, so we're going to study the lim imp and the lim sup and show that they're equal and they exactly converge to this ab, now we're going to show that this is because the characteristic function is greater than equal to phi, this is clearly greater than or equal to the lim imp of one over n the sum j equals zero to n minus one of phi composed with fj of x and what is this lim imp, this is equal to the integral of phi with respect to Lebesgue and why is that, sorry, by Birkhoff's ergodic theorem, that's right, by Birkhoff's ergodic theorem only says that this is true for almost every x but we have just proved that in fact we have uniform convergence for every x, okay, so we are using the previous theorem, we're not just using Birkhoff's ergodic theorem because here we want to show that this distribution is true for any x, okay, so now we're not assuming that this is just x that is generic with respect to Lebesgue measure but any arbitrary x, okay, so what we showed before is that we have uniform convergence to a constant and this constant has to be the integral of phi because almost every point converges to that, since they all converge to the same thing they must all converge to that, okay, so it converges to the integral of phi with respect to Lebesgue and now we use this, the integral of phi with respect to Lebesgue is very close to the integral of psi with respect to Lebesgue, so in fact this This is greater than or equal to the integral of psi with respect to the beg minus epsilon. But the integral of psi is greater than or equal to the characteristic function. So this is greater than or equal to the integral of the characteristic function minus epsilon. And this is precisely equal to the measure. So b minus a minus epsilon. So by for an arbitrary point x, the limit of the frequency in this interval is bounded by b minus a minus epsilon. So already you can see that by taking epsilon small, we're going to show that this has to be at least b minus epsilon. And similarly, by exactly the same calculation, we look at the limit of n tends to infinity, 1 over n, j equals 0, n minus 1 of the characteristic function of a b composed with fj of x. This time is less than or equal to the lim sub 1 over n, the sum of psi composed with fj of x. And by the same argument as before, this is equal to the integral of psi dm. This is less than or equal to the integral of phi dm plus epsilon. And this is less than or equal to the integral of the characteristic function of a b dm plus epsilon. And this is exactly equal to b minus a plus epsilon. So we get that the lim inf epsilon is arbitrary. I've chosen, I get this calculation for any epsilon. This means that the lim inf is greater than or equal to b minus a. The lim sub is less than or equal to b minus a. So they both have to coincide and be equal to b minus a. So since epsilon is arbitrary, we have the limit, n tends to infinity, 1 over n, the sum, j equals 0, n minus 1 of the characteristic function. Any questions? Everything OK? So this part of the proof is fairly straightforward. The difficult part really was what we did last lecture, showing in general that this uniform convergence was equivalent to unique algorithm. The application to the circle is fairly straightforward. So why have I insisted so much on this uniform distribution and this unique ergodicity? So in general, the fact that almost every point is uniformly distributed is a very interesting statement from Birkhoff's ergodic theorem. It tells us about the statistical distribution of almost every point. Here we have this for every point. And what I want to do for the rest of these lectures is give a very nice little application of this result to a topic that seems to have nothing to do with dynamics at all and more with numbers. So this completes the proof of the unique ergodicity of circle rotations and uniform distribution. And before going on to the next example, let's look at this interesting application to the theory of numbers. So has anyone heard of something called Benford's law? Has anyone heard of this? So this is a very curious phenomena that various people have observed, in particular, Benford, Mr. Benford, although some people before him. So about maybe 100 years ago, I think in the 1800s sometime, he noticed that when you have big collections of numbers, for example, if you take a big list of numbers representing, for example, all the sizes of the population in different cities in the whole world, or different measurements of different things. And you looked at the first digit of these numbers. Just the first digit. It can be 0, 1, 2, 3, 4, 5, and so on. The first non-zero digit, actually. So not 0. The first time the digit is non-zero. So if it's bigger than 0, if it's bigger than 1, it will be just the first digit. If it's 0.005, then it will be 5. It will be the first significant digit. He noticed that the digit 1 seemed to occur much more frequently than the digit 2. And the digit 2 seemed to occur much more frequently than the digit 3, and so on and so forth. And this was really mysterious in some ways. And in the old days, before calculators, there was logarithmic tables. I don't know if you know that logarithms were used to do big calculations before the use of electronic calculators. If you needed to multiply two big numbers, what you'd do is you take the logarithm of the two and then take some of the logarithm and then take the exponential instead of doing the product. And to take the logarithm, there was big books like this that had all the logarithm of all these big numbers. And he noticed that in these tables, all the beginning part, the first few pages were much more used, the ones corresponding to numbers beginning with 1, where the leading digit was 1. And he says, how come? Is it true that there's more numbers with leading digit beginning with 1? And it turns out that there is some mathematics behind this observation. Actually, there is a reason. So let me first define. So let A be a real number. And let D of A be the leading digit, leading non-zero digit of A. So we say that a sequence, Penford distribution, if for every D equals 1, 2, up to 9, we have that P of D, which I define as, let me call it actually, maybe P of D. P of D, kind of probability of D, is equal to the limit as n tends to infinity of 1 over n of the set of indices i between 0 and n minus 1, such that D, the leading digit of A i, equals D equals log base 10 of 1 plus 1 over D. So what I'm saying is that you look at this sequence of numbers. You look at the first n numbers, and you count how many of them have leading digit D. Just the proportion of those numbers that have leading digit D. And I'm saying that what I want is this proportion to converge to exactly this number here, log base 10 of 1 plus 1 over D, mysterious number. We will explain it in a second. So what are these numbers? First of all, notice that B1 is approximately 0.301 something, approximately 30% by this formula. So P2 is approximately 0.176 something, approximately 17%. And then I won't write them all down. But you go monotonically decreasing all the way to P9, which is equal to 0.045, approximately, let's say, call it 4%. Notice also that the sum from D equals 1 to 9 of log base 10 of 1 plus 1 over D is equal to 1. Exercise, you can check this. So this is a kind of, you can think of this as a probability distribution on the digits 1 to 9. What it's saying is that the leading digit, as you take the sequence, as you take the asymptotic limit of the sequence, the probability of finding numbers with leading digit 1, so the frequency of having digits with leading digit 1 is 30%. Well, exactly this, log base 10 of 2. The probability of finding numbers with leading digit 2 is this exactly 1 plus 1 over 3, log base 10 of 1 plus 1 over 3, which is roughly 17%, and so on. So I've not proved anything. I've just given a definition. It might very well be that it's impossible to find a sequence with this property. I'm just saying if a sequence has this property, then I say it has Benford's distribution. And this is what seemed to occur in real life. They seem to find. In fact, when Benford measured all this data, like if you take the list of all the populations of all the cities in the world or in a big country, it seems somehow that you get this. If you look at the cities where the leading digit is 1, so if the population is between 10 and 20 or between 100 and 200, where the first digit is 1 or 1,000 or 1 million something, then 30% of the places seem to have a population that starts with 1. And about 17% seems that the population starts with 2, and so on and so forth. It seems crazy. You can check it. You can try. So we're going to prove the following theorem that is quite interesting also. So let k not a power of 10. So let k be any natural number that is not a power of 10. So it's not 100, 1,000, 1 million, and so on. Then the sequence kn, k to the power n, satisfies a Benford distribution. So for example, if k is equal to 2, then the sequence would be 2, 4, 8, 16, 32, 64, and so on. And then you look at the sequence. You look at the first digit of all these numbers. And the number 1 occurs asymptotically exactly log 10 of 2 times for large enough n. It's surprising that there would be such a pattern. But the proof is very simple, actually. It's remarkably simple to see that we have this. We're going to prove this in two lemmas. So under these assumptions, so let k in n not a power of 10. Then the sequence log base 10 of k to the n, mod 1 n equals 1 to infinity, is uniformly distributed with a specular bag. And now you can tell me exactly why that is true immediately. Exactly. What kind of circle rotation? Exactly. And is that rational or irrational? Because k is not a power of 10. This is where we use the k is not a power of 10. Exactly. Because this is just log 10 a. So proof log base 10 k to the n, mod 1 is equal to n log 10 k. So which is just a circle rotation with alpha equals log base 10 of k, which is irrational. Because you start with, for example, the 0.0 on the circle. And then you add, when n equals 1, it's like rotating log 10 k. Then you rotate once more and you get 2 log 10 k mod 1 and so on. And you get the exactly the mod 1 means it's on the circle. And n log 10 k is just adding log 10 k at each iteration. So it's just translation by that. And why do we need the full strength of the unique ergodicity to make this conclusion? And not just Birkoff's ergodic theorem again. I'm trying to emphasize the difference between these two things. Because here it's exactly the orbit of 0 that we're looking at. So it's the orbit of a specific point under this irrational circle rotation. So it would not be enough if all we knew was Birkoff's ergodic theorem. And we only knew that almost every point was uniformly distributed. We would not know which points were and which points were not. But because we know that every point is uniformly distributed, then the orbit of the point 0 under this irrational circle rotation is uniformly distributed. And so how are we going to use this to prove our theorem? What's the relation between this and this? And what's the relation between the uniform distribution of this and the Benford distribution of kn? This is what we're going to establish now. So what we're going to show is that this sequence here satisfies Benford distribution. And we've shown that the log base 10 of this same sequence, mod 1, is uniformly distributed. So in fact, what we're going to show is a slightly more general relationship between these two facts. So we're going to show that, OK, so lemma, let ai be a sequence, and suppose that the sequence log base 10 of ai, mod 1, is uniformly distributed. Then ai has Benford's distribution. And this, in some sense, is the key to why Benford's distribution occurs in real life. I haven't actually thought about it that deeply. But in some sense, it is related to objects, to sequences in which the logarithm is uniformly distributed, and that's when you get Benford's distribution. So let's prove this. So we suppose that this is uniformly distributed, and we should show that this has Benford's distribution. So notice, first of all, that what does it mean to have a leading digit of ai is equal to d. What does it mean? That means that we can write ai. It means d is the first digit, OK? So it means that ai is of the form d plus 1, 10 to the j, d 10 to the j for some j. It's in this interval that starts with d, either d thousands or d millions or whatever d for some j. And this is equivalent to saying that taking log base 10 everywhere, this gives us, so log base 10 here gives us log, let me take a little bit more space, actually. So this is equivalent to saying that log base 10 of d plus log base 10 of j, which is j, is less, so this is equal, this can be equal as well. So it's less than or equal to log base 10 of ai, which is less than or equal to log base 10 of d minus, sorry, less than log base 10 of d minus 1 plus j. Just taking log base 10 everywhere. And this taking mod 1, OK? Yes, thank you, d plus 1. So j is just an integer. So for some integer j, for some integer j. So we can take mod 1 everywhere. And this is the same as saying log base 10 of d is less than or equal to log base 10 of a mod 1, less than log base 10 of d plus 1. Because if I take mod 1, mod 1 just means taking the non-integer part of the number, right? This j is an integer, j plus log 10 d. Log 10 d, by the way, is between 0 and 1, right? So j plus log 10 d is some number between j and j plus 1. And so is this. So the fractional part of this is exactly between log base 10 of d and log base 10 of d. So this I will write for simplicity because of what we're going to do. It's just the same as saying that log base 10 of ai mod 1 belongs to this interval, right? To the half closed interval, log base 10 of d and log base 10 of d plus 1. That's right? The big measure of this interval. Exactly. So our assumption is that this is uniformly distributed. This is an interval inside the unit interval 0, 1. So the proportion of times of this number here belongs to this interval converges as I goes to infinity. The proportion, the frequency of these digits that belong to it converges exactly to the size of this interval. And the size of this interval turns out to be exactly the log base 10 of 1 plus 1 over d. I'll write it out now, but it's very simple. So since by assumption, log base 10 of ai mod 1 is uniformly distributed. And so this property here, which is what we're trying to measure, is exactly equivalent to this property here. So the limit as n tends to infinity of 1 over n times the proportion of i between 1 and n, between 1 and n, such that the leading digit of ai equals d, is exactly equal to the limit 1 over n of the proportion of i between 1 and n, such that log 10 of ai mod 1 belongs to the interval log base 10 of d, log base 10 of d plus 1. And because this is uniformly distributed, this limit is exactly equal to the measure of this interval, which is log base 10 of d plus 1 minus log base 10 of d, which is just log of d plus 1 over d, which is just equal to log base 10 of 1 plus 1 over d. So you see this is not a very complex or mathematically sophisticated argument. But it's quite curious. And we have used in an essential way the fact that this is uniformly distributed. So we use the unique register of circle rotations. So we use the dynamical or gothic property to prove something that you could consider a number theoretic, interesting number theoretic statement. And we will also see some other applications to number theory in the next couple of lectures. So we still have a little bit of time, not much. But I think let's take just a couple of minutes break. And then I will just introduce the next topic so that the next lecture we can already be a little bit further ahead. OK. So I promise that our first two examples would be examples which we already know that Lebesgue measure is invariant, or in which at least it's easy. And the challenge is to show that it's gothic. So in this first case of circle rotations, we show that it's gothic indirectly, in some sense, by showing that there exists only one invariant measure. And therefore, Lebesgue measure needs to be gothic. Now we will look at another very important class of systems which are already a little bit familiar to you from the previous course. So we will study piecewise affine full branch maps. Do you remember what full branch maps were? Yes, do you remember what piecewise affine means? So what's an example of a piecewise affine full branch maps? Map. Is that it? Yes. 10x mod 1, or 2x mod 1, or 5x mod mod 1, and so on. So however, we want to be a little bit more, this is already a little bit more advanced course. So we're going to look at something a little bit more general, and we're going to allow to have an infinite number of branches in our case. So let me give the definition. So f, from an interval to itself, is a full branch map. Full branch map, if there exists a finite or countable partition p into subintervals such that, so partition p of i mod 0, let's say. Now that we're talking about measures, we have a slightly easier way of not worrying about the fact whether these intervals are open or closed, or half open, or half closed, because we only need almost every point to be in the interior of this partition. So when we say a partition of i mod 0, we mean that the element of the partition cover almost every point of i. That's what we are interested in, from the point of view of Lebesgue measure, at least. Such that for each omega in p, we have the map f, the stick that let's say to the interior of omega, from omega to i, is a bijection. So the basic picture is like this, that you have some number of intervals. And for each interval, you have some bijection onto everything. By this is the most general definition of full branch. If these branches, so each of these maps, it's called a branch. And if these branches are homomorphism, or difthomorphism, or affine, then we say that f is piecewise. So f is piecewise, so piecewise c0, c1, c infinity, affine, et cetera. If each f restricted to the interior of omega is a homomorphism, or c1 difthio, or c infinity difthio, or c2, or affine, and so on. So affine, what does affine mean? You remember in this context? It means that the derivative is constant on each branch, though, on each branch. Affine does not mean the derivative is greater than 1. It has to be greater than 1 in this case, because if it's constant and it sends a sub-interval to the whole interval, then it has to be greater than 1, because it sends a smaller interval to a bigger interval with constant derivative. So this derivative has to be bigger than 1. Affine just means the derivative is constant on each branch. So example of piecewise affine full branch map, f of x equals 2x mod 1. As you said, this is the typical example. Of course, every f of x equals kx mod 1 is also piecewise affine. f of x equals kx mod 1, where k is an integer. What about if k is not an integer? For example, 2 and 1 half. It's not full branch. It's still piecewise affine, right? So if k equals 2.5, then you get a kind of picture like this. So it's still piecewise affine. But notice piecewise affine, this is a very special case, so piecewise affine. It does not necessarily imply that the derivative is the same on all branches. You can also have a picture like this. This is also piecewise affine. So it does not mean that it's the same. It just means that on each branch, the derivative is constant affine. So the map that sends omega to the whole thing is affine. So in the previous course, we looked at these examples quite a bit. Now, OK, so first lemma is that Lebesgue measure is invariant for these maps. We saw some of these as an example. So in fact, the main theorem we're going to prove is that the Lebesgue measure is invariant and ergodic. So the theorem, if f, i to i, is piecewise affine, full branch, the Lebesgue measure is invariant and ergodic. No, no, like this. Yes, yes. Affine is, yes, yes. By affine, I implicitly assume it's continuous. No, no, no, no, no, no, no. Continuous and affine, yes. So we don't have time to finish the proof today. We will finish next lecture. But for today, let's just state the lemma for the invariance, which is quite easy. I will leave some of it as an exercise. So let's, first of all, state the lemma that f, that Lebesgue measures invariant. So is Lebesgue measure invariant? So let's just look at 2x mod 1 first. This example we did already right in the first lecture, I think. So how do you see that Lebesgue measures invariant? So the key is that, again, because of the general abstract properties of measured theory, since these are Borel measures, it's enough to show that they're invariant on intervals. So if you take the measure of an interval and you take the pre-image of this interval, the measures are the same. So is it true in this case? Yes, OK. It's easy to see. This is an interval a, b, an interval j. The pre-image is made of two intervals. So this is the pre-image of j is this and this. So we need to check that m of f minus 1 of j equals m of j. That's the definition of invariant measure. Is that true in this case? Yes. Why? Because how big is this interval? Is 1 half of j? This is 1 half of j. So the union is j. What about this case? This is j. This is one pre-image, and this is the other pre-image. They're no longer the same size. So I will leave it as an exercise. But this is an important exercise. It's a good candidate for an exam question. To show that this is to prove this lemma that in this case, piecewise, I find full branch, the big measure is invariant. In this case, it's fairly easy. You can calculate the relative size of this in this in terms of the derivatives. And you can show that this is always true. They always have to, wherever you put this point, it will change the derivative. And it will change the size of this, but the sum. What about the infinite case? Infinite full branch with a countable union. It's the same calculation. OK? Exactly, there are all these joints. So even in the infinite case, you still have an infinite number of branches. But in each case, the pre-image corresponding to that branch is exactly proportional to the derivative on that branch. So this is not completely trivial calculation, a very important exercise. Prove that the big measure is invariant even with the infinite case. Because here, I'm always assuming it could have finite or countable branches. From now on, whenever I talk about piecewise full branch, it could have a countable number of branches. So this will prove that it's invariant, the big measure. Are there other invariant measures, or is this also uniquely ergodic? Are we going to show that it's ergodic by showing that it's uniquely ergodic? The big measure is invariant. Is it the only one? Sure. Not only fixed points. What else does it have? What else does this point have? This map, besides fixed points, has periodic points. How many periodic points? Come on, how many periodic points does it have? Meza, how many periodic points does it have? Wait, wait, wait. How many? The set of periodic points is dense. So a lot of periodic points. How do you know it's dense? Exactly. It's homomorphic to the shift map. It's conjugate to the shift map. Or semi-conjugate, almost conjugate to the shift map. We studied this in detail. So it's got lots of periodic points. Each of these periodic points has a Dirac delta measure. So it's got lots of invariant measures. In fact, we will see later that it's got lots of more measures. It's got not only does it have Lebesgue measures invariant and all the Dirac deltas are on the periodic points, but it's got lots of other measures that are neither one nor the other. They're measures that are not sitting on periodic points, but they're not Lebesgue measure either. We're going to study very rich class of invariant measures for this map, very, very rich. We saw already that from a topological point of view, this has very rich dynamics. And this is reflected in the fact that also from an ergodic theory point of view, there are lots of invariant measures for this. But for the moment, we concentrate on Lebesgue measure. We have that it's invariant. And what we're going to show is the very important probability that it's ergodic, which is very non-trivial. It does not follow by abstract arguments of uniqueness because it's not the unique measure. So we really need to show that it's ergodic. So we really need to show that if you take a set that is forward invariant, either it has full Lebesgue measure or zero Lebesgue measure. If you take a set where f minus a equals a, so set that is backward invariant, then it has to be either everything up to zero measure or nothing up to zero measure. We show that last. Next time.