 Okay. Good afternoon, everyone. So this is the first class of the course in Ergodic theory. So I imagine this is not a subject that you are familiar with or maybe have not heard about it before. It is very closely related to the course on dynamical systems that we did previously. But in the previous course we looked at the dynamical systems from a very topological point of view. So we looked at whether you have fixed points, periodic points, dense orbits, structural stability, topological conjugacy, and so on. Many of these systems, as you remember, have some very complicated behavior. And it turns out that a very interesting and very successful way of studying them is using the framework of the theory of probability. And we can describe many of these systems, even though the systems are deterministic systems. We talked a little bit about this notion of determinism and chaos at the end of the previous course. Even though they're deterministic, it can be very useful to describe them using probabilities. Okay. And that's what we're going to do in this course. So the previous course is not a prerequisite in the sense that we do not have to have done a course in dynamical systems to study this. But it will be useful because we will use several of the examples that we studied in the previous course. On the other hand, what is fairly important is a little bit of background on measures, measure theory. And I know you've done a little bit of measure theory, but I think just for completeness what I want to do today is give a little review of the fundamental notions of measure theory. Okay. Probably for several of you this stuff would be known or familiar or elementary, but I think it can be useful anyway. I will just give very schematic key points because in particular I want to define the general notion of a probability measure because we will use it throughout the course. Okay. So this is what I will focus on for most or probably all of today. So I will give a little kind of quick introduction to measure theory. So what is a measure? Well, consider the interval 0, 1 and take some sub-interval in AB. Then it's very elementary. We have the notion of the length of this interval. The length is just the length. B minus A gives the length of this interval. So general measures can be thought of as a generalizations of the notion of length or area or volume. You're already familiar with some measures even if you've never done any measure theory before because you know what we mean by the length or the area or the volume of some object. However, there are certain subsets here. So there exist some subsets of I for which we cannot or to which we cannot assign a length but which clearly have measure or mass. Can you give me some example of such a subset? A cantor set. So for example, a cantor set does not contain any intervals. So it's not clear how to describe the size of a cantor set. In general, you might say, well, a cantor set will have zero size because it has zero length. So example one, if C in I is a cantor set, so you know what a cantor set is. So a cantor set is obtained by removing, taking an open interval, U1, and you remove this open interval from the interval, from the whole interval. The traditional construction of the cantor set is this is the interval one-third, two-thirds, but it doesn't have to be like that. You remove this and you're left with two intervals and then you remove these two intervals from two open intervals from what's left, U2 and U3, and then you have four intervals. And then you repeat this process, then you remove an open interval from here, an open interval from here, an open interval from here, and an open interval from here. And you have, you can call this U3, U4, U5, and U6 and so on, right? And then you get the cantor set is equal to the complement I minus the union from I equals 1 to infinity of UI. This is the cantor set, like you've seen this construction before. So each of these intervals has a length and they're all disjoint, so you can sum the length of these intervals. So the, you can write that the, okay, you can assign a length, length of this union is equal to the sum of the length of UI. Is there not this length of UI? Now what is the length of UI? I can decide what I want the length of UI to be, right? And for example, because this is just a countable set, I can make them very small, so I can write this as epsilon. By taking this small enough, I can make this infinite series converge to however small a number I take, and I can take this equal to epsilon, for example, right? If the UI are sufficiently small, I can take, for example, the length of each UI as epsilon over I squared. And then the sum will be equal to epsilon or 2 epsilon or so. And what does that mean about the length or the mass of this cantor set? It means that this cantor set, even though it does not contain any intervals and therefore it does not have any length, nevertheless, it is the complement of something whose total length is epsilon, which means that in some sense C has a size that is 1 minus epsilon, right? Even though it does not have any length. This is a very nice, this is for me the nicest initial motivation for the reason why we need a more general notion to measure sets, okay? Then just length, because this is a set that does not have, we cannot assign any meaning to its length, it does not have a length, but it definitely should have some kind of mass, okay? Because its complement has length epsilon, okay? So this is one nice example of a set that we cannot measure using the standard notion of length. There is another also very simple example, very easy example. What? Sorry? The set 1 over n is a set, okay, this is a set of, it's a sequence going to 0, right? Okay, but what is the length? What is the, this is a countable set, so it's just a countable set of points. Each point definitely has 0 length in some sense, and then you can ask, okay, what is the length of this sequence? It turns out we will see in the general definition that the length, the generalization of the length will give 0 mass in some sense of this sequence. But the example I had in mind is simply the rationals and the irrationals, right? Consider the set q intersected 0, 1 and the set r, okay, r minus q intersected 0, 1 in 0, 1. So the rationals and the irrationals in here. As you know, the rationals are countable and the irrationals are uncountable, but it turns out also that as we generalize this notion of length, it turns out also that the irrationals have in some sense full measure in the interval, okay? So let me be now a little bit more precise about these definitions. So what do we need for these definitions? Yes, so, okay, one more comment I want to make. What we need, so we need a more general concept of a measure to handle sets with complicated topological structure. So what should be the basic feature? So the fundamental feature of a measure, so suppose, so what we want is some kind of, so let x be a set and let's call p of x equals the set of all subsets of x. So what do we mean by a measure? We would like a function mu from p of x, for example, to 0, 1, which tells you what the measure of each of these subsets is, right? What we're trying to do is to measure the size of any subset of your space. That's what we would like, okay? And what should this measure satisfy? The key compatibility condition is the following, such that if A i, i equals 1 to infinity at this joint, then we would like the measure of the union i equals 1 to infinity of A i to be equal to the sum. This is called countable additivity. So any consistent way of measuring the size of sets really must satisfy this condition, right? This was implicit in the observation that I made here. I said, okay, you have a finite, the measure of this union is equal to the sum of the measures because they're all disjoint, right? Otherwise, you don't have any consistent way of measuring these things. Now, it turns out that this is impossible in general to find such a function, okay? And I want to give you a nice example so there are sets that are not measurable in general. And I want to give you a nice example because it actually uses some ideas from dynamical systems and from the course that we did before to show that this in general cannot be possible. So in general, this is not possible. So consider this example. Let X be the unit circle and let F be X plus alpha irrational rotation. So what is the dynamics of this? You remember what the dynamics of this is? The dynamics of every point is dense in the circle, right? In particular, each orbit is an infinite orbit because it's not periodic, right? Then each point, each X has an infinite, each X0, let's say, has an infinite orbit XI, I equals minus infinity to infinity, which is dense in actually. Now, let me define the following set. Let A0 and S1 contain a single point from each orbit of F. What does this mean? So I take an initial condition X0 and I take its infinite orbit and I choose one of these points and I put it in the set A0. And then I remove all this orbit. I don't look at it anymore. Now I choose a different point that does not belong to this orbit and I look at its orbit and I choose one of those points and it belongs to A0 and so on. So this means that every point in S1, if you look at its orbit, one point of that orbit will belong to A0. This is true for every single point. IE, so let me explain. For every X0 in S1, there exists some I in Z such that XI equals FI of X is in A0. Okay? There exists a unique. This is exactly what I mean by that statement. And now the question is, how big is this set? A0. It's uncountable. It's not equal to S1 because every orbit just has one point, one single point in it, right? But it's an uncountable set because of course there's an uncountable set of orbits, right? Because each orbit is countable so there must be an uncountable set to fill up the whole circle. Okay? What is the size of this set? Okay? So suppose we could, so let's try to measure this set. Try to measure this set. Okay? Suppose there exists a measure mu as a measure mu, which is countably additive like below and translation invariant like Lebesgue measure in the sense that if you take an interval and you just translate this interval, the measure will not change, right? Suppose you have a measure that is translation invariant so that F, this means that the measure mu of F of A is equal to the measure of A for any set. Just like the length of intervals is translation invariant if you translate an interval. Okay? Then what happens? Let's try to measure this point, this here. Okay? Then this set here. We have that the union. Then let A i equals F i of A zero, where A zero is this. And notice that for all i, j in Z, we have that A i, we have F i of A zero intersection F j of A zero is empty for all i different from j. Why is that? Because A zero contains only a unique point from each orbit, right? So of course when you take the images of A zero, it can never intersect itself. No iterate can intersect itself because if it did, it would mean that you had one point that belonged to A zero, that belonged to F i of A zero and also to F j of A zero. But this is impossible because those two points would belong to the same orbit. If you had a point in the intersection, okay? If there exists some y in F i of A zero intersection F j of A zero, then F minus i of y and F minus j of y would belong F minus i is in A zero, F minus j of y is in A zero, but they belong to the same orbit, okay? But both belong to orbit of y, okay? It's not possible, okay? So this means that all the A i's are disjoint. So all, pay why is disjoint? And so we have that the measure of the union i equals minus infinity to infinity of A i is equal to the sum i equals minus infinity to infinity of the measure of A i because that is joint. So the question is what is this measure? And here we know what this is because what is this set here? This is exactly S one because A zero contains exactly one point from every orbit. And for each of these points, we're taking the full orbit, right? Maybe I can write it as F i of A zero is easy to see exactly. So this is exactly S one. So the measure of this will be one. The measure that gives parameterized as length one for the circle. And what is this measure here? What is the measure of each A i? It's the same as measure of A zero, right? Notice that mu of A i is equal to mu of A zero for all i because F is a translation. So what are the possibilities? If mu of A zero is zero, then what is this? Zero. If mu of A zero is non-zero, this is infinity. In both cases, we do not get one. So this gives a contradiction to what? That's right. So this is a counter-example. This is a contradiction to the fact that there exists a measure mu which is countably additive and translation invariant. Okay, for the moment it's this. I just give you an example. Now I will say why this is okay. Give an example of translation invariant. I will give some, I will explain now. This is kind of a motivating example. Now I will give the construction of the measure which satisfies this in the sense, okay? I try to do two things simultaneously, but I cannot do them simultaneously. So now we look at this example, okay? And the reason why I gave this example is because what the problem is really is that we assumed implicitly in here that the measure mu was countably additive on all the possible subsets of the space. That is really the contradiction, okay? And here I show that if there exists such a measure which has these two properties, which is a measure under all the possible subsets of the space, this leads to a contradiction. So we do not want to give up this countable additivity property because it's kind of fundamental to our intuition of what a measure means. The translation invariant relates to a specific measure which is Lebesgue measure, which I will describe in a second, which is the natural generalization of the length of an interval. And we kind of don't want to give up either the possibility that there exists such a measure that is translation invariant because length is translation invariant, okay? So what we need to give up is the fact that the measure can be defined on all possible subsets of the space. And this is the motivation for the notion of a sigma algebra, which I'm sure you have seen before, okay? So I don't know how much it was justified to you, the fact that measures are defined on sigma algebras of subsets and not on all subsets, but this is the explanation. So let's give the definition here. So let's say give a definition. So let X be a set and A be some collection of subsets of X. So we say that A is an algebra of subsets of X if we have the following conditions. One, the empty set belongs to A. Two, if A belongs to A, then the complement of A belongs to A. This is saying that this set is closed undertaking complements. And two, three, we have for any finite collection A1 to An in A implies that the union I equals 1 to N. So let's say that A is a sigma algebra if, moreover, C prime we have. So this says that this set is closed undertaking finite unions and it's called the sigma algebra if it's closed undertaking infinite unions, countable unions. So if A I belongs to A for all I in N implies the union I equals 1 to infinity. So if A is an algebra we define a sigma algebra generated by A, the smallest sigma algebra containing. So in general a sigma algebra contains many more subsets than an algebra obviously because you need to be able to take countable unions and they belong to the algebra. So easy example, again if X is the unit interval 01, let A equals the set of all finite unions of open intervals and closed intervals of all finite unions of open of intervals, finite unions of intervals. They can be open or closed or half open and half closed, it doesn't matter. Is this an algebra? Plus of course the empty set. Is this an algebra? Yes. Because if you take a finite union of intervals the complement is also finite union of intervals. It's clear. Does it have the countable, is it closed under, sorry, is it closed under finite union? Yes. Okay this is an algebra. Is it a sigma algebra? No. Because you can take a countable union, well first of all clearly countable union of disjoint intervals you can find and that does not belong, is not a finite union of intervals. And most interestingly just like in the example of the counter set that we get before the complement of a countable union of intervals may contain no intervals even whatsoever. Okay so this is a nice example of the way when you pass from the algebra to the sigma algebra you really enlarge a lot the family of sets that you're dealing with. Okay they don't even look like intervals. When you hear everything looks like intervals. Okay here you really get some sets like the counter set that look nothing like intervals. Okay if you get this. So is a finite union of intervals and is an algebra. Okay is an algebra but not sigma algebra. So in general for a metric space X. If X is a metric space then we have the sigma algebra is a general definition here. So definition for a metric space X the sigma algebra generated by all open and also closed but that's automatic all open and closed sets is called the Borel sigma algebra. So these things are familiar right Borel sigma algebra. So in particular in the interval well so definition this is the key definition now. So definition in measure is a function from from on sorry on a sigma algebra a such that mu of the union any countable collection AI in A of this joint sets. Yes the measure of the empty set is zero. Yes okay we can include that. I'm not sure if that does not follow from this it's an independent I'm not sure that's the independent requirement or not. Ah that's true that's true okay. No I if you take it as infinity okay you're right you're right. Yeah in the case in which I'm allowing the measure to be infinity so in general we will work with finite measures. So if mu of X is finite we say mu we say mu is a finite measure if mu of X in which case we don't need this. In the case of finite measure then it's automatic if mu of X equals one then we say that mu is a probability measure. It's easier to work with probability measure and in fact working with the finite measures the same thing as working with the probability measure right because if mu hat is a finite measure. We can define mu be defined by mu of A equals mu hat of A over mu hat of X okay for all A in our sigma algebra. Then mu is a probability measure. So this is just a normalization so you just divide by the measure of the whole space and you have a probability measure. So the key I'm glad most of you are familiar with this so still I think it's good just to clarify all our our notation so how do we construct such measures is a key problem. And the most fundamental tool is the following theorem this is perhaps the basic the basic result of measure theory. So let A so let X be a set A so let mu hat be a finitely additive yes this because if you put X in here. You get mu hat of X over mu hat of X which is equal to one. Okay. Shine I know. Okay. Yes, it's not so interesting to work with measures zero. But you're right. So let me you had be a finitely additive function on an algebra. A hat subsets of X then there exists a unique countably additive function mu on the sigma algebra sigma algebra. A generated by a hat which coincides with mu hat on a hat. So this is sometimes called the Kolmogorov extension theorem. And it is a really very very powerful and important result. And let's look at an example. Okay going back to what we started with at the beginning of the lecture. So example suppose X is the unit interval zero one and let's let a hat equals like we said before the finite unions of intervals set of finite unions of intervals and let mu hat from a hat to zero one exactly be the length of an element a. Let's call it a in a hat. So we have our set. We have our algebra of subsets. We said before that this finite unions of intervals was an algebra of subsets because it's fine it's closed on the compliments and finite unions. And we have our function is this finitely additive this function. If you take a finite number of disjoint elements of this set. Then the total length is just the sum of the lengths. Okay. That's what I mean by the length the sum of the lengths right. So this is clearly countably additive. Then mu hat is countably additive. So what this theorem says is that there exists a unique extension to the sigma algebra generated by that right which is the one we mentioned before. So the sigma algebra generated by this which means that you start with these finite unions and you take countable unions of these and you take compliments of these countable unions. And you include all of these subsets in the countable set. In this case this is precisely the Borel sigma algebra on the interval and there exists a unique extension. Okay. Then there exists a unique measure mu from a to zero one on the Borel sigma algebra. Okay. Generated a hat which coincides mu hat. Okay. I'm not giving you this is a kind of review. So I'm not giving you all the proofs of all these things. Okay. I'm just trying to convey the basic notions here. What is this measure called? That's right. This is exactly Lebesgue measure. Okay. This value is called and in particular although it's not completely trivial but in particular Lebesgue measure is translation invariant. Okay. Because this is translation invariant. This is the Lebesgue measure. So it goes back. It's the example that we wanted for the example of the non measurable set I gave before. Right. If what the counter example I gave before in some sense says that it is impossible. Well not in some sense. It says it shows exactly that even though this theorem you see says that you can extend this function to accountably additive function on the sigma algebra generated by a hat. The example shows that you cannot extend it to the collection of all subsets of X. In general. You can extend it to the sigma algebra but not to all subsets. And there are subsets that are not in the sigma algebra and these are not measurable in the sense that these it does not make sense to talk about the measure of this sense. Okay. So this is the fundamental. Let me say a little bit. So oh yes, another observation. We will be using this theorem implicitly a lot. We will not be using it. I mean we'll be using it what it says in the sense that often to understand the measure it is sufficient to understand the measure on the algebra. Right. So because of this uniqueness of the extension suppose you want to prove you want to understand certain you want to you want to construct a measure with certain properties. Very often it will happen that is enough to establish the properties of the measure on the algebra of open sets. For example, of open and close sets because that will determine the properties of the measure on the sigma algebra. So it is often enough to check certain properties on a certain algebra that generates the Borel sigma algebra. Okay. Just a comment that I will refer back to. Okay. So just a few things more. One of the key things of the measure is to generalize the notion of integration. As you know. So let me just remind you briefly what it means to integrate a function with respect to a measure because we will also be using that very much. So integration. So we will assume we will always given. So let X be mu. So this template here defines what we call a measure space. The measure space we have a set. We have a sigma algebra of subsets and we have the countably additive measure on the sigma algebra. Okay. In general, unless we specify because we will always be working with metric spaces, we will always assume that we work with the Borel sigma algebra. Measures that are defined on the Borel sigma algebra sometimes called Borel measures. Okay. So most of the time I will not repeat this fact, but for this moment, let's just be completely clear and write out the full definition. Okay. So A in this Borel sigma algebra, let we define what we call the characteristic function is a function from X to characteristic function of X is defined to be 0 if X does not belong to A and 1 if X does belong to A. So this is the characteristic function of a set. Okay. And a simple function is 1 in the form g equals the sum i equals 1 to n of ci for subsets a1 to an in the sigma algebra. This is a simple function. Right. So this is a function g from X to R. Right. So what does this function look like if this is X and this is R. Okay. This can be negative 2. So and ci. So ci belongs to R. So we want to take positive functions ci now plus the definition of simple functions. Okay. So what does this look like? This means we have some sets a1 a2 a3 a4 and a function is just a kind of a step function. It's 2. But this is ci. So this is c1 c2 c3 and c4 is here. This is the graph of the simple function. But remember, so this, sorry. Ah, yes. This joint for this joint subsets. Thank you. But keep in mind that in general these subsets belong to the sigma algebra. They're not open sets, right? Or they're not this joint open sets. That does this joint. So for example a1 this is schematic. But in the case of the interval, and this is crucial to kind of have a proper perspective on what is coming later on how powerful the integration of with respect to a measure is. So a1 could be a Cantor set. A2 could be another Cantor set that's disjoint from the other one and so on, right? So these sets, even though schematically it's like this, really the function can be very wild. Okay. For example, you could have. So example, the function g of x equals 0 if x is rational and 1 if x is irrational. If you try to draw the graph, it is not at all simple. You cannot draw the graph. But it is a simple function in this sense. Okay. This is I think an important observation not to be misguided, not to be misled because we always draw simple diagram for these things. Okay. But it's not necessarily simple in that sense. This is why it's so powerful. What we've done is we've enlarged, we've understood the fact that there's a very large class of sets that we can be working in. And you can take the characteristic function of any of these sets and define the simple function like this. And then for simple functions, we can integrate. So what is the integral of a function? Well, the integral of a function is just the area under the graph. Okay. Just as it is your usual notion of integral. So for a simple function, we can just let the integral of g d mu is by definition what will be the area under the graph. This is the area under the graph. So what is the size? What is the area here? Well, it's C1 times the measure of A1, the measure. We use the fact that we have a measure here. Okay. And so on. So this is simply the measure I equals 1 to N of C i times the measure of A i. Okay. This is easy, simple function. And if for general non-negative functions, measurable functions. I didn't give the definition of a measurable function. Let me say this in a second. Okay. Let me say, so g from to R is measurable, is measurable if f minus 1 of A is in B for all, for all Borel measurable sets A. Okay. So function is measurable if the pre-image of every measurable set in R. So in R we take the Borel measure. Sometimes we could say this is Borel measurable, but we take a measurable set here and this belongs to the sigma algebra that we're given inside the set X. Okay. This is just a measurable function. It's a way of saying that the function is not too crazy. It doesn't take the value constant equal to 1 on a non-measurable set and equal to 0 everywhere else. That is something we cannot deal with or we cannot integrate. So then we define the integral of g d mu is equal by definition the supremum of the integral of g hat d mu. Where g hat is a simple. Yes. Oh, sorry. Over all g hat simple. So it's a supremum over all possible simple functions g hat of the integral of g hat. Oh, sorry. Sorry. Sorry. Sorry. Of course. Of course. Of course. With g less than equal to g hat less than equal to g. So we have some function which in principle is very wild. Okay. And we approximated from below by simple functions. So I think most of you have seen this before. I'm just reviewing everything. I will not give complete proofs, but it turns out that this is a very good definition. It gives a very good notion of integrability. Okay. For non-negative functions. If the function is negative, you just take the positive. If the function has some is not non-negative, then you just take the two. So for a general measurable function. So for a general measurable g from x to r, we write let g plus equals to the maximum between x and zero and g minus of x equals minimum between g minus i is equal to minus the minimum between zero. So both of these now are non-negative functions. Right. Because this is minus the minimum between zero and g of x. So you only take g of x when it's negative. So you take the negative so it becomes positive or zero. And the same in both of these cases. Right. Then notice that in this case, this integral can be infinite. Right. This supremum can be infinite. And it is infinite in some cases. And in this case, also each of these could be infinite. But if so both of these are non-negative functions so we can find the integral of each of these. So if g plus the mu is finite and g minus the mu is finite, then we say that g is mu integrable if that we say g is mu integrable and let the integral of f d mu equal to the, sorry of g, say that g integral of g d mu equal to the integral of g plus d mu minus the integral of g minus d mu. Yes. Pointwise. At every point. So we let L1 mu be the set of all mu integrable functions. So notice that the big difference between Riemann and when you do the normal Riemann integration, you say that it's integrable. You take an approximation and you say that it's integrable if this limit converges. Okay. Whereas here there's no such problem. Here in some sense this integral is always well defined. The only problem is whether it's finite or not. Okay. So when we say integrable in this case it's not that this is or is not well defined but just that it's finite. Okay. This is the integrable. Okay. So easy example. Then let's take the function g of x equals 0 if x is rational and 1 if x is irrational. Okay. On 0 1. This is a simple function. So in fact we could have done this before. Okay. But then we take the integral if mu equals Lebesgue measure. Then we have that the integral of g d mu is equal to what? Measure the measure of the irrationals in 0 1. Right. Because this is a simple function that takes 0 on the rationals, 1 on the irrationals. So to calculate the integral this is the sum of, okay. So this is 0 times the measure of q in 0 1 plus 1 times 1 times the measure of 0 1 minus q. So it could be 0 or 1. What is it? All different. I don't know. So what's the measure of this? Why? Because the set is countable. And so why is the measure 0? Because it's countable. By the countable additivity. Right. Because each point has 0 measure. And so this is countable. So this has 0 measure. So this has full measure. So this has measure 1. So this is equal to 1. Okay. So Lebesgue, notice that this integral depends on the measure that we're integrating. Not just the function. Right. This is important. There are many other measures and during, in this course, we will work with many different kinds of measures. So there's other examples of measures. So there are many measures. In general, many measures on a set x. So for example, let's take again the unit interval with the Borel sigma algebra. And let's define. So let p be some point in x. Okay. So that Dirac delta of p, okay, is a measure. And the measure of the set A is equal to 0 if p does not belong to A. And 1 if p does belong to A. Or rather 0 if A does not contain p. And 1 if p does contain p. Right. So this is the interval 0, 1. This is the point p. So what we're saying is that this measure is fully lives completely on the point p. If you check, so p is an element of the Borel sigma algebra because individual points belong to the Borel sigma algebra. And so now I can define this measure. And I take any set and I define its measure in this way. And I can easily, it's almost trivial to check that it satisfies all the property of the measure. Especially the additivity property. Because basically, since it's all concentrated on this point, if you take a union of other sets, if these sets, none of them contain p, the measure of all of them will be 0. If one of them contains p, the measure will be 1. Whatever the others do. Okay. So what is the integral? Let's take the same function here. So what is the integral of g with respect to this measure? In what way does it depend on p? Exactly. So because the measure is completely concentrated on here, then this value of this integral is always exactly g of p. Well, it's g of p normalized. So it's s. It's g of p. I don't know why I'm wondering. Because what I'm doing, I'm integrating. You look at the definition of integral I had before. And what I'm doing is I'm integrating a function, whatever this function is. Okay. So this is always g of p independently. It's true in this case. In this case, I get g of p. So I get that this integral is either 0 or 1 because those are the only values that this function takes. Okay. If I take a function that takes on all the values and I use the measure that is only concentrated at p, then this measure will only see the value of g at p. In some sense, the measure is giving a weighting to the different regions in space and is weighting the value of the function according to the distribution of this measure in space. If the measure is completely concentrated on p, then every other point in space has weighting 0. So it's just irrelevant in terms of this integral. But the value of the function at the point p is the only thing that counts. Okay. Good exercise to actually check this in terms of the definition of the integral. Okay. Check this. Okay. Okay. So I think this is almost everything. There's one more thing I wanted to say, which is the fact that there is the space of all measures. So we will see there's many other examples of measures. So it will be useful to talk about the space of all measures. So if x is, so let's suppose that x is a compact metric space. And let b be the Borel sigma algebra. And then we write m is equal to the space of Borel probability measures. So this space has a lot of probability measures because so a lot of measures. For example, if you just take the unit interval, if x is my unit interval, there's already how many measures have we seen? How many examples of measures have we seen? We've seen Lebesgue measure. We've seen the Dirac delta on p, but we also have Dirac delta on any point. Here. These are all different measures. For any point, you can have the Dirac delta on that point. So just by itself like that, it is a uncountable set of measures or probability measures there. Okay. And then you have other measures so you can define a measure mu. If you have another point q here, you can define, for example, a measure mu, which is equal to one half of delta p plus one half of delta q. For example, okay. This means that this measure, half of the measure lives on p and half of the measure lives on q. So if you take the integral here with respect to d mu, okay. So in this case, we would have that the integral of a function g with respect to the mu would be half of the value of g and p plus one half of the value of g and q. Because this measure sees these two points and it assigns equal weightings, okay. And there's lots of other measures. So another example of a measure, I'm giving you kind of random examples here. Another example of a measure is let phi x to r be continuous, okay. Let mu be a probability measure. Suppose the integral of phi d mu equals one, okay. Then we define a measure probability measure mu phi. In the following way, mu phi of a equals integral phi d mu on a. So this, so it looks like I'm casually making this up, but these are important examples that we're going to use. All of these are very important examples that we're going to use a lot in the course, okay. So what does this mean? This means we take our space x. We want this to be a positive function, okay. We take our space x. We have our measure mu on the space which is a probability measure. And then we take a function phi, which is continuous, and such that the area under the graph is equal to one. In fact, this function doesn't need to be continuous. Sorry, it just needs to be integrable. So let's, let's write phi in L1 of mu. It doesn't need to be continuous. It just needs to be integrable. And suppose that the area under the graph is equal to one. Then I can use this to define a new measure, and how do I define the new measure? So I take a set A. I want to know what the measure of this set A is here, okay. Well, I define the measure of this set A as the area under the graph over the set A. So I define this, okay. This is exactly what I have here. No, this will be still a probability measure on the same Borel sigma algebra. Define a probability of mu phi on the same sigma algebra as mu. You fix the sigma algebra, right? In general, you always fix the sigma algebra. So you have a sigma algebra on which mu is defined, okay. And what you get is a new measure defined on the same sigma algebra, okay. So for every measurable set A, you can define this function because for every measurable set A, you can define this object here, which is the area under the graph of the set A. And again, remember this A does not need to be a kind of open set like this. It can be a cantile set. It can be a very crazy set. But I can define this integral here on the set A just in the same way. This is just equal to the integral of phi composed with a characteristic function of A, the mu, right? So this is just integrating a function just the way I defined the integral before. Okay, so there's two things. So these are examples to show that you have a very many probability measures. You have all the Dirac delta measures. You have all the combinations of Dirac delta measures. All the linear combinations, one half of the Dirac delta plus one half of this. So you can take two million Dirac delta measures and take a linear combination that still gives you a probability measure, right? It's a way of distributing your mass. The way I think of measures is this. You have a kilogram of sand, right? And you take this kilogram of sand and you just distribute it however you want on your space. Okay, and then for every measurable set, you see how much sand is sitting on that set. Okay, when you take the Dirac delta measure, you put all your kilogram of sand just on that point. So all your mass is on that point. When you take a function like this, you're just redistributing your mass a little bit. Okay, because this is close. If this was the unit interval and your mu was Lebesgue measure, for example, then all you're doing here is you're shifting slightly. If you took the constant function one here and you did this, you would get again exactly Lebesgue measure. But if you take a function that is not constant, then the measure of this is no longer the Lebesgue measure, but a little bit more or a little bit less depending on the shape of the function here. Okay, so there's many ways of distributing the sand. And the probability measure is just one way of doing that in many different ways. Okay, so it is very useful to have some kind of structure on this set of all these measures because it's a huge set and it contains measures which are very complicated to understand. So by a structure, we mean a topology on this space, a way of saying when two measures are close. Okay, so for example, we take this measure here, mu phi, which is defined by this. We take Lebesgue measure or we take the Dirac delta in the point P. Are these measures close in some way? Okay, some kind of topology. So on this space, we can define the natural topology is called the weak star topology. Okay, because it comes from functional analysis from a certain standard topology and functional analysis. And it is very simple to define in terms of the integral and the weak star topology is the following. If we take a sequence of measures mu in M and we take a limit measure mu in M, then we say that mu N converges to mu in the weak star topology. The definition of this is that the integral of phi mu N converges to the integral of phi mu for all continuous functions phi. You're integrating this function with respect to this measure on your whole space, on the whole space on all of x. So you can check, for example, as an exercise, you can check that if you take a sequence of Dirac measures, for example, converging to P, on points converging to P, you take a sequence of Dirac measures on a point x1, x2, x3 and so on. And on a sequence of points converging to P, then this sequence of measures converges to the Dirac measure on P in this sense. It's a simple exercise you can check because you take continuous functions and these functions, when you integrate it will just take the value of that function at that point, so it will converge for every continuous function. So it is a certain way of saying that the measures are close. So all these definitions are important for what we will do. So everything we've said in this lecture, I am presuming that you are somewhat familiar with most of these things because I've not given a course in measure theory just to review. If there's some things that you really don't know, you found difficult, you should review these aspects. So I can give you some references or something, but we will not need really anything more than what I said. So to some extent, as long as you feel comfortable, even if you do not fully understand the proofs and so on, it might be sufficient, but you should think about these definitions and try to understand them. Let me just give just one final two more minutes and I will give two final definitions. When you look at all these space of probability measures, when you take two measures, you not only want to know maybe how close they are in this topology, but sometimes you want to compare them in different ways, and so I want to give two definitions that compare measures in two particular ways. So let me write this definition. So let mu1 and mu2 be probability measures. Then mu1 is absolutely continuous with respect to mu2 if mu2 of A equals 0 implies mu1 of A equals 0 for all A measurable, of course. I am always presupposing we are with a certain sigma algebra and we always take sets from the sigma algebra. Okay, I'll give the second and then we'll give just a couple of quick examples. So mu1 and mu2 are mutually singular. So this is the definition of absolutely continuous. This is mutually singular. There exists a set A in the sigma algebra such that mu1 of A equals 1 and mu2 of A equals 0. So we use the notation. Here we say that mu1 is absolutely continuous with respect to mu2 and we use the notation mu1 is orthogonal in some sense to mu2 mutually singular. So mutually singular is quite easy to see. Mutually singular means that they live on different sets, that you put your kilogram of sand for A1 on some set and your kilogram of sand for A2 on a different disjoint sense. That's what it means, okay? So example, absolutely continuous is a little bit tricky. Let me give a quick example. So example, again let's always take the interval 0, 1, it's easiest. Let's suppose that mu1 is Lebesgue measure and mu2, sorry I want mu2 is Lebesgue measure and mu1 is of the form integral of phi d mu, okay? And mu1 is of the form mu1 of A equals integral of phi d mu2 on A for some phi in L1. This is just the example I gave before, right? Then what is the relation between mu1 and mu2? Is mu1 absolutely continuous with respect to mu2? Or is it mutually singular? Why is that? What does absolute continuity of Lebesgue integral mean? Well, we don't need small here, we just need 0. So it's clear that if A is 0, if the measure of A is 0 then the integral of phi on A will be 0, okay? That's what gives the absolute continuity. Because if mu2 of A equals 0 then mu1 of A equals integral of A of phi d mu2 equals 0 because remember what we're doing here is we're looking at the values of the function phi weighted on a set of measures 0 so the weight of this measure on A is 0 so the integral is 0. It does not give any weight to the set A. It does not see the set A in some sense. So this is the typical example of absolutely continuous measures. In fact, there's a theorem called the Radon Nicodem theorem that says that if one measure is absolutely continuous with respect to the other, then it is always of this form. It has what's called a density. This is called the density of mu1 with respect to mu2. So the typical form of two measures that are absolutely continuous is that one is given in this form with respect to the other. What about if we take now mu1 equals Lebesgue and mu2 equals the Dirac delta on some point p? They're mutually singular. What is the set? For example, the point set p itself. It has measure 1 for this and measure 0 for this because every point. What if we take 2 Dirac delta, delta p1 and delta p2? Then they're mutually singular, clearly. You just take p1, the set p1 and it satisfies this one. These are some examples, but you can have fun thinking about other examples. I think this covers really all the crash course, full course in probability theory in just one in two hours. This is most of what we need. It's useful to know these things. Starting from the next lecture, we will apply this to understand dynamical systems, as I promised. Thank you very much.