 Good. So, today I want to explain you what does a dynamicist mean when he or she says a deterministic system that behaves in a probabilistic manner. And of course, and to also explain how this concept is related to ergodicity. Let me write a note here. So, we remember what I am talking about. So, deterministic system, let me write larger, behaving probabilistically, and it's relation to ergodicity. If you respect the roots of these concepts, you will probably end up studying works by Poincare or maybe Boltzmann where they try to study complex physical systems. It's possible to sort of explain these concepts in a modern way by using two theorems which I will try to explain first. So, these two theorems that I want to explain will be strong law of large numbers. This is from probability theory and Birkoff's ergodic theorem. This is from dynamical systems. And I will do it one at a time. So, I'll start with strong law of large numbers. I'll try to explain it using examples. I myself, I'm not a probabilist. I do more geometry. So, if you find anything that's not comprehensible or seems suspicious, do feel free to stop me and we can discuss. So, I'll start by stating this theorem or law and try to explain you hopefully in less than half an hour what it means. So, first I'll write theorem just as it is and then I'll explain it. So, don't worry if there's something that you don't understand. Let me write it here. Let this be a probability space. So, this is not a very good choice of letters. They all look like each other. But anyway, it will be clear what they are once I explain them. Be a probability space. Let xk be a sequence of random observables which are in particular some functions which have some properties which I will explain later. But there are some functions from your space. This will be the some set or some space that we are working with. Will be some random let xk be a sequence of random observables which are independent and identically distributed. Then for mu almost every point in your space you have that this limit exists and is equal to this. Can everybody see this? Okay. So, here the particular choice of zero does not really matter. As we'll see identically distributed means that you can replace this with any i but I just don't want to write an extra sentence so I put there zero. So, this gives somehow some statistical or average information about behavior of some objects that are defined on our space. Let's try to understand this theorem shortly or I mean as quick as possible by some simple example. If you are throwing dice, so let's first try to explain what are these. We are here if for those who already know it this will be our space. This is the measure and these are the measurable subsets of our space. But for those who don't know it, let's say that you are doing a probability experiment where you throw a dice. Then your outcomes will be inside this set. Well, of course when you are doing a probability experiment generally you are not just interested in one throw but you are interested in and throws, one million throws, two million throws. So, you would ideally want to work with a space that is of this form. An infinite product which is of course can be written as a sequence inside s where x i are from s. So, this means that this is the result of your first throw, this is the second throw, this is the third throw so on and so forth and this represents an ideal probabilistic experiment. Now and then in this particular setting you can take this guy to be just the all subsets of your set M and then the measure will be the following. I will explain what it means after I write it. So, the measure, so note that if you take any, so the measure will be first of all a function from here to real numbers in fact 0 1. So, it takes a subset of M and then gives it some value which will be the probability of observing an event inside that set basically. And you can define it like this. So, if you take a element from here which are all subsets here, all possible subset here it will be of the form a 1, a 2, so on and so forth where a i are from s and then the measure is defined as number of elements inside your set. Note that this set is finite and the number of elements can be at most 6 or 6 product from 1 to infinity. Let's see what does this thing do. It's just an extension of basic probability calculations that you see in probably undergraduate but not in this form. For instance, let's take a set of this form to be this. The first subset of s will be just the element 1 and then you take this, let's call it a. Then measure of a by this rule will be 1 over 6 because there is just one element here and then there are 6 elements here. So, these are just 1. So, as you see this just calculates the probability of getting 1 in the first row and then you can generalize this experiment to other kind of observables that you want to study but this is the basic setting that you should keep in mind when for instance you are looking at this theorem. Now, using this let me explain what it means to be independent and identically distributed. So, again by the way for those who are not familiar with integration with respect to arbitrary measures just imagine this to be volume for now for the purposes of this lecture is much more general of course but anyway this is some notion of integral that we have here. So, here you are just integrating a function over your space and so it should be integrable in particular these observables. Now, I am going to explain what it means to be independent and identically distributed. So, let us start with the easier one. A sequence of random observables are identically distributed if in some sense your chance of seeing a certain value is the same for all of them and this can be formulated in the following way. So, it should be by now clear that notions such as chance of seeing or probability of something happening is precisely related to this measure. So, you should expect a statement about measure now when I say identically distributed. xk are identically distributed if measure of the set of points p where xkp are in j is equal to measure of x i okay let me put i and j for all i j and let me call this maybe l so we do not have double j's. So, again this in some way sense could be seen as the probability of observing values inside l for the jth observable and the same for i and this is saying that this is the same for all i j and possible range of values. So, you see the same behavior probabilistically for two random variables then they will be identically distributed. Now, independent means I have to go like this sorry if it seems okay for everybody I guess. So, now I want to explain independent. Independent drastically means that if you know that one of the random observables achieves a certain value you don't gain any information about what the other random observable will be that they will be basically independent from each other like if you toss a dice and you wait and you toss another dice the result of this will not affect the result of this role. So, the results are like independent observables I'll also formalize this but first let me write what it means to be independent. So, xk maybe I should put this to make sure this you understand this is a collection of random observables xk are identically this are independent if measure of the points or probability you can call it if you like such that this probability should be the same as this probability times this is p sorry this should be same for all k and l subsets of r and for all i j. So, if you look at the second product this is just the probability for this guy to to get the values inside k. So, here also sorry l is a subset of r in our particular setting could be also integers but anyway. So, this is just the chance of obtaining xi obtaining values inside k times the chance of xj obtaining values inside l and this is saying that if you are given that xi p is already inside k and you want to look at the probability that xj p is inside l then it is the same as this. So, they are not sort of affected by each other. So, an example in the case of dice is just the following and in fact this just formalizes the feeling that the dice is I mean is a formula to write that dice throwing is independent and identically distributed you define the following random variables. So, I will write the sequence with y is now it just selects the kth element of the sequence. So, it means that this random observable will tell you what the kth outcome is and then if you look at these random observables I will leave it as an exercise to show that xk are iid which is independent and identically distributed and now if you want to apply this for instance strong law of large numbers in this particular case of random observables what you get is the following you can compute that integral on the right hand side is an exercise again, but the crucial point is almost mu almost every p. So, I will just make a side remark here for instance if you take the sequence 11111111 obviously it does not satisfy this, but if you take all the collection of sequences inside here which represent possible outcomes of your dice throws and if you collect them together they will still be a very small set in sense of measure. So, here mu almost every point p means the following for those who are not familiar with the notation that there exists a set inside m which has full measure by the way note that since we are working with a probability measure okay not very good choice of letters since we are working in a probability space the measure of our whole space is one, but this is saying that on some subset of full measure this equality is true and this is what we obtain here this says that this also says in this ideal case a typical dice throw will have this statistical behavior. So, by now let me check okay by now it should be also clear that when you take a physical phenomenon and when you want to model it using probability the aim is not to do the following the probability does not do this okay I have this dice here I know at which angle I am going to throw I know it is weight I know the direction so on and so forth what is the next result that I get this is not probability aims to get so probability aims to get some statistical or averaged behavior about the system that you are studying so this brings up the question which is also related to the the concepts that we are trying to understand let me make sure I'm on track okay so this pops up the question if you are given a physical system like dice throwing or other systems like economy propagation of a disease how do you know is there a mathematical formulation or a mathematical way of showing that such a system can be reasonably modeled by probability how do we know that when we write a model like this for dice throwing we we get that the results we obtain by calculations here reflect the patterns that you get by throwing real dice this is very much related to the concepts that we are trying to explain in fact you will see after the talk ends that ergo this it is a partial answer to this and is a mathematical partial answer to this and in fact it was some of the main motivation of let's say Poincaré or Boltzmann to basically to say that complex physical system complex physical systems instead of finding an exact model for them you can study them probabilistically this was the idea and direction which led to things like ergodic theorem and which in fact led to the notion of a deterministic system behaving probabilistically so to pass to that part now I am going to explain Birkoff ergodic theorem okay I'll keep this here so that you see the similarity between the two theorems so this theorem is supposed to give us some information about statistical behavior of deterministic systems whatever it means you will understand now tells the following let this be a probability let m mu and m fancy m be a probability space let f be a measurable map so here measurable map just is some sort of regularity condition which is lower than continuity but which is as reasonable as you can go if you want to decrease the regularity of a function beyond that is not very reasonable so this is just a weaker form of regularity let f be measurable transformation from m to m assume that the measure is f invariant and ergodic now take any random observable on your space not a sequence this time take some random observable and define the following then we have basically I don't need to define anything then for mu almost every point p we have that limit n goes to infinity one over n i from zero to n minus one converges to this integral so of course I will also explain this theorem but the similarity is more or less obvious that I mean the similar part is always it's not always how they are related at the moment but obviously the similar part of the two theorems is these results and in both cases you have a set of critical conditions which gives you the result you want so before I explain the relation between these conditions I will make a small remark for people wondering what does this even have to do with a deterministic system what is the system that you are studying so this is just the simplest abstract model of a physical system that you can imagine let me motivate it as a side remark in the following way for instance the original systems that Poincare or Boltzmann were studying were Hamiltonian flows of three bodies three planetary bodies or Hamiltonian sorry Hamiltonian flows of uh interacting particles gas particles inside a box so that would in general in the face space defines u a flow where n is the number of particles six and and r will be the time defines you a flow in your face space let me call this variable t the variable time variable of course one way to simplify this is just to take a discrete time fix a small scale time and look at this this is just one way of discretizing a map there are many other ways some of them are known as Poincare maps so on and so forth but here basically f is supposed to represent the time the discrete time evaluation of a certain physical system whatever it is and then if you want to if you look at the further iterates let me continue there of this map if you look at the further iterates of this map starting with a point you are just more or less following the discrete time evolution of a system for instance in this kind of setting or there may be more complicated ways but i'll not talk about them also for dice throwing you can see that each dice throw you can see as a so i mean you can try to although it will be almost impossible i guess try to model a dice throw as a set of maps with some parameters like the speed of the throw the angle of the throw such that at the end step when you apply that map given the speed so on and so forth you get the next result that could also be modeled as a family of maps not just one map but this is another example where such a setting comes into play so this is to make this question the relation between deterministic systems and probability mathematically rigorous this is the simplest setting of a dynamical system that you can imagine now that we at least hopefully understand that this represents some behavior of a dynamical system let's try to understand the relationship between these conditions so of course i'll first give the definitions as always mu is f invariant if you take any measurable subset here so some set who you can calculate its measure is tautologically if this equality holds true of course one might be wondering why you are inverse iterating a map rather than forward iterating i mean the most satisfactory answer that i have for this is it's a technical issue because if the map is for instance not invertible then if you take a measurable set you don't know if its forward images are measurable or not but you know that its inverse image is measurable so this is one technical reason why you would want to formulate it like this and geometrically this means the following since f is not invertible this f inverse is a set valued map meaning that when you look at all the inverse images you will get some b1 b2 b3 all of which map in a i mean if this is not invertible all of this map to this in a none one to one manner like they might end up in the same point this is saying that if you compute the volumes or measures of this it's the same as the measure of this so this is somehow a notion that something is left invariant so we'll later see that this will be related to let me just make a side remark remark to a identical distribution now the second condition is mu is f ergodic if f inverse a is equal to a implies that either the measure of this set is one or measure of this set is zero so the dynamical intuition about this condition would be the following if you have this condition again this is to be viewed as a set map in case it's not invertible if you have this condition then for instance you can decompose your space into two spaces such that the map such that the two sets do not interact with each other in terms of dynamics all the points here are sent to here all the points are sent to here so in some sense you can divide your space into two parts which you can break down and study separately but this is saying that to a nonsignificant level this is not possible if there is some invariant set inside your dynamics then it's almost the whole space so you don't have this picture it's almost the whole space that you are working with or sorry uniqueness of what yes yes so this condition this it happens basically I mean it happens basically as what I have said in any case if you have such a set you can break your whole set into this case and into these two parts all of which just sort of evolve inside themselves so this condition is enough to get that but now you look at the measures of these sets whether if that's a significant part of this whole space or not and this is saying that this basically says that if you have such a case then either this mu a is one or mu a is zero so you cannot decompose your space into two significant parts which are not interacting it always sort of mixes within itself it looks something like this intuitively that it always mixes by itself it doesn't if you start at a point it's not restricted to just here but of course these are all defined up to full measure so there will be probably some small sets here which do not satisfy this if you have this thing yes it violates ergodicity like if you are there is a flow I mean there is a significant amount of initial conditions for which if you start and you stay in some trapped region then it violates ergodicity and then our aim is to see that this is related to this will somehow be related to independence having said these I have to go here sorry of course when you see this theorem the first time these two theorems together you might wonder can I not prove this simply by demonstrating that x i written by like this are i id i id is ah yes sorry i then independent and identically distributed so this is where determinism kicks in these are as I will explain now a deterministic iterate of a certain map so there is no way they can be independent and I want to although they're invariant so I want to demonstrate that first with some maybe five minutes let me see okay I want to demonstrate that first so x i are identically distributed let's look at what it means so we let's look at the how the values of x i are distributed in this case measure of the points p such that x f i of p are inside some j where j is a subset of real interval so this means basically that f i p are inside x inverse j where this is also again just the set map which means that p are inside f minus i this i just denote the inverse of the this map so if you take the set of values q such that x q are inside j which is this set then you see that this guy is equal to this set inside here is equal to right q here so if you keep iterating this map you can also prove the following that for any k for any number of inverse iterates you have the same basically which is induced by the first condition and then you obtain that this is equal to measure of q so all of these guys which are the iterates of your map their values are distributed according to just the original random observable so they are identically distributed and this is how a invariance is related to identical distribution now to show how our godicity is related to independence i first need to show that these guys are dependent actually so x i as random observables are not independent well heuristically this is this could be explained maybe in a dynamical manner in the following way let's say that you know that for a fixed j your values is inside some range of some range of values this particularly means the following you know that this holds true then you know again by the same calculations that we have done is that your initial starting point is inside some region in your face space or configuration space or what you like now you can look at all the possible orbits so basically you look at the all the possible iterates of this set where your point falls and then this will give you somehow the distribution of the other random observables because x k of p is just x k x of f k of p so in particular this information tells us that our initial point is here and then this tells us that all the forward iterates are possibly trapped inside some region and this will prevent possibly again from x k achieving some values that it could have so this is an heuristic argument that tells you why they are not independent because they are related to each other by a deterministic manner this can be formalized a bit more with a more convincing argument although one would of course need a specific example to show this in the following way take some subset of u such that for some k this is not equal to u so you have u and you have some inverse iterate of u possibly also something like this all these are the inverse iterates such that it does not coincide with the original set and such that measure of f minus k u difference u is positive this generic situation you can imagine happening in many systems so this would be a way to construct an example now take any set here sort of take any set here so that its inverse images mapped to here and you will have that measure of f minus k of a f minus k of a a is zero while measure of a square is one now to to see so because this is empty set while this is obviously some positive measure set so not one is bigger than zero why does this imply independence as your random observer will you just take this function i'll leave it as an exercise which is equal to one if p is inside a and is equal to zero if p is not inside a and if you write the condition of independence using this function and using this particular form you see that you need to get this whereby invariance this will be again just measure of a so as i have demonstrated with a thought let's say thought experiment that this this is not reasonable to expect in a generic situation so these things are related to each other in an obvious way by a deterministic algorithm and if you know the value of one then you know the value of the others more or less so they are definitely not independent so in this case what happens is that instead of independence the independence condition of random observables are basically replaced by the ergodicity assumption and once you have once you have proven that a particular physical system that you have at hand satisfies invariance and ergodicity then it becomes reasonable to study observables on this space in a probabilistic manner by a probabilistic manner i mean that you look at for instance x could be here the energy of your configuration at that point and then this would just say that instead of evolving your system instead of finding an exact solution for your system and computing the energy everywhere if you are just interested in somehow asymptotic behaviors of course that's also very hard to estimate but anyway asymptotic behave by which i mean that when do you sort of get close to your limit but anyway i'll leave that aside then instead of calculating the exact thing you just do an averaging process so this is what is called of course by physicists the case where the time average is equal to space average so you see that this thing here that looks like just the summation is actually a time average you just follow the orbit of your point sum your observable here and then just divide by total time in a discrete manner there is also an analog for this for flows but this is the discretized version and this is obviously the space average or what people call our ensemble average where you just compute the average energy considering all configurations rather than a simulation or an exact solution and of course this also tells that this also tells that if you take this particular system which is invariant and ergodic and now as in the dice example you construct this space of infinite products and define the measure in the same way so measure of a set would be again let's call it bar measure of a1 measure of a2 now if you pick some points here with a probe pick some points with some probability using this measure this is a vague thing i know but this is just a way of thinking of this is again the dice example that you certain sets have certain probabilities so if you pick a generic point inside this or up the full measure if you pick a point inside this space a random point which means a sequence of points then asymptotically it is behavior will be same as if you are calculating this average along the orbit so this is also another way to see how now this turns into into a probabilistic setting this gives a partial answer to understand when can you study a physical system statistically at least studied statistically using probability now of course comes some other questions which i have little time for but i'd like to give a few remarks on which goes back to original questions of poincare and of course other people is how do you for instance can is it possible to study ergodicity of physical systems so for those who are interested basically well the answer is not always positive that because they were expecting probably that almost all or generic hamiltonian systems should be ergodic by hamiltonian system i mean some physical system that obeys newton's laws basically but there is a theorem called colmogorov arnold mozer theorem that says that there are actually quite large classes of hamiltonian systems which are not ergodic i mean yes basically which are not ergodic and anybody interested in going further i would suggest to look in that direction okay let me see if i have anything left okay i still have 10 minutes so i will just also add one more final remark what we have done here is that we have taken a physical system and show that it behaves probabilistically but you can also show that if you take a truly probabilistic setting like this you can view it as the most basic ergodic dynamical system in the following way so this is somehow establishing a relation in the reverse direction so always the example to keep in mind is the dystro and now assume that we are here that we have all these things here if you take your as in the dice example take your infinite product space which will be a space of sequences and assume that each probability experiment is independent and identically distributed which means that you can define a measure on this space in the following way where mu is the first given measure and if you define the following shift map on this space so it takes a sequence and gives you the next basically shifts the sequence to the left you can show that sigma is invariant and ergodic with respect to mu and this will somehow imply you i'll just show briefly that then you will obtain this theorem for a special class of functions so if you take again this characteristic function which is as i said is p is equal to one if p is inside a and is equal to zero if p is not inside a so i will just leave this as a small exercise to check these things then you can see that basically x i sigma k so i will let also p represent this sequence now p is one if x k is inside a so this function checks whether your kth result falls in this collection of results or not you do some experiment and you wonder if it falls where a is a subset of m you do a collection of experiments and you wonder if one of some of them falls inside a now if you apply Birkoff's ergodic theorem to this sequence of observables what you get is that this is equal to this so what is this quantity this quantity is precisely the average number of times your experiment results in an outcome inside this set and this quantity is precisely saying that this is measure of a yes yes sorry k i of a where this is equal to over m okay so this is equal to integral over a of d mu which is equal to mu of a so if you know for instance if you assume Birkoff's ergodic theorem now you can turn your probability space of independent and identically distributed trials which basically means this this means that you have identically distributed and independent trials given by mu then you can turn this question to a dynamical system and prove this famous theorem about observing a certain class of events in a sequence of experiments is equal to its measure or equal to its probability you obtain this theorem in a dynamical manner and finally i didn't have enough time of course i i realize after i prepare to talk that i had too much to talk about but i'll put some notes in the web in my web page so as you see this is just an average information about certain random observables or an average the information about let's say the energy of your system so this is a very crude probabilistic behavior let's say you just have some information about the the averages you can actually also relate what is called central limit theorem in probability and decay of correlations well this is both a probabilistic term and a dynamical term i guess of and decay of correlations that that is present in a dynamical system to show that actually there is even in some cases you can even get much more finer information about these guys individually rather than that's average but i will leave that to people who are interested to look at the notes that i will put in my website finished i don't know but it is inside so if you write sisa and my name to google there will be a directory inside there will be a directory inside sisa web page which links to the my web page here i don't remember right now sorry and yes you can find it like that or you can email me okay you can email me that's also another option which is less painful okay