 Yeah, so let's go back to where we stopped the last time which was the logistic map and we discussed the idea of a generalized dimension to recall to you generalized dimension dq was defined as the limit as epsilon goes to 0 of 1 over 1-q the log of 1 to n of epsilon mu j to the power q divided by log 1 over epsilon this in the case if you took the full phase space whose measure is unity normalized to unity and you broke it up into cells of various kinds and this is the C j the j cell and mu j is the invariant measure associated with this so you begin to see that I have in mind a system which displays chaos which is certainly ergodic and which displays chaos and in which a typical phase trajectory wanders round and round in the phase space or some portion of the phase space like an attractor and I take that portion I break it up into cells C j and I associate some letter of the alphabet with each of these cells and I do the symbolic dynamics of the trajectory in the sense that I keep track of at each iteration in which cell the trajectory is to be found the typical trajectory is to be found and having done that then I compute this generalized dimension here which is some kind of weighted sum over this invariant measure of the jth cell raised to the power q and this set of numbers dq gives me some information on the way the attractor is on the nature of the attractor in general there is no reason why these dq should be integer numbers q need not be an integer by itself and dq is some function of q and the general statement I made was that dq is a non increasing function of q in other words as q increases from minus infinity to plus infinity dq decreases maybe not monotonically but certainly remain or certainly not non increasing so the statement specific statement was that if q prime is greater than q then dq prime is less than equal to q that was the statement I made about these about these generalized dimensions the next thing we saw was that in the case of one dimensional maps in which the density was uniform you could break it up into say n cells of equal size 1 over n and then it was a easy matter to show that dq was equal to 1 the topological dimension of the attractor the same the fractal dimension is the same as a topological dimension the entire interval was the attractor but if the invariant density or the measure is not uniform then this is no longer true and the question was what is it in the case of some of the maps we looked at for which we know what the invariant densities so let me take that up now and show you what it is for the logistic map at fully developed chaos so this is the next task to compute dq for the logistic map at mu equal to 4 so if you recall the map function f of x is 4 x times 1 minus x and the invariant density in this case the normalized invariant density was 1 over pi times square root of x times 1 minus x now what do we do well here is the interval 0 to 1 I break it up into n cells of equal size so this is 1 over n this is 2 over n and so on and this is cell C 1 this is cell C 2 and finally you have n minus 1 over n and this is cell C sub n and I am going to compute this measure for each of them take the limit n goes to infinity and compute what dq is so that's our program and what is mu j for the j cell is an integral over C j but C j runs from j minus 1 over n to j over n dx times row of x but what's that equal to well this is pi times square root of 1 minus x times 1 minus x and remember n is going to become very large and the size of each cell is going to become very very small so what does this become approximately equal to it's approximately equal to in this cell in any of these cells say the j cell here the value of that integral as this interval becomes smaller and smaller is the value of the integrand at the midpoint multiplied by the length of this interval the length of the interval is 1 over n and the value of the integral at the midpoint is 1 over n pi 1 over the square root of j minus a half over n times 1 minus j minus a half over j minus a half over because the cell runs from j minus 1 over n to j over n and the midpoint is j minus half over n so that's what mu j is and therefore dq can now be written down at least formally and we have dq equal to a limit n tending to infinity because I put epsilon equal to 1 over n 1 over 1 minus q the log summation j equal to 1 to the power q n that mess 1 over n pi square root j minus half over n 1 minus j minus half over n so whole thing raise to the power q divided by log n 1 over epsilon is n this is the quantity whose limit I want now this suggests very strongly you have a sum from 1 to n this suggests very strongly that you actually have an integral here you could convert this sum back to an integral once again and what would this integral be well if I put a if I forget about this factor and then I put a 1 over n outside this would be related to the following integral it would be related to 1 over pi apart from a factor pi x times 1 minus x to the power q it would be related to this integral do you agree because it's a Riemann sum I can always write this Riemann sum back as an integral provided the integral converges and when is it going to converge when is this integral going to converge it's got singularities at 0 and 1 out here and it's clear that it's going to converge because there's a square root here q over 2 should be less than 1 if it's greater than 1 this diverges when q over 2 is equal to 1 you have a dx over x that's logarithmically divergent so the integral converges for q less than 2 and diverges for q greater than or equal to 2 so it immediately means that you have to consider two cases separately you have to consider case 1 q less than 2 and case 2 q greater than or equal to 2 separately for q less than 2 so let's look at that first dq is equal to limit n tends to infinity 1 over 1 minus q log summation j equal to 1 to n let's get this n out of here and that's n to the minus q and then let me take it out of the integral so log n to the minus q comes out of the integral and then what I have here is 1 over pi square root of j minus half over n 1 minus j minus half over n to the power q and it's really saying take the function x times 1 minus x take the midpoints of those guys the value at the midpoints and consider this sum now if I divide that sum by a 1 over n this quantity in the limit n tends to infinity is exactly equal to the integral from 0 to 1 divided by the whole thing divided by so I divided by n therefore I better put it back so log n to the 1 minus q so please remember I divided by an n in order to convert this to an integral and I have to put that n back there divided by I get rid of this and this is log n to the 1 minus q because 1 minus q log n is log n to the 1 minus q that becomes very very simple to evaluate so this quantity is equal to dq is equal to well the first term is clear so log of this times that and I take this out and it gives me just unity it's very clear so this 1 plus the limit as n tends to infinity of this sum divided by this but you see this sum in the limit is simply the remand is simply the integral from 0 to 1 dx over x times 1 minus x pi times square root of this to the power q that's all it is because that's how we got this to start with divided by log n to the 1 minus q and recall that q is less than 2 so what happens there what happens to this thing here so I put a 1 over 1 minus q outside and I have a log n this integral is finite this integral is finite and therefore when n tends to infinity this log n in the denominator ensures that that term goes to 0 completely because this integral is finite it converges so it's immediately clear this is equal to 1 for q less than 2 d0 had better be equal to 1 because that's exactly what the fractal dimensionality of the attractor was and the attractor was the unit interval the full interval so d0 was in any case equal to 1 now we discover that dq for all q less than 2 is equal to 1 if you had a uniform density like in the Bernoulli shift or in the tent map at fully developed chaos then dq is equal to 1 for all q but here for q less than 2 we can assert it's 1 but we have to re-examine the problem for q greater than 2 so let's now do the next step that is to go back and ask what happens if q is greater than 2 we have to go back to this and then we end up with so this is the second part q greater than or equal to 2 when we can't convert this summation into an integration because the integral doesn't converge but the sum is finite completely finite so what do we do in this case this case dq equal to limit n tends to infinity in the denominator of course you're going to get log n to the 1-q so let's put that here minus q but in the numerator you have an n to the q which comes out out there but then you can't convert the rest of it into an integral so there's not much use doing that on the other hand let's simply multiply through by n and this n goes away because you get an n squared with a square root that cancels this and in the numerator if I give myself a little more space this thing becomes limit n tends to infinity log summation j equal to 1 to n let's multiply through by n so we have a 1 over pi times a square root of j minus a half n minus j plus a half this whole bracket to the power q and the n squared canceled out with this n here divided by log n to the 1-q have to deal with this now what sort of sum is this it's clear that for small j j near 1 near this end the lower end these terms the denominator is going like 1 over n to the power q over 2 so those terms go to 0 the terms for small j now we're going to have jn very very large so the terms for small j are going to go to 0 like 1 over n to the q over 2 because this factor is going to dominate these are finite small numbers on the other hand at the other end of the sum for j near n this is going to be some small number and you're going to have an n here so once again the terms are going to go like 1 over n to the q over 2 the terms in the middle near n over 2 for example both these factors are going to be of order n because this is going to be like n over 2 that's going to be like an n over 2 and therefore those terms are going to disappear like 1 over n to the q because there's a square root of n squared so we the conclusion is you have an infinite sum a big very large number of terms for very large n the terms at the lower end are going to disappear like 1 over n to the q over 2 terms at the upper end are also going to disappear like 1 over n to the q over 2 and the terms in the middle are going to disappear faster like 1 over n to the q itself so this whole thing could be written in the following form the whole thing could be written as equal to limit n tends to infinity one can make this much more rigorous but this is as simple a way of understanding it as any log of n to the power minus q over 2 summation j equal to 1 to n times something or the other which will give you in the limit a finite sum because this is the rate at which every term vanishes and the ones in the middle are going to go to 0 even faster but certainly there are lots of terms of this kind contributions of this kind divided by log n to the 1 minus q where this sum tends to a finite not infinite finite non negative quantity finite limit as n tends to infinity having pulled out the dominant n this is what the rest of it is and that immediately implies that this implies dq is equal to since in this case is an n in the denominator that cancels against this this term is finite so that limit will be 0 then I take the log of this sum it will go to some 0 because of the log n in the denominator and this becomes equal to q over twice q minus 1 because it is a minus sign so it is minus q over 2 over 1 minus q that is all and this is the answer for q greater than or equal to 2 what happens at 2 this 2 and this cancels and you have dq equal to 1 so there is continuity in this dq and then it decreases beyond that so if you sketch this dq this is what it is going to look like the generalized dimension for this problem here is q and here is dq it started with the value unity and this is the origin here is 1 here is 2 for example kept till there till this is 1 on the side it is not to scale the scale on the vertical axis is 1 there and then it drops at this point and asymptotically as q tends to plus infinity it tends to a half it is easy to see that this is what it will look like so this is the spectrum of generalized dimensions for the logistic map at fully developed chaos this set of numbers dq something like the moments of these invariant measures weighted appropriately so it is a infinite set of numbers it is actually a function in this case and gives you information about the distribution itself anything for which dq is a function of q and not just a constant is not just a fractal it means there are many many other dimensionalities buried in this system in this tractor and this kind of thing is called a multi fractal and dq is sometimes called a multi fractal set of exponents or whatever it is related to other exponents it is related to other functions which also characterize this distribution so the whole thing is an effort to actually find out based on this course grain dynamics what the nature of the attractor is it is like finding the moments of a distribution if you like and then reconstructing the distribution the probability distribution itself so that is really what is being done let us take another example where you have similar kinds of behavior and then we will try to draw a general lesson from it and this was the map in which you had intermittency so if you recall we had the square root cusp map which was f of x equal to 1-2 the modulus of the square root of the modulus of x where x was an element of the interval – 1 to ± 1 and the invariant density in this case because you had a marginally stable fixed point marginally unstable fixed point at – 1 the invariant measure density in this case was a linear function it was 1-x over 2 so it piled up near the left hand side and tapered off to 0 at x equal to ± 1 and you have to do exactly the same thing as before once again you expect a sort of phase transition quote unquote at a point like this at some value of q where would that be because ultimately what you are trying to do is to find an integral of rho of x to the power q when you do that you have this – 1 to 1 and when would this diverge as long as it is finite you could convert the summation into an integration and do the problem very easily when would it diverge it is clear if q becomes negative it starts diverging here less than – 1 because dx over 1-x is logarithmically divergent and any higher power in the denominator and it diverges so here you expect that at q equal to – 1 you have a funny business and after that beyond that it remains a constant essentially and indeed that's true and I request you to go through exactly the same steps of before and show this is an exercise show that in this problem dq is equal to 2q over q-1 what does this look like if I sketch this q less than – 1 and what does it become at q equal to – 1 it becomes 1 so it remains 1 thereafter strictly speaking I should have written this but so here is – 1 this point and beyond this point it remains 1 in this case and at q equal to – infinity it hits the value 2 and therefore 2 here dq here and in fact it starts like this it bends down come here and it goes on so this is the way dq looks in this case so it's immediately clear that if you have some kind of polynomial in rho of x or a rational function in rho of x then wherever you have the roots of the numerator or the roots of the denominator as in the other case you have points of this kind where the slope of dq changes abruptly in fact it's not hard to show and I would like you to show this general result if rho of x for example is proportional to apart from some constant say x – x – a to the point power alpha say divided by x – b to the power beta at the two ends for example in the interval a to b suppose it has singularities at the end points of the range in which the map is defined then dq has changes of slope at points which depend on alpha and beta on both sides and it looks like as follows goes like the following graph let me draw it in pictures and then you have by the way alpha beta greater than 0 positive integers so you have this that's dq and this is – 1 over alpha this is 1 over beta and dq starts with the value at this point there is a phase transition it remains like this and then it tapers off to a limiting value which is 1 – beta because certainly beta has got to be less than one otherwise it's not integrable and out here it's like this and this is 1 plus alpha so d – infinity tends to 1 plus alpha d plus infinity tends to 1 – beta and in between it hits the value 1 and it looks like that if you have more products here in between more zeros and more places where it diverges then it will show up right there all those points will in general show discontinuities in the slope of dq so this gives you some idea of what the generalized dimensions look like we will come across some other examples in more in high dimensional fractals but I urge you to complete these derivations it's very simple all you have to do is to use exactly the technique I use for the logistic map whenever you can convert things to an integral you should do so and then the limit is easy to take otherwise you must retain the sum and find out what the dominant behavior of the sum is you can do this very rigorously but what I did there was essentially right okay now let's go back a little bit and ask okay we have some kind of coarse graining but can we relate the dynamics of these coarse grained dynamical systems can we relate this symbolic dynamics to some random process itself it turns out the answer is in many cases you can we saw what the idea of a generating partition was I stated that if you divided up the phase space into cells and kept track of the symbolic dynamics of the trajectory moving between these cells then if you can reconstruct the actual dynamics by looking at these sequences of letters then it's a generating partition not at all clear if every system has generating partitions but there are other cases where the partition has other kinds of interesting properties in other words the jump of the system between these cells looks exactly like a random process called a Markov chain and let's see what that is for that I need to introduce a few concepts in probability especially in Markov chains and not clear how much background I should assume here but let's take our usual attitude and spend a few minutes is not a digression on Markov processes or Markov chains so this is a digression Markov chains or Markov processes depending on whether continuous time or in discrete time let's let's do the following so let me assume that I have a random variable X and I it changes with time and the time could be discrete or continuous doesn't matter but let's suppose that this variable generates a time series as I go along then how do you specify the probabilities or distributions of this variable well one of them would be to ask what's the probability and if it's a continuous variable then of course I must talk about a probability density otherwise I talk about a probability itself and I'll use these two words interchangeably for the purposes of this little digression I could ask what's the probability that it's got some value X at some time T and if the time is discrete then it's N let me for the moment use continuous time and go back to discrete time a little later so this would give me the probability or the density probability density that the function is either that the variable has the value X at a time T or is in a range X to X plus DX at time T let me call it P1 it's some function non-negative function but this is not enough I also need to know what's the probability that this variable has say a value X2 at time T2 and X1 at time T1 let's stick to the standard convention and write earlier times to the right and later times to the left so you need to know this joint probability as well two time probability you also need to know the three time probability that it's X3 at time T3 X2 at time T2 and X1 at time T1 and so each of these in general different functions and to specify the random variable completely I need to know all the joint probabilities for arbitrary number of time arguments so you need an infinite number of probability distributions joint probability distributions to specify the random variable completely to find all possible correlations in the random variable however by definition Pn of Xn Tn Xn-1 Tn-1 dot dot dot all the way up to X1 T1 by definition this is equal to the probability that you have Xn at Tn given and I for conditional I put a vertical bar there given that at time Tn-1 the variable had this value Xn-2 at Tn-2 up to the last one multiplied by the probability that you had all the conditions here that you had Xn-1 at Tn-1 X1 and this of course is it has n time arguments and therefore let me call this again n and let's call this n-1 this is a different object than this probability this is a joint probability this is a conditional probability so it simply says that if you have a sequence in time T1 less than T2 less than T3 etc then the probability that you had X1 at T1 time T1 right up to Xn at time Tn is by definition equal to the probability that you had all these earlier events happening at the times indicated multiplied by the conditional probability that if these events given these events have occurred the next at the next time step the values Xn so this is almost intuitively obvious well therefore you can start reducing things because now you could reduce this to a product of an n-1 fold conditional probability that the last one occurs given all the earlier ones occur and therefore in some sense you can reduce this entire thing to a product of conditional probabilities till you hit this last one that's what you can do in general that's about all you can do in general for a Markov process the memory is short and its specific by saying first order Markov process means that this conditional probability is independent of all these earlier events and depends only on the immediately preceding event so it's like a one step memory that's it so if I call this the present state the future the next step is determined by the present and not by the past history so for a Markov process Markov Markov process this whole thing gets wiped out and you have just this multiplied of course so the statement is that the conditional probability so let me state that again precisely then we come back and write out our joint probabilities the statement is Pn of Xn Tn given that Xn-1 Tn-1 occurred X1 T1 is identically equal to some function and for want of a better notation let me just call it P once again P of Xn Tn Xn-1 Tn-1 there are only two time arguments here so let me just call it P2 it's a function with two time arguments and this is the Markov property it's called first order Markov if it depends on the preceding two steps then it's second order Markov and so on but we are going to restrict ourselves to first order Markov processes yeah this is for continuous time for continuous time these are probability densities rather than probabilities oh you are asking continuous time as opposed to continuous discrete variable here these are two different events in two different times time steps so I don't see any problem there all we need is this so at any instant that's true this is true this question is if you have discrete time then I don't have this T here this is n-1's instant and this is the nth instant then you can understand this but his question is suppose time flows continuously then what are these things how close to this time can that be arbitrarily close arbitrarily close still true that's a good question we will come back to this and see what it implies it will imply a certain renewal of the process as we go along so you can ask okay on the time axis if something happened here and something happens here then we are saying the conditional probability depends on what happened here in this point but something else could have happened in between yes indeed that's true therefore this quantity is going to obey some kind of chain condition which will propagate you from here to there and then from there to there we are going to see what that is but yes that's a good point continuous time this is actually true as long as Tn-1 is less than Tn that's true but what's the implication of this it's an enormous implication because it immediately implies that you can take the general n time probability and write it as a factor of conditional probabilities of just two time arguments right away because you can then write this quantity this n fold probability to be equal to first the joint probability P2 that you had xn at Tn given xn-1 at Tn-1 and the rest of the conditional probability this dependence is gone multiplied by the n-1 time joint probability of this set of events but that could be written once again as P2 of xn-1 Tn-1 given xn-2 Tn-2 and then the n-2 time joint probability but that can again be factored out and so on till you end up with P2 of x2 T2 x1 T1 multiplied by the single time probability of x1 T1 so for a Markov process or a Markov chain if T is discrete this factorization is still possible what does that mean in practical terms it means that if you give me the function P2 and the function P1 the job is done all joint distributions are known in terms of just two probability distributions the two time conditional distribution and the one time probability itself so that's the remarkable simplification that occurs for a Markov process now we are going to make a further assumption we are going to assume that the statistical properties of this random variable don't change with time in other words that the variable is a stationary random process that the actual statistics doesn't change as a function of time and therefore it implies that you can shift all the time arguments by the same constant and nothing is going to change what does that imply it says that this function here is independent of time because it's independent of the time argument you can subtract out the T1 completely in other words if you ask for what's the average value of x or the mean square value or any moment of x this should be independent of time if I took the nth moment or kth moment of this this by definition is dx if it's a continuous variable P of xt x to the power n if it's a continuous process if the set of values of x is continuous it's this integral otherwise it's a summation but the T dependence appears here and if this is going to be statistically all these moments are independent of time the only way that can happen is if this distribution doesn't depend on time therefore for a stationary process you have a further simplification for a stationary random process P of x, T P1 of x, T equal to P of x no T dependence and since no confusion should arise I am going to drop this one just call it P of x it's a probability if this x takes on a discrete set of values it's a probability density if x takes on continuous value it's the stationary distribution and what happens to P2 of xt x prime say x0 T0 so the probability given that you had the value x0 at time T0 the probability that you have x at time T is independent of T0 you can subtract out this T0 from both sides therefore this is equal to P2 of xt minus T0 x0 and 0 you could call that the origin of time which was simplicity of notation let me simply write this as again P using in abuse of notation P of x T minus T0 x0 in other words I won't even bother to write down the fact that this is a 0 here I have shifted the time origin and that's it I use the same symbol P here I shouldn't really I should call this P2 and call that P1 but we know what is happening here so where is the time argument here there is no time here at all in the single probability single time probability and there is just a single time argument here what therefore happens to this business here this entire density well it's clear this becomes Tn minus Tn minus 1 xn minus 1 and I can erase this in a little abuse of notation I write this as because it is really dependent only on this time difference all the way up to here P of x2 T2 minus T1 x1 and then P of x1 so the matter simplified enormously in other words if you give me two functions you give me P of x and you give me P of x T x0 the job is done there is one time dependent probability or density and there is one time independent you have one more property which you expect and that's the following you would expect you have to establish this but you would expect that if I took this conditional density or probability which says given that this happened given this as the initial condition this is the value at time T given that this happened at T equal to 0 what happens to this as T tends to infinity what would you expect what would you expect would happen to this probability as T tends to infinity I would expect if things are reasonable and the system is sufficiently random I'd expect that this probability would become independent of x0 it should forget the initial condition altogether and I would expect therefore that you have limit T tends to infinity P of x T x0 to be equal to P of x itself should become the stationary distribution independent of the starting point that's a consistency condition that I put on the system if the dynamics has enough sufficient mixing it turns out this is always going to be the case so really you have just one function which you have to determine this because once I do that I take its limit as T tends to infinity and I know this and therefore this entire for arbitrary n joint probabilities are all determined by a single function of a single time argument which is this and what's the initial condition on this if it's a density if this x is a continuous variable then it's immediately clear that in addition to this thing here you therefore have limit T tends to infinity P of x T x0 equal to P of x0 P of x and you also have limit T tends to 0 P of x T x0 equal to what it should have support only at x0 no not P of x0 this is the conditional density in x it's one if x equal to x0 in some sense but if it's a continuous variable and it's normalizable then what would it be what would this be what's the continuous analog of density which is got measure one delta function yeah there are delta function so there should be delta of x – x0 so that's the only thing that contributes and everything else is 0 so you have an initial condition you have an asymptotic limit and the entire process is determined by this single function and it's this quantity for which one generally writes equations down evolution equations or master equations or whatever you call it is written down for this because after all in any physical problem it's only conditional densities that you can model you can't model absolute probabilities you can only model conditional probabilities given there are certain event or set of events has happened you can now talk about the future but you can't talk about any absolute probability there's no way of writing equations for these things so the equation one writes for Markov processes for objects of this kind are called master equations there's several of them but depending some of them are integral equations some of them are differential equations and so on and so forth and there's a very elaborate theory of Markov processes which tells you what these things are how these things evolve but now let's go to specifics and let's see how to write this down so let me work in the context of continuous processes and after that I will talk about jump processes we need to do both there are several ways of doing this but let me do this in terms of something which I will motivate in sort of an intuitive sort of way so I would like to write an equation for p of x, p and I drop the x0 for the moment because that's incorporated in initial condition saying that p of x, 0 is going to be some delta of x-x0 wherever I start then I ask if x is continuous runs over a whole set of values then I ask what's delta p over delta t equal to on this side the two ways in which I can have contributions to the increase of the probability mass at the value x one of them would be at if the system well let me let me motivate this go back a little bit and write so I go back right p of x, t x0 at 0 could be written in the following way because of this Markov property you could say that you start at x0 at 0 and you evolve to some value x' at some time t' so on the time axis you have 0 here you have t' here and you have t here t' is less than t but greater than 0 you propagate from here to there and then you propagate from x' to x in the remaining time but x' could have been anything in between and therefore you integrate over all possibilities and you expect a condition like this to be satisfied this is called the chain condition it essentially renews this process because you are really saying that this process has only one step memory so it goes from x0 to some intermediate state any intermediate value x' and then from x' to x in the remaining time and this is true for all t' between 0 and t and there is an integration over the intermediate steps this has got a long name it is sometimes called the Chapman-Kolmogorov equation or the chain condition because it is like a chain which propagate go from step one step two step two to the next step three and so on it should really be called the Bachelier-Smolukowski-Chapman-Kolmogorov equation because several people wrote it down at the same time or around the same time but we won't go into the history of this but this equation helps us to write something down for this I should immediately hasten to add that one shouldn't be under the misconception that this equation is unique to Markov processes other processes which are not Markov also obey such a chain condition because it's in some sense a renewal process all renewal processes would obey similar equations but Markov processes certainly do what one does is the following one takes this t-t' to be infinitesimal some delta t for example very close to t and then writes instead of this a transition probability per unit time and that changes this chain equation from this kind of integral equation which is nonlinear it's an integral equation and it's nonlinear in this p to a linear equation for the rate of change of p and that equation looks like this so from this it follows and the suitable conditions delta over delta t p of xt x0 is equal to on this side an integral over intermediate states dx' the probability that you are at x' at time t having started at x0 and then you make a transition from x' to x per unit time if that transition probability is wx given x' in other words this stands for the transition probability per unit time of making a transition from the value x' to x per unit time because we're writing a rate equation but you also have a lost term it says you might well have reached x itself at time t you could jump out of x into any other state so there's a w of x' x and this looks exactly like the rate equations you write down chemical reactions or any other system which is governed by rate equations so this is like a game term you go from x0 to an intermediate point x' and then you jump from x' to x and there's a rate of jumps or you go from x0 to x and you jump out of x to some other value x' how do you write the same thing if you had a Markov chain in discrete time say so suppose the values that this variable could take and so being continuous set of values was a discrete set of values x1 x2 x3 etc x of j and let's suppose this whole thing took place in time then how do you write this set of equations what would you write what sort of equation would you write here if it's continuous if the process is discrete but the time is continuous then it's very straightforward you still have a rate equation of this kind here but you'd have a summation here right so you'd have a summation which is so let me let me say w say xi from the value xj to the value xi this transition probability per unit time let me write this as some wij so let me write it in this fashion let's call it wij by definition then what what would I write down instead of this I'd write d over dt and if pj is the probability that you are in the at the value xj at time t then this rate of change is equal to a summation over it's called it pi summation from j equal to 1 to whatever the number of set the set of values you can have could be infinite could be finite times what what's the integrand what's the summand inside in that case what would you write this remember would be you have to sum over all values so what would you write here pj of tw you have to go from i to j right from j to i so you have a ij you have that minus on this side of pi of tw ji with a prime here to show that j is not equal to i in an integration it doesn't matter because the value x prime equal to x is a set of measures 0 here but in a summit matters so you put a prime here which denotes this thing here implies j not equal to i in the sum that's very much like a rate equation write down and once you specify the matrix w you can write this transition probability matrix as a matrix once you write this matrix down the job is essentially done now what kind of solutions do you have for this matrix equation so let's see if we have a convenient way of writing this thing down let's suppose that I call p of t oh by the way what's the initial condition on this what's the initial condition it's anything at all depends on what your x0 was if instead of x0 I start with state k x sub k then of course it says that pi of 0 is equal to ?ik if it's at the state k the probability is 1 otherwise it's 0 so I'm starting with some specific value at t equal to 0 and with that initial condition you have to solve this equation but there's a very convenient way of doing this let me write column vector p1 of t up to pn of t if you have for example a finite number of states but if you don't it keeps going forever and we really don't care so let me write p1 of t p2 of t etc and let me call this column vector the script p of t and that looks very much like a matrix equation what it says is that d over dt of p of t is equal to some matrix w times p of t and what is this w what would you say is w what are the diagonal elements of w for example suppose I is equal to 1 then it says dp1 over dt is a summation over j not equal to 1 so it starts off in this fashion and it says w12 times p2 plus w13 times p3 and so on so it's clear that this w has matrix elements which are like w12 and w13 all the way in this fashion and then you have a diagonal element here what is that diagonal element equal to so you have you put i equal to 1 you have dp1 over dt there is a term here which is minus p1 but there is a summation over j wji so this is w21 w31 41 and so on so you really have a structure which looks like w is equal to minus w21 plus w31 that's this first term the second term is w21 and the term here is w23 but in the diagonal here you have minus instead of the minus w what will you have here 12 just as you had 2131 etc you have a 12 plus w32 plus and so on so you have a matrix equation of this kind which is not very difficult to solve if w is a constant matrix which is what it is in our stationary case in principle and then you have a matrix w which determines the transition probabilities in which the off diagonal elements give you the transition probabilities from one state to another this is the rate at which from state 2 you jump to state 1 or 3 to 1 this is 2 to 1 to 1 to 2 and so on but the diagonal elements are the negatives of the sum of all the rest of the elements of the same the rest of the column so the sum over each column in this matrix w is 0 so you have a matrix with the property that the elements are all real and the diagonal elements are minus the sum of the off diagonal the rest of the elements in the same column and the formal solution to this is of course p of t is e to the w of t times p of 0 if p of 0 starts at a particular state then it is a column vector with 0s everywhere except a 1 at that particular state and e to the w t acting on that initial column vector gives you the probability vector for all the states so this is how you would handle discrete valued processes in continuous time and now if you go to discrete time as well you have what is called a Markov chain that 2 is determined by precisely this matrix of probability is something analogous to w once again instead of an exponent e to the w t if you have discrete time n what do you think this will be replaced by intuitively I have discrete time what do you think this guy is going to be replaced by well each time to go from time step 0 to time step 1 you are going to apply this matrix w to go from time step 1 to 2 you are going to apply the matrix w once again it will be just w power n yeah just as the discrete version of the Laplace transform is a z transform this is just a power it is exactly the same principle so you need to know w raised to an arbitrary power which you incidentally need to know if you want to exponentiate it as well but there is an even easier way to handle the matter and that is to look at what happens to the Laplace transform of this equation this immediately implies that the Laplace transform of p of t if I call this p ? s it says s p ? s – p of 0 is equal to w p ? s which implies that p ? s equal to s times the unit matrix – w inverse on p of 0 so it implies that if you have the resolvent of this matrix w this is called the resolvent of this matrix w as a function of s you have the Laplace transform and then you can invert the Laplace transform to find p of t this resolvent will exist as long as you do not hit an eigen value as long as s is not one of the eigen values so you have to find p of t this resolvent exist as long as you do not hit an eigen value as long as s is not one of the eigen values of w where the resolvent blows up even already see from here this is very similar to what we had for dynamical systems so the entire behavior of this probability matrix will be governed by the eigen values of this transition matrix w just as it was governed by the eigen values of the Jacobian matrix for our dynamical system so in that sense these Markov processes really have reduced to a set of first order equations is very similar to a dynamical system very similar mathematics is used now what I am going to do next is to take a map like the tent map or something like that do a partition of it and show that this partition can behave like a Markov chain the discrete time dynamics would behave exactly like a Markov chain and therefore I can actually tell what happens to n time arbitrary time probabilities and we will see how that is done next time.