 So, we were started discussing about Markov processes in the last class, so Markov process we also called as Markov chains and we are mostly interested in discrete time Markov chains. So, in the last class we defined what we mean by this Markov property and then discussed what we mean by time homogeneous Markov property and then try to connect transition probability matrix with my n-step transition probability matrix. So, in the last class we said that P of n equals to P to the power n. What was P superscript n means? This is the n-step transition probability matrix and what is P here? One-step transition probability matrix or we simply call it as transition probability matrix. And we have also concluded that P of 0 is 1, sorry i, identity matrix. So, in the last class we just shown that P n plus m can be written as P of n plus into this product and at the end we said that we could write it as P of 1 and P of n minus 1 and then we said that so and so P of n we are able to split it in this fashion and we could further split them into this fashion and we said that this can be finally result in P to the power n. Okay, fine. Now, because of this, so I think last time after completing this I also wanted to introduce the notion of finite dimensional distributions which I could not do last time. Let me finish this today before moving to another topic on this Markov trains. We told that any process, a stochastic process to completely characterize this I need to give is finite dimensional distributions. That is I need to for any n, if it is a discrete time process I need to characterize joint distributions or joint probabilities of this format assuming that I am looking into random variables that take, that are discrete. That is let us say, so this is a joint distributions involving n random variables, sorry involving n random variables here and I need to give such joint distribution for all possible collections of my random variable to completely characterize my random process. So here I have taken some particular m random variables at time index n1, n2 and nm. You can choose any time index you want. And this could be again for all time indices. You take any number m and take any time indices m and let us say I have such a distribution how we need to characterize this. I should be able to do then that will give me complete characterization of my random process. And just j1, j2 are the possible realization this random variable can give. So I need to also define this for all possible values my random variables can take. Now let us see what all the things that govern this finite dimensional distribution in the case of a Markov chain. I have this, what I could do is I am going to just introduce this initial state and then keep everything. So if I do this nothing changes, these two quantities are the same like I have just introduced this x0 random variable and just summed it over all possible values it can take. We have said that all possible values are coming from this state space S. Now you can split this probability I call it j1 given x0 equals to i0. And further you can keep on doing this. Here it is xn1. So I just first bought this out and then conditioned up on that then bought x1 out and then in the second time I just conditioned this random variable on that. Now you can keep on doing this. Now let us try to understand this part. So what is this? Now what I am asking? You are in the beginning in the 0th round I am in state i0 and then I am going to jump to state j1 in n1 rounds. This is you are basically jumping in n states from state i0 to j1. What is this probability in our notation p i0 j1? What is the superscript? So it is going to be basically n1 minus 0. Like the number of subs is n1 I will just write it. Now if you look into this, does this probability here depends on x0 equals to i0? No, because if once I conditioned it on something beyond i0 that is xn1 this guy does not depend on this. So I could just write it as when I can do on I can further split it like this. Now what is this quantity here? This quantity I know that it does not depends on x0 equals to i0. So in that case what is this probability then? So is it okay? Is it correct in our notation if I write it as? So this is j1 here right not i1, j1 and this is going from j1 to j2 in how many steps? n2 minus n1 and I can keep on repeating these steps and finally what last step I am going to get is jm minus 1 times jm and what is this guy going to be? Now let us focus on the last term. So whatever this joint distribution of m random variables I am able to now express in terms of this. Now each of the terms here p i0 i1 so superscript n1 this is a n1 step transition probability matrix right and similarly this is n minus 2 n1 length step transition probability matrix. So all of this quantity here they depend on transition probability matrix of different steps okay. But what we have demonstrated earlier is whatever is the step we are looking at all that can be obtained by one single transition probability matrix by appropriately exponenting it right. So now all these quantities a claim is they can be computed from my transition probability matrix p is that correct right. But for the so this quantity how can I write? I can write it as p raised to the power n1 but which term this is a matrix right p raised to the power nm if p is the transition in that which term I will be interested in i0 j1 this is a matrix right I have to tell both row and column indices. So all these things can be obtained from transition probability matrix p. So fine all these things could be obtained from transition probability matrix p but still to complete this finite dimensional distribution I need one more term which is what is the probability that x0 equals to i0 right this is usually called the initial distribution. So x0 right that is your initial step and you want to know what is this probability that my initial step takes state i0 and you want to know it for all possible states. So this is called i0 where your i0 belongs to s is your initial distribution or initial probability distribution. So and often this is denoted as pi. So if you know this initial transition probability matrix and your transition probability matrix p do you think you will be able to characterize all finite dimensional distribution of a Markov chain right. So now this contrast this with iid process then you have iid process to completely define my iid process what I needed to tell you just one distribution right that one distribution that is common across all the random variables. But now here to completely describe a Markov chain I need to tell you two things to completely describe our time homogeneous Markov chain I need to tell you its initial distribution and its transition probability matrix. So as you see like the initial distribution is important because where the Markov chain goes after some round depends on where at which point it started if it starts at different point maybe the future I am going to see may be different. So that is why to completely distribute describe a Markov chain I need to tell you the initial distribution as well as my transition probability matrix. So initial distribution tell you where you are starting and transition probability matrix will describe how you are going to evolve from that point. So Markov chains the way we have scribes it they capture actually lot of things that happen that we face in reality and they give an often nice characterization of those phenomena. For example let us say you want to model some let us say aircraft. So you want to model or you want to analyze where it is going to go and how it is going to go. Suppose let us say its trajectory is mostly defined by what its velocity its acceleration let us say and maybe it also depends on how much fuel it is carrying. So once you give it some initial trajectory and you want to start looking at this position at different different times let us say you are going to look at this position at every second or let us say every minute. Now whatever it reach let us say at certain point of time let us say at every time I am going to measure velocity its acceleration at that time and also the fuel amount of fuel that is carrying that time. If you know that based on that itself you could further analyze further how far it is going to go right. I do not need to know what all the values of the acceleration and the velocity or the fuel it carried in the previous time instances. If let us say if I measured it at let us say t equals to 100 unit then that is enough like all the things that has happened before it is kind of captured there right. The fuel has been consumed till this point but what remains is of importance to me to know the future. So in a sense the current time is kind of already capturing the summary of what is happened in the past. So if you tell me past entire thing I can tell you what is the current thing. But if I tell you what is the current status yes it is a kind of summary of what has happened it may not tell you what all happened how the things happened in the past but it contains the summary of the things that has happened so far that is enough for you to describe future. For example if you just know at time t what is the fuel remaining and its current acceleration and all that is enough for you to see how my aircraft is how far it is going in the future right. So Markov chains are exactly doing this they are just saying that if in your scenario is such that the current state is capturing enough information or kind of summarizing what has happened in the past then you can ignore the past. And your future will be just you can analyze your future based on this current information. Initial X naught right but I am of course it depends Xn1 depends on what happened in the Xn1 minus 1 the previous slot. But here I am just trying to characterize the distribution jointly at different different time instance. But the origin I am not interested here right I am just trying interested in calculating the joint behavior at this point. Indeed what is happening going to happen here we do depend on what is going to from where you are going to start. So that is why this behavior is going to affect what is we are going to see at n1. We are just trying to capture that summary not like what is happening in between. Even when you are going to look at final dimensional distributions right even you need to characterize that aspect also. So for example here n1 could be just 1 I am I am I am considering all possibilities of the indices right it is I am not ignoring that fact. So for example here n2 could be just n1 plus 1 not only one step how it is going to affect all and I am also going to look at jointly all the distributions. What you are asking in this you are asking to find all these joint distributions right you are not asking it just like how it depends on the previous one. What we are trying to describe here is when we have a Markov chain property when you have Markov property all these joint distributions are going to be just captured by my transition probability matrix and my initial distribution. This transition probability matrix is already trying to capture all all you are asking it for from this state what is the probability that I go to next state. But that is not the only thing I will be interested in right may be like to completely understand the process I need to know all these joint process joint distributions. For this we are saying if you want to just what you are asking is okay n1 will only depend on what is happened in the previous step that is fine because I am saying that it do depend on where you are going to start from. If you are going to just okay you start you only going to give me from n1 minus 1 and you are asking me okay what is going to happen. But I that is not going to completely describe my process right I may be interested even to know what was before that that is what the meaning of FDD is all possible description I need to give about the process. We are just trying to model this okay somebody has given you and then I want to analyze the system. How to obtain this transition probability matrix is a different question for that one has to I mean really do big big experimental setup do real measurements. So in this course like right now we are putting all these things under carpet okay how we are going to compute this initial probabilities how we are going to obtain this transition probability matrix this is all there is a separate thing how we are going to find this like you have to do simulation and then try to estimate this parameter. So these are all parameters for us P is a transition probability means it is a parameter it is a matrix with so many parameters I do not know those parameters what I am just saying that see if you pass on these parameters to me I am going to analyze and tell this is how your system is going to behave okay and so for example that is how right like if you want to evaluate 2, 3 aircrafts each aircraft will commit its own parameters then as a system analyst you will see okay take all these inputs and then you try to analyze the system and who is going to give the parameter for this system select the vendor will give the system he would have said okay these are all my system so it is fuel efficiency is this it can sustain this much of turbulence and it can go at this speed and this it can take this much of acceleration and all using those things you will come up with this set of parameters and then use it to understand your future okay let us understand another example all of us know queues right like we wait lot of time in the queues so let us say we have a queue and there is a server let us take a simple idealized case where time is loaded like it every time a guy may come and join the queue so in every slot there is with some probability a new guy will come and join the queue it may happen that in that particular slot nobody will join but with some positive probability he may also join now and also meanwhile there is a server here which keeps serving you let us say you are just in a going to watch a movie and you are in the line to get the ticket so somebody is preparing a ticket for you and as you get the ticket you will move out and lines may either grow longer or grow shorter depending on people are joining or people are exiting the system now let us say at some point let us say time xn so n is going to tell you how many people are there in the queue at time slot n so how you are going to define time slot is up to you you may want to count the number of people in the system at let us say every second or maybe at every minute it depends on the I mean it depends on the scenario like if people are coming very very frequently you may want to take up smaller time scale if your people are coming slowly you may want to take a larger time scale so time being let us say this is just n equals to second n a second one second like that every second I will join the system with some probability and you want to see how many people are there in the system okay now so time being assume that you know this service rate the way this guy is preparing the tickets then by that you can kind of understand at what rate people are leaving the system getting out of the system okay so the guy who get the ticket like he will just rush inside the theater and it may go down but that it could be that that process that the rated or the rate at which this guy serves the people it could itself be something stochastic for example some guys will be very annoying may ask the receptionist the ticket guy a lot of questions and keep him engaged so that guy will take more time to get served or some guy is just want to grab the ticket and go so that guy may take less time so there itself some stochasticity can happen and also we are saying that when the people are joining there itself is a stochasticity every time guy is going to know if you know all suppose if you know the description of this stochasticity at any time if you know how many people are there in the system can you at least give probabilistically how many what will be the next state of the queue here for that time being let us say people are entering with probability q in every slot and every slot a guy gets out and leaves the system with probability q okay so if your system currently at n there are n guys waiting in the queue in slot n can you say how many people what are the possible values for Xn plus 1 so what are the possibilities the next guy so the new value of this number of people waiting in the queue will increase by 1 if a new guy joins nobody leaves it is going to decrease by 1 if nobody joins but one guy leaves it is going to remain same if either nobody joins or nobody leaves or one guy joins and one guy leaves right so to describe this new number of people that are waiting in the queue does it matter like starting from the beginning how many people are there till around n or only the previous one previous lot is enough there right like if I know what is my status currently I can describe what is going to happen in the future in a way the number of weight people waiting in the queue is already kind of summary of the system right what has happened so far it is capturing the summary of the system and that summary is itself enough for me to define the future I do not really need to know what has all happened it till X1 X2 all the way up to Xn minus 1 all that is enough is Xn so you see that Markov change is exactly capturing such kind of system where my current is already capturing or kind of summarizing all my past that is enough to describe my future and in many cases this will arise for example this queue model and the aircraft example I gave like just that is a toy example but many realistic examples can be thought of that follow such kind of scenario yeah yeah in that case if I am going to think of this Xn as a process Xn process then this is going to follow so right now rather than understand rather than seeing whether it satisfies the Markov property what I want you to understand here is the motivation in this example is it enough for me to know the current state of the system to know the future or is it like I need to know entire thing that has happened so far to know the future that is enough right in a sense the current state is kind of capturing the summary that is enough to describe my future so that is where we want to start using this Markov change it can change but say this is what like I am not saying you without knowing P and Q you give me P and Q and then I am going to as an analyst I am not so there are parameters so there are two things there are parameters given this parameters as an analyst sitting and analyzing and telling what happens who is going to give this parameters this is a different question so for example if say if you are a planner you want to plan something somebody will say I have 10 manpower to do this somebody says I have this much budget to do this and somebody will say I have only this much time to do this so this has been given to you and now you will be coming up with a planner analysis exactly like that so here this description is given to you and now you are analyzing what is that if somebody is going to say okay that time I have is less and somebody say okay money I have is less and somebody say I have less manpower then you are going to reanalyze with this new parameter and say okay this is the performance right now I am just seeing like a big boss okay you tell me what is all happening what all the things you have and then I am going to plan and or analyze what is going to happen okay so fine now the way we have defined Markov property it says that I tell you okay this at this index I know the status of the system then from that point onwards I will to explain my system I do not need to know anything about my past but most of the time it so happens that this time index that we are going to deal with may not be deterministic itself could be a random thing okay for example I want to see how much my aircraft goes further when it is fuel tanks become a half okay so now the point where the fuel tanks become half itself could be a random thing right because the fuel consumption that is going to happen it depends on so much of environment that you are going to face that you may not have control that itself is a stochastic thing so your fuel tank may get half in just 2 hours or 1 hour 50 minutes or maybe 1 hour 55 minutes whatever it could be a random time and from that point onwards you want to understand you want to understand how the things behave in the future so there this n here could itself be a random quantity right so for example you want to understand so let us say you just randomly dropped into some cinema hall and at random you dropped in it is not necessary that when the counter opened you may have after that you may enter the queue after 10 minutes or 1 hour or whatever so the time when you are joining the queue itself could be a random time now you want to see okay once I join this how the things I am what are the things I am going to see like what is going to happen for example in this case let us say you may want to analyze if I join the system at some point how much time I need to wait before I get my ticket okay so to analyze how much time I need to get a ticket you want to understand okay what if if I my entry time is not exactly deterministic but that itself is a random right so all of us like go and join queue at some in rush and at some point we join and from that point we want to analyze yeah this is an arbitrary time but this is a deterministic this is all index right this n I said this is some n and now I could make it random by saying that okay this is x t t is a random variable now t is going to decide which index I will be looking at okay so repeat just this example you can imagine you are joining this queue and the amount of time that you need to wait in this queue before you get a ticket that is a random thing right because it depends on how many people were before you and also depends on the people before you how fast they are getting served how much time they are getting served now if you know that you have exactly entered at a particular time you know this but you may not know you may be entering the system at random point from that random point now you want to analyze you want to understand how much time I may need to get served okay so but so I am saying apriori if you know exactly at what time you entered it is fine but if you do not know this time of joining that itself could be a random how that that is going to affect your future so now the question is the Markov property we have discussed so far does it carry over to the case when I have such a random times so I want to basically we want to basically analyze suppose let us say t is a random number t is a random variable which is integer valued now I want to know so now I want to know instead now see like I have now put this I want to ask you if I tell you the this the observation of the states of the system till this random time t now the next state it is going to take in the the state j1 it is going to stake in the next round can I write it as simply this is one step right p i j1 so if this t is some fixed n this is what our description right p i j so this guy is independent of everything else except this xn equals to i and that is the definition of p i j1 but now I want to ask the question what happens if this time is not deterministic is random so notice that when I make it random this t could be taking any possible values that this random variable can take okay so now let us make this notation a little bit more clear what I mean by xt how to interpret this term xt okay so I know that what is xn of omega my the value taken by my n xn random variable at sample point omega now what is x of t means so this t here here and we are going to denote it as x of t of omega so t itself is a random variable so on that sample point just first tell me what is the index and from that index what is the value taken by this xt of omega on that sample omega yeah they will be same so when I write x of t the way you have to read it is so this is a random variable right xt so this is again so t is from omega to 0 1 2 all the way and I know that x of n is also from omega to r this is for all n all my xn's are defined on the same sample space right now when I say x of t it is just that I am not specifying you which index n I am just specifying you a random index so when I write so this is basically xt of omega when I write this means first look at the index at the sample point omega and look at its value at this omega okay so okay another example okay so just let us see if this fits well here so let us say that I am interested in 100 stocks okay or whatever number you take in the stock market and what I am interested in is how the values of the stocks change on every day okay so let us call this stocks 1 2 up to 100 I can think of this as in my sample space omega is this stocks now on each day I can focus on this stock and then see how its value is going to change if you are going to fix a stock let us say that corresponds to omega then you know the value it is going to take on different different days that is one sample path you have now what I want is I would say that okay the value of my particular stock when its value becomes less than certain number on the day when its value goes below certain number let us say so now what is t is going to be t is going to describe you that day how you have said that particular stock when its value become smaller than some number right so that its value may become smaller than some number in some day you do not know that could be a random quantity right you go to that day and then look at the value taken by that stock on that day t of omega so the value that stock has so in this case this t of omega basically gave me the day on which the value of that stock omega went below certain number and then this is going to tell exactly what is the value of that stock so you realize that why this random I may be interested in knowing my value process value at a random time because you may want to put some conditions like that for example you may want to in the in the in the stock case you may say that okay what is the value of my stock when it is exceeded 500 rupees okay so you will look for that time when it exceeded 500 rupees 500 what I am just saying is it is exceeded 500 but its value itself need not be 500 right it may be 600 but what I was interested in when it exceeded 500 it may have exceeded 500 rupees on the 100th day but when it exceeded what is the price that will be given by this and when it exceeded 500 that day is given by this t of omega so and here omega corresponds to that particular stock that I will be interested in x will go from no x it is not like x exactly so x has actually basically I can think of x as index by and with index and sample so this same thing we have written as x and omega so you understand the meaning of this random variables here for every n I am defining x n taking value on omega I mean assigning values to omega right in on real line so instead of fixing n I can just make it a joint one so I can think it as so okay so I think yeah yeah n belongs to capital N n is the set of natural numbers okay so either so you first so to define a random variable the process what I asked give me an index and then give me a sample then I will tell you what is the corresponding value so here I am instead of separately defining for each index then I can just say okay give me the sample point and give me the index I will tell you what is the value that it takes so in this case this x is a process we have discussed this right when we are talking about a stochastic process stochastic process can be thought of as a collection of indices sorry collection of random variables where every index corresponds to a random variable or alternatively we can also think it as random process x as a map which gives value r for a pair omega and n so same thing in that case we can think this as if this n so here this can be thought of as for so if x of omega in this case can be thought of as x of t omega and omega so sorry yeah x of t as this we will go into tell you what is the index we will be looking at and what is the value so for example when I say stoch right so the t is defining so on a particular stop omega let us say you have defined when it exceeds value 500 so t of omega will tell you that time when it exceeds 500 that that index so that is fine I mean that that ambiguity is there for time being for our example let us assume or maybe we can refine it as the first time when it takes value 500 we can do all such refinements then it is going to be unique in that case and then we can define like this