 We started discussing about convergence of sequence of random variables in the last class. So we defined different notions of convergence. We talked about what convergence in almost sure sense, we talked about convergence in probability, we talked about convergence in mean squared sense and at the end we defined convergence in distributions. So let us take an example here. Let us say I have 3 random variables in this form. Let us say it takes some number here and then it has something. So this is all random variable x1 and let us say I have another one. So let me call these values here. Let us say this is a1, this is a2 and this is a3. And similarly, let me just draw another one here. It is like this. So again, all these things are same. This is a1, this is a2, this is a3 and my scaling is not correct. So let us recall the definition of unit interval probability we have defined earlier and let us say we have 3 random variables defined on this unit interval probability space like this. So this is x1, this is x2 and this is x3. So notice that the way I have drawn is here. All these random variables I am going to take only these 3 possible values a1, a2, a3 and these widths are same. I mean my scale may not be correct but assume that. So whatever this width is let us say this is like p1, whatever this is the same p1 here and this is p2 and let me let us say this is the p2 width here and let us call this as the third part as p3. So all of them have the same taking 3 values but it is just like the intervals on which they are taking these values is going to be different. So here it is taking value p2 on this interval and here it is taking the same value p2. So here it is taking the same value p2 on a different interval here. So what I mean here is that is fine. Now let us say that x4 is now again this, x5 is this, x7 is this, x8 is this, x9 is this and x4 is, x9, 10, 11 is this. Like it is now my random variables are periodic versions of this. So what we mean is xn plus 3 is simply going to be xn for all n. Now if you have a sequence for random variables like this, do you expect it to converge in almost sure sense or in probability? No, right? Because this is periodic and it is fluctuating in so much. But if you look at the distributions, so what is the distribution of x1? x1 is actually a discrete value random variable, right? Here actually it is only taking 3 values a1, a2 and a3 and it is going to take that with probability of p1, p2 and p3. And it is also again the same, is the distribution again same here. This x2 is also taking 3 values a1, a2 and a3 with what probabilities again p1, p2, p3. So if you look into these distributions, these random variables are identical. They are the same. It is just like they are putting that mass on a different intervals. Otherwise, they are the same. So this is where we want to understand convergence in the distribution. So we have to go beyond convergence in almost sure, convergence in probability and convergence mean square and we are interested in convergence in distributions, okay fine. Now what is the limiting distribution here? So if you have a sequence distribution, what will be the limiting distribution? So the limiting distribution is something which is probability that x1 is equals to a1 is p1, probability that x2 is equals to a2 equals to p2 and probably equals to x2 equals to p3, p3 or this is going to be any random variable which takes these values a1 and a2, a3 with probability p1, p2, p3. This is going to limit distribution here, okay fine. Now let us look at another sequence of random variables. I am going to define a sequence of random variables here. What is u here? u is uniform random variable. I am going to look at a sequence of scaled uniform random variable here like this u. I am going to pick scale by minus 1 to the power n by n. Now let us try to understand how the distribution of this looks like. So let us take n to be odd. If n is odd, this random variable is going to be positive random variable or negative random variable? It is going to be negative value. And then let us look at how its CDF look like. So your value, u takes value between what? 0 and 1, right? It is going to be, its smallest value is going to be when u equals to 1. In that case, it is going to be minus 1 to the power minus 1 divided by n because n is odd here, okay? And its largest value is going to be 0. So if you look into its CDF, it is going to start from minus 1 by n here and how does it go? It goes all the way up to here linearly and hits 1 here, right? And then it hits 1 here. Now if you look into the case when n is e1, how does the CDF look like? So in this case, the smallest value is going to be what? It is going to be a positive random variable when n is e1. What is going to be the smallest value? 0 and what is going to be the largest value? 1 by n and it is going to be and then it saturates here, right? It takes 1. So I have this CDF which is going to look like this for depending on whether n is odd or e1 and as n tends to infinity, what is going to happen to this? It is going to be just a function and what is, where is the jump happening? At origin, at x equals to 0, okay? Now let us understand. So we expect that in this case, the limiting distribution to be what? Which is the one which is going to take almost value 0 with probability 1, okay? So that is my limiting distribution. So my limiting distribution is just this. Now let us try to understand how my CDF converge at different points of x. So my CDFs are function of x here, right? Let us see for different values of x, how they converge. Suppose if I take a value of x here in the negative side of my real line, that is x is less than 0, okay? And if I let n go to infinity, what this value will converge to? It is going to converge to 0, right? Because this guy will keep shifting to the right side and some point whatever x you have here, it is going to go beyond that and it is always 0 and this guy is anyway 0 for x less than 0. Now if you are going to look at f of xn of x for x greater than 0, what is this going to happen? It is going to be 1, right? So this guy sequence converges to 1. That is fine and that is matching with this. When I am going to take x less than 1, this is going to be 0 and when I am going to take x strictly positive, that is going to matching with 1. Now let us look at the case when x equals to 0, okay? So when x equals to 0, so what is the value of this function at x equals to 0? It is going to be 1, right? Because we have a right continuity property and what is the value of this at x equals to 0? 0, right? So I have a sequence which is alternating between 1 and 0. Will such a sequence converge? 0, right? So it is not converging at when you are going to look at x equals to 0. So it is converging for x less than 0, converging x greater than 0 but for x equals to 0, it is not converging. So as you see this, this limiting distribution has a jump or a discontinuity at the point x equals to 0 but on all the other points, it is continuous. But now you see that at the point of discontinuity, this convergence is not happening. So in general, that is why when we said that in the definition of convergence and distribution, we said that a sequence of random variable convergence and distribution, if the CDFs convergence to the limit CDF at all points of continuity, right? That was our definition of convergence in distribution. So why will we ignore that? Because let us say if you look at the sequence of this CDF, they are converging at every point except for the point of discontinuity here. But still it is valid that this distribution we can assume, I mean we can interpret that that is converged to this point but we have to just ignore this point of discontinuity here. So just like to account for this in our definition of the convergence and distribution which we stated last time, we have explicitly stated that we look for convergence of CDFs only at the point of discontinuity of the limiting distributions. We have already discussed that distributions are somehow associated uniquely with their characteristics function, right? Because character function uniquely defines the distributions and vice versa. So then based on this, we can directly state the following result which I am just going to state without proof. So if we have a sequence of random variables and then another random variable X, then we are going to say all these three statements are equivalent. Either you say that Xn converges in X in distribution, you recall this notation. This is what we mean by converges in distribution. This is equivalent to say that if you are going to take the expectation of your random variable but not directly the random variable, expectation of some function of this random variable where this function is continuous and bounded, if the sequence of expectation converges to the expectation of the function of that limiting random variable then this is also implies that they converge in distribution. And alternative characterization of same is you look at the convergence of the characteristic function. So phi of Xn is the convergence characteristic function of my random variable X. So if you compute at some point U, then it should convergence to the characteristic function of X again computed at the same point U. And this should happen for all U. If this happens then we can again say that my sequence of random variable converges in distribution. So I am just going to skip this proof but we will take this result for granted. Please look into the proof in the book. Fine. So we have now characterized all the four kind of convergence notions we studied. Okay, now the question is what is the relation between them? So I am going to state it as a proposition. So let Xn be the sequence of random variable. Now suppose if I know that my sequence Xn converges to X in almost sure sense then this implies that Xn converges in probability to the same random variable X. Okay, so this we have already said. And now if Xn converges to X in the mean squared sense to some random variable X, then Xn converges in P again to the same random variable X. Xn converges to X in probability then Xn converges to X in distribution and D such that for all n and so what this result says is suppose if I have a sequence of random variable that convergence in mean squared in almost sure sense then that implies that it converges in probability also. The second point says that if I have a sequence of random variable that converges in mean squared sense then that also means that it converges in probability. And the third point says that if I have a sequence of random variable that converges in probability that means that it also converges in a distribution further. Now the question is okay we had shown implication in one direction what about the other directions? Is it true that P implies AS and P implies mean squared sense. So this part D answers that question partly it says that convergence and P implies convergence in mean squared sense that is this direction provided something happens under some condition not always true. It says that if there exists a random variable Y such that all my sequence of random variables are dominated by that random variable Y with probability 1 and this random variable is such that further it has finite second moment. If I can find such a random variable Y then this is true that my convergence in probability also implies convergence in mean squared sense. So this implication that convergence in P implies convergence in almost sure that is not true in most of the cases even if you look at the example we studied we have already have an example but we discussed where it converges in probability but not necessarily in the mean squared sense. So also it is not so easy to come up with conditions like when convergence in P implies convergence in mean squared sense. That is the meaning of that right? What is the meaning of this? So this means that if I take set of all this omega where xn of omega the absolute value of that is upper bounded at Y of omega that set of omega should have probability 1. It may happen that this condition is violated on some omegas where that mass is 0 for example when we have a continuous value it may happen that one point this condition may not hold but that one point may have 0 mass I do not care about such points. So that we are going to come through P. So if I have almost sure first we will check whether it implies P and once I know it implies in P then this condition will come to my rescue. So it is not like I do not have a direct route here I have to go here and if after going here further if this condition holds then I have a clear I can then say something about mean squared sense. So this also in general we do not know I mean we do not have a proper conditions like this when this is going to be true. So we only whatever we know that we have stated here under these conditions this holds but if you are going to for some reason want to use that distribution can convince in distribution implies convergence P you need to provide a proof for that. It may happen that for a specific example you have in your hand conversion distribution may be also implying convergence in distribution but you need to establish that. But if you have a case where you have conversion mean squared sense you can just say that using this proposition we stated that already implies convergence in P. The ones we have in this hard lines, this continuous lines we know and when you are using this dashed lines you have to first establish this hypothesis that that exists such a Y and for the lines which we do not have you need to provide a proof before if you want to use it at all. There is one more important property that we will just write down. It says that suppose x n converges to x in let us say limit x n converges to x in m s square or m s square or in P and I have limited x n equals to y again in mean squared s and p then it must be the case that probability that x equals to y is equals to y. What it is saying is the limits are unique up to the probability unique up to the set on which we have on the set with probability 1. So, remember in the first class in the previous class we had a simple example for a where we defined x n of omega is equals to omega to the power n. You recall that example? So, for that example we said that the limiting x is 0 on all omega or we came up with another possibility where x is 0 for all omega strictly less than 1 and x at omega 1 is 1. So, these two were both possible limits according to that definitions. But then they are equivalent in the sense that they take the same values on the space which has probability 1. For that example omega raise to n we are not defining a limit for every omega. No, because this is we already said right we are interested in unit probability space, unit interval probability space. We have restricted our probabilities on our sample space to be unit interval. Now, here I have not stated. So, those are specific examples. You can take the sequence of distributions like say they are all on the some probability space. On this probability space we are defining everything these these x n's are all defined on this. So, the last point is suppose let us say I have x n equals to y in distribution. So, let us say I have I establish that I x n converges to some random variable x in distribution and y x n converges to another random variable in again distribution then it must be the case that x and y has same distribution. Let us try to at least prove the first a and b parts which will also highlight some of the or like help us revisit some of the concepts we defined before. For example, continuity of probability and all. So, let us try to prove a. We have let us assume that x n goes to x almost surely and now we need to show that that implies x n converges in p. So, let us define a set a n equals to I have defined a set a n. Now, if I want to show my x n converges in probability what I need to show? I need to show that probability of this a n as n goes to infinity is 1 that is the definition of convergence in probability. We need to show let us try to see if we can do that. So, for that we need to construct some specific sets. So, let us define b n to be all omega x n of omega sorry k x of omega to be less than omega for all k to be equals to n. See what I have done is I have I had interested in set a n which for a given n wanted included all these omegas which satisfy this condition. Now, I have slightly refined it and defined a new set where I want this condition to hold not only for n, but for all k greater than or equals to n. So, this was a particular n. Now, I wanted to hold not only for n also for n plus 2 and all the way up to infinity. So, because of this is b n is a set which is contained in a n or other way around. So, a n is contained in which is most stringent where I am asking for more this is b n right. So, b n for not only n I want is to happen for everybody. So, it may happen that some omegas this may not happen they may drop out from this. So, because of that which one is correct? This one or other way around? Other way around this one is correct right fine. Now, let us look into this sequence of b n's. Now, I want to look at so is this sequence of b n's they are monotonically increasing or decreasing. So, if I increase n right let us say I increase n from 10 to 15 earlier I wanted everything beyond 10 to satisfy. Now, I only want everything beyond 15 to satisfy. So, it should be increasing function right because as n increases I only want smaller number of sense conditions to be satisfied. So, I will have b 1 contained in b 2 like this or b n is an increasing sequence. So, we are saying that this condition should be satisfied for all k greater than 4 to n right. And a n should be satisfied for only n. So, this is like I am looking for all this omegas which satisfy this at only n. Now, here I am looking for all this omegas not only satisfied at n, but also every point after n. So, how are we saying b n is a subset of n. So, let us say something satisfied n. Is it necessary that it satisfies at n plus 1 also? It may not satisfy at this point right. So, that point may drop out of this. So, because of that a point which may be belongs to a n that need not necessarily belong to b n. Let what we are saying here b n is going to be for all this. What is the issue here? We want this to be correct. Let us take a point. Let us say some omega belongs to a n. Can you guarantee me that that also belongs to b n? Just think about it. Are you convinced or not b n is monotonically increasing? I just said let us take n equals to 1. Now, you want this to be satisfied k greater than equals to 1 like all the points. This should be satisfied. Now, let us take k n equals to 10. When you have 10, you want to be satisfied only after 10. So, here compared to the first case, in the second case, you have lesser condition to check right. Yeah, I mean that is what like it could be the same. That is what our convention right. We have said that unless it is a 6 subset, then I would have written like this. Okay fine. And now, whenever I have this, I know that what is my limiting b? My limiting b is going to be union of b of n right. And what is my convention in that case? What is going to be probability of b is equals to limit as n tends to infinity probability of. So, this way I just because I have a sequence which is monotone, I could apply this continuity of probability and write it like this. Now, let us take this set. Now, let us look at all these points for time being which are going to satisfy this condition. Now, what is the doubt? Convince yourself later. He is not going to convince you. So, let us say this, I have a set of this omega which satisfies this condition. Let us take up a particular omega in this case. Now, according to the limit of this limit definition here, we know that x n omega minus x of omega is going to be greater than epsilon for some for all n greater than n equals to then n epsilon. This is true right. I have just applied the definition of the limit. If for some omega this is true, I know that that omega should also belong to this b. Is that true? Because I know that if this is the case, then for some b here, so for some bn this is already satisfied right and this is union. So, that omega should belong there. So, because of that I know that my b contains this set. So, my b contains this set. So, b is contained here. Now, let us apply probability on this bn at an. If I going to take probability of bn, this is going to be, this I know trivially holds right. Probability has to be less than 1. And what I know, now let us try to invoke what is given to me. If I let this n, I want to now show that this sequence takes the value of n right. If I can somehow argue that my lower bound also goes to 1, then I as n goes to infinity, then I know that my probability as n goes to infinity is going to be 1. That is what I need to show. Now, if I am going to apply probability on this, I know that probability of b because b is a larger set than this, it must be the case that probability of b is probability of omega limit n tends to infinity xn of omega right. Because this set is contained, sorry this left right hand side is contained in this, it must be the case that this probability is less than this. But by my hypothesis, this quantity is going to be 1 because that is the definition of almost sure convergence. Now, as I let n go to infinity, so here, so here if that is the case, I know that the probability of b is lower bounded by 1. That means probability of b is 1, it has to be 1. So and if I now let n go to infinity here, so this quantity by definition as n goes to infinity, this is going to be simply p of b which I have shown to be equal to 1. So this is going to be probability of b less than or equals to limit as n tends to infinity probability of a n less than or equals to a 1 and this I have already shown you less than or equals to 1. So that is why it must be the case that this sequence is equals to 1. So it implies that this quantity implies convergence in probability. Just also let us quickly discuss this part also, this is going to be slightly easier. Now, suppose we assume x n converges to x in the mean squared sense. What does that mean? I know that to 0 goes to 0. Now, what I want? I want to say something about this because this quantity is related to convergence in probability. So how can I write this quantity in terms of this, in terms of the expectation? So do you recall any relations we studied so far? The probability of a random variable being larger than something I could express in terms of the expected Markov inequality. So if I apply Markov inequality here, what is this going to be? What is this? The upper bound is, what did Markov inequality say? Probability that this quantity is greater than epsilon is upper bounded by expected value t of square of this divided by epsilon square. Epsilon is some constant but positive constant, however small it is. And now just apply, if I let n go to infinity by my assumption that x n is already convergence in mean squared sense, this quantity should go to what? It should go to 0, however small your denominator is. If this quantity is 0, what is this quantity as n goes to infinity? It is going to be 0 and that is exactly what is the definition of convergence in probability. So let us leave it here like this c and d, you can look into the book and you have to again go through construction of such sets, do some epsilon or delta business to get the proofs. So we will just leave it there. You can just skim through the proofs but let us make sure that we understand these results for the rest. So it is saying that convergence on p implies convergence on d and we already talked about this.