 Welcome back. So, we were discussing convergence notions. So, we have defined almost your convergence, which is also known as convergence with probability. And we define convergence in probability. So, I will give two more definitions of convergence in the earth mean. So, we say x n converges to x in the earth mean, if limit n tending to infinity. So, this x n is a sequence of random variables, limit n tending to infinity expectation of x n minus x to the r equal to 0. And for r equal to 2, x n is set to converge to x in the mean squared sense. So, this is another notion of convergence. This says that the expected. So, if r equals to for example, you have the mean squared error between x n and x. So, the expected x n minus x squared equal to 0 as n tends to infinity. So, the mean squared error goes to 0. And generally for r, you take the earth moment assuming it exists and so on. So, if this happens, then x n is set to converge to x in the earth mean or the earth moment. So, this is one notion of convergence. And then I will put down the last one definition. We say x n converges to x in distribution. This is convergence in distribution. If limit n tending to infinity, f x n of x is equal to f x of x for all x, where f x is continuous. So, this is convergence in distribution. This is also known as alternatively also known as weak convergence. See, convergence almost surely is sometimes refer to a strong convergence. And convergence in distribution is sometimes refers to as weak convergence. So, this means that. So, this notion of convergence corresponds to the CDF's, the sequence of CDF's f x n converging to the CDF of the limiting random variable. You do not need, you do not necessarily demand convergence for all x. You demand it wherever f x is continuous. After all where f x is, where at points of discontinuity, we always know that CDF's have to take the right continuity value. So, at points of discontinuity, there is no ambiguity on where the functional value is. So, the limit we only demand for points of continuity. So, if you think about it, these are not really, these for example, this notion is not really a convergence of random variable. It is only a convergence of distribution function. It is not as though x n and x are getting close in any way. In convergence almost surely for example, x n and x were getting very close for almost all omega. For a set of probability 1, x n and x were very close. They were converging. In this case, x n and x can be very different. For any given omega, it is not necessarily true at all that x n of omega and x of omega have to be close. All that is demanded here is that the CDF's converge. The actual value of x n of omega and the actual value of x of omega can be very different, even on a set of positive probability. Is that clear? Only the sequence of distribution functions converge. So, a better terminology rather than saying a sequence of random variables convergence in distribution, it would have been better to say that the sequence of distribution functions of the random variables converge. But, somehow this terminology has stuck with a bit of a misnomer. So, we still say that the random variables converge in distribution rather than say that the sequence of distribution converge. This clear? So, we have. So, how many definitions did I put down? So, definition 0 was. So, let me say definition 0 was sure convergence, which is useless. Definition 1 was almost sure convergence. Definition 2 was convergence in probability. So, this should be definition 4 and this should be definition 5, 3 and 4. Definition 3 and definition 4, if you I have called sure convergence as definition 0, because it is pretty useless. So, for all practical purposes there is 4 notions of convergence we will work with. Now, notation. So, did I introduce some notation? I probably did not. So, notation. So, for definition 0 is point wise convergence or a sure convergence. We will say x n converges to x point wise or surely. Let us just say x where converges to x p w and almost surely we will say x is converges to x. Either w p 1 on the arrow or x n converges to x a dot s on the arrow. So, normally you should not you should never say x n converges to x without saying in what sense you are talking about. See, when you are talking about a sequence of real numbers it you can say x n converges to x, because there is only one notion of it. But, when you are talking about a sequence of random variables there are so many notions. So, you have to you can never say that some you should never say that x n converges takes without saying in what notion it is. It is a fairly common actually there are people who write papers without saying what sense they convergence is in. So, that is very bad. So, you should always mention the sense of convergence you are talking about random variables. Then we had convergence in probability. So, for that we will say x n converges to x in i dot p in probability and convergence in the earth mean we will write earth. And mean square convergence r equal to 2 instead of writing 2 here we will write m s. So, for mean square we will write x n converges to x in the m s mean square sense that is for this bit. And finally, distribution convergence in distribution we will use there are some there are couple of standard notations. One is to just put d on the arrow the other is to put a double arrow. So, some people write that x n double arrow x, but this looks like implies. So, I do not really like this notation. So, much. So, I we will use x n arrow d x just put a capital D on the arrow that is as far as notation is concerned. So, now. So, I of course, we will give several examples of examples of sequences or random variable converging in various senses. But, before doing that I think it is actually instructive to put down the main theorem that talks about how this notions of convergence are in fact related. And then, we will look at some examples, because those examples may themselves show that one sense of convergence does not imply some other sense of something like that. So, let me put down the main theorem this is exceedingly important by the way all this material is very nicely. So, the very good source for all this is Grimit Sturzaker chapter 7 very good source for convergence of random variables theorem. So, this is theorem hierarchy of convergence. This is a very important theorem hierarchy of convergence means it gives you a sense of what implies what and other way round or not right those kind of in some sense it gives you a hierarchy of which is stronger and which is weaker right following implications hold. So, I will just write the notion I am just going to write these notions there almost sure implies convergence in probability. And so, thus earth mean for or greater than or equal to 1 also implies convergence in probability. And convergence in probability implies convergence in distribution. And of course, it goes without saying that point wise convergence implies almost sure convergence right. And between earth mean and almost surely there is no relationship whatsoever almost or for that matter point wise right you may have earth mean convergence without almost sure convergence you may have almost sure convergence without earth mean convergence. These 2 have there is no implication in any direction between these 2. And generally the reverse implications are false. So, let me just say no other implications hold in general. So, this bit is obvious right the bit about point wise implying this is obvious. So, how many results do we have to prove we have to prove 1, 2, 3 implications 3 results we have to prove right. And we have to in order to prove that no other implications hold in general we have to give counter examples right. If I give 1 counter example it is enough to show that some particular reverse implication does not hold. So, I will have to give 1, 2, 3, 4 and 5 5 counter examples I have to give is that clear. So, I have to prove 3 theorems and give 5 counter examples to establish this right right 1 is I have to prove that does not hold I have to prove that does not hold I have to prove that does not hold 3. And between these I have to prove that there is no implication in either direction that makes it 5 results that I have to prove or counter examples that I have to give and 3 results of implications to prove. So, that is what I will spend today and the next lecture doing. So, in the process I would have given you several examples because I will be giving counter examples to prove that the reverse implications do not hold right. So, in this. So, this is a fairly non trivial result well it is non trivial it requires a somewhat serious proof. This is a relatively easy proof this is probably the simplest you see why no r greater than or equal to 1. So, what you have is if r is greater than or equal to 1 you can just use Markov's inequality right. So, if you want to prove. So, if you want to prove that r th mean convergence implies x n converges to x n probability for r is greater than or equal to 1. So, you have to prove that limit n tending to infinity probability that. So, I have to prove that for every epsilon this limit is 0 right. So, one thing you should see when you are trying to prove a notion of probability promotion of convergence you first write down its definition that is a very safe way to start rather than you do not try to you know intuitively wing it or something just put down the definition and try and prove it that is a very I mean that is how we should go about it right. So, this is what I want to show is 0 right I want to prove this implies that right convergence in probability saying this is limit is 0 right. But by Markov this will be less than or equal to. So, I can have an r r here right for r greater than or equal to 1. I can put an r and r here because they are all monotonic functions for r greater than or equal to 1. Then I will have expectation of x n minus x to the r th correct over correct because this probability is less than or equal to that. And since I have convergence in the r th mean for every epsilon greater than 0 this limit will be 0 right. So, this is 0 for all epsilon greater than 0 right. So, this limit is this probability has to go to 0 correct. So, this is by Markov inequality. So, this bit is done therefore. So, you prove what you wanted to prove next. So, this proof that x n converges to. So, if x n converges to x in probability then x n converges to x in distribution. So, this proof is as follows. So, let us. So, now you fix epsilon greater than 0 then you have. So, you have f n of x. So, you are going to assume that convergence in probability converges in probability then you have to prove that there is convergence in distribution. So, f n of x is equal to probability that x n less than or equal to x which is equal to the probability that x n is less than or equal to x and x is less than or equal to x plus epsilon plus probability that x n less than or equal to x and x greater than x plus epsilon. So, I am just splitting this up into these two. I am trying to prove this implication. So, this is less than or equal to. So, if you look at this joint. So, if you look at this event right this probability is less than or equal to the probability that x is less than or equal to x plus epsilon. Because, this is like intersection of this and that this less than or equal to the probability of only that. So, that I can write. So, less than or equal to by say by f n is the. So, by actually mean f x n. So, this is less than or equal to f x of x plus epsilon plus. So, what is this event? This event corresponds to see you are saying that x n is less than or equal to x. But, x is greater than x plus epsilon that necessarily implies that the difference between x n and x is greater than epsilon. So, that is an upper bound right. So, that is. So, you can write this can you not that you will agree with a similar argument. If you do it for the minus epsilon part you will get f x of x minus epsilon is less than or equal to f x n of x plus probability of absolute value of x x n minus x greater than epsilon. This is a very similar argument thus. So, if you combine these two what happens? You combine these two you get f x of x minus epsilon minus probability of I am bringing this to this side absolute value of x n minus x greater than epsilon less than or equal to f x n of x less than or equal to now I have to work with that guy. So, that is f x of x plus epsilon plus probability that agree. So, this is just algebra right. So, I am bringing this term here right to write the first inequality and this inequality is something I derived here that is the second inequality. So, now as I send n to infinity what happens? So, in this double inequality I send n to infinity right what happens then say again. So, what am I trying to prove I have assumed that convergence in probability holds trying to prove convergence in distribution convergence in probability holds which means if I send n to infinity now I will let n go to infinity that will drop out right in the limit and that will drop out in the limit for every epsilon. So, what I will have is that this limiting c d. So, this limit will be sandwich between f x of x plus epsilon and f x of x minus epsilon correct. So, if f x is continuous at x I can send epsilon to 0 and then these two will. So, there will be a sandwich effect right. So, these two will. So, if a epsilon goes to 0 I am only considered see I am only concerned about f x at points of continuity of f x. So, when epsilon is very small these this will be equal left limit and right limit will have to be equal. And therefore, f x n will converge to f x right. So, we see f x n of x converges to f x of x if f x is continuous at x. So, this is also done fine any questions on this. So, actually just algebra is nothing very g going on. So, this is done this is done. So, let me just settle this side of the story once for all. So, I have to prove that the reverse implication does not hold right by producing a counter example. See the counter. So, the counter example the motivation is as follows. As I said convergence in distribution just means that the distributions are I mean the distribution functions are converging it does not mean that x n and x are getting close in any sense. Whereas, convergence in probability means the probability of x n and x n and x being very close is close to 1 right is not it or rather the limit of the probability of difference exceeding any epsilon is 0 right that is really what it means. So, if you take. So, this is the counter example will to show that convergence in distribution does not imply convergence in probability. So, x 1 x 2 dot dot dot b such that x i equal to x for all i and x is Bernoulli p for example. So, what am I doing I am this is a very pathological example, but that is enough. So, I am taking a example where the entire sequence is just x x x x and so on. And that x is some Bernoulli p it has value 1 with probability p value 0 with probability 1 minus p then the trick is then you define. So, let us do this Bernoulli half I think is a better idea let us do Bernoulli half I am just this is just 1 counter example if I find 1 that is good enough let y equal to 1 minus x. So, what happens. So, whenever x takes the value 1 y takes the value 0 and vice versa correct. So, now the claim is that clearly x n converges to y in distribution right correct why because x n are all well x n are all Bernoulli half y is also Bernoulli half right. In fact, there is no convergence they are all equal this series are all the same right there is no real convergence itself is like all the f x n are equal to f y because they are all Bernoulli half with me good. Now, what is the problem x n does not converge to y in probability y, but absolute value of x n minus y is equal to 1 right agreed. So, this is a counter example to show that is a very pathological counter example, but that is good enough right. So, obviously x n converges to y in distribution, but x n minus y is always equal to 1 is no way it is going to become smaller than any epsilon you want it always equal to 1 correct. So, the reverse implication also is generally not true, but there is a partial converse that is true to this this converse of this holds in a very special case and it is a very important special case. Namely, if suppose x n converges to a constant the limiting random variable is a constant c right then convergence in distribution and convergence in probability are in fact, they are equivalent notions. So, this is something you can take as your homework homework if x n converges to c in distribution where c is a. So, by x n converges to c I mean the constant random variable right where c is in R then show that x n converges to c in probability fine. So, is the statement understood prove few will supply it is not all the difficult. So, convergence in probability always implies convergence in distribution the converse is true if the limit is a constant random variable. So, when the limit is a constant convergence in distribution in fact, equivalent, but generally speaking convergence in probability is strictly stronger than convergence in distribution. So, by the way so this convergence in distribution is called weak convergence because it is sitting at this end of the implication right it is that same it is the weakest notion of convergence. So, this side of the story we have completed this implication we have proven this counter implication we have given a it is not true we have given a counter example. And we also said that when the limit is a constant this counter implication this of the converse holds alright and this we have shown. So, convergence in probability does not imply convergence in earth mean or second mean you can show by simply taking a random variable which sequence of random variables convergence in probability, but may be a second moment is finite infinite right or you know. So, in that situation you may not then you will not have convergence in the mean square for example, that you can cook up some trivial example to knock off this implication. So, I think so I will. So, this is kind of this is also kind of done. So, this whole bit we have completed. So, the real conceptual subtleties come in this part of the story the difference between almost your convergence in probability. So, that is where more work is needed. So, almost your convergence recall it says probability of omega for which x n of omega converges to x of omega that probability equal to 1 here it says. So, you just need that probability to what is 0 right as limit n tends to infinity. So, we have to prove that almost your convergence implies convergence in probability. And then we will have to give a counter example that convergence in probability does not imply almost your convergence neither of which is immediately it is not a 2 step neither of this is 2 step argument. So, how much time do I have about 10 minutes. So, let me so I think in 10 minutes I can do a counter example for this to prove that convergence in probability does not imply almost your convergence that is probably the most intriguing. This is an interesting counter example. So, at least standard counter example though counter example for convergence in probability does not imply convergence almost surely that I will do today. Tomorrow I will prove a major theorem about convergence almost surely a corollary of that big theorem will be that convergence in probability follows from almost your convergence. So, there is one big important theorem coming up next class, but I do not have the time to do that now. So, I will just do this counter example. So, if I show one example that is enough let X n be equal to 1 with probability 1 over n you would have seen something like this already I guess with probability 1 minus 1 over n. So, these X i's are and X i's are independent. So, you have a sequence of independent random variables. So, X 2 takes the value 1 with probability half X 3 takes the value 1 with probability 1 third and so on. So, as n becomes larger and larger it becomes increasingly unlikely that a value takes 1 the value takes is 1. So, the larger the n more likely that it takes the value 0. Now, the claim is that these X n's. So, what does you what do you think these X n limit to in whatever sense 0 you would think the because they are very high probability they are going to 0. So, what you will show is that X n converges to 0 in probability, but X n does not converges to 0 almost surely. So, clearly X n converges to 0 in probability this is very easy why that is because limit n tending to infinity probability of absolute value of X n minus X what is X limiting random variable is 0. So, probability of X n minus 0 greater than epsilon this is what you are looking for you are fixing some epsilon you are looking for this probability, but X n only takes values 1 or 0. So, if X n is greater than epsilon then X n must be equal to 1. So, this must be this is same as limit probability that X n equal to 1 and what is the probability that X n equal to 1 1 over n and that limit is 0. So, this limit is clearly 0. So, X n converges to 0 in probability now I am going to show that X n does not go to 0 almost surely. The way you do that. So, we will show X n does not go to 0 almost surely any ideas you have independence and you have 1 over n. So, what do you think you will do in this scenario? Boral cantile lemma that is right. So, boral cantile lemma. So, let a n be the event that X n equal to 1 then a n is independent and sum over probability of a n n equals 1 to infinity equal to infinity. Therefore, by boral cantile lemma 2 what with probability 1 infinitely many a n will occur that is X n equal to 1 infinitely often. So, X n equal to 1 infinitely often. So, X n does not go to 0 almost surely. So, what is the same? So, this is going back to your coin tosses right you can think of this is coin tosses then coin toss coming up as head this probability 1 over n. You know that no matter how far out you go boral cantile lemma says there will be some occasional head popping off. That which means if you look at this entire sequence X n no matter how far out you go there will be some 1 popping off which means X n does not go to 0 almost surely. In fact, X n go to 0 has probability 0 right far from having probability 1 it has probability 0 right. So, X n does not go to 0 almost surely. So, this example nicely illustrates the difference between convergence in probability and convergence almost surely. This shows that convergence in probability does not imply convergence almost surely, but we will prove next class of the reverse implication is true almost surely implies convergence in probability. So, next class we will prove a theorem which actually brings out this difference very clearly. Let us stop here.