 So, now I am going to look at another example which will lead us to the second notion of convergence called convergence in probability. So, what is the example? So, let us again try to look at a sequence of random variables I have, I am going to draw one random variable here X1. Again these random variables are defined on my unit interval probability space. This X1 is going to be one throughout and then I am going to define two random variables X2 which is going to be, I am going to split this half which is going to be like this. So, let us call this random variable 1, let us call this 2, let us call this 3, let us call this 4, 5, 6 and 7. So, this random variable X1 is going to take one throughout. So, X2 is going to take 1 in the first half, X3 is going to take 1 in the second half. Now, X4 is such that it is going to take value 1 in the first quadrant, X5 is going to take value 1 in the second quadrant, X is going to take value 1 in the third quadrant and X4, X1 is going to take value 1 in the last quadrant. Now, I can keep on reading sequence like this, right? So, you see that like I have doubled up from, this is like a good going in a geometric sequence and I can generalize this and write it as this equations as all j and n prime equals to 2k plus j, ok. So, let us take from n and then let us allow, let k to be the logarithmic of that and then if you let j go to 0 to 2 to the power k, for all the indices which are between 0 to 2k, 0 included, the n prime index will have, it is going to take length 2 to the power minus k on this interval. So, just like this is generalization, but you can just let us just try to understand for the case n equals to 1. So, n equals to 1 what happens? This is going to be 0, right? And in that case what is the possible j's I have? It is only 0, right? Then I have 2 to the power 0, 1 and j is 0. So, I have n prime equals to 1. So, that corresponds to this case and what is this value? It is going to be of length k was what? k was 0, right? So, this is 1 and it is in the interval 0 to 1 and now if you go and take further for these two things, I mean you can continue to do this and you will see that in the in this first half, when k equals to 1, this is going to be 0 and half, you are going to get the value of 1 and when you go to the next index, you are going to get, this is going to take the value 1 in the next interval. So, you can like this, you can generalize this description of your x n's and then you have a sequence of x n. And now let us try to understand where this, if I have a sequence of random variables like this, where do they converge? So, let us say n equals to 1, then I can write it as defined like this. So, take any n that can be expressed as in this form for some k and some j, right? So, and for that, it is going to be of length 2 to the power minus k over this interval. For example, if you are going to take n equals to 3, what is the k and j corresponding to this? k is going to be 1 and j is going to be, so then 3 can be expressed in this form and for this, what is the value of this? k is equal to be 1, right? This is going to be in the half and what is that interval? So, this is going to be, this j is going to be 1, this is going to be 1 by 2 to 1. So, this is covered the entire thing. And now for this kind of x n, I have, where does it converge? So, first of all, any guess what should be the limiting random variable here? So, as you go down, down like this, what do you expect? You expect these random variables to put mass on a small, small intervals, right? It is going to take value 1 every time, but it is going to take that value 1 on a smaller, smaller intervals. So, what do you expect as n goes to infinity? What? It is shifting, it is like putting mass on a narrower intervals and it is not putting, so here it has put mass on the first quadrant, but rest of the quadrant, it has put 0 mass. Here it has put mass on the second quadrant and on the other part it is going to be 0, right? So, like that it is always continued to happen. So, due to which, okay, let us assume, you know that as n is going to tend to infinity, right? This interval is shrinking. It is only going to put the mass on a very, very small, but we are right now unable to imagine what is that small interval where it is going to put mass, but everywhere other part, maybe it is going to put a 0 mass. Let us say whether x n converges to some x where x is 0 for all, x of omega is 0 for all omega. Okay, now let us try to understand. So, for this what we need to do? We have to understand whether probability that omega xn of omega converges to x of omega is going to be 1. Let us see. So, take any value of omega. So, if I want to include this omega in this set, that xn of omega should converge to 0, right? Because that is what I have taken x to be 0. So, now let us say that omega is here. Let us say for the time being let us take that omega to be 0.3. So, here omega 0.3 got value 1 when it is x1 is giving value 1 to this. And what is x2 is giving it? 1. But what is x3 is giving it? And here what is x4 is giving? And what is x5 is giving? 1. So, if you are going to construct such a binary thing, so here it is 1 element, 2 element, 4 element, then I have 8, 16 like that, right? In each of this rows that omega will have 1 at one of the graphs. In all the other graph it is going to take 0 value, right? So, however deep you go, you will see that after some number is it possible that x of omega is going to always get 0 or if you go further it is going to take 1 again. So, for example, if you start it took 1, it took 1 here, it took 0 here, 0 here, it took 1 here. And if you come back and when we went in the this row, it took 1 again. If you go to the next row below, it again took sometime 1, right? So, however down you go you will end up with some n where there it is going to take value 1, right? And if that is happening, what is whether x of omega x of 0.3 is going to converge in this case at all? No, right? That violates our definition that however far I go, I will find an n such that after that my x of omega is going to be 1. So, that is why it cannot converge to 0. And this is true for any omega, right? Do you understand this point? So, there is no omega here such that x n of omega converges to 0. So, then what is this set? If my x is 0, this is a null set, right? And if this is null set then this probability is actually 0, not 1. So, because of that my x n is in this case x n is not converging to x almost surely and where x is 0, identically 0, okay? But as you see from here, as my x n goes, most of the time I am getting the value 0 for each of the omegas, but it only happens that, but after some time 1 happens, but rest of the time it is going to remain 0. So, it looks like most of the time my random variable is going to take value 0 for omega, I mean as I proceed in n, but it is for some n further I go, it gets violated, that is why it is not true. But still you want to like you will see that most of the time for different values of n it is going to take value 0, then it looks like maybe like I could think of this sequence of random variables convergence to random variable x which is identically 0, but this definition of almost sure convergence is not capturing that notion. Maybe I need to capture, I need to define something weaker than that. So, for that we have this another notion of convergence called convergence in, so what it says? Okay, so now we are going to say that a sequence of random variables converges to x in probability if for any epsilon greater than 0, now if you look at this sequence of probabilities, now notice that this limit is now outside. If you are going to look at the probability of this event being larger than epsilon, if this sequence converges to 0, then we are going to call this as convergence in probability and we are going to denote this as x n converges to x, put a p here or limit as n tends to infinity x n is equals to x in p. So, now let us see whether this example we had here satisfies this definition. So, let us again take the same thing. This example here, let us take x to be 0. So, to do this, to verify this I need to verify it for any epsilon greater than 0. So, let us take a fix on epsilon greater than 0. Now look at this, so x is anyway 0 for all the points. Now I am asking what is the probability that mod of x n is going to be greater than or equals to epsilon. So, mod of x n I can just take it as x n because x n's are all positive in my case, x n being greater than or epsilon equals to 0. If I look at the limit of this, where this limit is going for this example. So, recall that this x size here I have defined it on unit probability space. So, the probability that it takes a value in some interval is nothing but the length of their interval. So, now what is the probability that x n is going to take a value greater than or equals to epsilon. If you take n any n, what will be the probability of just mod x n itself? What is the probability of mod x n itself? It is just going into the length of that interval where it is taking value 1. And as n increases, these values are shrinking. So, the probability where it is going to put a mass positive mass is also shrinking very fast. And this epsilon here is strictly positive. So, if it is strictly positive, you should be able to find n sufficiently large such that after that you are going to put a value in a smaller, smaller intervals whose probability is going to come down below epsilon. So, then this event will never hold and this probability will be 0. So, because of that is it true that this is going to be 0 for any epsilon greater than 0. So, then does it satisfy my definition of convergence in probability? That is right. So, x n converges to x here in probability. So, what is the difference between then convergence in almost sure sense and the convergence in probability? So, in a way, look at this. When you are looking at this sequence at any n, you are looking at the joint distribution of your sequence, sorry, your limit n given x. It is looking at these distributions at any point. But whereas, if you looked into the definition of almost sure, there you looked at probability. So, here you are computing the probability on the entire sequence. So, entire joint distribution of the entire sequence is affecting the definition of almost sure. Whereas, the definition of the probability convergence in probability is only looking at a pair wise distribution. The joint distribution of this pair x n and x at any time. But whereas, this is looking at the joint distribution of the entire sequence. In a way, this almost sure. So, that is why this almost sure convergence is a much demanding requirement than convergence in probability. So, later we will see that that is indeed the case. Almost sure convergence is a much stronger property than convergence in probability because almost sure convergence implies convergence in probability, but not the other way around. So, now, before I do that, if you look at this definition, what I asked is the probability that x n going to take a value larger than epsilon to be converging to 0. But it may happen that so here the value that it is going to take greater than epsilon is on a very small mass, but it may happen that this value, it takes on this small mass itself could be very, very large. For example, in all the examples I just erased now, instead of letting my x i only taking up to value 1, I could let it take very large value. Instead of 1, I can make it 100, 200, whatever. So, in those case, yes, it is happening that they are taking this value on a smaller intervals, but still the value they are taking on this smaller interval could be very, very large. Now, when you are interested in finding the expectation of the random variable, you are just not interested in the probability, you will be all also interested in the value taken with their probability. So, probability will involve product of the value of the random variable and the probability term. So, because of that, it may happen that some sequence of random variable satisfies that, but on the region where they are taking positive values, the small probability that positive value could be very, very large. So, in some application, this may not be desirable for you. You want it to be take small values even in that small interval. So, this is like some cases like fine, it may happen that when I am going to put money in a game, the probability that I am going to lose could be very small or let us say lottery. The probability that I win is going to be very small, but if I win, the amount is huge. So, the value product is going to be very large. So, when you think about this in the risk sense, something failureing is very small, but if it fails, it is going to be too expensive for me. So, in that case, you are interested in both the value taken. It may be taking some huge value with small probability, but that is of a concern for me. So, when that is the case, instead of this, you may be interested in knowing whether it converges in some expected sense. For that notion, so for that to capture those things, we are going to define another notion called convergence in mean square sense. So, now instead of looking at the value of the random variable itself, what I am now looking at is the expected value and not just the expected value, but the mean squared error from the random variable X. So, we are going to say that the sequence of random variables going to converge to X in mean square sense, if I look into difference, square difference and take their expectation that goes to 0 as n tends to infinity. So, notice that I have put this condition that each of this random variable should be such that their second moment is finite. So, why is this? Now, the natural question comes is fine, we are looking at different notions of convergence. Is it true that one already implies the other? So, we already said that convergence in probability is a weaker notion of convergence in almost sure sense and now we are talking about converges in mean squared sense. Is it already some weaker notions of earlier one or it is a total independent convergence we are talking about? So, we will look into that. So, before that let us look into an example. Let us take this random variable. Again, this is defined on a unit interval probability space. So, now each of this random variable have defined such that if you fix an n, then it is going to vary like this in between 0 to 1 by n, it is going to take a constant value of a n and then it is going to take a value of 0 in the rest of the interval. So, what do you think about this? Let us try to understand whether it convergence in almost sure sense. So, let us say to understand almost sure sense I need to compute this probability. So, first what is the guess? What is your guess? What could be the limiting? So, as n is tending to infinity, it is going to shrink this interval and it is the kind of putting value, nonzero value and only some small interval which is shrinking rapidly as an increases. So, most of the values of omega are only going to take value 0, right? Because this 1 by n as an increase I should keep on shifting to right to my left and then most of the time. So, you expect guess is like this will converge to 0 random variable. Delta does not matter to me, right? Like 1 point does not matter. What matters is the entire interval. If it is 0 everywhere, I am going to just call it a 0 random variable. So, is it converges in almost sure sense, right? Like if I am going to take a limit of n tends to xr omega whichever omega I take. So, let us say initially omega is here, omega is here, it has some positive value an. But as n increases, this guy comes to the left of omega and then it is going to get a 0 value, right? So, because of that every omega is going to be taking 0 value. So, this is going to be 1. So, xn converges to 0 almost surely. Now, what about its convergence in probability sense? So, what I need to verify that? I need to verify whether limit has n tends to infinity, probability that xn minus 0 going to be epsilon is going to be 0. Is this true? So, again, so I have already substituted 0 for x here, right? Now, if you look into the probability of this interval xn, that is nothing but this interval, right? And that interval is shrinking. So, it cannot, after some time it cannot exceed epsilon. So, after that point it is always going to be 0. So, it is also true, converges in probability. This is greater than epsilon. What you are talking about is probability of mod xn, in this case simply xn is going to be simply what? 1 by n. Yeah, so what we are just saying that, yes, it could take value greater than epsilon but on what regions? That is a region which is shrinking, right? Small, small shrinking region, that probability shrinking. So, that goes to 0 as n goes to infinity. So, now let us compute converges in mean squared sense. So, what is this value then? So, to compute this I need to compute expected value of xn minus. So, x I have taken to be 0. So, can I compute this expectation? So, this is nothing but expectation of xn square, right? What is this? So, xn is taking what value? It is going to take value an and with what probability 1 by n and with the other probability it is going to take value 0. So, this is going to be an square by n. Now, I want this probability to go to 0, if I have to claim that xn converges to 0 in mean squared sense. Is this true? So, notice that an is a number now and I am dividing an square by n. Will this go to 0? For any n, for n tending to infinity an square by n always goes to 0, n. Yes, an is a function of n. So, okay. So, fine. Let us say if I can choose like, say I can define this an to be log n. So, in this case it goes to 0. If I am going to say an equals to square root n, it goes to 0, n square. So, square root n is become the ratio 1, right? So, this does not go to 0 in this case. So, in this case s go to 0, but in this case it does not go to 0. So, what is happening in this case? So, it looks like in this case the convergence in the mean squared sense here at least depends on how this amplitude is or how this the value is. Like if this value of an is going only logarithmically in n and this guy is shrinking much faster than this guy, then yes, then this guy is going to go to 0. But if its amplitude the value taken is going to be much, much larger that means it is then this interval or sorry when this whatever this value of n, then this product could be still much, much larger, right? So, for example, you could almost come very close to here, but let us say n equals to some 10 to the power 6, you are like almost 1 by n is like some 10 to the power minus 6. So, you are putting positive value only in the small interval, but you are going to take a value of also you are going to take the value 10 to the power 6. Let us say like your chances of winning a lattery is 1 in million, but if you win you are also going to get a million dollars or million rupees. So, in that case that value is going to be still high, right? So, here as you see whether it converges in mean square sense or not depends on the sequence a n. So, it depends on, so far we looked at the sequence of random variable converges to some x. Now, we will look at the distribution itself and if is it possible that like if I have a sequence of random variables with certain distribution, they will converge to random variable which has some limiting distribution. So, we are going to say that a sequence of random variable converges in distribution to a random variable x if we are going to look at their CDFs. So, f of xn denotes the CDF of random variable xn, which is if you look them at a point x and if it converges to a CDF of x at that point and this point x are such that they are the continuity point of the CDF of the limiting distribution. In that case we are going to call it as converges in distribution, so sorry continuity point of x, fine then. So, let us stop here.