 So, let us get started. So, in the previous lecture and previous to that lecture, we started talking about exponential family of distributions, then talked about random sampling. We talked about sample means, sample variance, sample deviations, we talked about their properties and then started looking into sampling from normal distributions. So, that lead us to two special distributions called as student t distribution and student f distributions. And we also discussed about some relation between various distributions, right, like how I can obtain like f distributions reciprocal, what is the relation between t distribution, f distribution and this mapping of f distribution to beta distribution. So, you also saw the last one in quiz. We have been talking about basically statistics which is about, we said simply is some function of the bunch of random variables we have observed. And as a special case, we observed sample mean, sample variance and sample standard deviation. That time we discussed about one notion called unbiased. Like sample mean, we called it as a estimator for the mean value of the distribution. Sample variance is an estimator for the variance of the distribution. So, that time we talked about something called unbiased. So, we have this, like we had this, if you have i samples we had by x and if all of this x i's are such that their mean value is mu, ok. Then we said if you average them, what we said is we are going to get an estimator for this mean value which we called as mu hat, ok. And this mu hat was a random quantity because it depends on the random variables. Then we said that if expectation of mu hat is equals to mu, this is, this estimator is mu hat is an unbiased estimator of mu. And similarly we also saw a square, this was an estimator for variance and we said that this is also unbiased. Now, we will look into another property of this estimator called consistency, ok. But before we define that notion of consistency, we need some more definition because that consistency notion is an asymptotic notion. You notice that here this unbiasedness you can define for any n. You just take it for n equals to 10 or n equals to 15 whatever. If you take n equals to 10 you are taking average of those 10 sample and its mean should be equals to mu and it happens to be mu if this x i samples are iid. And if you take n equals to 15 and you take maybe I should make it n subscript n just to indicate that it depends on n samples and here I should take n mu n. So, here irrespective of what is the n samples you are going to use, expectation of mu n hat is going to be mu and this is, if it is an unbiased estimator this is true for every n. But now I am going to introduce another notion called consistency which holds for n tending to infinity, ok. So, for that we are going to look into convergence of this random variables as we have infinitely many of them, ok. Our first notion of convergence is called convergence in probability. Did you hear about convergence in probability in i is 6 to 1? No, ok. We are going to say that a sequence of random variable x 1, x 2 convergence in probability to a random variable, ok. If you look into the difference of absolute difference of random variable x n with x that being larger than epsilon that goes to 0 as n goes to infinity, ok. So, just to contrast this what we are trying to say with the standard limit. So, in the what does in the in the standard limits what we talk about? Sequence of random variables sorry sequence of numbers tending to a. This what does this mean? Or we simply write a goes to a, a n goes to a and you know the definition of this limit, right? It is a standard definition. But now what I am facing with is sequence of random numbers. What does this mean? So, the sequence of random numbers converging to something we need to appropriately define some notions and the first notion is what we call it as convergence in probability. And I just defined what is that definition is. If you look into, now once you take this probability this is some number and now that sequence converging to 0 we are talking about that sequence converging to 0, ok. And the complement of definition is if I am going to look into less than or equals to epsilon this limit goes to 1, right? This is just a complement of this. And at this point I did not say that this x 1, x 2 there has to be IID. Now, neither they need to be identically distributed or independent. This is just a definition if, if the sequence of random variables such that this holds then we are going to call they converging probability. And the shorthand notation like the way we use it for the deterministic case like this in the stochastic ways use this notion that x n goes to x with the superscript p written on the arrow mark. So, we will not just go into that I am just introducing this definition just to tell you what is consistency, ok. Now, let us say you people have already encountered sequence of random numbers earlier, right? Where did you encounter sequence of random variables when you dealt with law of large numbers and central limit theorems, ok? So, now let us see, ok. Now, suppose let us assume that my sequence are IID with a common mean mu and variance sigma and there is a type where this should have been less than infinity, it cannot be greater than infinity. And now law of large numbers said that if you take the average of these samples this value went to mu, maybe I will not write p here at this point. This is what law of large number told me, right? If you take the average of these numbers and if you continue to take average of these numbers as n tends to infinity that value is going to mu. Now, let us see is there any connection between this statement and the con and the statement and the definition of convergence in probability. So, the claim is that this is this is simply says that x n converges to mu in probability, ok. Now, let us see how is that? So, I am now going to go and apply this definition here, choose any epsilon positive. Now, I am interested in knowing this, now these are my sequence of random variables, x n bar minus mu I will be interested in their difference being greater than epsilon. I know this has to be greater than or equals to epsilon 0 that is by definition. Now, can somebody tell me how did I get this upper bound? Markov inequality, right? So, I obtained a Markov inequality. Markov or Chebyshev? This is already in a say in a way that because of this difference we have written it is fine, you can say any of this. But actually what I have done is basically I have take the square on both sides, ok. If I do like this, what you are saying? If I have taken square on both sides, now this is Markov, right? Now, expectation of square of this absolute value is nothing but variance and by epsilon. Now, when I actually got this, this is actually I could have directly written this and called it as Chebyshev inequality, right? But I first, we also derived Chebyshev equality using Markov inequality only, right? And this is exactly the way we derived Chebyshev equality from, but I am just repeating those steps. Now, what is this quantity? Variance of x n bar. We computed this, right? We computed when we calculated the sample. The sample means variance when you computed, we calculated as sigma square by n, right? Now, I am interested in n going to infinity. When n goes to infinity, this quantity vanishes and goes to 0. Irrespective of what is the epsilon you choose, ok? So, now notice that this is lower bounded by 0 and now also upper bounded by 0 as n goes to infinity. So, this value is converging to what? This probability is converging to? 0. 0. By definition that means, your x n bar is converging to mu. So, that is why this statement. Law of large numbers if you interpret in other way, it is just or the formal way of interpreting law of large numbers is the sample mean converges to that population mean in probability, ok. So, when we said law of large number we did not mention the notion of convergence. We just said if you take the average it will go to some number, but that time we did not formally define what is the limiting convergence mean, ok. But now we are just formalizing that saying that that that convergence was convergence in probability. So, sample mean converges to the population mean and we already know this. Sample mean is unbiased for any n. Now, I think I missed to write it here. Now, whenever it so happens that if expectation of x n equals to mu and x n bar also converges to the same parameter that I am interested in here, that expectation of sample mean equals to this mu and it converges to the same value mu then I am going to call x n bar is consistent is a consistent estimator of of what mean, ok. What is the difference between this and this? Agreed let us say limiting property, but is there anything more than that? Yes, this guy is that is asymptotic as n goes to infinity, but is there anything more? You see the expectation here and no expectation here. So, what is this quantity? This is x n bar we denoted it as a mu mu n bar mu n hat, right. This is basically average of n samples and we said this is still a random quantity. So, if you give me one sample once up one bunch of n samples I will get some value, if you give me another bunch of n samples I will get this. So, let us say I will do this, ok. Let us say one of you give me one sample one set of samples x 1, x 2, x 3 like this x n and I am going to call it as set 1 and to denote that I am superscripting them with 1. Another one of you give me another set of samples and I am going to superscript with 2, right and each one of you can give me n samples, but let us take 2. If I have to go and take the samples from x first guy, this is like i equals to 1 to n 1 by n and he is whatever the value he gets me let us call this 1 and similarly whatever the second guy gives me the average of the second guy values give me let us call that, ok. And all this this is and this all samples are coming from the same underlying population, ok. Now, is mu n hat 1 and mu n hat 2 are going to be the same? Not necessary. But are they same in expectation? Yes, they are same in expectation because we know that sample mean n is an unbiased estimator, but suppose you give me lot of samples like you keep on giving me many samples and both of you give me lot of samples, ok. And now this mu n hat I you are giving me more and more samples, right I am going to average and this is where it is going to converge? Mu and where is this mu n hat is going to converge? Mu. Even though they need not be equal, but when you let n goes to infinity both the samples they are going to converge to the same mu, ok. So, that is why like if one of you just give me n tending to infinity sample I will just average and I know this is already mu I do not care about another guy now as I have already enough samples. So, this is what this is why it is called consistent whether I use this, this as long as n tending to infinity it is consistently giving me the same estimated value, ok same value of mu, ok. So, this is what we are going to say consistency as sample quantity is consistent its sequence converges to a constant. And in case of sample mean we know that that constant is simply the population mean and that has come from law of large numbers law of large numbers say to what constant it converges to. Now, let us see we now just to talk about sample mean as a estimator for your population mean. Now, let us look into sample variance and see whether this is a consistent estimator for my population variance. How you are going to do that? This is my definition of sample variance when you have n samples and we know that this right we actually discuss this right we know that sample variance is unbiased. Now, again I want to check whether if I take S n bar and see that how much it is away from the sigma square I want to compute this probability. So, I know how to compute this probability, right or at least get an upper bound of that. This one is nothing, but by again applying your Markov inequality you will get this value. Now, what is this quantity? This is nothing, but the variance of your sample variance. So, S n square your sample variance, Uska variance. So, whether it did we compute variance of sample variance? If not, you can verify that even that value as n goes to infinity goes to 0. So, variance of sample variance also goes to 0 as n tends to infinity. Now, because of that by definition S n square converges to sigma square in probability right and hence it is consistent. Let us revisit this definition of consistency. So, a sample quantity is consistent if it sequence converges to a constant whatever it is. Let us say I am just going to say y n if it going to converge to constant then you are just saying definition of consistencies are sample quantities consistent if it sequence converges to a constant. Let us say y n is a sequence of random variable which goes to some constant value we are calling it as consistent. Now, what I am now trying to ask is whether x n bar is consistent? Does x n bar converges to a constant? Yes. How? Law of large numbers. Law of large numbers already told us that x n bar converges to the mu value that is why it is consistent and that convergence is actually in convergence in probability. And similarly, we just said that if you are interested in S n bar we just computed this that it also goes to a constant sigma square in probability. We just computed this right S n square and that is why this sample deviation is also consistent. And similarly, you can compute that sample standard deviation is also going to be consistent. And what is that? That sample standard deviation is simply square root of your S n bar. Any question about this convergence in probability and consistency? You people are in sync with me or lost or anything is not consistent here? That is our notation right because we said these are like estimators. Sample mean is an estimator and we obtained it by average of n samples. So, we said that was our kind of informal we are going to whenever mu is the parameter basically we are trying to estimate. So, we will write it as mu n bar mu n hat and whenever it is an average of certain number of samples we will put a bar on it. X n bar is nothing but summation of X i 1 by n. So, that is why bar S n square this is same reason right. This is also we obtain as an average of n samples right. What was S n square? Even it is 1 minus n even though there is some kind of centralization here, but it is still average of certain number of right. I think when you are getting confused like sometimes I have put it bar and sometimes I have not put it bar. The S n square also you can use. No issues. I just like to be consistent with this X n bar I put bar because just to indicate it is an average of certain number S n is also S n bar R is also average of certain number quantities that is why I put bar on it. So, I think we first to be clear with this definition of convergence in probability ok. Now, one more exercise. I say X n bar goes to X in probability what this X has to be X n bar by our definition is a sample mean that mu has to be population mean it cannot be anything that is a constant that X is not a random variable it is a constant that is why the definition here that is why we are also calling it consistent because it is converging to a constant and that constant happens to be sample mean here ok. Now, one more simple example I will take. I will give you X n is uniform in the interval 0 1 ok. Notice that open interval it is an open interval and 0 1 are not included. And now, I define another random variable X n equals to X n square. Does Y n converges to anything? If Y n converges to Y in probability what that Y has to be 1 by 4 sorry this should not X n square this should be X n n ok. I am just refining it X n n X n what you understand what is X n right? X n is a uniform random variable between 0 1. Now, I am defining Y n to be nth power of that X n what this will be? See X n's are all going to be between 0 1 yeah everything is between 0 1 and if I start exponenting with n which is tending to infinity what will happen to that value? It is going to be 0 right. Does this Y is 0 then yes or no? Then this Y is 0 this Y is the limiting value is no more a random variable it is a constant. Then the definition of this convergence in probability applies right. So, then we should say yeah Y n converges to 0 in probability ok next ok this definition will quickly go through not spend much time these are just definitions. What are the other notions of convergence in probability? Or what are the other notions of convergence? There are two three other notions of convergence called almost sure convergence and convergence in distribution ok. You do not need to know much, but it is just better to know these definitions. We say that a sequence of random variables converges to X almost surely if probability that limit n tends to infinity X n is X with probability 1. So, anyone understand what does this mean? So, let us try to understand what is this may be this is a compact definition but let us say what is this means limit n tends to infinity X n equals to X equals to 1 what does this mean? This basically saying that this is probability of all samples on which this X n of omega is. So, you take a omega and at that omega that sample point you can compute the X n values and see whether that limit this is like a X n omega if you fix omega X n omega is a deterministic sequence right agree or no? Once you pick a sample omega you just compute your X n on on that particular sample value. This is like a standard limit whatever the X you have chosen if this holds then that omega lies here and if for some omega this condition does not hold then it does not belongs to this set. What it is saying that if collection of all those omegas will be such that their probability is 1 then I am going to say that X n converges to X almost surely ok. So, this is a different from almost sorry convergence in probability you can always consider random variables which convergence in probability, but not in almost sure, but it is the case that if it converges in almost sure it always converges in probability. So, if let us say limit of X n converges to p and I have limit of X n converges to no X here and X and this is p and almost sure is written like this. So, this implies this, but this need not imply this. So, which one of them is a stronger notion of convergence almost sure is stronger notion or almost convergence probability is stronger notion almost sure right fine. The other notion is convergence in distribution. Suppose you have a sequence of distributions and it so happens that their CDF converges point wise ok. You take some point X and computes the CDF of all the random variables at that point X and if that value converges to the CDF of your limiting random variable X at that point X and this should happen for all continuity points of f of X. Notice that f of your CDF can have jumps right only at those points it is where it is continue if this happens then we are going to call it as convergence in distribution and we are going to denote it as equals to X in distribution. And now if you little bit think carefully did we come across convergence in distribution before CLT. So, what did the CLT said normal distribution right. So, what CLT said is let us say if you are sequence X 1, X 2 have any these are IIDs and if you look into that X n bar minus mu and let us call this Y n ok. Now, Y n converges to in what sense? In distribution sense. Now, actually what I can say is instead of this I can say Y n converges to Y where Y is normal 0 1 ok. So, Y n converges to Y in distribution ok I think this is good enough. So, that is what the relation between them is if you have almost sure convergence you know it converges in probability and further if you know it converges in probability you can know that it convergence in distribution also. And when all these convergence happening to the same limiting X ok. So, if X n converges to X almost surely then X n converges to the same X in probability and that X n converges to the same X in distribution also ok fine. Yeah we do not need to go much more details than this I think you will study all this converges in distribution in a much more elaborate way when you take probability 2. You are doing IE 6 to 1 now right there is a second part of that course I do not know what is the label yeah. So, if you take the next version of that you will study all of this distributions.