 To continue with our discussion, today we talk in detail about the distribution of the sample mean. And why is this distribution important? This distribution would allow us to make some valid inferences about the population mean. Our objective basically is to infer about the population mean, but since we cannot approach each and every unit of the population, we have to resort to getting some idea about the population based on the sample mean. And in this connection, the importance of how the sample mean, what is the distribution, the probability distribution of the sample mean is, is of significance. We had taken an illustration last time, where we considered a population, a big population. And then from that population, we had drawn a sample of size 25. Now, this sample of size 25 is not unique. We can draw different samples based upon the sample contents. So, if I want to draw a sample of size 25 from a big population of say 1000 students, I can get a set of 25 students. If I ask you to pick up a sample of size 25, you need not pick the same set of 25 students. So, your sample will consist of another set of 25 students. And our objective could be to find out what is the mean height of students in the population. So, I will get a sample of size 25 and get a sample mean. You can figure out a sample mean based upon your sample of size 25. And like this, we can have 1000 choose 25 possible samples. And what is the objective here now is to find out, how is how are these sample means behaving with respect to its probability distribution. So, the distribution of the sample mean, the general properties of the sample mean. So, let x 1, x 2, x n be a random sample from a distribution with mean mu and standard deviation sigma. So, we are picking up a sample say of size little n from a population having a distribution with mean mu and standard deviation sigma. Then the expected value of the random variable sample mean x bar is same as that of the population mean mu and the variance of the sample mean x bar is the variance of the population mean sigma square divided by the sample size little n. Additionally, if I consider a statistic rather than the mean x bar, I consider the statistic which is the sum of the observations in the sample. So, if my statistic is t naught which is x 1 plus x 2 so on x n, then the expected value of this statistic t naught is n times the population mean mu. And the variance of the statistics t naught is n times sigma square where sigma square is the variance of the population from where we are drawing the sample. And the standard deviation of t naught is nothing but the square root of the variance. Here, if you look at expected value of x bar, this is nothing but expected value of x 1 plus x 2 plus so on plus x n by n. And this is 1 by n expected value of x 1 plus so on plus expected value of x n. And now, we know that x 1 x 2 x n is a random sample from a population having a distribution with mean mu. So, this is nothing but the expected value of the random variable x 1 so on x n. They are all mu which gives me the expected value of x bar as mu. Similarly, for the variance, this is variance of x 1 plus so on x n by n which since this is a linear combination of the random variables, we have seen the result that it is the square of the coefficients of the linear combination. And since in this case x 1 x 2 x n is a random sample therefore, they are independent random variable. So, we can add the variances, this is by independence. So, this gives us n sigma square because each of these variances are sigma square by n square which gives you sigma square by n as the variance of x bar. So, from this result on the variance, what we observe is that the distribution of x bar has less and less variability if your sample sizes are large. And because of this unbiasedness property, this is also called the unbiasedness property, I will discuss this in a while. And because of this variance being inversely related to the sample size, the larger the sample size, the smaller would be the variability in the distribution of x bar. So, here suppose this is the distribution of for say n equal to 1, that means this is the distribution of the population from where we are drawing the sample. And if I increase the sample size and say n is equal to 10, the distribution this is corresponding to x bar based on n equal to 1 and this is corresponding to x bar based on n equal to 10. So, in other words the first picture is nothing but the distribution of the population from where we are drawing x bar based on n equal to 1 is nothing but the individual observations in the population. And as the sample size increases, we see that the variability or the spread of the distribution of x bar decreases because of the simple reason that your variance is a function of is inversely related to n. So, now let us consider the case when the population from where we are drawing the sample is normally distributed. So, we take the simplest of the situation first that the population from where we are drawing our sample is normally distributed. So, let x 1, x 2, x n be a random sample from a normal distribution with mean mu and standard deviation sigma. Then for any n the distribution of x bar is again normal with mean mu and variance sigma square by n and we had worked out the mean of x bar and the variance of x bar just while back that is respective of what is the distribution of the population from where we are drawing the sample. However, if additionally we indicate that the distribution of the values in the population is normal then the distribution of the sample mean is also normal. So, that is the important part to note similarly, if the population from where we are drawing a random sample is normal then the total the sample total t naught that is also normally distributed with mean and variance as indicated earlier which is n times mu and n times sigma square. Now what happens to the distribution of the sample mean when the population from where you are drawing the sample is not normally distributed. So, this brings us to an important result called the central limit theorem or the CLT in short. What does this theorem say? Let x 1, x 2, x n be a random sample from our distribution. So, we do not indicate what is the distribution of the population. However, we know that the mean of that distribution is mu and the variance is sigma square. So, we only know the mean and the variance of the population, but we do not know the corresponding probability distribution. Then if n is sufficiently large x bar has approximately a normal distribution with mean mu and variance sigma square by n. So, what we say here is when n is sufficiently large irrespective of the population being normally distributed or not the sampling distribution of x bar is still approximately normal. And the similar thing holds for the sample total t naught which also would follow an approximate normal distribution with mean n mu and variance n sigma square. Now let us try to see this through an illustration. What we need to illustrate here is that the larger the value of n the better the approximation. So, let us roll a fair die. So, what is the distribution? The possible values could be 1, 2, 3, 4, 5, 6 and the probability of each outcome is how much is 1 by 6. So, the population of rolling a fair die and its outcome being the random variable. The probability distribution looks like this where x is the value which comes face up. So, x is the value which comes when you roll a die. Now let us define another random variable which is x bar and this is x 1 plus x 2 by 2. So, this is the situation when you roll the die twice. So, the outcomes would be x 1 and x 2. So, when you roll a die twice and if you want to look at the mean value of the outcome. So, you have the random variable x bar then. So, what are the possible values of x bar? What are the possible values of x bar? When you roll the die once the possible values are 1 through 6, but when you roll the die twice and you are looking at the random variable the mean of those two values corresponding to rolling it twice. What are the possible values of the mean? So, there would be 1 which would occur when both the outcomes are 1, 1. It could be the mean could be 1.5 that is the situation when you have one of the outcome as 1 and the other outcome is 2 and like this the possible outcomes go up to 6. 6 would be the mean when both outcomes are 6. And what are the corresponding probabilities associated to getting these values? What are the chance of getting the mean as 1? That would be 1 by 36 because these are independent events. So, you can simply take the product of the probabilities for each of these events. So, 1 by 6 is the probability of getting 1 and the probability of getting 1 in the second toss is also 1 by 6. The product is 1 by 36. What is the probability of getting x bar as 1.5? 2 by 36 because two of them of all possible pairs the outcome 1, 2 and 2, 1 is favorable to the case of getting the mean as 1.5. So, that gives you 2 by 36 then it will be 3 by 36 for getting the value 2 and 2 can occur as 1, 3, 3, 1 and 2, 2 and like this it will 4 by 36 then 5 by 36, 6 by 36 then again it will start reducing. So, getting the value 6 is this and the maximum will correspond to the probability of getting that is the maximum probability is to get the mean as 3.5. That is 3.5 mean you can get it in 6 possible ways. So, if I now draw the probability distribution of x bar. So, suppose this is 1 by 6 so corresponding to 3.5. So, we have 1, 2, 3, 4, 5 and 6 and the highest probability is at 3.5. So, this corresponds to and the other ones will be something like this. So, what we observe here that when the population from where we are drawing the sample is not normal, but when we are taking from sample size 1 to sample size 2 the shape of the distribution of x bar slowly turns into a shape similar to that of a bell shaped curve or the normal curve. So, if you instead of x bar based on sample of size 2 if you take and roll the die 3 times and then you find out various possible values of x bar and find out the corresponding probabilities and draw the picture it will slowly smoothen and take the shape of a normal distribution. Now, as against this suppose I start from a distribution of the population which is not symmetric as it is here. So, we started here from symmetric distribution and we see that as the sample size increases the shape takes off takes form of that of a normal distribution. However, if I consider a situation where the die is not fair and suppose the distribution is suppose the outcome that the value is 1 face up that probability is say 1 by 12 and that of 2 is say 1 by 6 3 is also 1 by 6 4 is also 1 by 6 5 is 1 by 6, but getting the value 6 suppose it is 3 by 12. So, clearly this is a probability distribution because some of the probabilities add up to 1. So, now in this case when we are considering tossing rolling the die 2 times the possible values of x bar again is 1 1.5 2 so on up to 6 and the outcomes is 1 1 this will be 1 2 or 2 1 and like this you have the last outcome as 6 6 when you can get these values. And the corresponding probabilities here this will be 1 by 12 into 1 by 12 which is 1 by 144 is the probability of getting the mean as 1. The second one would be 1 by 12 times 1 by 6 plus 1 by 6 times 1 by 12 which works out to as 1 by 36. Similarly, you can go like this in the last one is 3 by 12 times 3 by 12 which is 1 by 16. So, here clearly the to begin with the shape of the distribution of the population is highly skewed, but as you increase the sample size as n increases the skewness you will see goes on decreasing and it attains the symmetry. So, as n increases the skewness decreases. So, in other words if you start from a population having a distribution which is highly skewed as your sample size increases the probability distribution or the sampling distribution of x bar slowly takes the shape of a bell shaped curve. Initially the skewness is there this is highly skewed and the skewness will go on decreasing as the sample size n increases. So, in other words what we have is if n is sufficiently large the sampling distribution of x bar is approximately normal this is what follows. So, in fact through this picture we can see what the central limit theorem is telling. So, the population distribution if it is skewed as here in the first picture which is purple in color and the sampling distribution of x bar when n is small it slowly moves towards a bell shaped curve and when n is large the distribution of x bar is almost that of the normal curve. Now, we have used the word sufficiently large. So, as a rule of thumb if n the sample size is greater than 30 the central limit theorem can be used. Now, certain points to note from this result one when n is large the sampling distribution of the sample mean is well approximated by a normal curve even when the population distribution is not itself normal sample mean based on a large n will tend to be closer to population mean then will sample mean based on a small n. Now, this fact follows from the result on the variance of x bar which is sigma square by n. So, as n becomes larger there is less and less variability and so you have more chance of having a sample mean closer to the central place which is the population mean. The sampling distribution of x bar tends to be centered at the value of the population mean the spread of the sampling distribution of x bar tends to grow smaller as the sample size n increases as n increases the sampling distribution of x bar tends to a normal distribution with mean that of the population mean mu and the standard deviation being the standard deviation of the population sigma by square root of n. So, what we have here is therefore, that this is the distribution of say the random variable in question in the population and as your n increases. So, this has a mean say mu and as your as you look at the distribution of x bar based on a sample of size n. So, that distribution will have a spread which is smaller than that of the original distribution because the variance is a function of 1 by n and as your as n increases the distribution gets more tapered and is more concentrated in and around the population mean. As against this if you look at a situation where we have a skewed distribution for the population random variable x and as the distribution of x bar as your n increases. So, distribution of x bar as n increases. So, you see that to begin with it was a skewed distribution, but when you are looking at the distribution of x bar as your n increases it slowly takes shape of a normal curve and if you go further it will become more centered in and around. So, this is the population the distributions of x bar become more and more centered in and around the population mean as your n increases. So, what we have is when a sample x 1 x 2 x n is drawn from a population with mean mu and s d sigma and when n is large from center limit theorem or when the population has a normal distribution we have that the distribution of the standardized variable x bar minus the mean of x bar divided by the s d of x bar that follows a standard normal distribution. We have already seen that x bar follows a normal distribution when the population from where we are drawing the sample it has a normal distribution also from center limit theorem we have seen that if the population from where we are not drawing the sample. However, if the sample size is large enough then still x bar approximately follows a normal distribution. So, the standardized value x bar minus the mean of x bar x mu x bar divided by the s d of x bar that follows a standard normal distribution where we have replaced the mean of x bar by mu which we have worked out and the s d of x bar is nothing but sigma by root n. Now, let us use these results to work out an example. Let us consider a random sample of size 16 drawn from a population from a normal distribution. Normal population having mean mu equal to 10 and standard deviation s d as sigma equal to 4. So, this is given to us this is we have the information about the population as this. What is the probability that the sample of size 16 is equal to 0? Mean x bar would lie between 9 and 11. So, here we have to know the distribution of the sample mean in order to work out this probability. So, the distribution of x bar is such that the mean of x bar is the same as the population mean mu from where the sample has been drawn which is given as 10 and the s d the standard deviation of x bar is the standard deviation of the population divided by square root of the sample size which in this case is 4 by square root of 16 which works out to as 1. So, once we have this much of information our problem is to find out the probability that the sample mean lies between 9 and 11. So, from here this probability is same as 9 minus mu by sigma by root n. So, we are just standardizing the variable. So, we can write this and from here that is equal to probability 9 minus what is the population mean that is 10 divided by what is sigma by root n we had worked out as 1. This is less than equal to this is the standardized variable corresponding to x bar z and this would be 11 minus 10 by 1. So, in other words we need to find out the probability that z lies between 1 and minus 1 where z is having the standard normal distribution and what is this answer what is the probability that z lies between minus 1 and plus 1. We have the empirical rule from where we had studied that 68 percent of the observations under the standard normal curve lies within one standard deviation of the mean. So, here therefore it follows from that empirical rule or otherwise from the tables then this works out to as 0.68 is the probability. If I now ask you what about the probability that the mean x bar is greater than equal to 11 what is the chance. So, this would be just as earlier probability of z greater than equal to 1 which works out to as 0.16. So, this is 0.68. So, therefore this is 0.16 and this is also 0.16 from symmetry. So, the answer is the probability that x bar is greater than equal to 11 the probability is 0.16. Let us take another example consider mu as 18 grams per hot dog and what is this this is nothing but the fat content and we are given that sigma is 1 and this happens to be the claim of say McDonald. So, what we have here is McDonald's they are making say hot dog or burgers say and they are claiming that the population of hot dogs that they are making that has the fat content value is 18 grams the mean of the fat contents of the hot dogs that they are selling that mean of the fat contents is 18 grams that is their claim. It is also given or known that the standard deviation of the fat contents is 1 gram. Now, given this piece of information you have been informed in advance that well on an average the fat content of what you are eating is 18 grams and with that knowledge you are making an informed decision whether to have a hot dog or not to have it. If you feel that 18 grams of fat content is ok for your health then you would eat it and if you feel know it is too high of cholesterol or whatever then you would rather not eat it. So, given this piece of information you will go and purchase and eat the hot dog. Now, in order to validate the claim made by McDonald suppose I collect suppose I collect a random sample of 36 hot dogs and find that the sample mean is 18.4 grams. So, I randomly go to different McDonald's joints and pick up 36 hot dogs do an analysis of the fat contents in each of these 36 hot dog samples and I find that the sample mean of fat contents happens to be 18.4 grams. Now, it is possible that these 36 hot dogs do an analysis of the fat contents in each of these 36 hot dog samples and I find that the sample mean of fat contents happens to be 18.4 grams. Now, it is possible that these 36 hot dogs that I have picked up in my sample they are on the higher side of fat contents but can we say something more than this or can we say something to question McDonald itself. So, for that if I ask the question what is the probability of getting the fat contents being such a sample mean or more extreme value of the sample mean that is higher value. If I ask this question what are the chances that I can get a sample mean of 18.4 or more. So, in other words what is the probability that x bar is greater than equal to 18.4. So, this we can work out how do you work out this probability for this we need the distribution of x bar what is the distribution of x bar the distribution of x bar can we say anything we do not know the distribution of the fat contents in the population. We know that the mean is say claim to be 18 and s d is given as 1 but we do not know the actual picture or actual shape of the curve of the distribution of the population. However, can we still say anything about the distribution of x bar use the central limit theorem which says that the distribution of x bar would be normal irrespective of what the distribution of the population from where we are drawing is. And it would be approximately normal provided the sample size is greater than equal to 30 and which does hold in this case. Therefore, I can find this probability standardize it x bar minus mu by the s d. So, that will be 18.4 minus the claimed mean the population claim the claim is that the population mean is 18 this divided by the s d is 1 by square root of 36. This works out to as probability that z is greater than equal to if you work this out this becomes 2.4. And look at the standard normal table you will find out that this probability works out to as 0.0082 this probability works out as 0.0082. So, what does it mean? It means that the claim which McDonald has made based on which people are making decisions to have this hot dog or not to have it this claim appears to be doubtful. Because under the assumption that the population mean is indeed 18 the chances of getting that sample mean of 18.4 is very unlikely. So, there is a possibility that our assumption that the population mean is 18 may not be true. So, that is how we can try to find out or make probability statements about a hypothesis. So, we can say that the hypothesis that the population means is 18 is questionable. Because if that is true to get such a sample mean is likely, but highly unlikely. Now, let us look at point estimate for population mean. What is the point estimate of population mean mu? A single number or a statistic based on the sample data that represents our best guess for the value of the population mean. And we have already seen that sample mean is a statistic in this sense that it can represent the true population mean. However, that is not necessarily the only statistic that can be used to represent the population mean. We can take the median the sample median also is a statistic therefore, which can represent or be a guess for the population mean which we are trying to estimate the sample mode. So, all these values which are based on the data that we get from the sample that is called a statistic. A statistic whose mean value is equal to mu is said to be an unbiased statistic. For example, the sample mean we have seen is an unbiased statistic because the mean or the expected value of the sample mean x bar is equal to the population mean mu. The point estimate say 5.5 feet says that nothing about how close it might be to the true population mean mu. So, if you have a population and you draw a sample and based on that sample you work out the point estimate which happens to be 5.5 feet say you are looking at the heights of the individuals in the population. Getting a point estimate based on the sample mean say of 5.5 does not tell how close is that value to the true population mean mu. It just gives you an estimate. It does not tell how close that value is to the true population mean mu. Now, as an alternative we might report an entire interval of possible values for the population mean mu and this brings us to what is called an interval estimation of the population mean mu. So, apart from getting a point estimate if we can make some probabilistic statement about how likely is it that the population mean would lie between two given values which is your interval estimate. So, that would be more meaningful. Well, the property of unbiasedness I would like to illustrate. Suppose I have a population having units say 1, 2 and 4. So, I have a population of size 3. So, from here the population mean is how much? The population mean is 4 plus 2 plus 1 which is 7 divided by 3 that is my population mean. And suppose I am drawing from this population samples of size 2. So, what are possible samples? My sample could be the sample units 1 and 2. It could be 1 and 4. It could be 2 and 4. These are all possible samples when I am drawing from a population of say size 3. And the corresponding mean would be 1.5 the sample mean for 1 and 4 that is going to be 2.5 and 2 and 4 the sample mean is 3. And if I look at the mean of means that would be how much? 1.5 plus 2.5 plus 3 which works out as 7 divided by 3 possible samples which happens to be the same as the population mean mu. So, this is indicating that the expected value of x bar is same as the population mean mu. So, this is sample mean x bar. Yet another you can just consider say a population say 0 1 2 4. So, here your mu is again 7 by 4. And the samples in this case would be 0 1 0 2 0 4 1 2 1 4 2 4. And the corresponding means would be 0.5 1 2 1.5 2.5 3. These are the corresponding sample means for such a population. And from here the mean of means is 0 1 2 1 4 2 4 and that would work out as that will be 10.5 if you add these numbers divided by 6 which works out to as 7 by 4. So, in other words we see that the sample mean happens to be a statistic which has the desirable property of being unbiased estimator or unbiased statistic. This brings us to what is called confidence intervals and alternative to reporting a single value which is the case of the point estimate. For the parameter being estimated is to calculate and report an entire interval of possible values which is called a confidence interval. A confidence interval is always calculated by first selecting a confidence level which is a measure of the degree of reliability of the interval to have captured the true population mean. So, while calculating the confidence interval and when we are trying to give two numbers we will say that these two numbers would capture the true population mean with certain probability. And when we make such statements it becomes more appealing and we are more informed as to what are the chances that the true population mean is within certain value. I think we will continue this next time. Thank you.