 Welcome to our lecture on sampling distributions. This is a new concept for you. When you take a sample, you have a sample mean x-bar, and you have the metric that you don't know, the population mean mu. But think about it a minute. You could have taken a different sample, you would have had a different x-bar, and yet another sample, you'd have yet another value of x-bar, you could have hundreds or thousands of samples. Each one of them would have a different value of x-bar. There's only one population mean mu. When you think of it this way, x-bar is a random variable, just like the underlying population x, where x is the random variable with its own mean and its own standard deviation. The distribution of the x-bars, if we're imagining taking a lot of different samples or potentially having a lot of different samples that we could have taken, each one with its own x-bar then is the random variable. Of course, there's only one mu, there's only one population mean, and we call that mu, and chances are we don't know what it is. Let's continue imagining this distribution of the x-bars that we're talking about. Every time we could imagine taking another sample, we would have another value of x-bar, we could have hundreds of these samples, thousands of these samples, millions of these samples, until we imagine a fully formed population of x-bars. And you know what? Approximately 95% of the x-bars would be very close to the true population mean mu, within plus and minus two standard deviations. This is because the x-bar random variable is normally distributed as long as the sample size is relatively large. It follows a normal distribution centered about mu, the same mean of the underlying x-population. This reasoning that we just went through, we're going to see it a little bit more formally, is called the central limit theorem, and everything we do in statistical inference relies on this theorem. Here's what the central limit theorem says. It says that even if a population distribution is not normal, the sampling distribution of the x-bars is normal, or at least may be considered to be approximately normal, for large samples. So first of all, certainly if we're taking the samples from a normal distribution, the distribution of the x-bars is normal. So all it's saying is that even if the population distribution is not only not normal, but very far from normal, as long as we're taking x-bar as our random variable, and we're looking at the sampling distribution of the x-bar, x-bar will follow a normal distribution. The condition is that n, the sample size of these samples, should be large. What is large? That's going to be different depending on the statistician you talk to, depending on the textbook you're looking at. Certainly at least 30, some say 50, some say even higher. This is the central limit theorem. This is a hypothetical sampling distribution of the sample mean. That's the x-bar if n is relatively large, and these are the properties. Now remember, don't confuse x-bar and x. This is the x-bar, the sample mean. So it has to be based on a certain sample size. We're looking at x-bars and looking at their distribution. So now we know, first of all, n is relatively large, and we'll use 50 probably as a cutoff. We can say the x-bars will follow a normal distribution, the expected value of x-bar. That's the average of all the averages. The expected value of x-bar is mu. Now what about the standard deviation? We call that, this is the standard deviation of the x-bars. We show it as sigma, a lower case x-bar. The standard error of the mean, sigma x-bar, is equal to the population standard deviation divided by the square root of n. So this means that for large samples, the sampling distribution of the mean, the x-bar, can be approximated with a normal distribution. And notice we see on the right side, you see it. The expected value of x-bar is mu. And the standard error of the mean, as shown as sigma x-bar, is sigma over the square root of n. Now that's one little thing you don't have to worry about in this course. In the rare situation where you've sampled more than 5% of the population, which means lower case n is more than 5% of capital N, the population size, there is something called a finite population correction factor. That's so rare, we're not even putting in the formula for it, but it does make an adjustment to the formula. So you can forget about it, just be aware of something called a finite population correction factor. If in the unusual case, you sampled more than 5% of the population. Look at a very, very trivial example, just to help you understand the whole idea of sampling from a population and the various kinds of x-bars. It helps you understand when we say x-bar is a random variable, what that means. Now let's take a population that's very small, 5 elements. There are the elements, 1, 2, 3, 4, 5. So we have 5 elements in the population, and we're taking samples of size 2. Again, very strange that your sample size is 40% of the population, because we talked about finite population issues. But anyway, this will still illustrate what we're trying to get across. The 10 possible samples you can take, that's N combination 2, and it's 5, I'm sorry, 5 combination 2. So the 10 possibilities. Since we know the whole population here, that's just 1, 2, 3, 4, 5, the capital N equals 5, we can compute the population parameters. Mu is 3, the average of 1 plus 2 plus 3 plus 4 plus 5 divided by 5. The population mean is 3, the population standard deviation is 1.41. We've looked at the entire population. Now we're going to look at the sample means. Now look at all the different possible, there are only 10 samples you can get out of 5. You might get a 1 and a 2, and the average of that would be 1.5. You might get a 1 and a 3, the average of that would be a 2. You might get a 1 and a 4, that would be 2.5. And you might even get a 4 and a 5, and you get 4.5. So the 10 possible sample means here. Average out, take the average of the averages. So you add them all up as 30, divided by 10, guess what? The expected value of X bar, not X, but it's X bar, is 30 over 10, which is 3. Now we see what unbiased means. Now again, this is a very trivial example. Try to imagine the same thing happening when you're taking samples of size, let's say a thousand, out of a population of 330 million. That's the population of the United States roughly. 330 million, you're taking samples of size 1,000. The same thing is going to happen. If you average out those X bars, let's say you're measuring income, and you're averaging out all the X bars, you'll be getting mu. It's called unbiasedness. There's no bias. The expected value of X bar is always going to be mu. Let me know a little bit about the sampling distribution of X bar, not to be confused with the sampling distribution of X. Remember, X is the individual value. X bar is a sample mean. We can actually turn it into a Z value. You can see the formula there. Z equals, if you take X bar minus mu over sigma of the square root of n, that essentially converts your sample mean into a Z score. Why? Because you're taking the mean of the X bars, which is the mu, so you do X minus mu. You divide by the standard error of the mean, sigma X bar, the standard deviation of the X bars, and now you have a Z score. In fact, if n is large, some say 30, 50, you don't have to worry about that problem. Your teacher will tell you, you can use S as the unbiased estimate of sigma, and there's how to get a Z test. You take X bar minus mu, which is again, the mu is the mean of all those X bars, divided by S, because we're using S as the unbiased estimate of sigma, so we divide by S over the square root of n, and now we have something that will give us a Z test. This will be used a lot in this course. Let's do a problem. Suppose we have steel chains with an average breaking strength of mu equals 200 pounds, with a standard deviation, sigma of 10 pounds. How do we know that this is the distributions of mu and sigma? Well, presumably the production process is very well controlled, and we know what the average and the standard deviation are supposed to be. Now, let's say you take a sample. You're not going to test every single chain that comes off of this production process. Suppose we take a sample of n equals 100 chains, and we test each one and come up with a breaking strength in pounds. We add those up, divide by 100. What are we going to get? We're going to get the average of the sample. So here's the question, here's the problem. What is the probability that the sample mean breaking strength will be 195 pounds or less? This is the same as asking what proportion of the sample means in the distribution of the sample means will be 195 pounds or less. And you know what the first thing you have to do to solve this problem is, right? Draw a picture. See where that 195 pounds falls in the picture and figure out how to get the probability that you're interested in. There's the picture. You have the picture of the X-bar distribution. And remember, the mean of the distribution of the X-bars is the same as the mean of the distribution of the underlying random variable X. So it's 200. But what we want to know is what's the probability, which is the same as saying what's the proportion, what's the percentage, what's the area under the curve, that's 195 pounds or less. With 200 pounds in the middle, that's the mean, asking for 195 pounds or less means we're asking for the tail probability. And you can see that very, very clearly in the picture that you're looking at. Now, you don't really know how large or small that tail is until you actually compute the z-value, right? And we get the z-value using the formula that we saw in a previous slide. The X-bar value of interest is 195. And as always, every time we use a z-statistic, we're taking the random variable that we're interested in minus its mean divided by its standard deviation. And in this case, since the distribution, the random variable is the sample mean. Its mean is mu, the same as the mean of the underlying population X. And its standard deviation is the standard error of the mean, sigma over the square root of n. We end up with a z-value of negative five. Now, by now, you've had enough practice with the normal distribution to know that a negative five is way out in the extremes, in the extreme negative side of the distribution. And if you try to look this up on a z-table, you'd have to have a very, very good z-table for this and you probably won't be able to, why? Because it's very, very close to zero. And in fact, that's as good as you can get with the answer to the problem. The probability that you'll have an average mean breaking strength in your sample of 195 pounds or less is very close to zero. And remember, by the definition of the normal distribution, it's not going to go down to zero. So theoretically, all we could say is that it's very close to zero. Of course, if you've been using the cumulative standard normal table, you will see that to get the area in the tail from negative infinity to negative five, negative 5.0, it is a very, very tiny number, 0.000287. Assuming I had those zeros all correct, that's the probability and the answer to the question. The problem we're going to do now with bulbs, foreshadows what we're going to be doing in inference called hypothesis testing. You see the word claim? That's going to be called a hypothesis. Okay, a manufacturer claims that its bulbs have a mean life of 15,000 hours and a standard deviation of 1,000 hours. Okay, that's their claim, and we're going to accept it temporarily. We call that a straw man. We've accepted it and let's see. Quality control expert takes a random sample of 100 bulbs and finds a mean, again, it's a sample mean, so it's X bar of 14,000 hours. Well, should the manufacturer revise the claim? Basically, we look at the sample, we call it the sample evidence. Using the sample evidence, does the claim about mu make sense or not? Again, this foreshadows something, it's very important, it's called hypothesis testing, but now you should be able to understand the logic and we'll do more of it throughout the course. Like everything, we're going to try to turn the sample evidence and turn it into a Z-scope, but temporarily we're going to accept the claim. They told us mu is 15,000, that's the company, so why not believe them at this point? Okay, that's what we've done, we've taken the sample evidence, that's the X bar of 14,000 and the S of 1,000, actually we're using the sigma, estimating sigma, they told us sigma, the population standard deviation is 1,000, we've accepted it temporarily and now we're taking the sample evidence and we've turned it into a Z-score. Now we know that most Z-scores, if something is normally distributed, should be between plus and minus 2 approximately. Now let's see what this Z-score becomes, the sample evidence. 14,000 minus 15,000, 15,000 is the mu, 14,000 is X bar, then we have the standard error of the mean in the denominator and since we're accepting the sigma as being the standard deviation, we know we have to divide by the square root of N, so it's 1,000 over the square root of 100, which is 100, so we have minus 1,000 over 100, our sample evidence is an incredible minus 10. Well that's like, we don't have a table that goes to minus 10, even our cumulative table goes to minus 6, minus 10 you're going to have to do a lot of research to find it, because it's like one in a trillion or something. There are 10 standard deviations away from the mean, so what I would tell the company, they've revised their claim, the likelihood of their claim being true is like, very, very small, incredibly small, not impossible, remember the distribution goes all the way to minus infinity, but this is super unusual, reject the claim and tell the company, we doubt that your bulbs actually last 15,000 hours, it's probably a number closer to the 14,000 hours. We're going to be looking at two kinds of problems that you're going to have to see. One is A, in problem A, you're going to see we're looking at the individual motor, in B we're going to look at a sample mean based on sample of N equals 100 motors. So let's look at the problem. Okay, we have a large automobile company and they make a motor and they've estimated the life of their motors, now if they figure out mu, they say mu is 100,000 miles and the population standard deviation is 10,000 miles. Okay, so we know mu and sigma. Now question A asks, what is the probability that a randomly selected, I'll use the word individual motor, has a life between 90,000 miles and 100,000 miles? Again, it's X, it's not X bar. B asks different kind of question. If we take a random sample of 100 motors, that means N is 100, and we're looking at sample means, X bar, what is the probability that the sample mean will be below 98,000 miles? Let's solve A. The probability that a randomly selected motor made by this company has a life between 90,000 miles and 110,000 miles. You can see from the diagram, always draw the picture. From the diagram what this looks like. Well, we're going to have to turn 90,000 into a Z-score and 110,000 into a Z-score. Okay, the Z value for 90,000, given that you know mu and sigma is 90,000 minus the mu of 100,000 divided by the sigma of 10,000 is minus one. The Z-score for 110,000 is 110,000 minus 100,000 over 10,000, that's plus one. So you're really trying to see the area between minus one and plus one using your Z-table. And you know by looking at the table that the area on the right side is 0.3413 and the left side is 0.3413, and your answer is 0.6826. Now we're going to solve part B. Remember this is looking at X-bar, so we have to work out the distribution of the X-bar, which also follows the Z, the normal distribution. So the question was how many sample means will be below 98,000 miles in life? Okay, so using the formula, now it's the formula for X-bar, not X. You're going to be using the formula for X-bar. So Z equals X-bar minus mu over sigma of the square root of N. Okay, and we know mu and sigma that were given to us. And 98,000 minus 100,000 divided by 10,000 over the square root of 100, that works out to minus two. So 98,000 miles is a minus two in Z-score. We want the left tail. Now remember the table is 0 to Z, but we also know 0 to infinity is half. Zero to minus infinity is a half. The mu cuts everything in half. So it's 0.50 on one side, 0.50 on the other side. The way to get the left tail, what you do is you take 0.5, which is really zero to minus infinity, subtract the piece that's between minus two and zero. Take the zero minus two area, and which we find from the table is 0.4772. Take a half minus that, 0.5 minus 0.4772, and we get 0228. That tells us the probability of getting a sample mean below 98,000 miles is 2.28%. Roughly 2% of the means of the sample means will be in that left tail. Just to remind you what we said before, this topic is a very important topic for the rest of this intro to statistics course because it has implications for statistical inference, which is a huge topic that you're going to be studying in a lot of different ways. It's all about the relationship between X bar and mu. Mu is the parameter of the population. We don't have the population in front of us, so we don't know mu. Without actually conducting a census, we're never going to know mu. Certainly not with 100% certainty. On the other hand, we have a single sample that we've taken. We have the X bar value from that sample, and we're going to use that X bar value to make inferences about the population mu. What kind of inferences? Well, we're going to estimate. We'll do estimation of mu using X bar as our basis, and we're going to test hypotheses about mu using the X bar value that we got in our sample. Since there are so many possible X bars, as many as there are possible samples, we use the X bar value we happen to get as a tool to make inferences about the true population mean mu. In this course, when we talk about sampling distributions, we're focusing on the distribution of the sample mean. X bar is used in order to make inferences about mu. Of course, statisticians are interested in other statistics and using them to make inferences about other parameters. We could be talking about the population proportion. We could be talking about the population standard deviation, the population median. There are a lot of different population parameters that we would like to study based on metrics we get from our sample. In each case, we're going to use a statistic computed from the sample to make inferences about the parameter. How do we do that? Well, the mathematical statisticians do it for us. We're not going to do it in this class at all, but the basic idea is that every one of these statistics has its own sampling distribution that has been very well studied and can be applied. Perhaps you're going to do that if you take more advanced statistics courses. In this course, when we say sampling distribution, we're looking at the sampling distribution of the sample mean. As always, do lots and lots and lots of problems. It's going to stand you in good stead. You'll understand the material and it'll help you on the exams because when you look at a problem, if you've done a lot of practice, you'll be able to quickly recognize it as far as which type of problem it is, and you'll be able to remember how to solve the problem and you'll be able to do it faster because we all know in an exam situation you don't want to have to go back and forth and think about stuff and look it up in your notes or in your textbook.