 Alright, so let's take a look at a fundamental problem here, which is the problem of sampling. And so, in general, what I have, I want to introduce two new terms. So first off, let's consider a set of numerical values, for example, the age of every single person in the United States, or maybe I'll take a look at the scores of every single person who has ever taken a particular quiz, or maybe I'll take a look at the height of every single tree in a forest. And this complete set of numerical values is known as the population. And the primary goal of statistics is to determine information about the population. However, the problem is that it's in general very difficult to obtain this complete set of data. It's very difficult in general to find the complete population data. And so, what we have to do is we have to make do with a subset of the population. So for example, if I take a look at that population of ages of all persons in the United States, I may take a look at the ages of 100 persons, and that's going to be a sample of size n equals 100. Or maybe I'll take a look at the scores of 3 people, and that's a sample of size n equals 3, drawn from the set of everybody who's ever taken the quiz. Or maybe I'll find the heights of 73 trees, and that's a sample of size n equals 73, drawn from the heights of every single tree in the forest. And so we go to the fundamental problem of statistics. Given information about the sample, what can I determine about the population? So for example, to get a handle on this, let's take a look at some samples. So there's my population of quiz scores, and let's try to find 3 samples of size n equals 3. And so we're forming samples of size n equals 3, so I want to pick 3 of the quiz scores to be the values in my sample. So for example, I might pick the first 3 quiz scores, 8, 5, and 3. So there's one sample. Maybe I'll take the last 3 quiz scores, 9, 8, and 7. Maybe I'll take a couple of random scores in the middle, 10, 7, 9. Or maybe I'll take a different set, 5, 10, 7, and so on. So the idea is that I may take, I may form these different samples of size n equals 3 by taking n equals 3 of the quiz scores. Now, what can we do with that? Well, again, the idea is we'd like to find information about the population based on our sample. And so one thing we might start off with is our population has a mean, a population mean, and we'll use this Greek letter mu. And we can compute that population mean, turns out to be 8. On the other hand, let's take a look at those samples. So my sample of the first 3 quiz scores had sample mean 5.33 about. My second sample that I took, the last 3 scores, had a sample mean of 8. And my other samples had different sample means. In practice, we only ever see one of these samples. So maybe I'm collecting these quizzes nice, peel off the first 3 quiz scores, and I get my sample mean. Or maybe I wait until everybody's done, and I take a look at the last 3 quiz scores, and I get my sample mean of 8. Or maybe I take a few scores from the middle, and so on. So in practice, we only ever see one of these samples. And so the problem is given one observation, one mean, what can we say about the population? So maybe I see this, or maybe I see this one, or maybe I see this one, and so on. Well, we can approach the problem as follows. The 4 different samples of size n give us 4 different values of a sample mean. And what I can do is I can think about those 4 different samples as itself being a sample of size n equals 4, drawn from the population of sample means. This population mean is a set of values, and so the population of sample means will itself have a population mean. And I'm going to designate that mu of x-bar, that's the population mean of our set of sample means. And an important result, the fundamental result in statistics is known as the central limit theorem, and it guarantees that under a broad range of circumstances this population mean of the set of sample means is going to be the same as the population mean itself. And what that tells us is that our sample mean is going to be a good estimate for the population mean. We can use the sample mean to estimate the population mean. What about those square deviations that we looked at? Well, if our sample mean was exactly equal to the population mean, then the mean square deviation of the sample would actually be a good estimator for the mean square deviation of the population. But in general, the sample mean will not be equal to the population mean, which means that our mean square deviation of the sample is in general going to be a little too small of an estimate for the mean square deviation of the population. And so what we have to do is we have to remedy this by computing what's known as the standard deviation. And this is based on the concept of the variance. And so the variance is going to be formed as follows. If what I have in front of me are the data values for the entire population, which has size n, then the population variance is the sum of the square deviations divided by the number of values. That's just our mean square deviation. On the other hand, if I only have the data values for a sample of size n, then the sample variance is going to be the sum of the square deviations, same as before. But this time I'm going to divide by n minus 1. This is not the mean deviation. This is going to be a little bit larger than the mean deviation, because I'm dividing by a somewhat smaller number. In either case, the standard deviation is going to be the square root of the corresponding variance. For example, let's say I have data that consists of the values 5, 8, and 2. And first, suppose this is my entire population of data values. This is everything that I'm going to have. And so I can find the standard deviation when I view this as my entire population. On the other hand, if this is a sample of size n equals 3, drawn from a much larger set, then I can also find the sample standard deviation of this set. So if I view this as my entire set, our mean is just going to be our population mean, 25, and my standard deviation is going to be the mean of the square deviations. So that's going to be, first of all, I'll compute my variance, data value minus mean squared, data value minus mean squared, data minus mean squared, divided by n equals 3, because I'm viewing this as an entire population. The entire population, this data set, so I have three values. My variance is going to be 6, and my population standard deviation is going to be the square root of 6, about 2.45. On the other hand, if I view this set of three values as being a sample, then I'm going to compute the mean, first of all, in exactly the same way. So my mean is going to be 5, and note this is the sample mean, because I'm now treating this, I'm now thinking about this as a sample. And I can compute the sum of the square deviations, but to get the sample variance, I'm going to divide by n minus 1. So again, that variance, value minus mean squared, value minus mean squared, value minus mean squared, this time, because I'm viewing this as a sample of size n equals 3 drawn from a larger population, I'm going to divide by n minus 1. So that's n3 minus 1, and after all the dust settles, my sample variance is 9, so my sample standard deviation is going to be the square root of 9, about 3. Now the question is, how do you know if you're dealing with a sample or the population? The answer is, the problem will say, this is the population, this is a sample.