 So, we'll introduce some measures of variation, a sad, mad story with many important implications. So far, we've introduced several measures of center, like the mid-range, mode, median, and mean. This brings up a new problem. Given a center, we can measure the deviation from the center, how far the data values are from whatever center we pick, and there's two common approaches. First, we can find the absolute deviation, the absolute value of the difference between the center and the data value, and the square deviation, the square of the difference between the center and the data value. Now, while we have a choice of the measures of center that we can use, there are two common choices. We can either choose the center as the median, or we can choose the center as the mean. And it's vitally important to keep in mind, since the center is what you choose it to be, you must specify the center. Which one of these measures of center you're going to be using? So, for example, let the values in a data set be 5, 8, 10, 5, and 6. Let's find the absolute deviations from the median and the squared deviations from the median. So, here we've picked our center as the median value. So that means the first thing we'll have to do is we'll find the median, so we'll put the values in order. So the median is 6, which means our absolute deviations are going to be the absolute value of the difference between each data value and 6. So our first data value is 5, and so that absolute deviation will be the absolute value of 5 minus 6, or 1. The second data value is 8, so the absolute deviation will be the absolute value of 8 minus 6, which is 2. The third data value is 10, so the absolute deviation will be the fourth data value is 5, so our absolute deviation is 1. And our last data value is 6, so the absolute deviation will be 0. We also want to find the squared deviations from the median. And so for that we can square each of these deviations. So those squared deviations will be 1 squared, 2 squared, 4 squared, 1 squared, and 0 squared. Computing these values gives us the squared deviations. Now, the absolute and squared deviations are computed for each individual data value. And what that means is that if we have a mess of data values, we get a mess of absolute deviations and a mess of squared deviations. To translate this into information about the entire set, we'll sum them. And this gives us two new things. The sum of the absolute deviations is the sum of the absolute deviations from the center. Similarly, the sum of the squared deviations is the sum of the squared deviations from the center. So let's find the sad sum of the absolute deviations. And the SSD, the sum of the squared deviations, using the median as center for our data set, 5, 8, 10, 5, and 6. Now, this is the data set that we've already computed the absolute deviations and the squared deviations for. So let's just pull in those numbers. The sum of the absolute deviations is the sum of these absolute deviations, 1 plus 2 plus 4 plus 1 plus 0, or 8. And the sum of the squared deviations will be the sum of the squares of these deviations, 22. Now, it should be clear that we have several ways to choose a center, the median, the mean, or any other measure of center. We also have several ways to choose the measure of variation. We could take a look at the sum of the absolute deviations, the sum of the squared deviations, or we could even use other functions of the deviations. Let's consider one more idea. The deviations themselves are a set of data values. And rather than dealing with all the data values as individual numbers, we should pick a representative deviation by using some measure of center. We can pick any measure of center we want, but we usually choose the mean. So this gives us two more important ideas. The mean absolute deviation is the mean of the absolute deviations. And the mean squared deviation, also known as the mean squared error, is the mean of the squared deviations. So let's consider, let's find the mad mean absolute deviation and msd mean squared deviation using the median as the center for our data set. Since the first step in finding the mean is adding up all of the data values, the first step in finding the mean absolute deviation or the mean squared deviations is finding the sum of the absolute deviations or the sum of the squared deviations. And we've already found those numbers in a previous problem. So if I want to find the mean absolute deviation, I'll add up all of those absolute deviations and divide by the number, so that gives me 1.6. Meanwhile, the mean squared deviation is going to be the mean of the squares of the deviations. So we'll add these squares of the deviations together and divide by the number 5. And that gives us a mean of 4.4. It's worth taking a moment to consider what these values mean. Remember, the mean represents the fair share of a quantity if everyone gets the same amount. And so the mean absolute deviation would be the fair share of the absolute deviations if the absolute deviations were distributed equally. So this value 1.6, this suggests that a typical value is about 1.6 units away from the median. Similarly, the mean squared deviation would be the fair share of the squared deviations if the squared deviations were distributed equally. So that means that a typical value is going to have a squared deviation of 4.4. And that suggests that a typical value is about square root of 4.4 or about 2.1 units away from the median. And this last leads to the following idea that of the root mean squared error. Since the mean squared deviation is the mean of the squared deviations, we need to take the square root of the mean squared deviation to obtain useful information about the individual deviations. And this leads to our root mean squared deviation, the square root of the mean squared deviations.