 So, descriptive statistics is all about describing a set of data values using a couple of key numbers, and one of those key numbers is known as a measure of center, some indication of where the data values are located. Now, the introduction of a measure of center might begin as follows. Suppose I have a set of data values, and I want to pick something to represent those data values. So, for example, for concreteness, maybe I have a set of quiz scores that look something like this, and I want to pick a number that is somehow representative of this set of quiz scores. Well, how can we decide what a good representative is? Well, to introduce that, we want to consider this idea of a deviation from a center. So, whatever I think about as the center, what I might want to look at is how much the data values deviate from that center. And so, the easiest, perhaps most obvious way of doing that is to use something called the absolute deviation. So, as far as I choose some number x, not necessarily a data value, but I want to then find the sum of the absolute deviations from x by finding the sum of the absolute values of the differences between x and each of my data values. So, for example, here's my set of data values, 8, 5, 9, 6, 2, 7, 4, and I want to look at the absolute deviation from some place. So, maybe I'll take 8 as my chosen value. And so, I want to look at the sum of the absolute deviations from 8, and that means what I want to look at is the absolute deviations from x equals 8. And once I figure out what those absolute deviations are, I'll add them all together and get my sum. So, our first data value is 8. My absolute deviation from 8 is going to be the absolute value of my data, 8, minus my center, 8, and that's going to be the absolute value of 0, which is just 0. My next data value, 5. So, I want to find the absolute value of the difference between 5 and 8. That's 5 minus 8. Absolute value of negative 3 is going to be 3. Next data value is 9, so I want to find the absolute value of 9 minus 8. Next data value is 6. Absolute value of 6 minus 8. Next data value is 2. Absolute value of 2 minus 8, 7, and 4. And so, here's all of my absolute deviations from my chosen data value, and I'll take a look at the sum of those absolute deviations is going to be 17. What this means is that if I choose 8 as my representative value, then in some sense 8 is some distance away from the data values, and that distance can be roughly characterized by the number 17. Well, can I do better? Well, suppose I take 5 as my center, and so I might take a look at the absolute deviation from x equals 5, and again, all I'm going to do is I'm going to find the difference, the absolute value of the difference between my data values and my chosen center. So, absolute value of 8 minus 5, absolute value of 5 minus 5, absolute value of 9 minus 5, absolute value of 6 minus 5, and so on. So, I find those absolute values, and I sum them, and I find the sum of the absolute deviations. And one way we might look at this is because the sum of the absolute deviations from 5 is less than the sum of the absolute deviations from 8. In some sense, this value of 5 is closer to these data values than the value 8 is, and as a representative, 5 would seem to be a better representative. And this raises a new question. So, I have lots and lots of possibilities as the candidate for my representative value, and what I might want to do is I might want to look for the value, the number that minimizes this sum of absolute deviations. What number gives me the least possible value for this sum of absolute deviations? Well, it turns out that this is going to be tied in with something called the median. So, the idea is the following, we're going to take our set of data values, we'll put them in one order, it doesn't really matter which, but let's say least two greatest, and the median value is going to be defined as follows. If there's an odd number of values, the median is going to be the value that's right in the middle. If there's an even number of values, then there's going to be two values that are in the middle, and so our median is going to be the sum of those two values divided by two, sort of the midpoint of my two middle values. And from this definition, we get the following result, the median will minimize the sum of the absolute deviations. If we want to minimize that sum of absolute deviations, we'll measure them from the median. So, for example, again, here's our data set, and we want to, first of all, find the median and verify that we actually do get a smaller sum of absolute deviations. So, I have to put the data values in order, so I'm going to straighten them out, and in order those data values are going to be two, four, five, six, seven, eight, nine, there's my set of data values, there's an odd number of data values, so the median is the one right here in the middle, that's going to be six. Notice that there's three higher, there's three values to the right, there's three values to the left, and six is going to be right here in the middle, and so there's my median, six, and just to verify our claim, I'll find the sum of the absolute deviations from six. So, again, my data value, my absolute deviation, data minus the value, minus the center, I find the sum of those absolute values of those deviations, and I get 13. And remember that the sum of the absolute deviations from five was 14, and from eight was 17, so here we see the median does give us a smallest sum of absolute deviations.