 Okay, so let's take a look at another measure of center. And this is the mean, which is what most people think about when they talk about a representative and average data value. But the mean is just one way of representing a set of data. And again, we looked at the representativeness of a value for a set in terms of the deviation of the data values from that value. And so we might take a look at a different representative, which is the square deviation. And this is introduced for the following reason. If I imagine two different data sets, so here's a data set 558, 1111, 888, 820, both of these data sets have median of 8. But on the other hand, if you look at the data values in these two sets, 8 in some sense seems to be a better representative for this set than it is for this set. And part of the issue here is that the second set, all of our deviations really are concentrated in this single value, which is way, way, way out here. Whereas with this set, our deviations are spread out among a bunch of different values. And so in some sense, because both sets have the same median, both sets will have the same sum of the absolute deviations from the median. But in whatever reason, in whatever sense we want, this oddball value here, 20, makes the second set a little bit more dispersed. And we can remedy this problem of finding the difference between the two sets by looking at the sum of the squared deviations. So as before, that's going to be the deviations from a given value. And this time we'll be looking at the sum of the squares of those values. So for example, we might take a look at the sum of the squared deviations from the median, x equals 8. So for our first set, my sum of the squared deviations, again that's going to be data value minus my representative center, 5 minus 8 squared, again 5 minus 8 squared, 8 minus 8 squared, 11 minus 8 squared, 11 minus 8 squared, and I find the sum of the squared deviations is going to be 36. For the other set, if I look at the sum of the squared deviations, again that's going to be 8 minus 8 squared, 8 minus 8 squared, 8 minus 8 squared, 8 minus 8 squared, 8 minus 8 squared, 20 minus 8 squared. And so I look at the sum of the squared deviations here and I get 144 and because the sum of the squared deviations here is significantly greater than the sum of the squared deviations here, in some sense we can view this set as being a little bit more dispersed. And again as with the median, I might ask the question, well, what is going to minimize this sum of the squared deviations? And we can now introduce this following concept. The mean of a set of numbers is the sum of the set of numbers divided by the number of numbers. And again, this is what we often think about when we talk about averages, but it's a very specific measure of center. And based on this theorem, based on this definition we can prove the following theorem. If you want to find a value that minimizes the sum of the squared deviations, choose the mean. In other words, the sum of the squared deviations to the mean is going to be less than or possibly equal to the sum of the squared deviations to any other number that you could think about as a possible representative. Well there's another property of the mean that is in some sense more important than this property of minimizing the sum of the squared deviation. And again, this emerges from the basic definitions of arithmetic as follows. Suppose I have my data values, x1 through xn, then, well, I'm going to calculate the mean as follows. Again, the mean is the sum of those data values divided by the number of data values. And I'm going to do a little bit of algebra. Any division corresponds to a multiplication that says n times the mean. And again, multiplication is just a repeated addition. So by my definition of multiplication, and since n is necessarily a whole number, I have the following, this n times x bar, well that's the sum of x bar, the mean, n times. Now the thing to notice here is that there are as many copies of the mean on the left as there are data values on the right hand side. And what this suggests is the following conclusion. If I imagine my data values to represent some finite quantity that's initially distributed among however many recipients I have over here, so my data value is maybe a weight of cereal in n boxes, maybe the points given on an exam to n students, maybe the heights of n people, and I can imagine those quantities being distributed to a number of recipients. The mean corresponds to the values that each of these recipients would get if we were to redistribute the total amount we have here equally among our recipients. Now for heights of people, that doesn't work out so well, but again if I imagine these to be weights of cereal in a bunch of different boxes, if I imagine redistributing the cereal so that every box has the same amount of weight in it, well the amount of weight that each box would get would be the mean. Or if I imagine these to be scores of n different students, if I imagine the students sharing the points so that every student gets the same grade on the exam, so the students who get lower grades will receive some points, the students who get higher grades will give away some points, then the mean corresponds to the grade that everybody would receive if we got an equality of all the grades. And so for example we might take a look at this, Adam Beyoncé, Carrie Diana and Evan M are scheduled to play at a concert and Adam wants to play 5 songs, Beyoncé 3, Carrie 6, Diana 2 and Evan M decides he wants to do 14 songs. And so let's determine first off the mean number of songs the artists wish to perform and let's interpret the significance of that value. So finding the mean is pretty easy, we're just going to take the numbers that we have, it's 5, 3, 6, 2 and 14 and we'll add those up and then we'll divide by the number of numbers that we have. So we have 1, 2, 3, 4, 5 different numbers, 5 numbers, doesn't matter if they're different or not, so we're going to divide by 5 and we get 30 over 5, the mean number is going to be 6. Now again the significance is that if we imagine these to be some finite quantity that can be redistributed, then if each of these artists plays 6 songs, the total number of songs being performed is going to be the same. And that's the significance of the mean, if we equalize the amounts given to each one of the recipients, the equal amount that we'd give would be equal to whatever the mean value is. Thank you. Thank you. Bye. Bye. Bye. Bye.