 Hello, in this video we are going to cover measures of center, that is we're going to focus on a descriptive statistical value that allows us to focus on the center of a set of data. So a measure of center is a value in the center or middle of a data set. These include the mean, median, mode, and the midrange. You've likely heard of mean, median, and mode. For many of you the midrange might be something that's new. You've likely heard the word range, but not midrange. And so some notation that you probably want to know before we go any further is this squiggly looking sign is called sigma, it's a Greek letter called sigma. It means sum, find the sum of a set of values. We'll use the generic letter X to represent any individual data value. We'll use little n to represent the sample size or number of data values in a sample and we'll use big n to represent the population size, number of data values in a population. So when I talk about the mean or the arithmetic mean, because there are different types of means if you ever take any math classes focused on numerical analysis and so forth you'll learn about those. But the mean or arithmetic mean is a measure of center. It's found by adding up the number of data values you have. You add up your data values and divide by how many you have. The mean is a type of average. So we use X bar, X with a bar on top, to mean the mean of a sample, the mean of sample values. So X bar, the sample mean is the sum of all the data values, remember that's sigma and then that's X which represents all the data values. That's the sum of all the data values divided by how many you have in the sample. That's because we're finding the sample mean X bar. Please learn that notation, X bar is the sample mean. Next the Greek letter kind of looks like a U but it's not, it's mu. Mu is the mean of a population. So mu is equal to you add up all the data values you have and you divide by how many there are in the population because remember mu is the population mean. So some of the pros about the mean. Sample means from the same population very less than other measures of center. All data values are used and then one of the cons is that the mean is very sensitive to outliers. If you have a really big value or a really small value in your data set then it's really going to mess your mean up. What that means is that it is not a resistant measure of center. The median which we already discussed is the same thing as the second quartile separates the data and the halves. It is the middle number of an ordered set of data. The data must be ordered before you find the middle number or the median. It may or may not be a part of the data set. So some books will use x tilde, x with a little squiggly on top or some books may use capital M to represent median so it just depends. The median is not affected by outliers because remember how you find the median. You count out the data values till you find the one in the middle so the outliers are crossed out. Remember you can mark out a lower value, mark that mark out a higher value, mark out a lower value, mark out a higher value until you get to the number in the middle. So that means it is resistant, outliers do not affect it so it is resistant. So how to find the median by hand is you will sort the data, if the number of data values is odd the median is the number right smack dab in the middle. If the number of data values is even you'll have two middle numbers that you must average meaning add them up and divide by two. The mode is the number that occurs the most often in the data set. If a data set is bimodal that means it has two modes, if a data set is multimodal it means it has more than two modes and then sometimes the data set can have no mode which means no data value is repeated. So mode is the only measure of central Tennessee that can be used with nominal data. Remember that's categories like the color of your shirt, student IDs and so forth because you can look at shirt colors and say well green was the most frequently worn color. So mode is the only measure that can be used with nominal data, remember that's categories and labels. So I just have some sample data sets up here in part A I have 5.4, 1.1, 0.4, 2, 0.7, 0.3, 0.4, 0.8, 1.10 we have to know how many data values or which data value occurs the most often and it's clearly 1.10 or 1.1. In part B you'll notice in your data set that looks like 27 occurs three times, 55 occurs three times and then 88 occurs two and 99 only once. Since there's a tie 27, 55 both occur three times, the data sets bimodal 27, 55 are both modes. In part C I have 1 through 10 every data value occurs once so since no data values repeat we have no mode. So this is just showing you the different types of data sets that exist in terms of the mode. Now the mid-range is literally the midpoint of the maximum and minimum data values. So what this means as a formula is you take the maximum value you add up the minimum value to it and you divide by two. That is the mid-range. So one downfall is that the mid-range is obviously sensitive to outliers because it uses the minimum maximum value of a data set. It's not very common like I said most of you have probably not heard of it before. Very rarely do I ever hear of the mid-range but some of the positive things about the mid-range is that it is pretty easy to calculate. It uses two data values you add them up, divide by two. It adds variety to the different types of measures of center that exist and it prevents someone from thinking the median is the average of the maximum and minimum values. So in my example I have a sample of 10 car dealerships and the number of a certain type of car each is given below. So 15, 12, 4, 11, 6, 4, 15, 12, 21, and 8. I want to find the four measures of center. That's mean, median, mode, and mid-range. So I'm going to make my list off to the side here. The mean or X bar because it's the mean of a sample will be, I don't know, we'll find it in just a minute. You have the median. I'll use big M to represent median. You have your mode. Remember we can have more than one mode potentially. We'll find out in a minute and then you have your mid-range. How about we calculate the mid-range first? If I look at my data set, what's the highest value and what's the lowest value? Well, it looks like the highest value is clearly going to be 21 and the lowest value is going to be 4. So for my mid-range, I have 4 plus 21 over 2 or 25 over 2, which is 12.5. How about we look for the mode 2 and then I'll use my Google Sheet Spreadsheet to do the mean and the median to save us some time. All right. So the mode, the number that occurs the most frequently. It looks like 12 occurs twice, 15 occurs twice, and 4 occurs twice. Wow. So it looks like we have three numbers that occur twice. So I have three modes. This data set would be called multimodal, but the modes are 4, 12, and 15. All right. So you can find the median by hand. You would have to sort the data values and find the middle number. You can find the mean by hand. You can add up all the data values, all 10 of them, and you divide by 10. But I'm going to use the Google Sheet Spreadsheet. So in my Google Sheet Spreadsheet, I am on the one variable stats tab. I'm going to first clear out column A, which is where I will type all of my data. So let me type all 10 data values. I have a 15, 12, 4, 11, 6, 4, 15, 12, 21, and 8. Always double check to make sure you typed in the correct numbers because it can be easy to accidentally type the wrong thing. And what do we have here? It looks like we have a mean of 10.8 and we have a median of 11.5. The calculations are done right there for you. All you have to do is just type in the data set. You don't have to sort it or anything. The computer is going to do all the work. So thank you, Google Sheets. So the mean is going to be 10.8 and the median is going to be 11.5. So those are all of our measures of center. Let's do it again. So the eight highest salaries of people in a small town has collected it and the following are obtained 192,142,172,161, and so forth. Let's find the four measures of center. So out of the whole town, I only picked eight salaries. So let's find the sample mean. Let's find the median. We'll find the mode. And then we'll find the mid-range. So with the mid-range, what's the smallest value and what's the largest value? Well, 134 is the smallest. Looks like 192 is the largest. So remember, the mid-range, you add up the smallest data value with the largest data value and you divide by 2. So what you're going to end up getting is 163,000. Is there a mode? Is there a number that occurs the most number of times? I see two data values that are 142,000. Does anything else occur twice or more times? No. Our mode is 142,000. For the mean and the median, option one, you can sort the data. Option two, well, for the median, you can sort the data, find the middle number. For the mean, you can add all the data values and divide by 8. But I think we'll use Google Sheets. Hello, Google Sheets. How are you doing today? Hopefully you're ready to do some work. So we're going to add up or type in the eight data values. So I'm typing in all eight data values. Feel free to try it yourself, too, to make sure your spreadsheet's working correctly. All right, so I typed in all eight values. Make sure you type them in correctly. Don't be ashamed to go through and check your work. And I believe I did that correctly. So we have a mean of 160,375 and a median of 156,000. So what I have here is 160,375. That's my mean. And the median was 156,000. All right, so we found our measures of center. Go us. So does this information allow us to conclude anything about the salaries of people in the small town? So let's look at the data. What did the data include? It included the eight highest salaries. So is this an accurate representation of the salaries of people in that small town? And I would say no, because only the eight highest were used. I mean, looking at this data, you're like, yeah, the mean salary of the town was 160,375. Yeah, the mean of the eight highest salaries. So this is not representative of the actual town itself. They picked the eight highest salaries. They didn't even randomly pick eight salaries. They just say, hey, let's take the highest. So that's a little bit of a question on practicing measures of center and kind of looking to see if the data are actually representative or accurately portraying what they're representing. So that's all I have for now. Thanks for watching.