 Statistics and Excel, Population Variance and Standard Deviation. Got data? Let's get stuck into it with Statistics and Excel. You're not required to, but if you have access to OneNote or in the icon left-hand side, OneNote Presentation 1432 Population Variance and Standard Deviation Tab. We're also uploading transcripts so that you can go into the View tab. You can use the Immersive Reader tool, changing the language if you so choose being able to then read or listen to the transcripts in multiple different languages using the time stamps to tie in to the video presentations. OneNote Desktop Version here. Data on the left-hand side. Remembering that in prior presentations, we've been thinking about how we can take our data set and summarize it, represent it in meaningful ways using both numerical representations as well as pictorial representations, numerical representations including standard statistics like the mean or the average, the median, quartile, one, quartile, three, and so on. When we look at pictorial representations, we talked about the box and whiskers or boxplot as well as the histogram. Each of these tools have their uses. However, we want to now think more about the spread of the data around a center point and the histogram does give us an idea of that pictorially. However, we would like to also have more of a numerical representation of that. Now we've thought about last time an average deviation concept and it's useful once again to think about it intuitively. Let's just give a quick recap of that and then we'll move on to the more standard calculations which are going to be the variance and standard deviation. We have our simple data set which just has negative four, positive six, which adds up to zero. If we take our average calculation, then we're going to get to an average of just zero because if I add these up, they add to zero. If I divide by four, they still add up to zero. That's going to be our middle point zero. Our average deviation, this is not the standard use formula for this calculation of the spread, but the intuitive calculation we talked about last time would be simply taking each data point and comparing it to the middle point. You can see why that would be useful because that gives us the difference from the middle point of each data point, which makes sense because now I can think about, where does that data point lie in relation to the middle point or average? That's the top part of our equation for the average deviation. Then we said, well, if I sum these up, though, I'm always going to get to zero, which means I don't really get a meaningful number if I sum up these differences. What I can do is take the absolute value. That means I'm not going to care if it's above or below. I just want the difference from the middle point. Then we get our absolute value numbers, and then I can sum them up to give me 20. I can then take that 20 and divide by the number of items four, one, two, three, four of them, and I get to five. That's a simple way to give an average distance. You're taking an average of the distances from the middle point, average deviation. We have a slight twist. This is the formula for the average deviation. Instead of taking the absolute value, because you'll remember this absolute value, the point is that I can't have these negative numbers because then it'll add up to zero. I have to make them positive, so we just take the absolute value, make sense, and then divide it by n. But instead of doing that, the standard deviation is going to take the square of x sub i minus mu, which is representing the mean. Then divide by n, and then we'll have to take the square root. Look at the difference over here. If I look at my image, by the way, this is a histogram. You can see basically you've got the data, the middle point being zero, and then the data on the left side, negative numbers, and the positive numbers. If I was to plot out just those four points like on a histogram. We can now look at the variance and the standard deviation. Note that both of these are useful because sometimes the variance gives us information that's relevant rather than the standard deviation. The standard deviation is probably the first thing that comes to mind when you're thinking about this type of calculation to get an idea of the spread of the data around say a center point represented instead of by a histogram with a numerical number. But the standard deviation is useful too, and it's kind of like