 Statistics and Excel, Standard Deviation and Variance Population Location Data. Get ready, taking a deep breath, holding it in for 10 seconds and looking forward to a smooth soothing Excel. First, a word from our sponsor. Yeah, actually we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay, whatever, because our merchandise is better than their stupid stuff anyways. Like our CPA six-pack shirts, I must have for any pool or beach time, mixing money with muscle, always sure to attract attention. Even if you're not a CPA, you need this shirt, so you can like pull in that iconic CPA six-pack stomach muscle vibe, man. You know, that CPA six-pack everyone envisions in their mind when they think CPA. As a CPA, I actually and unusually don't have tremendous abs. However, I was blessed with a whole lot of belly hair. Yeah, allowing me to sculpt the hair into a nice CPA six-pack-like shape, which is highly attractive. Yeah, maybe the shirt will help you generate some belly hair too. And if it does, make sure to let me know, maybe I'll try wearing it on my head. And yes, I know six-pack isn't spelled right, but three letters is more efficient than four, so I trimmed it down a bit, okay? It's an improvement. That's okay because we'll basically build this from a blank worksheet, but if you do have access, three tabs down below. Example, practice, blank. Example, in essence, answer key, practice tab, having pre-formatted cells so you can get right to the heart of the practice problem. The blank tab, having a blank worksheet except for our data set, so we can practice formatting the cells within Excel as we work through the practice problem. If you don't have this data set, you could type it in there. This is not a very long data set, or you can use some of our other methods to create your own data set. You can also take a look at Kaggle K-A-G-G-L-E dot com to find data sets to practice with as well. Let's go back to the first tab to get an idea of what we will be doing. In prior presentations, we've been talking about our statistical calculations in a bit more detail. Prior to that, we were mainly focused on the intuitive representations of the data in pictorial format, things like the histograms and the box and whiskers. So now we're concentrating in on the spread of data around the center point, around the mean, and we're using our standard deviation and variance calculations to do that. So now we just have a different data set over here that we can practice those similar calculations related to population information. So we have the location over here and then the population data that we will be working with. So let's just get rid of this. I'm going to just format our table and then we'll do the similar calculations we've seen in the past. I'll do them a little bit more quickly here because we have seen some of these in the past. So I'm going to format the entire worksheet by selecting the triangle, right-clicking on the worksheet, formatting the cells. I'm going to go into the numbers tab or go to the currency, negative numbers, bracketed and red, no dollar sign, and we don't need the decimals in this practice problem. Removing them, okay. I'm going to make the entire thing bold, home tab, font group, bold. That's too bold. You are too bold. No, the whole thing is going to be overly bold, overly bold. And then I'm going to hold control, scroll up a bit and let's put our cursor in one of these items so we can go to the insert tab, tables, group and add the table. So we'll add a table dancing ants doing their magic mambo around this thing, mambo, mambo, mambo. And then I'll double click on the title to put a space between them so that I can wrap the text. I'm going to select the title, go to my home tab, and then I'm going to go to the alignment, wrap the text and center it. And then I can make these a little smaller. I can make this one a little smaller. And that looks good. So we have the name over here, I could sort by the name A to Z, it looks like it's sorted by, I could sort by then the results, the population from Z to A or from A to Z. Or I did that backwards, Z to A or A to Z, you get the idea. And then let's do our standard kind of statistical calculations that we've been working with, make it a skinny C so that we can see it better. We need a skinny C so we can see everything else because the C was in the way, it's too large, it's taken up too much time. So let's make us a YD and then we're going to say that we want the mean, we'll calculate the mean or the average. Average, we'll calculate the min number, we'll calculate Q1, we'll calculate the median, which is going to be Q2. And we'll calculate Q3. And we'll calculate the max. And then we'll calculate our focus here, the new calculations we have, standard deviation for the population. That stands for not really this population, but the population of whatever data we have as opposed to a sample. So we're going to do the population calculations and we'll look at the sample calculations later. So remember those are slightly different, but we'll just practice the population one. Variance for the pop. Okay, so then let's do our calculations function-wise equals the average. We're going to take in our data, double-click on the average and then just put in our cursor so we hit the drop down. Pick in the data and enter. The min equals the min, shift 9. And pick it up our data. And there's the smallest one, which we can see if I sort from A to Z. And then quartile 1. Quartile 1 is equal to Q, U, Tab. To pick up the quartile, select in the range and then I need one more argument. We get to the next argument by selecting a comma and then a 1 for Q1 or quartile 1. And then the median, which I could say is quartile 2, but I'm going to use the median calculation. And that's more common. It's easier to type, takes less arguments. We'll just take the whole thing. That's the one in the middle. Then we'll take quartile 3 equals quartile 3, Tab, selecting the data. Quartile 3, I need a comma for a second argument and a 3. We'll take the max or largest number in the data set equals the max control shift. I mean, sorry, shift 9, data, max. If I sort from Z to A, we could see that one. So there's the 30,348. Standard deviation equals the ST. And I'm picking the STD EV dot P for the population. The STD for the population, STD for the population. All right, selecting that one and picking it up. And then we want the variance equals the variance, which is the VAR for the population. And there's the calculations. So there we have it. All right, if I entered a histogram so we can see this pictorially, I can go to the insert tab up top, charts and graphs, histogram, putting in that histogram. We need a gram about our toe. And we call it a histogram. Okay, that's not what it is. But there it is. So now we've got our buckets of our data down below. We can see it's skewed to the right here for our data. Now let's calculate our variance and standard deviation a little bit more longhand in a formula type of basis or table basis, which might be giving us more insights than simply using the Excel functions. So you'll recall if I type in my variance, my formula, I'll just copy and paste the formula over here is that I'm going to lower the font a bit. You could type that in if you want, but I'm not going to retype it in because we have seen it in the past. We've typed it in there, but you could do it by going to the insert tab and go into the symbols equation. And I suggest using the ink equation so that you can just write it in there and it'll type it in there as you write it in there, which is nice. So let's make this home tab. I'll make it orange. So it's a different color. And then I'll pull this up here. I'll make this black and white on the header. There's our formula for the variance black, white. And then we have the standard deviation formula, which is similar, but has in essence another step in the series of steps, which is taking the square root, which is why it's represented by just a sigma instead of the sigma squared. And we can bring that down to 12 as well. 12 as well. And let's bring it on down and let's put some orange around that. Okay, so now let's kind of do this in our table form. This should be black and white. I need a header formatting before. Hold your horse. Hold your horse for crying out loud because it needs a hug. The horse needs a hug. You've got to hold your horse sometimes because it's really, you could tell it's got a long face. The horse has a long face that needs a hug. You got to hold your horse. All right, let's put some blue. Home tab font group. We're going to make this bordered and blue. And then let's just take our data set. I'm going to copy from A to B. Copy A and B. Selecting those, right-click and copy. Bring this on over to the right. And we're going to put this in column R. I'll just paste it normal. I'll make a skinny queue. Any questions? Skinny queue? We're short on the queues here, short on questions. All right, so then we're going to go up top and say that this is going to be... We're going to start doing our top bit here. So we're going to take each of these items minus mu or the average. So that's going to be the mean or the mean or the average equals the average. Selecting the average. And I'm going to pick my data set, recalculating the average again. There's our average. Everything compared to the mean. Every data compared to the mean. X of I up to X of N, which compared to the mean. So we're going to say, all right, this is going to be the difference. Which is equal to the population minus the mean. We have some positives, some negatives because that middle point is the middle point. And therefore we need to get rid of the negatives, not with a absolute value, but we're going to square them, squared, which gets rid of the negatives, but also squares it. So that's why we got the square root when we want to bring it back down to the standard deviation as opposed to the variance. But I'm going to take the prior cell and care it. It care it. That's to the power of taking it to the power of grace. No, the two squared to the power of squared. And then so that's the whole top part, but we got to sum it up now. We could do that by going to the bottom of our table. And we can we can then go into my table designs, table style options and put a total down here. And then over here, I could just keep the total as the name. That's good. Maybe here I'll do the average calculation so we can recalculate the average again, which we see right there. We can over here do the count because I don't need to sum it up or anything. I would like you to count the cells one, two, three, four and so on. And then here let's sum that up to give a double check that it should be zero. It is because we took the distance from the middle point of all of them, some of them being negative. And then we got this large number because we squared all of them, which took away the negative, but now we get this big number. Let's calculate then the variance. So this is the squared, squared difference from the mean, which is equal to this, that's kind of like the subtotal we have thus far, which is the top part of our variance formulas that has been squared here. And so we're going to pick that up. That's the numerator, I should say, if I could stop stuttering, if I could stop stuttering. And then we're going to pick the count. We'll say divide by that thing so I can hit a divide without it beeping at me. And then the count, which is represented by N in our equation. And this is going to be equal to the 78 counting the cells, which is counting the cells. And then that's going to be giving us the variance, which is represented by the sigma. Insert and symbol, cool symbol, Greek symbol here, Greek and Coptic. But I already have it in recent, so I'm going to pick it up down there, add it. And I'm going to hit enter and then go back into a type of two, shift left to select the two, right click on the two, format the two, and make it a subscript so that it does that, making it look like a squared thing. And then we're going to do the squaring, which is which I'm sorry, we're not going to square it. Divide now this equals this cell divided by the cell gives us the the variance or sigma squared. Let's put an underline home tab font group underline. And then we can also do the the standard deviation, which is also a sigma, but not squared. Insert symbols were in the Greek and Coptic. There's the sigma there. There it is down here, either one. Pick it. Pick the sigma instead of your nose. I don't know why I had to say that is I couldn't raise it anyways. This is going to be equal to the square root, the square root. You probably weren't picking your nose in the first place. I didn't mean to accuse you of picking your nose. I felt like saying that. So that comes out to 5134. Let's go ahead and make this border blue home tab font group blue and bordered. So there we have it. So we've calculated this a couple different ways. Again, just notice that when you do it this way, sometimes those separate steps can give you more intuition about what your what these numbers are actually doing. And they can give you sometimes the mid steps actually give you value in and of themselves. These numbers are often useful as well when we're comparing different sets of data, right? So then we can because then we're going to be comparing like the variance and the standard deviation compared to, you know, a similar possibly similar related variance and standard deviation. And so we'll take a look at that more in future presentations, but we can see we got the 5134 here, which ties out to the calculation 5134 here and the variance of the 2696. They're trying to dyslexify me there. The 26, the 26 variance here. Hold on a second. Something is wrong. Let's go back on over here. I think I picked up the wrong variance because if I go into here, I picked up the variance for S instead of P. Okay, so then that should be 2635388. See, they tried to sneak that by me by dyslexifying it with the 696 thing. But no, I totally saw who's I did it myself. I did it to myself. I don't I tried to trick myself by doing that, but no, so there, there it is.