 Statistics and Excel, Population Variance and Standard Deviation. Get ready, take it a deep breath, holding it in for 10 seconds and looking forward to a smooth soothing Excel. First, a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay, whatever. Because our merchandise is better than their stupid stuff anyways. Like our CPA six pack shirts, a must have for any pool or beach time. Mixing money with muscle, always sure to attract attention. Yeah, even if you're not a CPA, you need this shirt. So you can like pull in that iconic CPA six pack stomach muscle vibe, man. You know, that CPA six pack everyone envisions in their mind when they think CPA. Yeah, as a CPA, I actually and unusually don't have tremendous abs. However, I was blessed with a whole lot of belly hair. Yeah, allowing me to sculpt the hair into a nice CPA six pack like shape, which is highly attractive. Yeah, maybe the shirt will help you generate some belly hair too. And if it does, make sure to let me know. Maybe I'll try wearing it on my head. And yes, I know six pack isn't spelled right, but three letters is more efficient than four. So I trimmed it down a bit. Okay, it's an improvement. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Here we are in Excel. If you don't have access to this workbook, that's okay because we basically built this from a blank worksheet starting in a prior presentation. So you could start back there. However, if you want to start here with simply a blank worksheet, you can do that as well. If you do have access to this workbook, there's three tabs down below. Example, practice blank. Example, in essence, answer key practice tab having pre-formatted cells. So you can get right to the heart of the practice problem. Blank tab is where we started with just a blank worksheet so we can practice formatting cells within Excel as we work through the practice problem. So last time, what we did is to try to think about this concept of the standard deviation, which gives us an idea of the disbursement, the spread of the data around that middle point, which we are defining as the mean. We thought about the intuitive way that we might come about this if we didn't know firsthand or first off about the standard deviation calculation by starting with an average deviation where we basically took our data points. We've had a very simple data set and we took our data points and we compared it to the mean, the middle point, the average, and then we basically took the absolute value of those items and divided by the count to get a number to give us an idea of where things are populated around that middle point. So we basically took the mean, the average of the distances from each data point to the middle. Now let's make a quick little histogram here just so we can see our data. So I'm going to select our data set and go to the insert tab and then insert tab, chart. Let's add a histogram that we've seen in prior presentations. Pretty boring histogram like that. Let's add a couple buckets here. I'm going to put some buckets down below and we'll go into our buckets on the left. Let's make like eight buckets just to make it. So there we have kind of an idea of our data set, right? Like if I was to picture this data set, what we have is that middle point, the average, the mean is in the middle, which doesn't have any, which is here, and then the data are around that. So we wanted to include some negative and positive numbers just to show that you might have negative and positive numbers on the mean around that midpoint. So then we have two on the negative side and two on the positive side. So that's kind of a pictorial representation of the data. And we're trying to say, well, the middle kind of point and this small little data set is the middle of zero, right? And we're trying to think about what's the distance from basically that middle point with the small data set. Okay, so now we're going to do a similar kind of thing, but now we're going to use the actual calculations that are typically used. And that's going to be the standard deviation and we'll calculate the variance, which is kind of a component of the standard deviation. Now be aware that there's different calculations for if you're dealing with the entire population versus the sample. We're imagining an entire population firsthand. So remember the two buckets we talked about of kind of statistical problems. One, you have all the data, that's what we're imagining here. So we're dealing with the entire population and two, where you have just simply a sample of the data and the calculations will be a little bit different. So right now we're going to look at, they'll be very similar, but a little bit different. Right now we're going to be dealing with a population, the entire population. All right, variance of the population. So population, I'm going to do black and white on the header, home tab, fonts group, drop down on the bucket making it black and white. And then let's add another formula. So I'm going to draw out the formula again by going to the insert tab, symbols and equation. So now we've got our equation item. I'm going to go to the tools now and then ink. I'm going to write in the equation with our little ink thing here. So let's see if we can write this thing in and see if it'll populate properly. So here we go. So I'm going to say this is going to be a sigma. So usually shown as a sigma squared. So it's not seeing my sigma thing. There it is. Sigma squared. And then so here we have it. I'm going to say that that's equal to, that's what we usually use to represent the variance. So the symbol oftentimes will be sigma squared. Okay. So then we're going to do our sum sign looking like this. And so then on top, I'm going to put an N up here, which it doesn't see yet. But then when I go down here and I put the I equals one, then it sees that it still doesn't see my N up top. So okay, let's see if I erase the N. Erase the N. The N is too ugly. It doesn't want to see it N up top. And let's erase this. I think I put a J there. Let's make this an I. Make sure it's an I. So I equals one. Okay, I'm going to keep on going even though it still doesn't see my N. And I'm going to say that this is going to be a brackets. I'm going to say X. So it thinks it's a C, but I'm going to put an X and it picks up the X and then sub I X sub I minus mu. Which is representing the mean. Close it up. And then we're going to put squared up top. So there we have it. It still doesn't recognize my N. What do I have to do? Excel for you to recognize. So I put my N here. Let's try to circle it. Say what? Why can't you see that? So it's still N. So there it is. So I circled it. There it puts the N up top. All right. And then let's put that whole thing over the top of an N. So there we have it. So you can see this is similar to what we did before. And that we're going to take each of our data points. We had four data points. And we're going to subtract them from the mean represented by mu, which in our case was zero. But this time we're going to square them. So that's going to be something that's a little bit different than last time. And then dividing it over the number, which is N. So that's going to be our variance calculation. So I'm going to say, all right, let's insert that. Now this is going to make more sense when we then do the next step to go from the variance to the standard deviation, which we're going to take like the square root. So in essence, we like square it. And then we take the square root of it, you know? So which kind of negates to some degree kind of like the squaring, right? So it's a little bit different than what we did before, which we just said, let's just take the absolute value of it so we don't end up with the negative numbers that will net out to zero. In this case, we're going to square it, which has a similar kind of characteristic that it's not going to, because when we square it, we're not going to get the negative numbers. So that does something similar to the absolute value, but obviously it also increases the value of the numbers. And then we're going to divide it by N. All right, so let's go home tab, font group. We'll make this orange. Let's increase the size of it a bit. And so that looks good. All right, and then we're going to do another one, which is going to be the standard deviation for the population. Standard deviation for the population. Let's make this header formatting, home tab, font group, black and white. Okay, so this is going to be much the same as this here except that we're going to take the square root of the entire thing. So we'll put the whole thing under a square root, for example. And that's why it's going to be represented not with a sigma squared, but simply a sigma. So because we're going to take the square root, remove the square root here and take the square root. Okay, so and then let's go to the insert tab and just draw it again. So I'm going to say make another equation. See if I can write this, hold on a second. Not that that's not what I wanted. Let's do this one. And let's see if we can write it again. And so I'll just do it real quick here. Okay, so I basically wrote the exact same thing except I don't have a squared on the sigma. And then I'm just going to add the new thing I'm going to put I'm going to try to put this whole thing under a square root symbol like so. And so there we have it. So if we were to remove the square root, we would in essence have that variance that we looked at last time, which is we're going to take each data point minus the mean. And then we're going to square it, which has the property of removing all of the negative numbers. Therefore we don't need to do the absolute divided by the count, the number of N. And then because we squared it, you can kind of think about it while we squared it. So what if I then take the absolute value of it? One of the things that that will do is it's going to it's going to remove the negative numbers because when we squared it, we got negative numbers and then we can absolute it, which kind of negates to some degree the squaring and and so now we've come up to kind of a similar thing that we would have, but not exactly similar. That's going to be one of the points when we did the average here. So let's go ahead and insert that. So we're going to bring that down here. So let's pull this down under here. And what's my size? This was came out to be 16. Let's make this 16. And let's make it orange. So you could see this is kind of similar to what we did with our intuitive calculation, which was taking each point minus the mean absolute value, not squaring it, right, but we took the absolute value, get rid of the negatives divided by two versus this where we're taking each item minus the mean and then squaring it removing. So we don't have to do the absolute value, but now we've squared it divided by N and then in essence, taking the square root. All right, so what's the different? Let's see. Let's do the actual calculation and look at our data set. So I'm going to copy my data set over here. Let's just copy this column. And so I'm going to see if I can copy just this. Let's just copy that data set and put it over here. Let's make a skinny V put in my cursor between V and W making it skinny and paste it right there. So it's not in a table, even though it looks like a table because I only copy part of the table. So let's go to the insert tab up top tables and make a table out of it. And there we have it. All right, so then I'm going to do the same starting point. We're going to have the mean, the same top part. I'm just making the numerator here in essence. So we take the mean, which is equal to the same thing we calculated before zero, which is simply the average of our data. So we just took that zero is taking the average of the data, which is, you know, summing them up and dividing by four. And then I'm going to double click on this number. I want to make it absolute. So I'm selecting F4 and the keyboard dollar sign before the D and the one so that each of these numbers are pulling from the same cell. And then I'm going to take the difference against difference. Same thing we did before. This is going to be equal to the six minus the zero. So there's our difference from the mean, which of course are the same numbers in this case, but wouldn't always be that because we picked the mean to be zero. This is where we get something different. We're not going to take the absolute value, but we're going to square each of those differences. So the way you do that is you say equals. I'm going to point to that number and you have a carrot. If you're going to take to the power of something and that's on shift six on the keyboard. So there's the carrot taking it to the two or squaring it. So we're going to say enter. So now we've squared all of them. I'm going to add a total column at the bottom now so we can do that by going to the table design and totals. Now if I sum up the data, if I add up all my data and I just take take the I could take the average by the way if I take the average that's where our zero comes from. And if I count my data here, I could count the data. There's four of them right and then I can take the sum of the differences which is always going to add up to zero because we're comparing everything to the middle point or the mean and then we've got the squared amount instead of the absolute value. So before we had 20 because we just took the absolute value and now we've got of course a bigger number of 104. Alright so now let's take our squared or this is going to be a calculation of the both to this squared difference from the mean. So this is going to be our squared difference from the mean which you can look at either of these formulas but if I'm looking up here I'm going to get to the variance. I've got the numerator so I've got the 104 and then the count so this is going to be the count which is represented by n that equals this number that we counted because we just counted one, two, three, four and let's put an underline home tab font group and underline and that's going to give us then if we divide it's going to give us the variance divide it doesn't want to give me the divide or I have to do it this do it this way and then divide and then variance. If I want to represent the variance with a symbol I could use I can format the cell as one way we can do it or we can enter a symbol for it so for example if I was to enter a symbol and I go to the insert tab symbols I would then go to the normal text Greek and gothic and then you can look for it up top I also have it down here in my recently used so there's my sigma insert it needs to be squared so I'm going to put a two and then select that to right click on it format the cells make it a subscript and so now we've got a subscript of the two I don't know why it's so large though which is kind of odd uh what happened in my I have a huge number two uh it's just the font size got ridiculously large for some reason whoops let's make this a two and bring it down to I think it was 11 is the font 11 something like that this font is 11 number two it needs to be at 11 okay alright and then that's going to be equal to this over this 26 and then we're going to get the standard deviation standard deviation which is represented by just a sigma so if I go into the insert symbols and enter a sigma insert we do that by taking the square root so we're going to take the square root and so that's going to be equal to this actually you got to do a formula with that so it's going to be the square root function which equals sqrt square root of the 26 gets us to five now let's add a couple decimals and you can see it's not exactly five it's a 5.1 so it's a little bit higher which is typically the case when we use this method as opposed to our simply intuitive method of just the average the average deviation so you get a distinct number and that's another one of the properties that is different from the standard deviation than the average so we might go into that in a little bit more in future presentations just to drill down on that point as to why another reason why we might use this number as opposed to doing a calculation like this but for now let's go ahead and make this blue and bordered home tab font group making it blue and bordered if you don't have that blue it's in the color wheel there's the blue and now of course we can also get to the shortcut there by using our excel functions so I can say this is going to be the population variance using excel right and I'm going to say that let's make this sell a little bit larger and this is going to be the same as this so I'll copy that and put it here so the formatting is there with it as well and I could say this is going to be equal to the VA and we're looking for these two here varp.pvar.s we're looking at the population right now and we'll look at the sample in a second and more in a future presentation so if I select my data then we get the 26 same as this 26 up top and let's go ahead and make this bracket and blue and then I can take my I can calculate my population standard deviation let's make this one a little bit wider again so I can then copy my sigma it's going to be represented by sigma so this is going to be equal to ST and once again you've got the standard it's these two we're looking at STDEV dot P dot S we're looking at the population at this point so I'm going to say dot P select my data and enter let's add a couple decimals and we get to the 5.1 let's make this blue and bordered now just to compare that to the formulas which we'll take a look at in more detail later which is the sample variance in Excel so that's going to be equal to the variance so the variance but let's look at the sample just to get the difference and let's pick up the data so if I use that calculation I get 35 adding a couple decimals we'll add some decimals here it's even there and then we're going to say if I took my sample standard deviation in Excel equals the standard deviation for the sample then our data we get something a little bit different right so we'll talk more about that in future presentations because we're focused on the population here but let's make that yellow for now just to say that that's a little bit different of the formula for the population okay in a future presentation we might dive into a little bit more on analyzing the difference between using this calculation and why this kind of more intuitive possibly more intuitive calculation another reason why we might use it use this one instead as the default right