 Hi, this is Dr. Don. I have a problem I want to share with you. In the problem, we have salaries for data scientists in five offices of a company, and the manager of the Atlanta office feels that her data scientists are paid less. She thinks that the average salary of her data sciences is less than the average salaries in these other offices. So we need to run a test of hypothesis to see if her claim is correct, that her salaries are significantly different or lower in her case than the other salaries. We do this using an ANOVA, a single-factor ANOVA, just because we're just dealing with salaries. We're not adding in another factor like gender. The null hypothesis is always nothing going on. Remember that NNN, nada, nothing happening here. For the ANOVA, it's that all the means are equal. And another way of saying that, as you might see, uses the Greek letter mu for mean. And it says that mu 1 equal mu 2 equal mu 3 and so forth, mean 1 equal mean 2 equal mean 3 and so forth. So don't be surprised by that if you see the Greek letter mu and one way of expressing the null for the ANOVA. The alternative hypothesis is that at least one mean is different. We don't know which mean if it comes back significantly initially, but we know that at least one is different. We're going to use the Excel data analysis tool pack ANOVA tool to do this. So let's go over here to click on data to select that ribbon. And over on the far right, you should have data analysis showing because you've activated the tool pack. And we have some help on how to do that. If it's not, I'm going to click on that. And I want to make sure I select ANOVA single-factor. We don't want to use the two-factor. I want to click OK. And the dialog box, I want to clean that out. I want to go over here, first of all, and make sure my cursor is in the input range. And I want to select all my data. I'm going to start with A1, hold down my left mouse and drag over and select all the data. And you can see that we have an unequal number of observations. That's OK for ANOVA as long as you don't have an extreme difference. Say we had 30 for Atlanta, but only two for San Francisco might be a problem. But generally having some differences doesn't hurt. I want to check to make sure that I've got my group by columns. My data is in columns, Atlanta, Boston, San Francisco. If your data was in rows, Atlanta, Boston, San Francisco, you would click rows. Our labels are in the first row. I did include those. So that should be checked OK. I'm going to leave the alpha at 0.05. I'm going to get the output range. I'm going to delete that. And I'm going to select that value. And I just click OK. And looking at our output, we see we have some funky looking values there. That's Excel's way of saying there's not enough room to show all of the information. This is scientific notation. E to the plus 8 means we would move our decimal point 8 places to the right for plus if it's minus. And you see that sometimes when we're looking at p value, it would be moving it to the left. But here we want to move it to the right. What I like to do is just to highlight my rows and then double click on the divider there to expand everything. And we got rid of the scientific notation. What I want to do here, because we're dealing in dollars, I'm going to go to home, reformat that to dollars. I can see that more clearly. And we're going to talk about the variance a little bit. So I'm going to highlight that and put commas in there so you can see that better. Our results tell us that the average salary, the mean salary in Atlanta is apparently lower, at least of San Francisco and Boston. But the output of the Nova tells us that it's not necessarily statistically significantly lower. And how do we know that? Well, we compare our F statistic, 2.47, to our critical value of F, 2.59. And because the F is not greater than 2.59, it's not in the rejection reason, which means we do not reject the null. Now the p value is greater than 5%. So that is also telling us not to reject the null. And I'm going to just highlight those cells and yell, yell because I want to report this in my final report. So we have two ways of telling us that this is not statistically significantly different. You may be wondering why we're getting a non-significant, a no-difference outcome here when we have these obvious differences in these averages of these salaries. So the ANOVA compares the variation between groups, in other words, these are our five groups, the variation between those groups with the variation inside the groups. Each of these groups has a lot of variation. And we see that here in this variance column. Now the variance can be hard to understand because it's a square. And I think it's easier to understand if we just calculate the standard deviation, which is just equal to the square root, I'm using the XL square root function of the variance. You can see for Atlanta, that's about 20.6,000, and it goes up to almost 37,000 for Seattle. So that helps you understand a little bit of the variation that's going on there. Let me show you one more thing. Let me show you a table when we just add and subtract the standard deviation to the mean for each office. And you can see it here for Atlanta, just with one standard deviation above and below. It goes from 91,000 to 132. There in Seattle, it goes from 90,000 to 164. So that's a lot of variation within these groups. And of course, we've got a lot of variation here from 11 to 158. Combination of those two things gives us this non-significant result with a p-value greater than 5% and an f-statistic less than the critical value. So I hope this helps.