 Remember the key question in hypothesis testing. If the null hypothesis is true, what is the probability of all outcomes as or less likely than the observed outcome, and then, do we feel lucky? So, for example, suppose a committee of three people is selected from a group of twelve women and six men. If the committee consists of two men and one woman, does this support a charge that the committee selection is biased? And to answer this question, we need to find some probabilities. So, let's suppose the committee members are chosen completely at random. In that case, the probability distribution, let's see, we have three people, and let's calculate our probabilities based on the number of men, which can be zero, one, two, or three, and find the corresponding probabilities. So, we can use a hypergeometric probability distribution here, because we're choosing groups from a group. So, if we're choosing three people from a group of twelve plus six eighteen, there are eight hundred and sixteen ways to do this. So, in order for the committee to have three men, we have to choose those three men from among the six men. So, they six choose three twenty committees of three men. So, this outcome would have probability. The observed outcome, we picked two men from the six and one woman from the twelve, and so there's a hundred and eighty ways to choose a committee of two men and one woman, so this has probability. If the committee only has one man, that's six men will choose one of them, and twelve women will choose two of them. Three hundred and ninety-six ways to choose a committee of one man and two women, so this would have probability. And finally, if there's zero men on the committee, that means all three members are chosen from the group of twelve women, and so there's two hundred and twenty ways to choose a committee of three women, and so the probability of occurrence will be... Now, the observed outcome is selecting a committee of two men and one woman. So, if the null hypothesis is true, there are two outcomes with probability equal or less than the observed outcome. And those are these two, the observed outcome itself, and this one, the probability that there are three men on the committee. The corresponding p-value is the sum of those two probabilities, and we're seeing an outcome with a reasonably high probability. So, this committee selection is probably just randomness. Now, one very important idea to keep in mind, probabilities do not scale, and in particular, what happens if we increased all numbers by the same factor? Well, let's find out. Suppose a committee of three times five fifteen people is selected from a group of twelve times five sixty women and six times five thirty men. If the committee consists of two times five ten men and one times five five women, does this support a charge that the committee selection is biased? And so here, notice that all we've done is we've scaled everything by a factor of five. The important thing here is that the percentages are exactly the same as they were before, and the only difference here is the raw numbers are larger. Now, this does require us to calculate a bunch of hypergeometric probabilities, and so this can be rather tedious. Fortunately, this is the 20th century. What's that? Oh, sorry, the 21st century, and we have technology at our disposal. And so, again, most spreadsheets, calculators, and quite a few web apps will calculate hypergeometric probabilities for you. Now, if you're doing this thing for real, you'll probably be having all of this information on a spreadsheet. So it's useful to learn how spreadsheets calculate hypergeometric distributions. Fortunately, you could take, oh, five or six classes at five hundred to a thousand dollars a piece to learn how to use all the details of a spreadsheet, or you could call it the help feature, which will tell you how to use the hypergeometric distribution. And here's the important thing. You can't look it up unless you know what it's called. So let's type in hypergeometric, and even then how the program refers to it might be different. So we have to use a little bit of insight. Probably this HipGiom dist is useful, and that five hundred to a thousand dollar class will basically teach you how to read the manual. Let's see, X, and there's a number of samples, and then sample successes, and oh, okay, okay, so let's put this together. X is the number of results achieved. So since we're interested in these probabilities, we want to let X be zero, one, two, all the way up to 15 as a number of men of the committee. And sample is the size of the random sample, so we're selecting a committee of 15 people. So we have a total of 30 men, so that's the most we could possibly draw, and there's 30 plus 60, 90 people altogether. And so we can calculate these probabilities. Now, what we actually observe is that there are 10 men, and so we find the observed event has probability. So that would be drawing 10 men in a sample of 15 when you have 30 men out of a total population of 90 and 0.003583. Again, whether you calculate this by hand or use a spreadsheet or an app or actually list out all of the possible combinations, it doesn't make a difference somehow we obtained this probability. And so we want to find the outcomes of all events with equal or lesser probability. And again, let's take advantage of the fact that we're using a spreadsheet and fill down our formula and set this up so it'll just calculate these hyper geometric probabilities for us. And again, learn how to use a spreadsheet by paying $500 to $1,000 for a class or reading some manuals, maybe even watching some YouTube videos. And again, all of this is optional. This is merely doing something that we could do by hand, but we don't want to. And if we look, the probabilities that are equal or less are these. Add them up. And so we find that if the null hypothesis is true, we've observed a very unusual event. Now, we still have to make a decision. All that this mathematics does is it gives us a quantifiable and consistent way of making that decision. We have to decide whether we're going to reject the null hypothesis on the basis of this p value. And again, the question you always have to ask yourself is, what are the consequences if we make the wrong decision?