 When testing hypotheses, one hypothesis is the null hypothesis. Now, you can determine which is the null hypothesis because it will be the one that yields a probability. But why is it called the null hypothesis? One reason is that it frequently appears in comparing different populations. The null hypothesis is that there is no difference between the two populations. So, suppose two groups have means mu1, mu2, and standard deviation sigma1, sigma2. If we take sufficiently large samples of size n1, n2 from the two groups, the sample means will be normally distributed with mean and standard deviation given by the formulas. So the difference in the sample means will be normally distributed with mean and standard deviations. And so this leads to the following type of problem. Suppose we have two groups of students, a treatment group who will get access to special tutoring and help sessions, and a control group who will not. The program costs money so the natural question to ask is, did it make a difference? So let's set this up. Suppose 50 children are selected from a control group and 125 are selected from a treatment group. They're both given the same standardized test. The control group has a mean of 75 with standard deviation 8, while the group has a mean of 77 with standard deviation 12. Does the data support a claim that the two groups exhibit a difference in the test scores? So the difference in the sample means will be normally distributed with the mean equal to the difference in the means. But by the null hypothesis, the two means are equal. So the mean of their difference will be zero. And so under the assumption of the null hypothesis, the difference in the sample means will be normally distributed with mean and standard deviation mu equals zero. The standard deviations of the sample means will be normally distributed where we can compute our values. And remember our standing for the standard deviation is the sample standard deviation. And so we find... Now these are the standard deviations for the different sample means, so the standard deviation of the difference of the sample means will be. And so we have the mean, the standard deviation, the assumption that it's normally distributed. And our actual observation is that the difference is 77-75-2 and that gives us a p-value of 0.18. Now the probability and statistics will give us a p-value, but the important question is what we do with that. So remember, the decision of rejecting or failing to reject the null hypothesis should always incorporate the consequences of a wrong decision. So the real question is not whether the data supports the claim, but rather is the data strong enough to support a decision. And to answer that we have to consider the consequences. So suppose the two groups represent a treatment group given special tutoring and a control group. If there is no difference between the means, funding for the treatment might be cut. And the null hypothesis is that there is no difference between the two groups. So if we reject the null hypothesis, we'd probably claim there is a difference and this could be used to continue funding for the program. And if our decision is incorrect, we would have wasted money. And here's where the p-value comes in. The p-value corresponds to the probability of rejecting the null hypothesis incorrectly if it is in fact the true state of the world. On the other hand, suppose we failed to reject the null hypothesis. If we failed to reject the null hypothesis, we'd conclude that there is no difference between the groups and we'd probably terminate the program. If our decision is incorrect, we would have deprived a group of children of a program that would help their educational development. And so now we have two consequences. Incorrectly failing to reject the null hypothesis means we're wasting money. Incorrectly rejecting the null hypothesis means a loss of opportunity. So let's take a look at that p-value again. The p-value of 0.18 means that if there is in fact no difference, there's an 18% chance, or about 1 in 5, that we'd see the observed difference in scores. So it's reasonably likely to happen even if there is no difference between the groups. Now, we still have to make a decision. So now you have to ask yourself, if it is more important to save money than to provide opportunities, you would fail to reject the null hypothesis and conclude there is no difference and the program should be cut. In other words, while there is a difference, it probably occurred by chance. On the other hand, if you decide it's more important to provide opportunities, you would reject the null hypothesis and conclude the program does make a difference. In other words, while it's possible the difference could have occurred by chance, the possibility that it didn't warrants the continuation of the program.