 Welcome to our lecture on understanding how hypothesis testing works. In previous lectures about hypothesis testing, we were content to follow a very simple step-by-step process that resulted in reject the null hypothesis or don't reject the null hypothesis. Everything was very clearly laid out and regimented. In this lecture, what we're trying to do is help you understand exactly what you're doing when you test hypothesis and what it means if you reject, if you manage to reject it and what it means if you don't. In this example, we're going to look at a comparison of men and women and how well they do in terms of leadership aptitude. As you can see from the example, the mean for women was 83.7. This is on a scale that goes from 0 to 100. The mean for the men was 74.3. Of course, we didn't look at all the women and the men in the world. We took a sample of 64 women and 54 men and as you'll see in the moment, the difference is 9.4 points. We want to know is this 9.4 difference significant or is it just chance difference? As you can see, we end up with a z value of 2.97. Now, the z is a distribution. We need that to get a probability. Our computed z value of 2.97, if we're testing at the 05 level to a two-tailed test, it needs to be greater than 1.96 on the positive side or less than minus 1.96 on the left tail to reject anything between plus 1.96 and minus 1.96. We don't reject the quotes we accept. Okay, here we got 2.97, which is clearly more than 1.96. So, we say we reject the HO at a probability of 05. In other words, we have found that men and women are different with regard to leadership aptitude. Okay, let me talk about a straw man because the null hypothesis is generally a straw man. What is a straw man? We set it up because we're trying to not get down. We're actually saying let's pretend that there's no difference between the two groups, in this case men and women. They're exactly the same. Okay, now we say let's look at the sample evidence. Is this the kind of sample evidence we should see if there indeed is no difference? So, men and women are exactly the same in leadership ability. Should we find a 9.4 difference in the two samples? That's what we found. We found a 9.4 difference. Is that what we expected? There's really no difference. Clearly, if we said no difference, we found the difference of, let's say, a quarter of a point. We don't know right away that we can't reject the HO. But here we found the difference of 9.4 points and that's what we're trying to see. So, we start off with a straw man, the null hypothesis. We're pretending right now there's no difference between men and women. Now, is this the sample evidence we should be seeing? Now, if you look at the cumulative Z distribution table, you can see that it gives you the values from 2.97, that's our Z value, we go to infinity. Because we're not only looking at the difference of 9.4 in our sample, what about more than 9.4? So, we're actually looking at differences that are more than 9.4. 10.4, 15.4, 20.4, 30, 50, those clearly would be more than what we found. We want to know what is the likelihood of finding this difference if nothing is going on? Again, the straw man is no difference. So, we're looking at our sample evidence or something even more extreme. Now, we've got to double it because we're also looking at the two-tail test. So, we've got to look at that probability. Essentially, what we find is that there's only three chances in a thousand. If there's actually no difference between men and women, getting a difference of 9.4 or greater, again, we're doing it as a two-tail test, so we're looking at minus 9.4 and less too. But a difference of an absolute value of 9.4 or greater, there's only three chances in a thousand of happening if men and women are indeed the same. So, essentially what we're finding is this is not what should be happening. The sample evidence should not be happening if the straw man is true, that men and women are the same, that the two groups are the same. This is too big a difference to be chance. So, that's why that three chance is in a thousand. It would have been less than, it would have been seven chances out of a hundred. Remember, we're using five out of a hundred is our criterion. O5, that's our alpha. It would have been like 7%, 10%, 20%. We would not reject, but this is only three chances in a thousand. So, we have to just basically chew down the straw man and reject HO. The use of computer, the computer simply gives you the p-value, the probability of getting that sample evidence or something more extreme. And then all you have to do is compare it with the alpha. If you look at the p-value, it comes in a printout, you'll see it in Excel and any other computer programs. You could see right away whether you should be rejecting HO or not. So, for example, if you work in an alpha of O5 and the computer prints out a p-value of let's say 0.0003, you know right away to reject alpha. It's telling you the likelihood of getting the sample evidence or something more extreme if the two groups are the same is very unlikely. This is not what should be happening. That's why we reject HO. So, again, if the p-value is less than alpha, again if alpha is O5 and you get something less than O5, let's say 0.0002, you're going to reject alpha. The only way to learn statistics is just to do lots and lots of problems. And again, you'll see the methodology is always the same. It's a certain method that we use over and over again to just keep doing problems. You'll learn this very well and you'll understand a very important concept. And this is the basic concept you're learning in the inference part of the course. When you take a sample, there has to be some kind of margin of error, the sampling error. Just remember that. That your sample statistic, let's say the X bar is not new. It's an estimate of new.