 In this lecture, we seek a further understanding of two sample hypothesis testing. Welcome to our lecture. In previous lectures about hypothesis testing, we were content to follow a very simple step-by-step process that resulted in reject the null hypothesis or don't reject the null hypothesis. Everything was very clearly laid out and regimented. In this lecture, what we're trying to do is help you to understand exactly what you're doing when you test hypothesis and what it means if you reject, if you manage to reject it and what it means if you don't. In this example, we're going to look at a comparison of men and women and how well they do in terms of leadership aptitude. As you can see from the example, the mean for women was 83.7. This is on a scale that goes from 0 to 100. The mean for the men was 74.3. Of course, we didn't look at all the women and the men in the world. We took a sample of 64 women and 54 men. And as you'll see in a moment, the difference is 9.4 points. We want to know, is this 9.4 difference significant? Or is it just chance difference? As you can see, we end up with a Z value of 2.97. Okay, now the Z is a distribution. We need that to get a probability. Our computed Z value of 2.97, if we're testing at the O5 level, the two-tailed test, you need to be greater than 1.96 on the positive side or less than minus 1.96 on the left tail to reject. Anything between plus 1.96 and minus 1.96, we don't reject. In quotes, we accept. Okay, here we got 2.97, which is clearly more than 1.96. So we say we reject HO at a probability of O5. In other words, we have found that men and women are different with regard to leadership aptitude. Okay, let me talk about a straw man. Because the null hypothesis is generally a straw man. What is a straw man? We set it up because we're trying to not get down. We're actually saying, let's pretend that there's no difference between the two groups, in this case men and women. They're exactly the same. Okay, now we say, let's look at the sample evidence. Is this the kind of sample evidence we should see if there indeed is no difference? So if men and women are exactly the same in leadership ability, should we find a 9.4 difference in the two samples? That's what we found. We found a 9.4 difference. Is that what we expected? There's really no difference. Now clearly, we said no difference. We found the difference of, let's say, a quarter of a point. We don't know right away that we can't reject HO. But here we found the difference of 9.4 points. And that's what we're trying to see. So we start off with a straw man, the null hypothesis. We're pretending right now there's no difference in men and women. Now is this the sample evidence we should be seeing? Anyway, we see that there's a difference between the two sample means, men, women, and men. 83.7 versus 74.3 or a 9.4 point difference in leadership aptitude scores. Now the two possibilities, what does that mean? We see a difference. Okay, one possibility is that men and women are exactly the same when it comes to leadership. And that difference of 9.4 is just sampling error. It's just chance. It just will happen. There's going to be a difference and it's not meaningful. That's one possibility. There's a second possibility, however, that it's too much. That 9.4 difference is just too much to be attributable to chance. And rather, we're going to have to conclude that women and men are different when it comes to leadership aptitude. That 9.4 difference is significant and that women are better leaders than men. Now, if you look at the cumulative Z distribution table, you can see that it gives you the values from 2.97. That's our Z value. We go to infinity because we're not only looking at the difference of 9.4 in our sample. What about more than 9.4? So we're actually looking at differences that are more than 9.4, like 10.4, 15.4, 20.4, 30, 50. Those clearly would be more than what we found. So we want to know what is the likelihood of finding this difference if nothing is going on? Again, the strawman is no difference. So we're looking at our sample evidence or something even more extreme. Now, we've got to double it because we're also looking at the two-tail test. So we're going to look at that probability. Essentially, what we find is that there's only three chances in a thousand. If there's actually no difference between men and women, getting a difference of 9.4 or greater, again, we're doing it as a two-tail test. So we're looking at minus 9.4 and less, too. But a difference of an absolute value of 9.4 or greater, there's only three chances in a thousand of happening if men and women are indeed the same. So essentially what we're finding is this is not what should be happening. The sample evidence should not be happening if the strawman is true, that men and women are the same, that the two groups are the same. This is too big a difference to be chance. So that's what that three chance is in a thousand. If it had been less than, if it had been seven chances out of 100, remember we're using five out of 100 is our criterion, 05, that's our alpha. If it had been like 7%, 10%, 20%, we would not reject. But this is only three chances in a thousand. So we have to just basically shoot down the strawman and reject HO. If you use a computer, the computer simply gives you the p-value. The probability of getting that sample evidence is something more extreme. And then you only have to compare it with the alpha. If you look at the p-value, it comes in a printout. You'll see it in Excel and any other computer programs. You could see right away whether you should be rejecting HO or not. So for example, if you're working on an alpha of 05 and the computer prints out a p-value of let's say 0.0003, you know right away to reject alpha. It's telling you the likelihood of getting the sample evidence or something more extreme if the two groups are the same is very unlikely. This is not what should be happening. That's why we reject HO. So again, if the p-value is less than alpha, so again if alpha is 05 and you get something less than 05, let's say 0.0002, you're going to reject alpha. As always, I encourage you to do as many problems as you can find. It will help to strengthen your knowledge in statistics and it will also help you do well on exams. Practice, practice, practice.