 In this lecture, we will learn about making inferences about means of two groups, the two-sample Z-test. Welcome to the lecture. Previously, we examined what's called one-sample tests. Now we're changing things a little bit. Not much. We're going to be examining two groups. It's called two-sample tests. We're going to examine two populations and we're going to look at a parameter, say mu. And we want to determine whether there's a difference, a significant difference between the population means. And as always, we look at the sample means and sample evidence. And we're going to use that to determine whether the difference between the two means is significant or not. Now it doesn't have to be means. We're going to look at proportions, which we'll do also. But these are all called two-sample tests. And the HO, that's our straw man, is that mu1 equals mu2 if we're dealing with means, which essentially means that there's no difference between the means. Anyway, let's look at the null hypothesis. If we're comparing means, the null hypothesis that mu1 equals mu2. The population mean of group one is the same as the population mean of group two. Or in other words, mu1 minus mu2 is zero. No difference. You're really looking at an HO saying there's no difference, zero difference between the two population means. So for example, we could be comparing men and women on some measure. Like we could look at who does better in college in terms of GPA. And then we examine the GPA of men and women. And again, there are other groups too that maybe there aren't enough of them. So we're just looking at men and women. So the HO is always about the population parameters. And the random variable is x bar 1 minus x bar 2. The difference between the sample means. So we're going to look at the sample evidence to see if it supports the HO that there is zero difference. Between the two groups. If we're looking at the difference between two groups, two population means. The statistic of interest would be the difference between the two sample means x bar 1 minus x bar 2. So the question is, what's the expected value of that statistic of that random variable? And what's the correct measure of variation? Well, we don't have to derive it here because this isn't a course in mathematical statistics. But it turns out that the expected value of the difference between the two x bars is mu 1 minus mu 2. The difference between the two population means. And the standard deviation of this random variable, the difference between the two x bars. Is found to be the square root of sigma 1 squared over n 1 plus sigma 2 squared over n 2. And you'll see how that becomes important on the next slide. So if our sample sizes are large enough. And again, you have to know what large enough means for the company you work for for the instructor of the class you're taking. Or this is easy if the sigma is our known if the two population standard deviations are known. We could do a two sample z test for the for making inferences in this case. So to get the calculated value of this z statistic. We get that just like every other z statistic every time you see a formula that starts z equals. It's always the random variable minus its mean divided by its standard deviation. That's why it was so important to figure these things out on the previous slide. So the random variable is x bar 1 minus x bar 2. The mean of this random variable is mu 1 minus mu 2. And the standard deviation of the random variable is sigma 1 squared over n 1 plus sigma 2 squared over n 2 under a square root sign. Of course, we can simplify this nicely because by the null hypothesis, mu 1 is equal to mu 2. So mu 1 minus mu 2 is 0 and that term falls out and you're left with the formula on the bottom of the slide. Now what do we do if we don't know sigma? We're going to come to that in just a few seconds. We don't know sigma as long as our sample sizes are large enough. We use S1 and S2 as point estimators of sigma 1 and sigma 2. S1 is the sample standard deviation in group 1, S2, the sample standard deviation in group 2. And just like with one sample test, we're using the sample standard deviation as a point estimator for the population standard deviation. And we can only do that using a z if our sample sizes are large enough. If we're working with sample data from two groups and we are constructing a confidence interval estimator, what we're really doing is constructing a confidence interval estimator for the difference between the two population means, mu 1 minus mu 2. Normally we'd be looking at a CIE of mu, but in this case we have two groups, presumably with two means, and we're looking to estimate the difference between the means. We use the formula that you see in front of you. Again, the random variable is X bar 1 minus X bar 2. That's the sample evidence and that's in the middle, what goes in the center of the confidence interval estimator. We have z from the table, which gives us our level of confidence. And we have the measure variation, sigma 1 squared over n1 plus sigma 2 squared over n2. What happens if there's really no difference between the two groups? In other words, we have two samples, but these are just two random samples from the same population. We think it's two different populations, men and women, but if they have exactly the same parameter, it really is the same population, human beings. So what happens to the confidence interval in that case? Well, in that case, it should turn out that there's a zero in there, because if the two means are the same, the difference between the means is zero, and we're looking for that value of zero in the confidence interval to determine that the two groups really are the same and really are one group. We'll see a problem like that later on. Gold standard of medical research is comparing a drug by looking at two groups. You sign people randomly to these two groups. One group will take the new drug, and the other group, we call it the control group, they take up SIBO. And in a double-blind study, no one knows who's in which group. In this case, we're looking at number of coals. We're going to test the drug to see if it reduces the number of coals that people get. So we're going to have two groups. Again, one group taking the drug, another group taking up SIBO, and we'll see if the means are different. Look at the sample evidence. In group one, that's the drug group, we had 81, N is 81, and they were getting 4.4 coals a year with a standard deviation of 0.7 coals. The SIBO group, that's group two, there was 64 in that group, and the mean for them was 4.8 coals with a standard deviation of 0.8 coals per year. Now we can see there's a difference of negative 0.4 coals, 4.4 minus 4.8. Now the question is that difference statistically significant? Is it a significant difference? Or it could just be chance. Any time you compare two groups, the likelihood is that there'll be some difference. I don't care what. The big question in statistics is, is this chance or is it a real, real difference? If you want to understand how chance difference, take a coin, flip it 50 times, and you'll get a certain number of heads. You may get to say 23. Take the same coin, flip it again 50 times. It's almost a certainty that you're not going to get the same number of heads. It'll be slightly different, and that's just chance. It doesn't mean anything. That's why we always have to test. The government is not going to let you use a new drug unless you've tested it for significance. You want to make sure that it's different than not taking a drug at all for SIBO. As you can see, we set it up. HO is mu1 equals mu2. Another way of saying no difference. And H1 is that mu1 is not equal to mu2. And now we have the Z formula. So we take 4.4 minus 4.8 over the square root of 0.7 squared over 81 plus 0.8 squared over 64. And when you do all the arithmetic, you'll end up with minus 0.4 over 0.127, which is negative 3.15. Clearly, that's in the rejection region. Notice anything to the left of negative 1.96, that's going to be significant. That was less than 0.05% chance of this occurring. So we'd say we conclude that the two groups are indeed statistically different. The drug group has fewer calls per year. That would be presumptuous for you to say it's a 0.4 difference because there's always a margin of error. But for the moment, we know there's a statistically significant difference between the means of the two groups. If you want to construct a 95% confidence interval for the difference, remember your difference was negative 0.4 calls. The reason it's negative is because the lower number was the group 1. If you switched it around, then you have a positive. It doesn't make a difference. But we had a negative 0.4. The drug group had 0.4 calls fewer than the placebo group. So now let's construct a confidence interval. Again, you have negative 0.40 plus and minus. Now there's that 1.96, as you know. That's used if you two-sided 95% confidence interval. And that number in the denominator, that's in the square root, the standard error for the difference was 0.127. So you just multiply it and you get the margin of error is 0.25 calls. Basically that's how you would report it. You found a 0.40 difference with a margin of error of 0.25. Or if you write it out, you'd show that the 95% confidence interval is the drug definitely causes fewer calls. But it could be anywhere from 0.65 calls less all the way to negative 0.15. Notice there's no zero in the interval. So you're basically sure with 95% confidence that the drug group will have fewer calls. Again, the way you would report it, you just say 0.40 difference, fewer calls. The margin of error is 0.25. Problem two, we're comparing men and women to see if there's a difference in standardized science tests. I guess somebody, Chauvinist thinks that men are better than women. Okay, but we're going to see if it's true or not. So we took a sample, as you can see, 100 men and their average, this is sample mean, was 80 with a standard deviation of 10. The women, their average was 76.5 with a standard deviation of 16. Now it would be presumptuous of you to say, oh, men beat the women. So as you know, there's always sampling area to worry about. So we want to know, is there a significant difference between men and women on the science test at alpha equals 0.05. All right, we're going to set up HO, mu1 equals mu2, H1 is a mu1 is not equal to mu2. And again, we're using 0.05, we cut it in half, 0.25 in the right tail, 0.25 in the left tail. By now we know the critical values are 1.96 and minus 1.96. Now we turn the sample evidence into a Z score. We end up with 80 minus 76.5. That's a 3.5 point difference where the men beat the women on the science test. The standard error for the difference, that's the square root of S1 squared over N1 plus S2 squared over N2. It's 10 squared over 100, by the way, that's just 1 plus 16 squared over 64, which by the way is just 4. So we end up with a square root of 5 in the denominator. 3.5 divided by 2.24 is 1.56 and we're in the acceptance region. We have no evidence to reject. It's not enough that 3.5 point difference, it could very well be sampling error. So you'd say in simple English, no statistically significant difference between the two groups, men and women on the test scores in science. There's no significant difference. As you'll see in a moment, we're going to do the confidence interval for the difference just to show you what happens. But you really don't do a confidence interval. Once you've said there's no difference, you're basically claiming that the men and women are the same when it comes to science. So there's no reason to do a confidence interval, because you've already said no difference, zero difference. So don't contradict yourself by doing the confidence interval. We're going to show you what happens if you do it. You do decide to do the confidence interval. Anyway, you have 3.5 plus and minus 1.96 times 2.24. So the margin of error is 4.4. When the margin of error is more than the observed difference, there's got to be a zero in there. So here's our confidence interval. It goes from negative 0.9 all the way to plus 7.9. Obviously zero is in that interval. So basically that reinforces there's no difference. But again, let me repeat. If you have no evidence to reject HO, if you in quotes accepted it and you can't reject, you don't have to do this. You don't do a confidence interval, because you know zero's got to be in the interval. And this problem of looking at two colleges, you know, colleges very often say, you know, our students are smarter than your students, right? We have these arguments between Brooklyn College and Baruch College. And if you're at Queens, it'll be Queens versus Hunter. Everyone's fighting over this. So now one college is claiming that the difference in the CPA exam scores between College X and College Y. Anyway, here's a sample evidence. Notice we're looking at a sample of 260 students, right? It took 100 students from College Y and 160 from College X. And it does seem to be a slight difference. 72.5 versus 74.1, those are the means. And the standard deviations are 8 and 9.1. But as you know, if we test, let's say, the alpha of 01, we are not sure if the difference is sampling error or not. So we're going to do a statistical test. HO is mu1 equals mu2. H1 is mu1 is not equal to mu2. Since we're testing at the 01 level, we cut the 01 in half. Two-tailed tests, we have 005 on the right and 005 on the left tail. The critical values are 2.575 on the right to negative 2.575 on the left. Okay, and then we turn the sample evidence into a z-score. We do the difference between the means and the numerator, which is 72.5 minus 74.1 or negative 1.6. And the denominator, we have the standard error of the difference. And we have the square root of 8.0 squared over 160 and 9.1 squared over 100. And we finish all the arithmetic. We end up with a z-score of minus 1.44, which is definitely not in the rejection region. Again, it's got to be to the left of negative 2.575. So like negative 2.8 would have been in the rejection region. Negative 1.44 is not in the rejection region. We don't reject HO. And our conclusion is simple English. There's no significant difference between the two colleges on the CPA exam scores. In other words, it could just very well be sampling error. No college can claim a brag that it does better. And as far as the 95% confidence interval for the difference, you don't need to do that. You know zero is going to be in the interval. So you basically don't do a confidence interval. You said there's no difference, which is another way of saying that you've accepted zero as a difference. There is no difference. As far as you are concerned, the two colleges do just as well as each other on that CPA exam. And I'm back. Did you miss me? In this problem, we're studying the lifetimes of smartphones produced by company X and by company Z. We took a sample of 60 phones from company X, a sample of 64 phones from company Z. And we found an average lifetime of six and a half years in company X's phones. 5.7 years in company Z's phones with the respective standard deviations of 1.2 years and 0.8 years. We want to test to see if there is really a significant difference between these two phones, these two companies phones. And we're going to be testing at the alpha of 0.01. So here we have our two sample hypothesis test using Z for the difference between two means. The null hypothesis is that the two means are exactly the same. The alternate hypothesis is that they're different. And if you notice that we've been kind of lazily just saying mu one equals mu two. We could have said mu X is equal to mu Z, mu of company X is equal to company Z. But basically it's simpler and less confusing to stick to the notion that we have a group one and a group two. And that's why you see mu one and mu two over there. We're working at an alpha of 0.01. And with a two sample tests, we're always doing two tail tests. We split the alpha into two equal pieces, half a percent on one side, half a percent on the other side. So the critical values from the Z table are plus and minus 2.575. The calculated value of Z from the sample evidence, 6.5 minus 5.7 divided by the square root 1.2 square over 60 plus 0.8 square over 64. And you end up with 8 over 0.1844 or 4.34. That's a pretty big number for a Z. And we are definitely in the region of rejection. It's larger than 2.575. So we reject the null hypothesis at P less than 0.01. It's a very, very stringent test. There is a statistically significant difference between the two companies with regard to the lifetime of their smartphones. We're constructing a 99% confidence interval estimator for the difference between the two population means mu one minus mu two. We use the appropriate formula for this. The middle of the confidence interval is 0.8 years. That's the value that we got for the difference between the two sample means the Z value that gives you 99% confidence or the central 99% of the distribution is 2.575. We multiply it by 0.1844, which is the standard error of the of the random variable. And so 0.47 is the margin of error. And the 99% CIE of the difference between the two means it goes from a low of 0.33 years to an upper level of 1.27 years. Here's another problem slightly different. We're looking at a company and its minority employees versus its white employees. And the company would like to determine whether its minority employees actually earn less than do the white employees. These are two random samples, 60 minority employees, 100 white employees. And you can see the data there. We're using an alpha significance level of 0.05. We've got an average of $1,300 a week for the minority employees, an average X bar of $1,500 a week for the white employees. And we know that it's different. What we want to know is whether not only whether it's significantly different, but in this case we're looking at whether the difference is in a particular direction. If we can show that the minority employees really do earn less. We'll see how we do that on the next slide. Okay, notice the HO. This is going to be called a one-tail test. And notice the rejection region is on the left. If you're not sure where, look at the H1. H1 always points to the direction of where the rejection region is. So the entire O5 is put on the left. And if you put the entire O5, this is Z. You'll see the critical value is negative. Don't forget it's a negative number, negative 1.645. Now the way it's set up when you have this kind of test is HO is the straw man. You're pretending that rough at the moment that the minorities make more than the white employees. Now if you reject it, then you're going to have, you're going to shoot the straw man down. And then your conclusion is going to be, no, minorities make not more. They make less than white employees. So the rest of the stuff is just simple arithmetic. You get the Z value. It ends up being a very strong, you know, minus 6.52. The probability is going to be a lot less than O5. It'll be less than like, it'll probably be less than one in a million. So basically your conclusion is you reject HO. Minorities don't make more than white people in this company. You reject that. So what are you saying then? That the minorities make less and now you possibly have a lawsuit. Okay. So that's your conclusion is minorities make less in this company based on the two samples that you took. Now this, you remember you could have, you know, 100,000 employees here. So we're looking at the sample evidence and was deciding that now we see that based on the sample evidence, minorities make less than white people. Do you want to know how to retain all this information and do well on your exams? Here you have it. Practice, practice, practice. Once you understand the material, find as many problems as you can and just keep on doing them and do them over. Do them over again. You have problems in the lectures. You have problems in the do it nows and test your knowledge on the lecture page. You have problems in the homeworks, problems in the handouts. You have exam prep, practice problems on the handout page. Lots and lots of problems. Thank you very much for joining us in this lecture.