 Welcome to the lecture on the two-sample t-test. All right. Suppose you're comparing two groups. Again, we know the sample size is large. It's not clear how much. Some 32, some really, you want to cut it close, 32 something, 60. Basically, you have large samples, or if you know the sigma, the two populations, you could use z. With small samples, certainly if you're below 32, you're going to have to use t. So use t basically when sigma 1 and sigma 2 are unknown to you. The samples are kind of small, and the populations follow some kind of normal distribution, or at least aren't badly skewed. Okay, you see the formula in front of you for the two-sample t-statistic. Just remember that s squared pooled has to be computed first. It's not a big deal. Essentially, it's an average, a weighted average of the two variances. In fact, if you put the two groups made one group out of them, which is HO, there's no difference. If you took the variance of that one group, that would be s squared pooled. So we have a formula for it. You'll see in a second. Once you calculate s squared pooled, this one is not very complicated. But remember, mathematically, you lose two degrees of freedom, and you're going to be using t. Well, in front of you, you've got the pooled variance, and you see how easy this is to do. You do n1 minus 1 times s1 squared, plus n2 minus 1 times s2 squared, divided by the degrees of freedom, n1 plus n2 minus 2. That gives you s squared pooled. And as I mentioned, that's just a weighted average of the two variances from the two groups. To use the two-sample t-test, really, you first have to test for something called homoscedasticity. You have to prove, and it's a way to do it with an f-test, which you're not going to learn, is that the two variances are equal. That's what homoscedasticity means. The two variants are statistically equal. They're not equal, but statistically equivalent. And again, there's an f-test for that. Okay? Now, if you don't have homoscedasticity, and that's what you show that we call a heteroscedasticity, which means the variants are not statistically equivalent, then you adjust the formula. We're not going to learn about it, but just be aware of it. Okay? So for this course, we're going to assume, basically, that we do have the homoscedasticity. Here's the problem. Suppose we want to compare men and women on reading scores on a standardized reading test. We take a sample of 31 people. There are 16 men and 15 women. Too small to use the. We don't know the sigmas for the two populations. We'd like to know if the two means are different or if they're the same. We're going to assume the equal variances. We'll assume the two sigmas are the same, even though we're not testing that because we haven't learned it. We'll assume that the two underlying populations are normally distributed. Note that on this test, women outperform men by four points. Is that just random variations, sampling error, or is that an actual significant finding? That's why we're going to test. We'll test at the alpha equal 0.05 level of significance. Here we do the hypothesis test. The null hypothesis is that the two means are exactly equal. The main reading score for men, the main reading score for women, that there really are equal is no difference between the groups. The option hypothesis, the one we accept if we reject the null hypothesis, is that they're not. There is a difference. We don't have to say how big a difference. All we have to say is if we reject HO, that means there is a difference between the groups. We will use the T with 29 degrees of freedom. You compute the degrees of freedom as the n minus 1 from the first group, n minus 1 from the second group. That ends up being n1 plus n2 minus 2. So in this case, it's 31 minus 2 or 29 degrees of freedom. The next step is to get the calculated value of the T29 test statistic. We have to get the pooled variance first and plug it into the formula for the T29 test statistic. That's T with 29 degrees of freedom. When we do, we get a T of negative 0.62. That's very small. It's less than 1. It's definitely in the region of acceptance or in the region of do not reject, if you want to call it that. But clearly, it's close enough to 0 so that we don't reject the null hypothesis. It's possible that there really is no difference in the score no matter what we saw in the data between men and women on their test scores. Do men do better than women in anything, by the way? Not at all. Okay. In this problem, we're looking at the company pay. The window is a difference in daily pay between two companies. Notice we do find that in company one, that paying their workers $210 a day, company two is paying $175. But again, this is a sample. In fact, we only looked at so that we can only find 30 people, 10 from company one and 20 from company two. So to say that that difference between $210 and $175, which is a $35 difference, to say that's sufficiently significant, you better test this first before you, you know, go to court or anything. In event, look at HL, V1 equals Mu2, and H1 is that Mu1 is not equal to Mu2. And you got, it's a T28 as you know, because you lose two degrees of freedom. And there you see the S squared pool, which is the first thing you calculate, and it's 472.3. Then you do the formula that's, again, you have to memorize it, it's given. And you can see that you got a T28. When you finish all the calculations, your sample evidence is basically defined by that T28 of 4.16. Now 4.16, as you can see, is in the rejection region. Okay, so it says 4.16 is way beyond 2.7633. It's all the way to the right of that. So since it's in the rejection region, we conclude that the pay rates of the two companies are indeed different. Company A pays more than Company B. There is a statistically significant difference in the pay of the two companies. Here's another problem. We are considering two different types of precast concrete beams. The difference between these is the type of material that's used. And the material is measured by its, for its strength, in terms of pounds per square inch of pressure, PSI. The question is, is there a difference between the beams supplied by these two suppliers? Supplier A, supplier B. We're calling the data from supplier A Group 1 and the data from supplier B Group 2. Now with this, just like with both of the previous two examples, if we had more data, we'd be using Z. But here you see the sample sizes are very small. We have only 12 from the first supplier, 10 from the second supplier, and we don't know sigma. We're assuming, therefore, that the two variances are statistically equal. We're assuming that the populations are normally distributed. We're setting up the hypothesis test here. The no hypothesis, that the two means are the same. The no one is equal to mu two. The alternate hypothesis, that they're different. We're working with a T with 20 degrees of freedom and one plus and two minus two. With alpha equal to 0.05, you have 2.5% in each tail. So the critical values from the T table are plus or minus 2.086. And that sets up the regions of rejection of acceptance in the middle, or the region of not rejection. We compute the T statistic from the data, first computing the pool variance and then inserting it into the formula for a T. We get a T value of 1.07. And of course the conclusion is to not reject the no hypothesis. We cannot say that the two means are different. All we can say is actually that we can't say that they're different. As always, we don't have a very clear statement when we're quote unquote accepting the no hypothesis. All we can say is, ah, we can't reject it. There's no statistically significant difference between the beams made by supplier A and by supplier B. Now we're going to learn how to use Microsoft Excel to solve two sample T tests. If you're like, you don't want to stop doing all these calculations by hand. So you're going to be using Microsoft Excel. And we have simple instructions on how to use Microsoft Excel to solve these kinds of problems on the virtual handout page. It tells you exactly what to do, and it's very, very simple. In this problem, a marker wants to know whether men and women spend a different amount on wine. Because statistically, you know that men spend a lot more on beer. Let's see about wine. Let's research a randomly sampled 34 people. 17 were men and 17 were women. And here we see the averages. The average amount spent by on wine in a year by women is $437.47. The average amount spent by men was $552.94. Now we're going to look at the Excel printout and we're going to determine from the printout whether the difference is statistically significant or not. Now let's look at that printout from Excel. You see it has variable one and variable two. I mean, you can actually change the variable name and call it men and women, but you just remember variable one over the women, variable two is the men. First of all, the first rule, mean. Well, you see the average, $437.47 for the women and the men was spending $552.94. You see the variance below that. You see the variance for the women, $97,784.13971 and you see the variance for the men. You see the observations. This is, you know, 17 and 17. That was the sample size for the 17 men and 17 women. Here's that pool variance you learned about. Here's the difference for you. Notice the pool variance is kind of in between the variance for the women and the variance for the men and it's $101.002.8193. The hypothesized mean difference is zero. That's another way of saying that mu one equals mu two. If mu one equals mu two, the difference, mu one minus mu two is zero. There you get degrees of freedom, 32. We know how we got that. 17 plus 17 is 34. We lost two degrees of freedom. So 34 minus two is 32. T-stat, that's the computed T-stat. That's the one we use that formula as that pool variance in the one over N1 plus that whole complicated formula. There it is. It's a T-stat. We call it a calculator of a computed T-stat. And it's minus 1.05. And the last few lines that you got are the critical values, depending whether you're doing a one tail test and a tail test. And we'll learn about the probabilities in a moment. So we're looking at over here just in case you want to know where the data came from. And again, if you had to do this yourself, you have to input two columns of numbers. So here's the numbers we used. This is the amount the 17 women spent on wine in a year. And then we have the amount the men spent. This is basically two columns of numbers. So that's the data that we used. The main thing the researcher wants to know is there a significant difference between women and men and how much they spend on wine? It looked like it was a big difference. 437 and change versus 552 and change. But look at our computed T-stat, minus 1.05. Anyway, using the old method that you were doing by hand, you see the critical values, all right? You see the critical as for T32. Over the last, last row, T-critical, two tail, 2.0369, whatever. That's the, those are the critical values on the right. It's plus 2.0369, whatever. And it's minus 2.0369 on the left. Those are the critical values. So we were testing at the alpha 05. That was at the 05 level. And our T-stat, the calculated T, that's the double extent calculated T, was minus 1.05. That's not beyond minus 2.036. So we didn't get into the rejection region, right? And again, the reasons for minus is we have the women, I think it was the women that were first. So they spent less than the men. So that's the lower number. They reversed it and put the one that spent more, you have a plus. Same thing, because it's symmetrical. That's one way. But it's actually a statistician who do different. That's the hard way, actually, of doing it. The easy way is look at the probability. We've been doing two tail tests. Look at that probability. That's the probability of getting the sample evidence, which is the difference that you saw, or even something more extreme, a bigger difference. All right? Well, the probability of getting that, if HO is true, is 0.29739, et cetera. That's cool, about 30%. You see where I got that from? That's the road right before the end, which is probability capital T is less than small t to tail. It's a probability. You see the P? It's giving you the probability of getting the sample evidence or something more extreme. They're saying that even if HO is true, there's no difference between men and women and how much they spend on wine. There's a 30% chance almost of getting this sample evidence. So there's no reason for you to reject the HO. See, you only reject HO if you see that probability is less than 5% of the alpha vo5. And it's unlikely. We're always looking at where is the likelihood of getting the sample evidence if your claim is true? There's no difference between men and women. What is the likelihood of getting this sample evidence? And we're seeing that it's not so unlikely. This could very well happen. In fact, roughly 30% of the time it should happen. So you don't have the evidence to reject HO that there's no difference between men and women. So in conclusion, there's no statistically significant difference between men and women and how much they spend on wine consumption. Let me repeat just a couple of points. Something seems to be wondering why the calculated t-statistical t-stat why it ended up being negative. Again, since women spent less than men on the wine, even though it wasn't significant, so you get a negative number because you're subtracting a larger number from a smaller number. If you would have made men first, there's nothing wrong with doing that. If they live in the first column, then you've gotten a positive one. But you get exactly the same results because the t-distribution is symmetric. Another question you might have is what the order of the calculated t-stat that I have to be for us to reject it? Again, we saw that the critical value is there, right, for the two-tail test. It showed us the critical value is at 2.03693, et cetera. So you need a calculated t-value called t-stat of more than 2.03693, et cetera, or less than minus 2.03693 to reject. Remember, our calculated t was 1.059. It wasn't in the rejection region. Actually, we had a minus 1.059. So you need more than 2.0369 on the positive side, or less than minus 2.03693 on the negative side if you want to reject HL. Okay, here's another problem we're going to use in Excel. Now, here we put the input data first. A company wants to know whether men or women have the same job satisfaction scores. We have a scale that goes from 0 to 10. 0 is the lowest. That means you're really, really unhappy on your job. And 10 is very happy. It would be a job. Highest job satisfaction. And if you look at the data, you'll see there were 18 men and 18 women. We have the actual scores. And now we're going to do a test to see if there's a difference. Now, again, this could lead you to court if you're treating women differently. So it's important to know that we have a statistically significant difference. Anyway, now we're looking at the Excel output. Again, just remember, very little one of the men and very little two of the women, and there were 18 of each. And you can see that from observations, 18 and 18. Notice the means of the men. 6.1666666 to et cetera. And for the women, their job satisfaction, average job satisfaction was 3.555555555 et cetera. Okay, we have a pooled variance thing. You see the pooled variance is right between the two variances of 4.735 and 5.0849. See that? And it's 4.9101 et cetera. Hypothesize mean difference is zero. Mu1 equals mu2. That's our null hypothesis. We have 34 degrees of freedom. No surprises there. 18 plus 18 is 36. Minus 2 is 34. Here's your calculated T-set. 3.535. That's the T-status that you calculate. Actually, in this case, the computer did the work for you. But this is the one that uses that formula that has the pooled variance in it in the N1, in the N2. That's complicated formula. Guess what? Here it is done for you. So the T-status is 3.535 et cetera. And again, if you do the two-tail test, you see the critical values there. All the way to the bottom, the T-critical for the two-tail test. 2.0322. That's for a T with 34 degrees of freedom. And we'll discuss this in a moment what this means. Well, if you look at the printout, you'll see the average satisfaction for the men with 6.1666. A whole bunch of sixes, right? And for the women, it's 3.555. A lot of fives there. Six points, let's call it 6.17 versus 3.56. That seems like a big difference on a 10-point scale. So we want to know if it's significant or not. You can't go to court unless you've proven you're looking at something that's statistically significant difference. Well, if you've done a two-tail test, you see the probability of getting the sample evidence that HO is true. Remember that probability, the one with the capital P and the small P, the whole thing there? That probability of two-tail is telling you, what is the probability of getting the sample evidence of something more extreme if HO is true? HO, that says there's no difference. And then we're exactly the same. Well, what's the likelihood of getting the sample evidence? And we find out that it's a rounding now. The printout said 0.001199. Let's round it to 0.0012. There's 12 chances out of 10,000. 12 out of 10,000 chances of getting the sample evidence, essentially. In other words, we're saying this is highly unlikely. This is not what you expect to see. This kind of difference of 6.1666 versus 3.555. You don't expect to see that kind of difference if men and women have the same satisfaction in the population. Remember, we're always looking at the sample evidence to see whether it supports what you're saying about the population. Well, this sample evidence, based on the 18 men and 18 women, does not support the claim that you're making about the men and women in the population. And there may be a million men and women working at this company. Well, the sample evidence doesn't support it. And as a statistician, we would reject HO and say there is definitely a difference between men and women in job satisfaction at this firm, and the men have much higher job satisfaction than women. And that would have to be something that needs to be investigated by the lawyers, et cetera, in the court system. We'll be very concerned about this. Well, that's probably just repeating it all. You see, the average job satisfaction of the men was rounded to 6.17. For women, it's 3.56. There's a difference of about 2.61 on a zero to 10 scale. And we know it's very unlikely that this sample evidence comes from populations that are the same. So we rejected HO, and we said there's a statistically significant difference between the average job satisfaction scores of men and women at this firm. So again, what we're trying to show you is to use Excel to do your own analysis. You just take two groups, you put in the two columns of numbers, and Excel will do all the work fields. You'll give you the means, the variance, just a pool that will do the calculation, called the t-stat. And then you look at the probability, and that probability is very low. If you're testing at 05 less than 5%, you reject HO. If the probability is high, let's say it's like 20%, or even 12%, 10%, you're not going to reject. Generally, the cutoff is 5%. If the probability of getting the sample evidence under HO is less than 5%, you reject your claim, you reject HO. As simple as that. As always, go back to the lecture notes, go back to the homework problems, do lots and lots of problems. That's the best thing you can do for yourself in order to get a good grade on the exams in this course.