 Hi there, in this video we're actually going to look at comparing the means of two independent samples. So the requirements. To use the method we're about to use, the standard deviations for both populations sigma 1 and sigma 2, sigma's population standard deviation is unknown and no assumption is made about the equality of those standard deviations. Next, the two samples are independent, both samples are simple random samples and either or both of the following conditions are satisfied. The two sample sizes are both large meaning over 30, or both samples come from populations having normal distributions, remember that means bell shaped. So the notation that you're going to see coming up here is mu sub 1, remember that you look in thing as the Greek letter mu, mu sub 1 as the population mean for group 1, sigma sub 1 is the population standard deviation for group 1, n sub 1 is the sample size of group 1, x bar sub 1 is the sample mean of group 1, and s1 is the sample standard deviation of group 1. If you replace all those one subscripts with twos, that represents the corresponding information for group 2. So to perform these hypotheses test, we are going to go to Google Sheets to the data list tab and we'll go to the two variable confidence interval p value region and we'll type in the summary statistics for each of the samples. So researchers conducted trials to investigate the effects of color on creativity. Subjects with the red background were asked to think of creative uses for a brick, other subjects with the blue background were given the same task. Responses were given by a panel of judges, so they gave scores to these people. Researchers make the claim that blue enhances performance on a creative task. Tests this claim as the .01 level of significance. So you have the red background folks and the blue background folks. I'm going to call the red background folks group 1 and I'll call the blue background folks group 2. I need to list my hypotheses for this test. So the null hypothesis is always going to be the two means, the two population means mu 1 and mu 2 are equal. My claim is that blue enhances the performance on a creative task. Therefore, that would mean that the mean for group 2 will be greater than that from group 1. So blue is performing better than red. That means my red group should be smaller than my blue group. It's always best for these sorts of tests to write group 1 always on the left. It's always important to write group 1 on the left for the sake of getting your information labeled in the correct spot when you use Google Sheets or any sort of technology. So blue enhances performance means red group is going to have a lower average or lower mean than the blue group. So my null hypothesis is equal to and my alternative hypothesis will be mu 1 is less than mu 2 and that is indeed my claim. That's what's mentioned in the question. So when we go to Google Sheets, we're going to type in the mean for group 1, the sample mean that is, the sample standard deviation and the sample size. Then for group 2, we'll do the mean, standard deviation and sample size there. And then we'll mention what is the sign of the alternative hypothesis. So let's go on and venture to Google Sheets to the data list tab. We're focused on the two variable confidence interval p-value region. The first thing you'll do is you'll type in your sample mean for group 1, 3.39. Your sample standard deviation 0.97, sample size 35, second group 3.97, 0.63 and sample size of 36. The null hypothesis sign or the alternative hypothesis sign is going to be less than. So your p-value is 0.0021. That's what you need for a hypothesis test is that p-value 0.0021 and we have to compare it to the significance level alpha. We have to compare it in this case to 0.01. This 0.0021 less than or greater than 0.01 is clearly less than. Since we're under alpha, since we're less than our significance level, we reject the null hypothesis. So we are rejecting the null hypothesis, which means all eyes are pointing to the claim as being supported. So our summary statement is there is sufficient evidence to support the claim that blue enhances performance on a creative task. To make a confidence interval for two means, you go to the data list tab, go to the same area, and you make sure you input the confidence level. So using the same data from the color creativity example, let's construct a 98% confidence interval estimate. So your confidence level is 98% or 0.98. All you have to do is go to Google Sheets and we already have all of our data inputted, go to confidence level, type in 0.98 and push enter and you see that you get your lower limit of negative 1.05 and your upper limit of negative 0.11. I'm rounding the two decimal places here. So my confidence interval is going to be negative 1.05 and negative 0.11. Remember the thing about confidence intervals is that 0 not included and when that's the case, there is likely a difference in the two means because the difference between two things is 0 then there is no difference. So since 0 is not included, there is likely a difference. We can say with confidence level 98%, the difference between the means of the two groups is between negative 1.05 and negative 0.11. So let's do a couple of examples where we'll do both the hypothesis test and the confidence interval. So in a study of proctored and non-proctored tests in an online intermediate algebra course, researchers collect the data for test results given below. Use a 0.01 level of significance to test the claim, there's your keyword claim, that students taking non-proctored tests get a higher mean than those taking proctored tests. Use a hypothesis test and then a confidence interval. So let's think about this claim statement here. My group 1 is going to be my proctored students, the sample size, the sample mean and sample standard deviation are given. My group 2 is my non-proctored students, sample size, sample mean, sample standard deviation. And the claim is that students taking non-proctored tests get a higher mean than those taking proctored tests. So if you were comparing the mean of group 1 with the mean of group 2, if I say non-proctored students, which is group 2 or scoring higher, that means the bigger side of the inequality sign faces group 2. So this is saying that the proctored students get a lower score than the non-proctored students. It's important to always write group 1 on the left, otherwise you could mix things up when you type your stuff into Google Sheets or whatever technology you use. All right, so let's talk about our test, our hypothesis test. The null is always that the two means are equal to each other. And the alternative in this case is that the mean for group 1 is less than the mean for group 2. Remember, group 1 is those that had a proctored exam. Group 2 would be those that had a non-proctored exam. That's your claim. The alternative hypothesis, it's your claim. All right, so we need to input our information into Google Sheets. The sample mean for group 1, x bar 1 is 74.30. Our s1, our standard deviation for group 1 is 12.87, and our sample size for group 1 is 30. All right, then we have group 2, we have 88.22 as your sample mean, 88.62 as your sample mean. You have 22.09 as your sample standard deviation, and then you have 32 as your sample size for group 2. Highlight all the important information we need for Google Sheets. And when we do our confidence interval, our confidence interval, well, this is a, since less than is in the alternative hypothesis, this is a one-tailed test. If you recall when we built confidence intervals, we have two tails. So for a one-tailed test, our confidence level is always going to be 1 minus 2 times alpha, 1 minus 2 times 0.01, which is going to give you 0.98. Also, another piece of important information. When you go over to Google Sheets and input all of this exciting information into the dapplist tab, the two variable confidence interval p-value region. So for group 1, I actually had a mean of 74.30. I had a standard deviation of 12.87 and a sample size of 30. For group 2, I had a sample mean of 88.62, standard deviation of 22.09 and a sample size of 32. Now my alternative hypothesis signs less than, and it looks like my confidence level is actually 0.98. So we look at the p-value 0.0014 and look at the confidence interval lower value or lower bound and the confidence interval upper bound. That's all we need for this example. All right, so our p-value, our p-value, which we just discovered, is actually going to be 0.0014. Let's compare it to alpha, let's compare it to 0.01. It's definitely less than, so we can reject the null hypothesis. We're under the limba bar alpha, so we reject the null hypothesis. Which means our null hypothesis is out of here. It's gone. See you later. And all eyes are now pointing to the claim. There is sufficient evidence to support the claim. Now for a confidence interval, it's important to note that it was negative 25.27. And negative 3.37. And once again, 0 is not in the interval. So there is a difference. Well, there is likely a difference, I should say. So 0 is in the interval, so there is a difference, most likely. So the proper conclusion statement and the proper confidence interval statement are as follows. For the hypothesis test, the conclusion statement is the following. There is sufficient evidence to support the claim. You always give your statement in terms of the claim that students taking nonproctor tests have a higher mean than those taking proctor tests. And the confidence interval statement with confidence level 98%, the difference between the mean scores of students taking nonproctor tests and proctor tests is between negative 25.27 and negative 3.37. So if you want to write these down, feel free to pause the video. Let's look at one last example here where we know the mean weight of men is usually greater than the mean weight of women, and the mean height of men is greater than the mean height of women. A person's body mask index, or BMI, is computed by dividing someone's weight in kilograms by the square of their height in meters. Given below are the BMI statistics for random samples of males and females. Use of 0.05 significance level to test a claim, ooh there's that key word claim for a hypothesis test. And males and females have the same BMI, the same mean BMI. All right, so I have my summary statistics for males and females, sample size, sample mean, sample standard deviation. I think we'll call males group one just because they're the first group listed, and we'll call females group two, so the second group listed. All right, so the claim is that the two means are equal to each other. So when I go through and run my hypothesis test, it's pretty clear, and it's always clear that the null hypothesis is mu one is equal to mu two, the two means are equal, and the alternative will be the opposite of that, mu one is not equal to mu two. It's always helpful to write out what your claim is first and then identify what hypothesis does it go to. All right, all the information you're going to type in to Google Sheets is your sample mean for group one, your sample standard deviation for group one, we've got all these long decimal numbers we've got to type in, plenty of room to make mistakes, sample size for group one, and then you also input the same information for group two, the sample mean, the sample standard deviation for group two, sample size is 40, and your sign of your alternative hypothesis is not equal to, that's what we'll type in to Google Sheets, seven pieces of information. What about that confidence interval, what's our confidence level going to be, might as well kill two birds with one stone here. Our test is two-tailed, so our confidence level is going to be one minus alpha, or one minus 0.05, so that's 0.95. I like everything that we're going to type in to Google Sheets, let's go there now. So type in your information for group one, that means the sample mean, sample standard deviation, sample size, and type in the same information for group two, and type, put your alternative hypothesis sign, which is not equal to, and then your confidence level of 0.95, and you'll be given a p-value of about 0.2066, and then you have your confidence interval, lower bound and upper bound, negative 1.039 and 4.719. So let's indicate the information we just found. So we had a relatively high p-value of 0.2066, is that less than 0.05, definitely not, it's greater than. So we failed to reject the null hypothesis, we failed to reject H naught, we failed to reject our claim in this case, so there's no evidence to reject the claim here. And then your confidence interval, you had negative 1.039, that's your lower bound and your upper bound is 4.719. So what this means is, zero is in the interval, so there is likely no difference. Or you can just say there is no difference for this particular information that we've collected for these two samples based on the two populations. So zero is in the interval, that is likely that there is no difference. Alright, so let's write our conclusion statement for the hypothesis test and our one-sentence statement for the confidence interval. We say that there is not sufficient evidence to warrant rejection of the claim that males and females have the same mean BMI. And we say that with confidence level 95%, the difference between the mean BMI of men and women is between negative 1.039 and 4.719, remember zero is in the interval, so it suggests that there is likely no difference between the two population means. But anyway, that's all I have for now, I hope you enjoyed. Thanks for watching!