 Welcome. We're now going to go through the entire process of calculating a confidence interval by hand, and then we'll actually start using technology to create our confidence intervals for us. So the first step is to verify that the required assumptions are satisfied. And in order for us to be able to use the methods that we're about to use, the sample must be a simple random sample, the conditions for the binomial distribution must be satisfied, and the normal distribution can be used to approximate the distribution of sample proportions because there are at least five observations and then at least five non-observation. So for instance, if you're looking at M&Ms and you're looking at the proportion of M&Ms that are like brown, for instance, you must show that there are at least five brown M&Ms in your sample and at least five non-brown M&Ms in your sample. We don't focus too much on verifying the assumptions in this class. We learn more about the process and procedure. Step two is to find the critical value. Remember that's the data value whose area to the right is alpha over two. Alpha is the significance level, alpha is one minus the confidence level. And then you'll calculate the margin of error. Then step four, we'll take our sample proportion, our point estimate p hat, and we will literally subtract the error from it. We'll add the error to it, and then we will literally, that's our confidence interval. You have a lower bound, you have an upper bound, you're done. And make sure you round accordingly. So if it says round your confidence interval to two decimal places, make sure you round to two decimal places. So in the Twitter example, we noted that a poll of 1,007 randomly selected adults showed that 85% of respondents know what Twitter is. The sample results are a sample size n of 1,007 and a sample proportion p hat of 0.85. So that's going to be the starting point of our confidence interval p hat. The first thing they want you to do officially is to find the margin of error e that corresponds to a 95% confidence level. So there's a formula for margin of error. The margin of error is equal to your critical value, that's z sub alpha over two, times the square root of p hat times q hat over n. Alright, so for a 95% confidence level, the critical value is always 1.96. There's a shortcut table from the previous video you could use. Another way you could find the actual critical value is if you drew your bell curve, and you're trying to find the positive critical value, and you know that 95% is the center of your bell curve, which leaves 2.5% for each of your tails. So in order to find your critical value, you'd have to use technology where area to the left is 97.5%. In general, you can assume that a 95% confidence level has a critical value of 1.96. Alright, also we need to know, we got our square root, p hat is 0.85, what would q hat be? Well, q hat is actually 1 minus p hat, 1 minus 0.85, it's 0.15. One sample size n is 1,007. So what this ends up giving us is literally 1.96 times the square root of 0.0001266137. I'm scared to round until the final answer because even rounding the two decimal places or three or four might cause your error to be a little bit off, it might cause error in your error. Let's just say that. So if you take the square root of that long decimal, multiply it by 1.96, you will actually end up getting 0.0221. So that's going to be your error. If you ever have to give the margin of error, if a question asks you to find our margin of error, you must calculate the error by hand. The technology will not tell you the specific calculation, it'll tell you your confidence interval, but it won't break it down and tell you what the error is. So now find the 95% confidence interval estimate of the population proportion p. So literally we're going to take p hat and we're going to add and subtract the error from it. So step one here is you have p hat minus e, your sample proportions 0.85 and you're going to subtract 0.0221. When you subtract those two values you'll actually get 0.8279, then you take your sample proportion and you add the error to it, 0.85 plus 0.0221 and that's going to give you 0.8721. So this confidence interval will be lower bound 0.8279 is less than p is less than 0.8721. You could also write it in parentheses with the two values separated by a comma. So one's inequality form, the other's interval notation form. So that's how you actually create or calculate a confidence interval by hand. And that's telling me that the true proportion is between 0.8279 and 0.8721 with 95% confidence. So based on the results can we safely conclude that more than 75% of adults know what Twitter is? And the answer to that would be well since our interval was 0.8279 all the way through 0.8721 we can say yes and that's because the true proportion is between 0.8279 and 0.8721. And so that's basically 82.79% and 87.21% just as a FYI there which is way more than 75%. So now assuming you're a newspaper reporter write a brief statement that accurately describes the results and includes all the relevant information. So in order to describe a confidence interval in words we use the following format. We say with 95% confidence so we state our confidence level the proportion and in this case we're talking about the proportion of adults who know what Twitter is. It's between 0.8279 and 0.8721 you literally list your values. That's always the structure for writing your confidence interval statement. If you write a confidence interval you must write out the description. The two go hand in hand. So now let's look at technology. So I'm not going to read this to you I'd rather just show you on the next page but it's literally as easy as typing in three pieces of information into the Google Sheets spreadsheet. So the colors from a sample of 100 M&Ms because who doesn't like M&Ms? Yeah I know some of you don't but a lot of you do. We collected a sample of 100 M&Ms and 8% of them were brown. Use the sample data to construct a 98% confidence interval estimate of the proportion of brown M&Ms. And Mars candy claims that 13% of its M&Ms are brown so is this claimed rate wrong? So all we have to do is write down three pieces of information. We're talking about brown M&Ms now. So the first thing I need and this is for Google Sheets we're going to go to a tab that's called the data list tab and we're going to go to the region that's called 1 prop ci p value. It just means you're finding the confidence interval for one proportion. And the first thing you're going to put is your number of observations remember we're talking brown M&Ms. You want the discrete number of observations. How many M&Ms were brown? Well it said 8% but we need an actual count. Well number of observations is out of the 100 M&Ms 8% or .08 or brown. So that's 8 that's going to be X on your spreadsheet. Your sample size in is 100 and then your confidence level is the only other information you need it's .98 or you could literally type 98 with the percent sign it'll recognize it as a percentage. So let's type these three values into the Google Sheets spreadsheet. So we're going to go to the data list tab we're going to focus for proportions one proportion confidence interval p value and you type in a value for X. So there are 8 M&Ms that were brown. You're going to type in your sample size in which is 100 the only other thing you need to do for a confidence interval is put your confidence level 0.98. So I have 8, 100 and I have 0.98 don't worry about any of the other information you just need to focus on the confidence interval lower bound and confidence interval upper bound. Those are the two things that you need from the spreadsheet the rest will come later on so let's do three decimal places 0.01 the six rounds to a seven because there's an 8 in the following place and then 0.143. So that is our confidence interval 0.017 0.143. So that's anywhere from 1.7% to 14.3%. So is Mars's claimed rate wrong? Well I would say I would say that since 13% is in the interval. 13% is literally 0.13. The claimed rate is not necessarily wrong. The claimed rate is not necessarily wrong because 13% lies in that interval there. So the only other thing we need to do here is write out the one sentence interpretation of our confidence interval itself and that interpretation is with 98% confidence the true proportion of M&M's that are brown is between 0.017 and 0.143. There's one more thing we're going to do here and that's determining sample size. Sometimes you need to make sure you determine the correct sample size to use to estimate a population proportion because you can't just go out and randomly pick a certain number of people if there's certain quality or certain things or requirements that must be met by your boss or by the company that's contracted you out to do these calculations you need to make sure you use the appropriate sample size to meet those requirements. So solving for N in the margin of error formula gives the following two formulas for determining sample size. So the desired sample size will be the critical value squared times p hat times q hat over the error bound squared and that's when you have an estimate of p hat. If you don't have an estimate of p hat you still square the critical value but you then multiply it by 0.25 and the reason why is if you don't know anything about p hat which means you don't know anything about q hat what's 0.5 times 0.5 that's 0.25 that's where that comes from and then you still divide by the error bound if the computed sample size is not a whole number you will round up because if you need to have a sample size of 40.2 40 is not going to make the cut you have to bump it up to 41. So many companies are interested in knowing the percentage of adults who buy clothing online. How many adults must be surveyed in order to be 95% confident that the sample percentage is an error by no more than three percentage points and we're going to use a recent result from the Census Bureau that 66% of adults buy clothing online. So the important thing here is we're using a 95% confidence level which means instantly that our critical value for a 95% confidence level is always 1.96. All right p hat our point estimate here is going to be 0.66 that's our sample proportion what about q hat well it's one minus 0.66 so that's 0.34 and what else do we have to know we need to know that error they said they want the error to be no more than three percentage points so that means the error e would equal 0.03 3% would be 0.03 now all you have to do is plug and chug so you have 1.96 squared times 0.66 times 0.34 over 0.03 squared it's up to you to do this calculation use a calculator but on top I got 0.86205504 and on the bottom I got 0.0009 so you divide these two things and you get 957.8 for the necessary sample size although it would anyway remember you must always round up and you need a sample size of 958 people and that's in order to meet those requirements and to have that certain error bound so that's pretty cool well now let's assume we have no prior information suggesting a possible value of the proportion so we have to use the formula where p hat times q hat to replace with 0.25 it's still the same confidence level so it's still the same critical value of 1.96 and it's still the same error of no more than 3% so even though we have less information here formulas actually a little bit easier which is pretty awesome if you ask me so the critical value squared times 0.25 over 0.03 squared so you perform this calculation you get 0.9604 divided by 0.0009 and you're going to get 1067.1 so if you go out and ask 1067 people that's not enough remember you must always round up for this sample size always round up so that's actually going to give you 1068 people so notice that in part a of this question we only needed 958 in part b we needed 1068 so notice the fact that just by knowing some prior evidence of the sample proportion it saved us from having having to ask around 100 people about so anyway that's all I have for now thanks for watching