 Well hello, this is module five talking about the normal distribution if you recall back in module one when we made Histograms we said the shape of a histogram was normal or that data are normally distributed if Your frequencies start low Get high and then get low again So there's a peak that is right there in the middle and on either side the distributions approximately symmetric That's called the normal distribution we use the normal distribution when it comes to Continuous data remember continuous data are data values that Can take on decimal Form to as many decimal places as you would like So the first stop is actually the empirical rule and the empirical rule is actually an estimation or an approximation rule That tells you what percentage of the data is Located within a certain number of standard deviations of the mean. So this is an estimation rule. There's no technology used It's just a rule that we're about to learn about So a density curve is the graph of a continuous probability distribution it must satisfy the following properties So a density curve, which is that Bell shape that normally distributed data shape The total area under the curve must equal one That's because the area represents probability remember all the probabilities of all the possible outcomes within a sample space aka the area under the curve Must equal one or a hundred percent and every point on the curve must be above the x-axis So the x-axis will be our number line and then probabilities will be the area underneath that curve So here's the empirical rule For data sets having a distribution that is approximately bell shaped Very important this bell shaped word here the following properties apply I Can guarantee to you that if my data are bell shaped remember that means the histogram starts low gets high gets low again And symmetric about the middle 68% of all values will fall within one standard deviation of the mean That's one standard deviation below one standard deviation above that's 68% 95% of all data values will fall within two standard deviations of the mean and then 99.7 Approximately 99.7% of all data values will fall within three standard deviations of the mean so what does this mean exactly Well, let's do this by talking about a picture So here I have a bell curve normally distributed data in this picture notice The x-axis serves as a number line Right smack dab in the middle of the x-axis is your mean remember that's represented by mu Greek letter mu and you And then one standard deviation above one standard deviation below so that's within one standard deviation of the mean 68% of the data will be located within one standard deviation of the mean Well, I'm taking 68 and I'm dividing it amongst two regions the region just below the mean and the region Just above the mean that's where that 34% comes from Then we said there's 95% of the data located within two standard deviations of the mean So what this means is if you take 95% and take away the 68 that's already accounted for For one standard deviation. This actually is going to leave you with 27 And if you take 27 and you divide it by 2 That's where the 13.5 comes from because you have these two regions So together these four percentages add up to 95% 95% of the data is Within two standard deviations of the mean Let's go further Then we said 99.7% of the data values are within three standard deviations of the mean So if I take 99.7 and take away the 95% from the two standard deviations Within two standard deviations of the mean that leaves me with a measly 4.7 So I took out 99.7 took away my 95% and this leaves me 4.7 to divide up amongst two regions Well 4.7 divided by 2 is actually 2.35 So that's why these two outer pieces are each 2.35% and then we can even go a little bit further and If three standard deviations encompasses 99.7% of the data So 100 minus 99.7 that leaves you with 0.3 What's 0.3 divided by 2? Is 0.15 so that's why these outer tails out here that little tail off to the very right side and a little tail off to the left side Are each 0.15% So we'll actually be able to use this picture of this diagram to answer some questions now on a certain test Scores are bell shaped remember. That's a super important word with a mean of 90 Also important and a standard deviation of 10 also important by the time this is ever I'm going to underline everything, right? It's all important everything is important Sort of So the results follow a normal distribution use the empirical rule that means we're using that estimation rule We're using this awesome picture which has regions with percentages in them to find the percentage of people's whose scores are between 70 and 110 60 and 100 The first thing you have to do is take your empirical rule diagram and put it within context of the question So we know the mean mu is 90 that goes in the middle Well, what is one standard deviation above the mean what data value would that be well? The standard deviations 10 90 plus 10 is 100. All right. What about two standard deviations above the mean? Let's add another 10 to get 110 then three standard deviations is the data value 120 All right, so now what about one standard deviation below 90 minus 10 is 80 then you have 70 then you have 60 Now let's answer the question What percentage of people have scores between 70 and 110? So what regions would those be? well Take every region between 70 and 110 and add up those percentages So you have 13.5 Plus 34 Plus another 34 plus 13.5 and that's actually a nice beautiful 95% Which 70 is two standard deviations below the mean and 110 is two standard deviations above and remember? 95% of the data is located within two standard deviations of the mean Well, what about the percentage of people who have scores between 60 and 100? So look at all the regions between 60 and 100 So I'm just going to shade anything from 60 All the way to a hundred That's four regions So my first shaded region is actually two point thirty five plus my second regions thirteen point five Plus thirty four plus thirty four Out of each region that we shaded each region that's included and this is actually going to give us 83 point eighty five percent So please use the picture you have to use this visual. It's extremely beneficial. It's extremely helpful So that's the empirical rule It's used for estimating the percentage or proportion of data values Located within a certain number of standard deviations of the mean That's all I have for now. Thanks for watching