 Take our four standard deviations, remembering that if I'm trying to plot all the data so I have a pretty nice bell curve that has all the information in it so I can see the tails of the curve, four standard deviations would be the vast majority of the data. So I can do that by taking the standard deviation 815 times 4 and then I'm going to subtract that from the middle point or mean 2189 to get to the 1071. It's rounded here so it's not exact so let me do that again 815 times 4 minus the 2189 about 1069. Now that's a negative number so you might say well why don't I just stop it at zero because then you could. But sometimes it's nice to plot it all the way down into negative so you see the whole shape of the bell and it can give you another verification by the percentages adding up possible. To 100% so we'll keep it for now just to demonstrate that then if I do this the other way 815 times 4 standard deviations plus 2189 we get to the high point of the 5448 on the calories so my count then over here if I'm going to say all right let's count this thing out. We're going negative X's and then we're going to go all the way down to the positive now I've cut some of it out here will have the whole thing in Excel but I'm just going to then it goes into the positive here's the positive calories and so on. Then we can do our P of X calculation this would be the norm dot dist or actually notice that this X here. We did this with a formula that will demonstrate in Excel as well because what we want to do is go from negative 1069 up to positive 5448. Now you could do that by putting negative 1069 negative 1068 highlighting those two and having Excel see the sequence as you go down but you'd have to go down 5488 times so it might be faster to use the formula of sequence and what we want is the sum of those two. Plus one in terms of how many columns do we want we want or rows not columns 5448 plus 1069 we want 6517 columns here. So that would be 6517 plus one columns and then skipping the start that's why we have two commas and then the starting point is going to be that 1069. Then it'll plot all of these X's for us without us having to kind of drag it down. Once we have that we can then do our norm dot dist. Now it looks funny because calories are negative up top but remember we kept the negatives for the examples of the curve of a normal distribution so that we can get the full four standard deviations on the low side. Norm dot dist we're taking the mean and the standard deviation which of course would be this number and this number in our function or formula and then we've got, should it be cumulative it's going to be not cumulative or zero. So then if we do this all the way down you can see that it's plotting these out. We're going to get into the positive numbers down here so now we've got the likelihood of our data set being at 126 calories is 0.0020. So note when we're looking at this we're getting pretty small numbers in part due to the fact that our calories are a pretty small unit of measurement. So that means if I'm looking at just this one calorie point of 166 then the percent is pretty low. It's likely that we're going to be asking questions about ranges like what's the likelihood of being 167 or below or something like that which you would be tempted to sum it all up but you'd have to use another formula because we're talking about area under the curve. Although because this is much more detailed because we're using a pretty fine detailed approach here you get a pretty good approximation. If you were just to sum up the whole thing we'll talk more about that later though. So now we want to be out so here's where an issue comes up we want to be able to compare this to the actual count. Now the ways we've done that in the past as we said OK well I can take my actual count I can count all of the all of the numbers over here using account formula of this how many how many data points do we have with a count function and it comes out to 457. So we have 457 far less data points than the last example we had where we had like 4000 data points. So I could say I'm going to take this number times the 457 but you're going to get you're going to get these really small fractions of the number because we have such small units of measurement here. Or last time what we did is we we grouped all of our all of our actual data over here into bends or buckets based on the calorie counts. But that's still not going to work quite as well this time because because there's there's such fine data over here that we're just going to have a bunch of 0000 and then every once in a while we'll have one that landed into a bucket and then a bunch of zeros because again we have so many small units of the calorie count. So for example here is us taking the percent times times the count. So remember that the count was what was the count 457. So if I go down here even to one of the larger percent it's still a quite a small number. If I take that 457 I think it was times and I'll multiply it times this one which is point 000021. If I put it in decimal format then you get this really small number and this small number is isn't going to match any actual data count because of course the data count is just going to be a one you can't have less than one of the data. So when I match that up over to my actual frequency. So this is the actual frequency meaning we're looking at these in terms of buckets and this would be counting how many times in our actual data set we had a count that was above 126 but below and including 127 and you get a bunch of 0000 for all of them. And then every once in a while you're going to have a one over here in our frequency so it's going to be difficult to compare those out. Last time when we had different examples in the past when we were talking about heights for example or weight then this frequency count kind of lined up fairly nicely because we didn't have such small units of measurement. And we were able to then take the percent of the total and give a comparison of the percent of the total and the P of X over here.