 statistics and excel normal distribution calories example part number one get ready taking a deep breath holding it in for 10 seconds looking forward to a smooth soothing excel first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is is better than their stupid stuff anyways like our trust me I'm an accountant product line yeah it's paramount that you let people know that you're an accountant because apparently we're among the only ones equipped with the number crunching skills to answer society's current deep complex and nuanced questions if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com here we are in excel if you don't have access to this workbook that's okay because we'll basically build this from a blank worksheet but if you do have access there's three tabs down below example practice blank example in essence answer key practice tab having pre-formatted cells so you can get to the heart of the practice problem the blank tab blank worksheet with just our starting data in it so we can practice formatting the cells within excel as we work through the practice problem if you don't have any starting data to practice with you can try to go into resources such as cagle dot com or create your own practice data set by going to the data tab up top analysis group and the data analysis if you don't have an analysis group you can usually turn that on in the options under the file tab as we have seen in prior presentations let's go on down to the example tab to get an idea of what we will be doing we've got our data on the left we're going to have our calorie counts by date being able to then sort our calorie count once we put a table around it we're going to do some familiar stuff we've seen in prior examples now with our calorie data one of the differences however between this example and prior examples is that when we think about the calorie count we're looking at fairly low units of measurements so when we actually plot our graph we can do it calorie by calorie but when we want to compare this information to our actual count it is often useful then to kind of group some of this information together so we'll put it into buckets which will be one of the distinguishing factors of this problem to prior problems so let's go on over to the blank tab and take a look at it i'm going to delete this here and let's do our standard formatting now i'm going to format the entire worksheet which will mess up the dates here and uh then i'll fix the dates so i'm going to note that that's going to happen i'll select the triangle right click on the entire worksheet and then format the cells let's go to the currency and negative numbers bracketed red dropping it down with no dollar sign and we don't need the decimals either so let's remove the decimals okay so you can see the date field has now gotten messed up i'm not worried about that i'll fix that here before i do though home tab font group in bolden and then i'm going to select column a and change the date field so home tab number and then i'm going to go to the number drop down and just bring it back to that short date field and so there we have it i'm going to put a table in here now so it'll be more easy for us to sort the data so i'll click anywhere in this set of data home tab insert i'm going to put a table in it now i don't think there's any blank cells in our data so i think the table the dancing ants should be doing their dance around the whole worksheet which is down to 458 and there it is let's just double check that that is indeed the case if i scroll down boom table has got all of the data within it now we could sort the data data by date now of course or we can sort it by the calorie count and so we might go from low to high got those zero days in there we got from high to low those days where we uh had a lot of calories for whatever reason let's keep it there because we could see the numbers here that's where we have to work that's what we have to work on any case i'm going to make column c a little bit smaller and let's do some of our standard calculations we'll do the mean we'll do the standard deviation which i'm just going to put sd the median and let's try the mode and see if the mode works as well i'm in the mood for some mode let's go ahead and then do the mean which is going to be equal to the average tab we're going to select all of the data i'm just going to do that with the drop down because it's in a table format making it nice and easy the dancing ants doing their average dance not that it's average their dances are all above average in terms of how well the ants actually dance how they move their hips is just astounding if you were to zoom in on it but average calculation that's what like the rain like when the rain happens but this time the ants dance made the average calculation so then we go to the standard deviation the standard deviation double clicking on that and this is going to be for the population so i'm going to select the entire thing again and now they're doing their dance they look like the same dance with those dancing ants but this is the standard deviation dance which would look totally different if you zoomed in as you well know because otherwise how would excel know if they did the same dance how would excel know to do the average calculation versus the standard deviation let's do the mode the median the median is the one in the middle so we'll start typing that in i'll select the entire data again now they're doing their median dance hitting the one in the middle i think it's got a little rocky balboa theme song in it uh when they dance to that one because because of in honor of rocky's wanting to hit the one in the middle hit the one in the middle and this is the mode the mode let's do a single mode and select the entire thing and boom so if we look at our data we can say okay the mean is pretty close to the median which is an indication that we might have something that deals with a bell type curve in terms of our data and therefore we might want to plot an actual bell curve for more predictive power about it the mode is pretty close as well but remember the mode is main is saying that one number shows up multiple times and because calories are so small in units uh then then it you know it might not be the case that even though we have something that's kind of in the has a mean in the middle and has a bell shaped curve that the numbers are exactly showing multiple numbers of the same number right because because they're pretty small units of calories but in any case the mode's fairly close as well so we're thinking the standard deviation might work you would assume just based on the information we're looking at that a normal distribution might fit due to the fact that if we're counting calories usually the calorie count's going to have to be somewhat consistent and vary from day to day if someone was going to maintain some some average of of weight you would think there should be some relationship you would think so let's do our what we're going to do now is plot this we're going to say okay i'll plot the x and the p of x and say we're going to plot this out and so i'll select this and make it black and white hometown font group bucket drop down making it black making it white and then we will center it okay so so now the question is on on the calories notice it's we had from zero see this is where the issue comes in we had from zero going all the way up to four thousand calories so you can see that's if we do it calorie by calorie that's a lot of units now it's kind of nice to put all the calories in there all the singular units because then when we do our p of x calculations we'll we'll get the thing to add up to basically a hundred and we'll have a nice smooth graph but when we do comparisons to our actual data we might want have some other method of doing it so i'm going to so i'm going to start out saying i'm just going to do it calorie by calorie but the question is how far down do i need to go so let's say we're going to take the number of deviations number of standard deviations let's say we're going to go for below and for above that means we'll have a lower x and an upper x so the lower x is going to be equal to we're going to have the mean the middle point minus the standard deviation times four standard deviation times four less than the mean and note what happens we end up with a negative number now in practice you would think you can't get a negative number it would stop at zero but in theory the bell curve goes on forever so i'm going to just keep that for now you could change it to zero but i'm going to keep it for now because when i plot my curve if i if i have four standard deviations i can get basically all of the data and it should add up to like a hundred percent right and on the upper side i'm going to say this is going to be the mean plus the standard deviation times four so there's the upper side of things okay so then we're going to have uh uh let's do this let's do it then so i'm going to start then at a negative which again it's funny weird uh weird weird so so a negative 1069 and then it's going to go to negative 1068 and then so on and so forth let's see if the system if i select those two it can count that out so then it goes down to seven so i could just count this down to five thousand it might also be useful for me to do a sequence calculation which might be a little bit easier so i can look at the difference between these two how many are there we've got we've got uh six thousand five hundred and seventeen so let's try it this way instead of me copying it down which we could do i could say this equals sequence and then i'm going to say uh the rows that we want are going to be this many rows six five one seven plus one actually no not a plus one we'll keep it right there and then comma and then the start point is going to be this is going to be the start point and let's close it up and then boom and it didn't do it k pos oh i put the start point i should have put columns so i need two commas here and then the start point so now the start point because there's just one columns all right then boom and it counts it down so if i hit control shift down i should end at five four four eight so it's at five four four seven so let's add let's i should have done the plus one i i was right the first time let's do that uh plus one and then boom and then control shift down so now it goes down to five four four eight perfecto just like moondo would do it i have a friend named moondo he's a perfectionist so whenever i do something like really well i say perfecto just like moondo would do it perfecto moondo so then i'm going to say that uh we can do our norm dot dist calculation equals norm dot dist so the x is just going to be that x again it's negative looks funny but we're going to put that in place because because that'll plot the the bell curve out for standard deviations comma the mean is going to be this mean over here i'm going to select f4 to make an absolute dollar sign before the e and the one so when i copy it down it will copy down the standard deviation the 815 f4 to make it absolute dollar sign before the e and the two comma let's make it uh not cumulative which means false or zero is what we need to put in i'll put a zero because it's easier it's faster let's percentify it before copying it down home tab number group percentify so we can recognize adding some decimals nothing there we're going to copy it down by just double clicking on that fill handle and then if i say control shift down it goes all the way down to the bottom and there's there's some activity there we go in the middle now notice a lot of the numbers are quite small because we're going we're going calorie by calorie let's add a couple more decimals i'm going to select the entire h column home tab number adding a couple more what you went the wrong way k posso percent adding a few more decimals so you can see we have pretty small numbers for any one of these uh due to the fact that when we go call it one calorie at a time we're finally getting into the positives that that if you're looking at for that exact number you're going to get a pretty small number but the benefit of going one calorie at a time is that if i select all of this data if i go control shift down and then we sum it up this will be the total and i say alt equals to sum it up then it adds almost to a hundred percent right because we're at four standard deviations on the low and the high now that double check numbers nice because that's also one of the reasons why sometimes it might be useful to plot the negative numbers even though in practice you're not going to have in real life negative numbers on this scenario just because that gives you that kind of verification number and it'll plot the graph a nice bell curve tailing off to that side as well you could remove the negative numbers if you would like to as well so i'm going to make column f a little bit thinner so then if we were to plot this i'm going to put my cursor on p of x well let's plot let's plot the calories first i'm going to put my cursor on the calories control shift down let's make a graph of that i'm going to say control backspace to go back up insert and let's go to the charts let's go to a just a histogram let's go to a histogram and there's our data so that's another indication here this is giving us our buckets of calories zero to to 370 370 to 740 and so on it looks like we're kind of in the middle here this is our average count 1,850 to 2,220 and then we had a count of in that bucket 120 so you can see this looks somewhat symmetrical somewhat like looking like a bell curve which would justify us basically doing you know this bell curve type of analysis with this kind of data so we have that let's plot this one so i'm going to say control shift down on this one and i'm going to say shift up so i don't pick up the total control backspace to get back to the top before we insert the graph insert tab charts this time we want the bar chart selecting the bar chart and boom and because there's so much data it almost looks like an area graph right because we did it so we did it cell by cell it almost looks like we entered an area but if you click in here you can see it's actually it's got these bars in there but the lines are so small because it's calorie by calorie that it looks like we entered an area graph on it so there is that looks quite bell shaped so we're going to say okay that's the shape of that one and then we can say let's compare that to our sample and this is where we're going to kind of reveal one of the issues that we have with this particular system of going one calorie at a time at a time so if we were to compare this two hour sample there's a couple different ways we can do it we can we can then take our actual data here for the calories and and then we can put those into buckets and take a percentage of the total of all the counts that we had or i can try to say let's take our percentages here and and multiply it times the count and then compare that to our calorie count so in other words let's take a look let's take this data right here and multiply it times the count of the data over here so i'm going to say let's do a count of the data this equals count and then brackets i'm going to select all of the data and so and enter so there's 457 counts that we have over here so if i multiply the p of x p of x times the count times the sample count and let's make this black white and wrap it maybe i'll double i'll put a space here and then maybe center it then then that should give us then the probability of each of these numbers coming up if we had a count of 457 based on our actual bell curve not not the actual data so we're going to multiply these percentages times the sample so if that was the case what would we expect to happen well i'm going to take this number i'm going to select f4 on the keyboard to make it absolute because i'm going to copy it all the way down times the percent which is very small here and enter now if i if i select that i'm going to add some decimals to it so let's make it quite long you can see it's a very small fraction of a number now obviously when we have an actual sample of 457 there's no way that i'm i'm going to get a fraction of a count right it can only be one or above on how many times i have a count so that's one of the issues when we try to compare this so if i double click going down you can see our numbers are quite small and the problem is in in the actual counts right we would have actual whole numbers when we when we count them so that's why we might need to kind of group these together into basically buckets if we wanted to kind of do a comparison right so if i compare that to my actual data let's compare this to the actual data actual frequency frequent frequency not sure if i spelled that right i'll check it in a second home tab font group black white let's wrap it and let's uh center it and now we'll do a frequency count which is going to put each of these into the buckets uh a bucket count so in this case you would have everything up to and including uh 68 and over and above 69 so it's going to be equal to frequency tab this is an array formula we'll select the entire data on the left and then comma and we'll pick up the x's which is our bends control shift down shift up control backspace to get back to the top so we can see what is happening here i'm going to close it up and then enter so now if i say control shift down you can see it takes it all the way down to the bottom here i'm going to try to shore up this last bit so that it doesn't uh pick up so it doesn't spill all the way over so uh let's see if i can take this up one and then enter and so now it spills down boom control shift down and it's still going all the way down to here let's try it again we're going to double click on this go all the way down to the bottom and i'm going to pull this one up to right there and then enter control shift down okay so there we have it now if i if i sum this up i say alt equals now we've got the 457 so the count is there so the issue then though is the fact that i can't get a count of course let's add some decimals to these home tab number adding a bunch of decimals right it's not going to help because i'm not going to get a count that is going to be less than one right i can only get a i can only get a count that is is going to be one or above so even though i've got these fractions based on the bell curve you're only going to get actual numbers here and so here's you could see sparsely ones that are sparsely arrayed here when we're counting only calorie by calorie because there's so many options when you're counting one cell at a time i can also compare if i wanted to compare to the percentages i can take my actual count divided by the total so i could do this equals this divided by the total control shift down which is that 457 i just want the 457 and then f4 to make it absolute and then enter and then make that a percent home tab number percentify adding some decimals and double click that down so that's something we've done in prior presentations but again it's not going to it's not going to be comparable all the time because i i'm only going to have certain items that have a count and there's going to be all of these places where i don't have a count because we're going calorie by calorie so whichever way we do it we run into that problem so this is going to be a percent of total so let's make this home tab font group black white wrap it and center it so next time what we'll do is we'll say okay let's see if how we can kind of put this this information into buckets for possibly so i can compare it to the to the actual count we'll take a look at that next time