 statistics and excel bell curve test score example part number one get ready taking a deep breath holding it in for 10 seconds looking forward to a smooth soothing excel first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is is better than their stupid stuff anyways like our CPA six pack shirts a must have for any pool or beach time mixing money with muscle always sure to attract attention yeah even if you're not a CPA you need this shirt so you can like pull in that iconic CPA six pack stomach muscle vibe man you know that CPA six pack everyone envisions in their mind when they think CPA yeah as a CPA I actually and unusually don't have tremendous abs however I was blessed with a whole lot of belly hair yeah allowing me to sculpt the hair into a nice CPA six pack like shape which is highly attractive yeah maybe the shirt will help you generate some belly hair too and if it does make sure to let me know maybe I'll try wearing it on my head and yes I know six pack isn't spelled right but three letters is more efficient than four so I trimmed it down a bit okay it's an improvement if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com here we are in excel if you don't have access to this workbook that's okay because we'll basically build this from a blank worksheet but if you do have access there's three tabs down below example practice blank example in essence answer key practice tab having pre-formatted cells so you can get to the heart of the practice problem blank tab blank worksheet so we can practice formatting the cells within excel as we work through the practice problem let's go to the example tab to get an idea of where we will be going working with the bell curve or normal distribution starting out with an example familiar to both students and instructors that of course being grades we'll actually use a random generator tool to create our data great tool to be familiar with as you practice with a bell curve for example doing some calculations we'll create a histogram seeing if the characteristics of our data might conform to those of a bell curve if so which of course we hope it will then we will then plot the actual information to create a bell curve and then create a bell curve and we'll get a little fancy with the characteristics of the bell curve in a following presentation all right let's go into the blank tab to get down to business I'm going to start by formatting the entire worksheet as we always do select in the triangle up top right clicking on the selected area formatting the cells I usually go to currency and then negative numbers bracketed and read with no dollar sign and removing the decimals I'm going to say okay I'm going to put some standard data here that I'm going to use to then create our random data so this is going to be the test scores I'm going to say that the mean is 75 so we're thinking about a test situation notice I'm going to be writing this not in percent so I'm just going to be putting a 75 instead of 0.75 or 75 percent so we're talking the scores can be going from you know 0 to 100 representing 0 to 100 not in percent or decimal format this is going to be the standard deviation which you can represent with a sigma or I'm just might be putting sd to represent it in excel because it's easier and faster to type and then I'm just going to say that it's going to be 10 I'm going to use this data to then populate my randomly generated data now note in practice of course we wouldn't have all this information if you were the instructor you would simply be taking the test scores plotting them out and then finding based on that information what the mean is what the standard deviation is however if we're generating the random numbers we can use excel to give us that information having an element of randomization in it but we have to give them some characteristics such as a mean and standard deviation I'm going to select the entire worksheet and bold this you don't need to bold it yourself but I like to have it bold for recording purposes because you have to be bold when you're recording when you're on camera you need to be you need to be bold that's what I'm told by the producers and my editor here so in any case we're going to then say that we're going to take our data and we're going to add our data now if you don't have this analysis tool then you're going to need that to generate the random data so to do that you can go to the file tab on the left you can then go down to the options at the bottom and then within the options we want to have the add-ins and then down here at the bottom we've got the add-ins if you hit the drop down we want the excel add-ins and then say go and then you want to check off that you have the analysis tool pack and once you have the analysis tool pack in your tool belt it's not that heavy or anything uh to when you have it in your tool belt you know so you can have it there then it'll be in the data tab analysis and now you've got your data analysis so I'm going to be putting it right here so I'm going to say this is going to be my random data now it's not totally random because it's going to be in accordance with the bell curve right but we're going to say let's let's format this home tab let's go to the bucket black and white this is my normal uh header formatting I'm going to wrap it home tab alignment wrap it wrapping it random data that was my wrap of it that wasn't a very good wrap didn't rhyme or anything I'm going to go to the data analysis up top and then let's go down to random number generation random number generation all right so the number of of variables I'm just going to put one that's like the columns we only need one number of random numbers so how many numbers do we want let's say like let's say like 500 we have 500 numbers actually let's just make it larger just to let's make it a thousand that'd be a good lot data a lot of data there so we're going to say that the distribution this is the key we want it in accordance with uh the normal distribution so normal distribution and then when you do that it gives you the parameters that are necessary we need the mean or the average now again in real life you wouldn't have that because you're going to generate the data looking at the test scores in order to get the average but here we have it because we need those parameters that doesn't remove the randomness but it's a randomness that's controlled randomness so standard deviation is going to be the 10 that's going to be giving us the spread right all right so that's going to give us our data the random seed I don't need anything there the output is I'm going to put it right here so this is going to be the location of the output I want it to start right there so I want it in this worksheet right here on D2 so D2 just like R2 D2 and R2 D2 the he's from Star Wars gave us the numbers so there we have there's our randomly uh generated numbers and that's great so now let's run some calculations so we're imagining if this was real life this is all we would have this random number generation from rent from past test scores we probably wouldn't I'm going to make this smaller we probably wouldn't have a thousand test scores but we might if we've been doing if we've been teaching for a long time we've had a lot of test scores but uh we're gonna we're we're gonna take this data and now do our calculations on it so that would normally be that we would want the mean or the average so let's take that I'm going to then say this is going to be equal to the average we've seen these so I'm going to do these fairly quickly this is a average and then I'm going to hold down control shift I'm holding control shift and down to take me right down to the bottom of the data set and enter and boom double clicking on it there is our formula there is our answer it's 75 percent what do you know that's what we put in the random data thing so home tab that's not how it is these days because they inflate the scores man it's not really 70 so it's so if I add some decimals it's 74 92 a b is a c these days because it's it should be around 70 in any case that should be the average 70 is average right not name not no they totally inflate it now that's what I hear at least I don't know I'm out of it these days standard deviation standard deviation but you can test that stuff out with your statistics right if you have the data which they probably hide the data but standard deviation is going to be standard deviation of the population I'm going to say I'm going to say control shift down and enter and so there we have 10 I'm going to add some decimals home tab number group adding a couple decimals there now I'm also going to add the median so the median that's the middle number so the average is adding them all up dividing by the number of them which was 1000 this one is going to take the one in the middle just like Rocky's coach told him when Rocky saw three of Drago's out there which the Russian guy he said hit the one in the middle which was good advice so we'll do so we should keep that in mind so we'll take control shift down and enter and so that's 75 so we're going to say home tab number group and adding decimals here now the fact that these two are similar in our actual data is one indication that it might actually follow a bell curve system notice that not all data follows a bell curve but a lot of data you would kind of assume would if you're talking about errors for example that's the classic example of when bell curves came about is when they were trying to make predictions people never got it exactly right but they had errors like where the stars are going to show up in the sky or whatever like that you know you can have errors that hopefully will average out to the actual answer and so you would think same with test scores you would think height a lot of things in nature you would think but a lot of stuff doesn't conform to the bell curve so it's just like the other curves we talked about in the past we want to say does it conform to a bell curve type of system and if it is something that you would think would conform to a bell curve like grades or something and it doesn't then again that's a question to say well what is going on here because it's something because you would think it would and then you and then you dig into why maybe it's not okay so let's make column h smaller and then i'm going to make a histogram this is another indication of if it would be a bell curve i'll select the the name up top control shift down all the data now i'm going to hold control backspace to get back up to the top without unselecting the data and then insert tab we're going to go into the charts so i'm going to just make a standard good old histogram there it is histogram of the data so boom so now this of course giving us our buckets down below i let uh excel just choose the buckets so it chose these buckets and maybe i can format this data if i format this data and i say let's go to formatting and say that we want to have it to be let's say currency negative numbers bracketed two decimals and so now we've got some brackets uh some data that's not so chaotic with so many decimals down here so these you'll recall are the ranges that it's going to be putting in and then it's counting the number of occurrences within the thousand that fall into those those ranges and this middle point of course where the most occurrences happen is around that 75 so if we put the focal point here it'd be around the middle notice this doesn't look exactly like a bell because we only have a thousand a thousand data points but it approximates a bell and if i drew a smooth bell curve on it the question is would that bell curve give us predictive power into the future and the assumption here would be yes uh because because it looks somewhat bell shaped right so so now that's what we'll do now i'm going to pull this on over to the right and say okay i think the bell curve is applicable in this situation would be the would be our determination and so now we're going to we're going to actually plot the smooth bell curve which will be more exact just a smooth curve and notice that the the calculation of the bell curve the formula is a fairly complex formula but the point of the bell curve usually isn't to try to understand the formula exactly although that's a good exercise it's it's to it's to say well how can i apply whoever made the formula gave us a great gift right because now we're able to apply that to these to these situations using tools such as excel to approximate our data so i'm going to say this is going to be x i'm going to say this will be p of x and so let's start with those two we'll plot our data out i'm going to go to the home tab font group let's make it black and white i'm going to center it alignment center let's make this one a bit smaller all right so just to get an idea of uh of what we can do now uh we can let we're going to we're going to say what should our x's be right we could have a lower limit and an upper limit basically on the x's now we could just have x's go from zero on up to 100 or we might try to say hey look i'm just going to go uh four dvh standard deviations out and that's usually enough to capture all of the data so the way the bell curve basically is going to work here of course is that most of the data is going to be in one standard deviation and then within two standard deviations a large part of the data is going to be there and if you get over three standard deviations very small amount of the data would be outside of that range and then so four standard deviations again would basically encompass almost everything right so if i'm trying to say what should my x's be what should be the bottom part and the top part so i can plot my graph if i choose four standard deviations i'm going to be picking up most of the data so let me show you what i mean here we're going to say if i took uh a a let's call this lower x and then the upper x for our chart so these are the chart x uh areas the lower and upper so i can say all right this this lower amount is going to be equal to the mean which is the middle point that's where that's going to be the top the tallest part of the bell curve and then i'm going to say minus because it's the lower point minus this 10.09 which measures the spread and i want to go uh four standard deviations lower so i'm going to say times four so i'm let's pull this down a bit i'm going to pull this down and i'm going to say standard deviation deviations and let's say four like that so i'll do it this way so this minus that and then i'll say times four and i can point to it there and then maybe i should add some decimals okay and then the upper the upper limit is going to be equal to the mean the middle point plus four standard deviations one standard deviation is 10.09 part of the spread times four we want four of them and that'll give us uh 115 so we're going to say then the range is going to be from 34 up uh to to 115 now obviously again it might cap at 100 so the 115 when you think about a bell curve in general remember that the tails of the bell curve can go out forever in a theoretical bell curve obviously from a practical real world example there might be an upper or lower limit in our case the lower limit generally being zero and the upper limit uh 100 but if i was to capture four standard deviations this is the range that we can pick so if i was to say i'm going to try to plot this thing then instead of starting at at zero i could possibly start at 34 about and that would pick up pretty much all of the data right so i could say let's go from uh and now i could do this with like well let's do it just this way i'm going to say 34 and then 35 i'm going to select those two and i'm just going to bring it on down till i get to 116 right so that should capture all of the data and you can see the number format i can go till i get down to 116 right there that's when it happens and that should capture all of our all of the data the primary part of the data so then i'm going to say all right and then now let's let's do uh let's actually just plot it so if i go over here this is going to be our norm uh dot dist function for each of these x values so i'm going to say this equals norm dot dist so norm dot dist this is going to be our major function here's the arguments that we need to to input in order to get the result for it so we're going to say norm dot dist has an x value of 34 now i'm not going to do an array right now i'm going to do it basically without an array formatting and then comma the mean is going to be this mean 74 i'm not going to use the mean that we used when we first started it's close but not exactly the same as the mean of our actual data i'll use the mean of the actual data and then i'm going to select f4 making it absolute dollar sign before the g and the two comma standard deviation the spread is going to be that 10 uh 0.09 f4 so that when i copy it down those two cells will not move down comma and then this is whether we want it to be cumulative or not which is similar to the arguments if you saw the price if you saw our presentations on on the poisson distribution and and those other distributions they had some of them have this similar kind of argument but it might be a little bit different when you're talking about the normal distribution because you're talking about the area under the curve when when you're talking about the cumulative meaning there's kind of calculus involved because you're talking or possibly because you're talking about the area under the curve right so integral so but so but conceptually similar kind of concept to it but we want to have it uh as of a certain point so i'm going to say zero to make it false or you can type in false or you can put in zero zero is easier to type so i'm going to close it up and spell and then and then i'll let's make that a percent home tab number group i'm going to percentify this add some decimals and then i'll double click on the fill handle and that'll copy it down so i can copy it down and so we can we can for example uh look at the and by the way if i select this entire thing control shift down oh what what did you do excel what did you do okay i hit the right error control shift down and i look at all my data it adds up to 100 percent right also remember that in order to get this at the 100 percent that we represented our data on the left which represents test scores not in the format of decimals or percentages in this case but as whole numbers so if you have the information of test scores in the format of percentages or decimals sometimes it might be easier to multiply at times of 100 representing the data as basically whole numbers so that when you get your percentages over here then it'll basically add up to the 100 percent so now we can ask questions such as or this calculation for example means that what would be the likelihood for example that we would get a 64 percent we're at the 2.2 percent likelihood that's not the question of what's the likelihood that i get 64 or under that would be one of the cumulative types of questions so then we can of course ask questions what's the likelihood that i get 64 percent or above for example and we'll do some of those calculations in a little bit here but for now let's continue plotting out our graph now you could just take these percentages and plot the graph out this way so i can select my item up top i'm going to hit control shift down and then i'm going to say control backspace taking me back to the top so i can insert another histogram so i can go into or let's let's do a actual chart this time so i can say this is going to be charts and this is going to be a bar chart not a histogram so the bar chart and of course you get this nice smooth bell curve looking thing because of course we did this with our our actual formulas and functions so this is going to be the p of x i'll leave that there and it's graphing now the percent the percent likelihoods but we need to fix that bottom bit so i'm going to select the data up top i'm going to go into the select data and on the i'm going to edit this side and we want to pick up our numbers which are starting at 34 not one that's why it's i'm that's why it's messed up here so we're going to put our cursor on 34 control shift down and and then i'm going to say okay now you got to be kind of careful making sure it picks it up over here because sometimes excel gets a little little finicky over here so if this just has one number then something got messed up when you did that and you have to do it again but i'm going to say okay and so now if i scroll up top so now we've got this nice bell curve and that middle point is around you know the 75 at that middle point nice and smooth okay and so then we can also add an area uh type of bell curve i'm going to pull this to the side we will get back to it soon because we could then think i would like to compare is there a way to compare my data to the actual data the actual data versus the bell curve so we'll think about that in a little bit but first note that you can also plot this with an area so i'm going to select this item again control shift down and then i'm going to hit control backspace and this time we're going to go into the insert tab and we're going to go to the charts and graphs and if you select i believe this one then down here you've got your areas down below so we'll pick this is the area and so i'll pick that one and so now you've got a graph uh that gives you that the area graph and this is the one uh that we can we can work with because oftentimes when you're thinking about the bell curve you're trying to get the area under the curve right that's going to be part of our calculations because that's going to give us our our probability so let's once again i got to fix this this bottom bit so i'm going to go up top and go to the select data i'm going to go to the edit over here and select this one again and then pick our x's holding control shift down usually like you could hit the control backspace but again sometimes that messes it up so i just like to go okay and then okay and see if it picks it up and be awfully careful with excel when we do that part because again excel gets a little wonky sometimes right there so there we there we have it so we'll get a little bit more fancy on this one and say can we uh can we plot questions like if it's over a certain amount or under a certain amount and can we get the z score on there so we'll do that uh in future presentations but for now let's put this at the bottom of our stack of charts cool charts that we've been making so notice that if you just want to see the shape of it the bar chart works well the area chart is going to give us that area however uh which is what we're usually thinking in when we're thinking of the of the normal distribution now we could compare this to our actual data so let's do our actual and then let's put our frequency frequency that's totally not spelled right is it there's no way i'll spell check it oh they say it is i don't know i'll take their word for it i still have my doubt font group let's go black white rapid center it i'll make it a little bit larger and now i'm going to do a frequency of our actual data to see how many times out of the thousand test scores we have that we get to each of these x's so this is going to count our actual data in accordance with these x's so i'm going to say this equals uh the frequency tab and then i mean this is an array formula so fancy array formula i'm going to pick my data over here control shift down all of the data control backspace up to the top then i need to pick my x arrays so i'm going to say comma and then put my cursor on the x arrays control shift down control backspace taking me back up so this is saying all right excel i would like you to find all of this data and see and and put it into the groups our buckets with these numbers being in essence the top part of the bucket right so it's going to take everything below 34 up to and including 34 and then everything from 34 uh or above 34 up to and including 35 and so on hopefully i got the cutoffs right there so if i scroll down there it is now i picked up this last bit down there which i don't i don't want it to hang out that far so i'm going to change this to three and so there we have it and so then if i put my totals down here i can say alt enter this should add up to 100 percent let's percentify at home tab number percentify and then alt enter for the sum function this should add up to a thousand because that's how many we told excel to count that's how many sample test scores that we had in our data set now if i want to compare this i can't compare this to the p of x directly i could make a histogram from this right i could make a histogram and i'll come up to a similar kind of histogram but i what i'd like to do is say well how can i get the percentages so i can either i can either make this into percentages or i can make the percentages into a frequency by multiplying the percentages times a thousand now normally it'd be easier to say percent of total to make to make our data into a percent so i'm going to say black white center so i'm going to take every number divided by the total this equals this number divided by control shift down i just want that thousand so enter double clicking on it i'm going to make this second number absolute f4 so that that bottom number doesn't move down as i copy the formula down i'm going to make it a percent before copying it down home tab number group percentifying it couple decimals fill handle double clicking it copying it down so now i have a set of data similar to this set of data so this is my actual percent of the total and this would be my predicted you know percents of whatever total i'd be using and so then i can i can take my difference and i can say okay what's the difference between my actual data and this is the perfect bell curve minus my actual percent data percentifying this home tab number percentify add some decimals and then copy it on down so now we can see our differences here so when we're thinking about our data then i'll stop it here we'll continue on with this next time but the general idea is we can take our actual data set in real life in practice you can actually build your data set but you would need the mean and the standard deviation to do so in practice you might not know those numbers of course and you would take your data and then calculate those numbers the mean the standard deviation if the median is similar to the mean it's likely it might follow a normal distribution so you might then plot a normal distribution in this format being then able to create our graphs and then we can also look at our actual data and compare it to the normal distribution at each point which could also give us an indication as to how close the actual data mirrors a normal distribution we'll continue on with our graphs in a future presentation