 statistics and excel bell curve batting average comparison part number two get ready taking a deep breath holding it in for 10 seconds looking forward to a smooth soothing excel your first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is is better than their stupid stuff anyways like our crunching numbers is my cardio product line now I'm not saying that subscribing to this channel crunching numbers with us will make you thin fit and healthy or anything however it does seem like it works for her just saying so you know subscribe hit the bell thing and buy some merchandise so you can make the world a better place by sharing your accounting instruction exercise routine if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com we are in excel if you don't have access to this workbook that's okay because we basically built this from a blank worksheet but we started in a prior presentation so if you're starting from a blank worksheet you may want to begin back there however if you do have access to this workbook there's six tabs currently down below the two example tabs in essence answer keys the two practice tabs having pre-formatted cells so you can get to the heart of the practice problem the two blank tabs is where we started with a blank worksheet and are continuing at this point in time quick recap of what we did last time we went online to find some resources for baseball statistics we downloaded them in a CSV or comma deliminated type of format and then we put them into our blank data so that we can sort the information populated in a table filter it pick up what we need and then copy it over to our blank tab looking at just basically the batting averages we converted the batting averages to a decimal format so that we have whole numbers with them now so now we have the 60 representing basically the 60 percent here for 2000 or 1920 and 2022 so we have the two years so that we can compare the two we did our standard data calculations the mean standard deviation median mode for 1920 and 2022 looking at the differences between the two we noted that the mean is similar to the median and mode for both years which would indicate that we could have a bell shaped curve which might help us with our comparisons so now we are constructing our bell shaped curve starting with the 1920 data we plotted from negative five which is unusual you might say why would we have an X of negative five the batting averages would need to be between one and 100 but four standard deviations goes down to negative five here and so we want to have all of the data that we could on the bell curve so we have a nice smooth bell curve and we can kind of double check that our data adds up to 100 as we did here when we did our norm dot dist calculation this set of data basically adds up to 100 percent now and that is useful to be able to see that and in part it's doable because we converted the stats which were in percent or decimal format to you know whole numbers okay so now let's also do our z-score which is a nice thing to be able to look at especially when we're comparing two different years because when we're talking about people's batting averages we might say well look at the things were a lot different in 1920 than 2022 but we can see where they were relative to the field at that point in time relative to the mean and that's going to give the z-score how far away from that middle point so I can go to the home tab font group let's make this black white let's center it and do the z-score so the z-score is going to be equal to brackets I'm going to take the x value which is a negative five in this case minus the mean the distance from the mean for 1920 data closed up the brackets and divide that by the standard deviation so there we have it and enter now I want to copy that down so I'm going to double click on it anything that's not in the current field I'm working in then I want to make absolute that's going to be these two and column J so the J2 I'll put my cursor in there F4 and the keyboard dollar sign before the J and the two put my cursor on J3 F4 on the keyboard or dollar sign before the J and the three enter back on it double click in the fill handle to bring it on down so there we have it so if we're looking at a like a a 32% batting average then that's 3.08 likelihood for that exact area and it's 1.06 on the z-score or 1.06 away from the middle point the middle point being right here at around 24 to 25 that's where the z-score hits zero because of course that is the mean for the 1920 data so then we can also take a look and and try to compare this to our actual data so what I'm going to do now is pick up our actual data to compare to the actual data we could take these percentages and multiply them times the count over here let's calculate the count the count equals count and then I'll take my data for 1920 so it's just going to count all of the data and enter so there's 629 count over here this is going to be equal to the count and we'll count this data and here we have 822 so I could multiply these percentages times the 629 and get whole and get numbers that we would think would basically represent if I had that sample using like the bell curve data or I could try to convert my data into percentages so I can compare and that's probably the better way to go here so what I'm going to do is call this the frequency so this is going to be the frequency and I'm gonna say let's make that home tab let's say actual actual data frequency and let's wrap the text home tab alignment wrap the text center it and make it black and white and so there we go that should work I'll check the spelling hopefully later and the frequency we're going to see how many times in this data set do we get a count of x so x being in this case above five and less than or equal to four or four if I was talking down here which would be more common for batting average right it would be up to one up to an including one from zero zero to one that's what the frequency is going to give us it's our buckets so I'm going to say this equals the frequency which is a spill array the data set I'm going to pick over the data which is over here and say boom there's our data and then comma and now we want the bins array which is going to be this information control shift down and control backspace there we have in it cloning close up the table and enter and it spills it on down so there we have our accounts it's going a little far here so I'd like to remove that last one or actually it has a number in it so maybe I should make this go down one more go down to 55 just so we can include that last number and so okay so there we have it and then I'm going to total this and and then sum it up over here so if I sum this up by the way I should get alt equal that should be a hundred or one home tab no numbers percentify and this one over here if I say alt equals we're going to get to the 629 which should be our account right there's the 629 that's a check a double check that we've kind of got all of our information now instead of multiplying these percentages times 629 so I can compare to the frequency I'm going to take the percent percent of total the percent of the total so that we can convert this into percentages which I can compare to these numbers then on the actual data so actual let's say actual actual percent of total home tab alignment center rapid black white and then we'll put our cursor here this is going to be equal to this divided by I'm going to put my cursor in the data so I can say control shift down taking me to the bottom picking up the 629 I'm going to say f4 on the keyboard to make it absolute so there we have it so we're going to take each one of these divided by the total enter and let's make that into a percent home tab number percentify add some decimals putting our cursor on the fill handle double clicking it taking it on down so now we've got our percentage use comparisons so this compared to this now this compared to this this is our bell curve this is the actual percent of total data let's look at our differences then we could say that well it's the difference between the actual and our percent data home tab no font group black white alignment center and let's say this is going to be equal to the p of x minus the actual percent of total making that a percent home tab number percentify adding some decimals double-click in the fill handle to take it on down so we can see the differences there so it looks like it lines up pretty good pretty good now we could actually plot this information we can plot these two basically on top of each other to see how closely they line up the bell curve and the actual percent of total so we'll pick up control shift down and then shift up so I don't pick up the total control backspace so I get back to the top and then insert charts we're gonna put a bar graph and boom so there's our actual data looks very bell ish bellish shaped and let's say this is for 1920 comparison and then I'm gonna pick up the actual in data so let's put our cursor here let's go to the chart and let's add the data I need to first pick up my proper X's so these X's aren't right I need to pick them up over here so I'm gonna edit and then select this one and go into our X's like we do every time control shift down and shift back so I don't pick up the total enter okay so it looks like it's got the correct X's so I'm gonna say okay got the correct X's because we went to Texas and found all of my X's that's a song sorry that's a song I got distracted all of my X's are in Texas anyway we're gonna I don't know why that song just would you would you delete that stew Phil would you delete that cuz that shouldn't be in there select that's my editor we're gonna then go into here and and we're gonna say now we'll then take this and say we want to select the data and we want to say that we have add let's add the next data set which is gonna be named the actual percent of total and then we'll take our column over here and pick that one up and say percent boom looks good okay and okay so now we've got the two on the same so you can see that that let's add a legend let's add a legend I thought it was gonna put a picture of me right in there cuz I'm a legend I thought it was just gonna put my portrait on the side whatever anyways you can see that the that the actual data is pretty close you know so you could see to the to the bell shaped curve which means that the bell shaped curve should give us some predictive power over what we're doing here so let's pull that on down now we could of course make an area type of graph as well but let's what I want to do now is do the same kind of thing for for the next year let's put let's put something above this so that we can show that it's 1920 data so I'm gonna select all of this information and right click and then insert and shift it down shift it down okay and then this is gonna be 1920 and let's make that number format and let's just make it a general and then I'm gonna select that and I'm gonna I'm gonna copy it across I'm not gonna some people might merge I don't like to merge so I'd rather right click and then format the cells so and then go into the alignment and then I want it to be horizontal center across the selection so that'll put it in the middle without combining so they're still separate cells but it's like in the middle and then I can go into the home tab and let's make it a different color like a blue maybe let's make it blue and white so that's the 2020 stuff all right let's now do the stuff for for 2022 so I might be able to hide some data first let's make a skinny T it's like mr. T got skinny because he stopped working out it's a skinny T and then we're gonna right click and then hide no one knows who mr. T is anymore whatever everybody knows mr. T man he's on the A team and so this one is blank because I hid the data all right let's do it again up top and then let's do this for 2022 so now we're on I'll put this up top 2022 formatting that home tab number let's make it general format I'll make this one green let's say so I'm gonna hit the home font group let's make it dark green and then white okay so we have the same starting point we have X's we've got the P of X P of X and then we're gonna have let's select those two I should have kept it unhidden so I can black white I'm gonna make the X a little smaller we should probably center it alignment center and then I should probably just copy the X's from what we did before but I'll recreate them well actually let's unhide and copy the exact same X's so I'm gonna I'm gonna unhide putting my cursor from L to U Lou and then right click and unhide and let's just pick up the exact same X's to make sure that we have them lined up this equals this five putting my cursor on it double clicking the fill handle it won't let me do that so I got to drag it down for crying out loud Phil why is the handle not working why is Phil's handle not working so I'm gonna scroll up we don't need the decimals decimals would you number group kindly depart so that we can get down to business over here then I'm gonna hide from M to S so miss is gonna we're gonna hide miss and then hide she's distracting people the miss okay so now we're gonna say this is gonna be equal to norm dot dist and so now we're gonna pick up the X which is now 2022 data so so I'm sorry it's gonna be that X the five comma the mean is that 22.2 for the 2022 data I want to F for it because I want to copy it down dollar sign before the K and the two so that 22.2 don't move down comma standard D that's gonna be the 6.12 F4 in other words make it absolute dollar sign before the K and the three comma cumulative no false or zero closing it up enter putting my cursor on it percentifying the cell home tab number group percentify so we can recognize and then we're gonna put our cursor on the fill handle and double-click it on down there we have it and now it adds up to 100% if I go to the total down here and we say alt equal 1 or 100 home tab number percentify decimal eyes and recognize all right so then we're gonna say the Z we can do our Z score which is important for our comparison purposes home tab font group black white let's center it let's make it a little smaller and the Z is going to be equal to brackets the five minus the X minus the mean 22.2 in this case close the brackets dividing by the standard D standard deviation 6.14 there's our Z I want to copy it down so anything that's outside of my data over here that's outside of the table I'm working in I need to make absolute that's anything in column K so that's gonna be this K2 F4 dollar sign before the K and the 2 K3 F4 dollar sign before the K and the three enter putting my cursor on the Z score double-click in it down and we can see that it gets to that zero point around 22 and 23 because that's the mean and that makes sense so we can compare the Zs for years 2022 and 1920 now let's do the actual frequency you're a frequency frequency I don't know font group black white did you just call me a freak I would know it's the freak the frequency I said I said frequency we're gonna say this is gonna be the frequency which is which is how many times it's gonna fall into the buckets for example this one being above zero up to and including one so we'll do our frequency equals frequency tab data array it's gonna be over here control shift down control backspace to get back up comma and then the bends array which are our X's which all live in Texas we try to stay out of Texas therefore I because you there's that's where the X's are and so it's not a pleasant place a lot of times that's a song okay get the song out of your head no song references no one even knows what you're talking about man no one even knows what you're talking about okay sorry but this cut there's still two down here so let's actually pull this down and bring this down to one more I'll pull these two down so that we get that last two in there and this should be 56 and then there we have it so now I can total this up this is gonna be alt equals alt equals 822 so that 822 should be the total to that so that looks good that's our check number now and now what I'd like to do I could multiply all these numbers by the 822 so I can compare it to the frequency but I'd rather make these into percentages so now I'm gonna say this is going to be the actual actual percent of total and then we're gonna go home tab font group black white wrap it center it and then we'll say this is gonna be equal to that zero divided by I'm gonna go into my data so I can go control shift down just so I can find that bottom number f4 on the keyboard dollar sign before the X and the 65 so I can copy it down not having the total move down when I do so percent to find the cell before we copy it down home tab number group percentify to recognize add some decimals fill handle double-click in it and fill fill does the calculations and there we have it so I can say alt equals that should be a hundred percent right there home tab number group percentify to recognize and so there we have now I can say the difference the difference between the actual data and the P of X data home tab font black white center is equal to P of X stuff minus actual percent of total percentify home tab number group percentify decimals double-click and fill handle to bring it on down there we have it all right so let's let's put a I have a 2000 I thought I put a 2000 oh there it is let's center this 2022 thing now selecting this whole bit up top right clicking formatting the cells and I will do this by going to the alignment group horizontal center across selection and okay you can't see it because it's in the middle now and it's white there so home tab font group bucket drop down make it green and there we have it the hiccups okay so that looks good so so now I could say that that we we do a P of X calculation so I could say well let's look what the P of X looks like control shift down and shift up to not pick up the total control backspace to get back to the top and then I'm gonna say insert charts let's put this in place with a bar chart boom and so there's our our actual data this is for 2022 comparison let's say and then I'm gonna put my actual data on top of it well first let's fix the X's the X's down here they need to be fixed something needs to be done about these X's we're gonna go to the chart design we're gonna go to select and then edit this on the left and then add our X's I'll be selecting my X's okay I'll be so you don't just give me random X's I'm the one that picks them so there they are and then we're gonna say okay and then now let's put the actual data on top of it to compare it so we're gonna go to the chart design data select and then we're gonna add more data and I'm gonna call it actual percent of total and then we're gonna say boom and let's pick up our actual data and okay okay and so now you can see this one's a little bit kind of weirder because well let's add the legend hit the plus button we're gonna add a legend and it's gonna show a picture of me oh man I'm I thought when you look it up in the dictionary legend then I thought there's a picture of me when that when you do that but no they put the legend here so then now this one you know it's a little bit more walkie but it still kind of conforms to so you think the bell curve would still give us some predictive power so we're gonna say alright let's put that down here in our collection of pictures I'll put this over here I have a lot of pictures on your phone of your family and stuff no they're of charts their chart pictures they're really important to me I have to keep them safe over there okay so now we're gonna say that we want the difference so we can call this let's put it up top differences let's say let's make that orange home tab font group drop down making it let's say dark orange or something and white and then let's put the differences for the X's and in the P of X's maybe and then we can say the differences between the actual percent of total that we can do differences on selecting these items home tab font group making it black white centered wrapping it I need to unhide my 2019-20 data which is both between L and T putting my cursor on the drop-down or on the column L to you Lou we're going to Lou and right click skip to the Lou my darling and we want to unhide where's your darling she's skipped off to the Lou skip into the Lou my darling that's a song nobody knows what you're talking about that song is ancient for crying out no need what's a Lou and okay whatever skip into the Lou man so let's just do this actually let's just pull over the X's so the X's are just gonna be this and then I'm gonna copy it down and we're gonna copy it down to here and then let's go to the home tab numbers and remove some of the decimals okay we can make this a little bit smaller possibly alright and then the P of X's I'm gonna say is equal to the difference between the 1920 P of X data so this minus then this data so you can say boom there's that and I can make it a percent possibly home tab number percentify adding some decimals double clicking to take it on down so now we're looking at the difference between 1920 and and 2022 I could do that for let's do that let's insert a cell here right click insert for the Z's as well so for the Z's I could say this equals the Z here minus the Z there so and then I don't want to make that a percent let's format this one home tab font group format paint it and so there's our difference and then I'll just double click to bring that on down we can take the differences in the actual percent of the total so this equals the actual data for to 1920 minus 2022 boom making that into a percent home tab number percent adding some decimals double clicking it on down so there's some some comparison data and then of course we could make some other graphs and try to graph these things on top of each other if we wanted to such as the 1920 P of X and the 2022 just to see them overlapping each other to see what that would look like so let's take the P of X here I'm going to say control shift down and then shift up and control backspace back to the top insert charts and let's insert a bar chart so now we have now we have P of X 2000 or 1920 verse P of X 2022 something like that and then I'll pull this on over here and let's also pull in the 2022 I need the X's so I'm going to pick up the X's too so we're going to say don't pick up the X's it's not a good idea it's not a good we have to we have to do it we're going to go in here we need our own X's it's better than just having random X's be given say that we pick them up or we get random X's okay so there we have it and then so now we're going to select the data and go into chart design select the data and I can put on top of it I'm going to add another data set and say this is going to be the second P of X or let's call it this one let's just name it P of X 2022 and then I'm going to pick this data is going to be this range P of X 2022 boom and okay so now you can see the two if we if we approximate a bell curve on the two you can see they're slightly you know this this they're slightly skewed the shape of the bell curve let's let's add a legend here legend hold on I thought I changed the name this orange one oh my data so this one let's edit this and this is 1920 so let's just call it 1920 okay so the legend and then this one let's call it edit let's make that one 2022 and okay okay so now we can see that the shapes of these on top of each other and see that it's kind of skewed and then this one obviously more peaked in 2022 and we can see the differences over here in our mean is 24 versus 22 so those middle points you would think would be 24 versus the 22 so we can see kind of the the relative data here now when we look at the z scores remember we're thinking about the z scores are relative to the middle point of of their of their distribution which we have these two different kind of bell curves based on the two different time frames so we have that and then we could do a comparison of like the percent of of the total versus the percent of the total over here but let's let's keep it at that for now and and just note then when you do some of these comparisons then sometimes the z score becomes relevant because when you're trying to compare one year versus another year you might be saying well you know if I if I if I looked at something down here and we had we had a player that had like a 30 a 30% right a batting average in 1920 and we looked at their z score which is this column and so it's at it's at 0.79 up above that average point and if I go over here the 30% if we look at the z score is 1.27 above and so if I just look at those two numbers you can say well they're both at that 30 but but if I compare it to what was happening relative this one over here in 2022 is a lot higher up you know it's so it's further away from that middle point the average of the times so you would expect you know that would that would be a differentiating factor possibly even though you're looking at the same percentages remember that the general idea of the statistics oftentimes is that I can't look at just how many at bats they have or how many times they got on base because they're going to have different at bats that's why we make the percent so so the percent already kind of evens things out a bit but it's still not going to be give us we might want to have more questions about who's the better player and what time frame and whatnot and the argument of course is going to be yeah but and the things were a lot different and in 1920 than 2022 so let's see how well that percentage was compared to the the middle point the average the z score right and that would be another another factor that you can kind of put into play let's go ahead and add some formatting here so I'm going to select all of my data here up top I'm just going to make it blue home tab font group bucket if you don't have that blue it's in the more colors there it is the blue and I like to put some borders around it if I may I'm going to do that here too you may no one's going to stop you home tab font group border blue border blue at least I don't think anyone's going to stop you I won't stop you you can make it a different color if you want I won't stop you from doing that either it's a free it's a free country man it's a free country well at least it used to be border blue I wouldn't be surprised if someone stops you now the king of England probably come in here and try to tell you what color you should make your chart home tab font group border blue and then we'll say this one home tab font group border blue and then this one uno vase moss one more time control shift down home tab font group border blue I'll make my table however I want to make my table it has only two legs on it and falls over every time I put something on it that's how I want it that's how I made it that's how it's going to be okay that's how it's going to be this is going to be center across let's make it home tab font group and orange and let's pull our graphs over here so we have all our pictures so we can see them we need to have our it's not enough just to have your pictures your special pictures you have to have them ready so that people can look at them you can show them off when at like parties and stuff so then so there they are let's go into the review and I'm not oh now it's gonna I can't spell check it because it has all the names let's try to highlight all of this stuff and just spell check it over here see if I misspelled fixing my spelling spell check frequency I knew it that wasn't actual okay we can make some skinnies let's these can be skinnier mr. T you're still not skinny enough I know you lost some weight but still all right that looks pretty good looks pretty good