 statistics and excel deck of cards statistics and excel got data let's get stuck into it with statistics and excel or actually one note here but we'll still talk about excel you're first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is better than their stupid stuff anyways like our trust me I'm an accountant product line yeah it's paramount that you let people know that you're an accountant because apparently we're among the only ones equipped with the number crunching skills to answer society's current deep complex and nuanced questions if you would like a commercial free experience consider subscribing to our website at accounting instruction.com or accounting instruction.thinkific.com not required to but if you have access to one note we're in the icon left-hand side one note presentation 1325 deck of cards statistics and excel tab we're also uploading transcripts to one note so that you could use the immersive reader tool changing the language if you so choose and then being able to either read the transcript or listen to this transcript in multiple different languages using the timestamps to tie into the video presentations one note desktop version here remembering the two major categories of statistical problems we are working with number one where we have all the data of the population we then using our statistical tools to organize that data in such a way that we can extract meaning from it number two where we don't have all the data of the entire population but maybe we can get a sample of the population applying statistical tools to that sample in a similar way as we would apply statistical tools to the entire population not because we're interested in the sample itself but hoping that we can take the information from the sample and it will tell us something about the entire population now in prior presentations we looked at examples such as heights where we took an entire population and then took a sample from that population so that we can test out whether or not the sample does give us information that we can possibly apply to the entire population we thought about a more theoretical situation of a coin flip where the entire population is basically infinity infinite number of coin flips and the sample then would be however many times we flip a coin in order to test that out now we want to take a look at a deck of cards because when we look at a coin flip for example you still have only those two options 50 50 the deck of cards adds a little bit more complexity because now we have 52 cards in a deck so we want to be thinking about the statistical implications how we might organize the data as well as how we might use our tools of excel to help us out with our analysis when we're working with something like a deck of cards so if we think about a deck of cards we have numbers of the cards and we have the suits of the cards that we're going to have to be dealing with now with the numbers of the cards it goes from ace or one two three four five six up to ten and then it doesn't keep on going up to 13 but it kind of does because you've got the jack queen and king so we could assign a number to all the cards we can say well they're basically numbered from one to 13 and notice if you'd play different games you might say well some games say that all the face cards count for ten or something like that those are particular games but if we just simply assign a number to the cards we can do that and say well there's one to 13 that can be a useful tool for working in excel or even for memorization of certain things and then we have the suits I'm not putting the suits in any kind of order of importance the suits might have different implications for different games that are being played but the suits we have the four suits states hearts diamonds and clubs so if we take each of the cards then such as an ace we could have four aces the ace of spades ace of hearts ace of diamonds ace of clubs therefore we have four aces of different suits we have four twos of different suits we have four threes of different suits and so on and so forth and we've got the four jacks of different suits for queens and four kings so if we add all this up we can put this into a table to try to analyze this right we could say okay well I if I number the cards one to one to 13 and then each of those cards have four of four of each of them that's gonna give us four of each card or 52 so you can add these up both ways nicely in a table right so now you've got one two three four I can sum up this column to get to 52 or I can say how many of each suit are in a deck how many spades how many hearts how many diamonds how many clubs 13 of each 13 plus 13 plus 13 plus 13 is going to be 52 so that's nice that's useful for us to kind of organize this in our minds and on the page but now I still have this kind of issue of if I want to set up some kind of statistical problem because I don't have one numerical value for each card right I have to I have to say two things in order to name one card I've got to say well this is an ace of spades or an ace of hearts or an ace of diamonds I don't have one number that can stand for the card so that's going to be kind of an issue for us that will take a look at in a second now if I just run some standard statistical kind of analysis I can say well one card out of 52 what if I had a deck of 52 cards it was an even deck of 52 cards and I draw one card out of 52 well then then the chances of it being any one of those numbers if it was chances of it being an an ace of hearts or whatever is one out of 52 right or point one point nine two if I move the decimal about over and I can say okay well what what are the odds of any suit I think that's how you spell suit right or is that like a business suit I don't know 13 out of 52 because there's 13 of each suit 13 of each suit out of 52 so the odds of pulling a hearts or the odds of clubs or whatever 25 percent right because it's 13 over 52 and then I can say okay well what are the odds of one number of any suit meaning what are the odds that I choose an ace or or a jack well there's four jacks in the deck of 52 so I can then say well then four over 52 is going to be about 7.96 or so on and so forth so those some standard statistical analysis we can get just from that table now the next thing we might do if we're if we're trying to analyze this board I could say well why don't I assign a number to each of the cards that's a unique number now obviously there's many different ways that you can do this but I'm just the concept would be if I assign a number to it then I can have just one cell representing any card so for example here we had if I if I regroup my my table over here vertically I can just say okay I need two columns the way I had it before right I need the ace to 13 which represents the king of spades and then it starts over ace of hearts to 13 which is the king of hearts ace of diamonds to 13 the king of diamonds and so on but what I'd like to do is assign one number to it so I'm gonna make this other column and just assign one through 52 so now I've got one number assigned to each card and if I if I can do that and again there's no like uniform numbering system I could have put the the clubs on top or the hearts on top or whatever but once I have a numbering system with one number then I can use my excel tools a little bit easier and it might actually help for some memorization tools if you if you had a system like that coming up as well then I can say okay well what how can I simulate possibly a random draw from a deck so this is how you can kind of make a computer your own little computer you know card game or something right you can program basically excel in some into some degree you know you could say you could say well what if I took a random draw of 52 cards well I can use a I can use the random generator here now because I've named each card individually the bottom card being one the top card being 52 now notice there's kind of an issue with this I won't get into it right now if you were to try to play games with this or something like that because if you took one card out of the 52 card deck right now there's only 51 cards and that one card you took out you know is gone so you have to kind of account for for the fact that when you draw cards out in a practice game but I won't I won't get into that in detail here just want to note that you can kind of create a generator here based on this now by saying I'm gonna make a random generator and here's my random generations as if I took one card I put it back in the deck shuffled the deck again and drew one card and each of these numbers then represent a unique a unique card because according to my table so the first one I drew 39 39 according to my table is the 13 of diamonds which is a king of diamonds because 13 represents a king because because of right so that if I if I just if I had a 48 48 represents the nine of clubs okay so now we've got a unique a unique card and I could take I could even take these numbers if I wanted to and tell Excel to find the related you know suit so it'll give me the it'll give me the the number and suit but we won't get into that now but then I can take I can take my random numbers from my random number generator which is always just going to keep generating random numbers and and then paste it so that it's just a hard coded number over here so now I've pasted these random numbers that we are that we drew out and we can you know we can do some analysis of it so let's take a look at our table here let's do it this way so this is a similar table that we had before we have where we had our aside numbers so these are the aside numbers these are the card numbers 1 through 13 spades 1 through 13 hearts 1 through 13 diamonds these are the numbers we assigned just 1 through 52 these are the suits and then we have that's just deleted here we have the results so so the results are using this count formula which I'm saying we're telling Excel count this range of data if the range of data has the assigned number in it so in other words this cell has this formula in it which is using account function and we want to say given this sample that we just drew out a number every time we drew a number out of the deck and then put it back into the deck right and then drew out another number one out of 52 each time of the random sample however many times we did it we did it a fair amount of times is going to give us a 79 so 79 times it it was an ace an ace of spades particularly right and then and then 100 times it was a 2 and then notice down here if I go down here this is a 29 so it looked for how many 29's I got and the 29's represent a 3 of diamonds so we had 88 3 of diamonds you see that are drawing out of here now if I look at the percentage then this then is the percent compared to the total so if I add up all the results the results add up to 5,000 so we actually did this 5,000 times so we kind of mimicked us taking a card out of the deck and then putting it back into the deck shuffling it taking a card out you know 5,000 times alright and then so it's so the total of these is 79 79 over 5,000 if I move the decimal two places over that's 1.58 so that means we drew out an ace of spades 1.58% of the time this one is 100 divided by 5,000 so that's 2% so we drew out 2% of the time a 2 spades the 3 of spades we drew out 109 over 5,000 which is 2.18% now according to what we would expect to happen because we're kind of mirroring a similar situation with the coin flip here where in the coin flip we kind of imagine that the entire population would be as though you did this infinite amount of times in which case if it was a fair coin it would be 50 head 50% head 50 tail and this case same kind of concept we would say well if we imagined that we did this infinite amount of times drew a card out of the deck and then put it back in drew a card out an infinite amount of times then we would expect that it would come out to be 1 over 52 which is going to be that 0.92 if I move the decimal place over percent of about rounded so so now we can compare our results the statistical results we came up with 1.58 79 times out of 5,000 versus the actual so we have our differences of in this case 0.34 0.08 and you can see that the differences are kind of are gonna are gonna some are over and some are under so we have a similar kind of situation we did with the coin flips here right with the coin flips we were trying to the null hypothesis was the coin was fair and we want to then flip the coin multiple times to see if that was false same thing here the null hypothesis is that the deck has 52 cards in it and it's fair and you're drawing it correctly and it's a random draw and all that kind of stuff and if that was the case we would expect if you did it infinite amount of times it would come up to 1.92 and so then we're we're gonna do it and see if that is the case and and with our statistical analysis and this is our differences that we have here so then if I summed up all of the if I averaged all of these all of these numbers I come up to 1.923 which is pretty close to the to the actual result right because we did it because we did it a fair amount of times and by that I mean it we ran the experiment a lot of times five thousand is pretty a good number of times so now we could also say well what if I took what if I took the count of the spades for example because we know what the statistical numbers that we looked at we'd say well if I just looked at spades there's 13 out of 52 so if I did that infinite amount of times if I drew one card infinite amount of times you would think that I would have a spade like 25% of the time right so so then if I count the spades here that's going to be using this formula so I'm now I'm looking for the spades so so so just notice the function here you could arrange this function a couple different ways because now I've got a sums if function meaning I want you to sum the results so the results are going to be this column and I want you to sum them for the spades so here are all the spades down to here now in order to sum the spades I could I could tell Excel I want you to sum this column the sum range if this column has a spade in it right that would be the easy you know that's probably be the easiest way to do it based on this table but you can also say that if I use the absolute numbers over here there's a the spades are from 1 to 13 so you could try to tell Excel that you want to sum this column if the if the numbers between you know 1 and 13 I believe this when we did it we we did it based on this column here but in any case Excel can add that up for you right it'll add it up and then we can take the percent so we can say okay well I've got one three one two three three out of five thousand draws let me do that again one two three three out of five thousand and that comes out to two four point six six and this one is one three three four out of five thousand that comes out to twenty six point six eight percent the percentages out of all of them of course add up to one hundred because that's going to give us our check number and we would expect that it would be about 25 percent which is 13 over 52 cards in the deck 13 of every suit over 52 so you can see this one is off you know they're they're off but they're somewhere in the range right that we would expect so the results seem to kind of verify that this is a fair deck that has actually 52 cards in it we can also do the same thing and say well what if what if we were to count like each card but there's four cards of each suit so anyway in in the cases of aces I could say well there's four aces out of 13 I'm sorry four aces out of 52 so we would expect that we would draw an ace if we did this infinite amount of times the whole population of the cards about seven point six so we can tell excel try to count the number of of aces so we've got another sum if formula meaning I want you to sum the range so we're going to tell excel sum up this range of the results if the result has an ace in it now notice when I look for an ace it's easier to look at this column because this is the card number that's going to show me the one ace here another one so we can tell excel sum up this column if the the criteria this column has a one in it otherwise it gets a little bit more confusing we could also say sum up this column if this column has either a one or the the next ace is is down here or a 27 right or a 40 right so it would be easier to to to to use these two columns right some sum up this result if this column has a one in it right so we could we do that here we're going to sum it up some criteria criteria range if it's equal to a one and if we do that then these are the results we get in this first one for example is three six three over divided by the total five thousand five thousand gives us point seven point two six this one seven point four six and again they look somewhat you know within a range that seems reasonable if we compare that to what we expect to happen if we did this infinite amount of times it would be four over fifty two or seven point six nine percent and then again we can look at the differences and some are under some are over that's kind of what we would expect so based on this kind of statistical drawing it looks somewhat like a fair deck now we could make histograms of this this what we're what we did here is try to make a histogram of the entire data set so we just did a histogram and then I adjusted the histogram to try to give me each number but the problem with the histogram if you're trying to do that is that it needs a range so this is going from one to two two to three three to four and so on but you can see that it gives you just just how many times we drew in essence each number is basically what I'm looking for and you can see it's kind of somewhat even right because you would expect if you did an infinite amount of times for example if if there was if there was each number one over fifty two is that percent wise one point nine two percent times five thousand we did this five thousand times you would expect them to be hovering around ninety six right that's kind of what what you would expect to be the case and in it was you know so there so there we have it now you could also make that with a bar chart so in this case I used a bar chart to look at the results so now I've got one through fifty two so I did the same thing but this time I assigned this to the x-axis and the results to the y-axis and now this one's notice the bar charts a little bit nicer because now I can just have one number represented down here instead of the bucket ranges even though you can kind of get a similar chart with both of them and then I added the numbers it's a little it's a little crowded in but you get the idea and then this is going to be a histogram of the percent results so just to get an idea of that we've got the percent results now remember the percents you would expect to be to be around one over fifty two you would expect them to be hovering around one point nine two so and that's kind of what we have here right so it's a little bit over here but one point nine two is kind of what we would expect is kind of like the middle point of a histogram like that and you and you and that's kind of what what you what you see here so here's the bucket one point five four to one point seven two there were there were seven results and that bucket range and so on and so forth okay so the next thing we can say well what if I wanted to skew the data how could I represent that in excel if it wasn't a fair deck like there's a card missing or something well we can do the same kind of random generation one to fifty two and there's multiple ways that you can then take that randomly generated number set and then skew it in whatever way you want if you wanted to practice using a data set that wasn't exactly fair depending on what you're doing but for us let's just say that we took our data this represents an even one over fifty two as if you drew one card five thousand times out of a fifty two fair deck and then we're going to go into it using in excel the find function and replace so now we're going to replace everything that was a twenty nine with a one so we're going to remove all the twenty nines which I think is is is like a three or something of some diamonds or something and then we're going to replace it with a one which is an ace of spades so now someone has stacked the deck with the ace of spades and removed the the three of diamonds or something like that so let's see what that would look like so now and so now this is what we made the change on we we increase the ace of spades and we removed entirely the three of diamonds so then if I if I populate our results here we have our aside numbers again on this column this is the card number with the suit one through the king of spades one through the king of hearts or ace through the king and then these are the results now the results were we're doing the same thing we're saying count if meaning I'm telling excel to count this number these numbers but we adjusted it to remove all of what are they all of the three of diamonds and replace them with a with a spade ace of spades so we did a count that column and we said count if and then according to this number and so the results we get now we've got these results and we can say okay if I do my same analysis I can say well this 181 ace of spades came up out of 5000 divided by 5000 that came out to 3.62 which obviously looks quite high because you would expect it to be one over 52 if you did it infinite amount of times 1.92 right and then there you can do this for the rest of them this one came out to 108 divided by 52 108 108 divided by 5000 comes out to that this one came up to 94 divided by 5000 and so on so those these are hovering around kind of what we would expect so this is the actual this is this is what we would what it would be if we did an infinite amount of times or the fair amount and then of course if we get down to here we see there are zero diamonds or three of diamonds obviously that's an indication that that would be very rare to happen if we drew 5000 times out of the deck we didn't get any of those right so that would be unusual so obviously this would be an indication that the null hypothesis that it's a fair deck would be incorrect and we probably we can also do the same thing if we count of the spades right so if I counted up all of the cards because we increased the ace of spades you would expect the ace of spades to be higher so now you we got we've got the spades came out to be 1 364 out of 5000 which is 27.28 versus 25% which would be the ace of spades there's 13 out of 52 if it was fair 25% right and so so you can do that kind of analysis and you can do the analysis with each you know card but I won't we won't do that again here let's just take a look at the charts so here are the results here so here's the histogram of the results so that's a histogram of this one and again you would expect if I had a histogram of the results that that the results would be around I mean if I drew 1 out of 52 that's 1.92 percent times 5000 so you would you would think it would be hovering around 96 and so you've got kind of an around here but you've got these kind of funny outliers that and sometimes those are the things that you know depending on what we're looking for the outliers might be something that you know that way is that we're going to draw our attention right and then we've got if we did it this way this is just a histogram of the of the full data set again so we took the full data set and said count them and this one gives us a pretty good picture of something funny going on because because this would be I tried to get the histogram to do something similar to listing just one out of 52 right and and and we get obviously this one is looking quite high and this one's at zero and we would expect that this would be hovering around 1 over 52 times 5 thousand around the 96 right so this one like that was obviously that looks funny so that will give us some indications on the random draws and I can and then I just regenerated the bar chart this way so now I can actually have the numbers 1 to 52 not using a histogram but the bar charts and in this case I used this column for the X and then the results for the Y so these are just a couple different ways we can see the data and analyze and analyze the data in a similar way we did with the coin flips so that we can then use our kind of sampling the sampling concept being that the whole population would be as though we drew the cards an infinite amount of times and if we did that again that then we can come we can we can think what we know the whole population theoretically in this case right we know the whole population would be would be you know one out of 52 so percent right and so and so again we can compare that to the sample being whatever finite amount of times that we run that test and then of course we can take a look at the results from the sample and see how different they are from the null hypothesis or what we would expect the entire population to be if it was fair and then if there's a substantial difference between the two that's the evidence that we might have that we would then reject the null hypothesis and and and and come to a different conclusion