 Statistics and Excel coin flip statistics example got data let's get stuck into it with statistics and Excel or one note in this case but we'll talk about Excel too you're not required to but if you have access to one note we're in the icon left-hand side one note presentation 1315 coin flip statistics example in Excel tab we're also uploading transcripts to one note so you could use the immersive reader to first a word from our sponsor yeah actually we're sponsoring ourselves on this one because apparently the merchandisers they don't want to be seen with us but but that's okay whatever because our merchandise is better than their stupid stuff anyways like our crunchy numbers is my cardio product line now I'm not saying that subscribing to this channel crunchy numbers with us will make you thin fit and healthy or anything however it does seem like it worked for her just saying so yeah subscribe hit the bell thing and buy some merchandise so you can make the world a better place by sharing your accounting instruction exercise routine if you would like a commercial free experience consider subscribing to our website at accounting instruction dot com or accounting instruction dot think of it dot com will possibly change the language if you so choose and then either read the transcript or listen to the transcript and multiple different languages tying it out to the video presentations using the timestamps one note desktop version here data on the left hand side remembering the two major categories of statistical problems one being where we have all the data all the information for the entire population and are applying our statistical tools to organize that information so we can get meaning from it and to a situation where we don't have all the data all the information for the entire population and are often reliant on samples of that data and then we apply similar statistical tools not so much so that we can fully understand the example the sample itself but so that we can infer the results of that sample to the entire population now here we're thinking more of the second of those buckets where we don't know the entire population however often times it's useful to think about situations where we do know the entire population and we can get basically the answers and then we can also take in a sample from that population and use our sampling techniques to see if when we infer the results of the sample to the entire population they are applicable and so in a prior presentation we did that with heights we had an entire population of heights we took samples of those heights to see how representative these samples were to the entire population now we'll do a similar kind of thing with coin flips which is a little bit easier sometimes because you have like the 50 50 the two results on the coin flips as opposed to the heights data where you have the range of the heights but also it's a little bit more mathematical and theoretical because when we think about coin flips the entire population you can think of as a conceptual type of thing which would be an infinite amount of flips so in other words when we're thinking about coin flips if we had a coin and we kind of assumed that if it was a fair coin that would be our general assumption the fair coin would mean that anytime I flip it you're going to get an even chance of it coming up heads or tails so that's going to be like the null hypothesis and then how would we test whether or not that is indeed the case well we would have to flip it a bunch of times and then test whether or not what we think would be happening is indeed what is happening and if we come up with evidence that that is not the case something else is happening then the question is what kind of statistical evidence would we need to override the general null assumption that it would be 50 50 now when we think about this from a statistical sampling concept you would think about well what's the entire population here that we're thinking about well it's a conceptual idea that if you flipped the coin an infinite amount of times then you would come out to 50 50 but obviously in the real world you will never be able to flip the coin an infinite amount of times so any amount of times that you flip the coin in theory you can think of as a similar kind of concept as when we took a sample of the entire population of heights so any amount of flips is going to be some finite example of the amount of flips of the conceptual concept of flipping it an infinite amount of times and if it was a completely fair coin we know the outcome of an infinite number of times just conceptually would be 50 50 if it was a fair coin so that's kind of the idea so then two things like like always we would like to do here we would like to say we want to know this conceptually for statistical and we also want to think about how we can we test this out in excel so we'll give an idea of how we can use excel tools to kind of test this out now obviously if you were to generate this in excel you can use something like the random generator again last time we used a random generator to just give us a random decimal number of multiple decimals out this time however we're going to use the random between so if i want to simulate a coin flip i can basically say look what i want to do is i'm just going to assign a one to heads two to tails and now that i have a numerical assignment to my outcomes i can use a random generator random between the bottom being one the top being two and then excel will give us this random generation which basically simulates anything that's random between two numbers right so now we've got our random generator that we can play with and simulate something of this nature so now as i look at my random generation i can use this formula and i can copy it down to as many cells that i want now however many cells i copy it down to if i copy it down to a hundred cells that would be like simulating a hundred flips of the coin and a hundred flips of the coin is kind of like from a sampling idea or concept a hundred flips out of an infinite amount of flips a sample of a hundred out of a sample out of the actual population which is a conceptual idea of infinity right so then so then uh if we if we take a look at our i can then basically organize my data so once i have my random coin flips i can then i can then say this random generator is going to keep on generating random results so i'm going to copy this random generated field and paste it static so now it's pasted without the actual formula but just as hard-coded ones and twos i can also use excel if i want to then give me the results in terms of words i can use a formula in excel to say if there's a one tell me it's a heads right so we'll do that in excel if you want to if you want to see those formulas to do that but but conceptually i can just say well one is a head and two is a tails i sorted it by heads and then tails here so we can sort them heads on top tails on the bottom i can get an idea from doing that just saying it looks kind of close to 50 50 maybe right but i can't really tell just by looking at the data results so then we have to run some statistical analysis on it so then i would like to tell excel to count the number of heads simply count the number of heads now conceptually that we would just take our data set and we'd say this is how many times it came out heads if you did this and you actually flipped the coin then you would have to tally out how many times it counts as head but excel will do that for us with our random generated tools here so we can play with this stuff a lot more quickly and understand it more conceptually and understand excel formulas so this is equals count if so it's taking this data uh we can take either one but if we take the the numerical data we're going to say count if there's a one uh and and if we take the the data that's the text data we can say count if you see a heads count if you see heads then pull pull that and we came out to 52 and then you can do the same thing for the tails so i'm going to say excel take this data and count if if it was this column you see a two or count if over here you see a tails and then if i sum those two up notice i didn't do exactly 100 flips and i'm kind of happy that i didn't oftentimes in statistics people use 100 because if i had 100 and there was 52 heads out of 100 well that's 52 percent but it's useful to do that added step of saying well what if it wasn't 100 flips then you would have to do the math to think about think about how much it was out of meaning if it wasn't out of a hundred 52 wouldn't be 52 percent if you came out with 52 so if i summed those up it came out to just 99 flips so we flipped it 99 times came out heads 52 times came out tails 47 times all right well then we can say if i take that 52 divided by 99 that's how i'm going to get the percent so i can pull out the trusty calculator 52 over 99 it's going to give us if i move the decimal two places over 52.53 about it's rounding right if i multiply times 100 times 100 the the percent 52.53 about and then if i look at the other one tails is going to be 47 over over 99 if i move the decimal two places over 47.47 about and that will give us the total percent of 100 percent so so we don't have to flip it exactly 100 times in order to get a percent right which is often what you see an example problem so if i do that with a formula it would just be you know the 52 over the 99 and this would be the 47 over the 99 this is a common calculation to to see right you've got the the the sum and then the total and then i want to see i want to see then the percent of each item compared to the total so this is like a running percent which is quite common calculation very useful to be able to visualize in in excel or mathematically because it's it's very common and useful all right i'm going down here now we go down and so now we could do this many more times in excel so now i'm going to use the same random between function here and and i could test it out and say let's say we have we're going to do a few different tests and we're going to flip it anytime between two times and 15 times now just like you would expect if i flip it just two times then then you're not going to get a result that's going to be representative in this case i got two tails and you know that might not obviously if i got two tails that's going to be misleading so i'm not i don't have a big enough sample to really tell me anything about the data right the more times i flip it intuitively you would expect that we're likely to get closer to the 50 50 right but but still three times is still you might get three tails if you flipped it three times right this is a test where we flipped it four times so i'm just using the random generator tool and just in each cell flipping it two times this one we flipped it three times in test three we flipped it four times at a test four we flipped it five times and so on and so forth so you can see as we go through the results as we flip it more times you would expect that we would get a result closer to what we know the actual result is in the actual population just knowing it intuitively or theoretically or logically if it was a fair coin using our conceptual knowledge would be 50 50 right the entire pop if we flipped an infinite amount of times all right so then so if i so so this is a another random generator tool and this will keep on populating random generations and so you can test it randomly and i can copy this entire test which will keep randomly generating every time i click on something in this table and i can then paste it down here one two three in other words pasting the value only and then i can run my statistical testing on it so in this case we had the test of two we came out with two heads and zero tails so if i was to do my percentage two out of two is a hundred percent for that for the heads zero out of two is zero for the tails so and there's our hundred so so that adds up to a hundred percent so this is kind of misleading we only flipped it two times and we we came out with a result that obviously isn't going to give us too much information about the entire population if we were to in flip it infinite amount of times so this would not be sufficient to overturn the null assumption that it's fair coin because we don't have enough of a sample to really get an information in this case we have three so it happened to come out one heads and two tails for three so one divided by three is thirty three percent about and two divided by three is sixty seven percent in this case we flipped it four times it happened to come out two heads two tails which are four and fifty fifty and this one it came out five times we flipped it five times two heads three tails and so on and so forth now in excel to get these results what we did is we used count if again i said count if these two cells have a one in it and so two of them had a one in it and on the second one i said count if the cell has a two in it so here's the criteria count if that range has the criteria of a two and then it counted how many of them there are and then we divided we divide out to get to our heads which is the heads divided by the total next we might want to run similar examples or experiments where we're going to flip it more times so in this case our number sequence got a little bit messed up over here in excel but we're imagining these are multiple tests where we're going to be flipping the coin many more times using the same concept in excel of equals random between one and two one representing heads two representing tails so now we have a bunch of tests that we did and we just copied this random generation tool in all of the cells this is just going to randomly generate all of our outcomes for us and then if i copy that entire random generation tool onto another cell so that i can then change it from a random generation to just hard coded numbers we get our results and so again the number sequence on the left hand side got a little bit messed up so we we uh we flipped it a bunch of times here i'm not sure exactly uh how many times but we will find out in the calculations down here so then i can sum each of these up so if i look at the heads the formula is going to be excel look at this column of numbers that we flipped i think it's going to be 75 times but look at that column of numbers and then and then tell me how many times count the times that you see a one and and excel gives us 34 times and then we do the same thing for the tails we say hey excel count this column of numbers and tell me how many times you get a 41 and i mean i'm sorry how many times you get a two that's this count if and excel says 41 times now between a one and a two which are the two things that populate these cells we come up to 75 as the total and so then we can say all right if there's 75 of them i can take a look at the heads 34 out of divided by the tails uh i mean divided by the total 75 and that comes out to 45 about percent 45.33 percent i could do the same for the tails 41 divided to by out of the total 75 gives us 54.66 55 about 45 plus 55 is 100 which is kind of our double check that we have done things correctly there so then if i if i look at it just the heads then i would expect it to come out you know 50 50 or the entire population if it was a fair coin would be 50 50 if we flipped it infinite amount of times here we came out to 45 there's 53 56 52 we could do the same thing for tails of course but if we just zero in on one of the outcomes then it gets a little bit easier for us to think of that series of outcomes right so 60 this one's you know pretty high kind of outline for 74 flips right but it still could clearly happen that we have in you know 45s here's our you know it should be you know we would think it'd be around 50 50 now if i took that series here of percentages results i might want to extract that and put it in a vertical column in a column format so i could do that so i could in excel i could do that by basically copying it and then pasting it and transpose it so we'll do that in excel if you want to practice that in this practice problem in excel and then i could compare it to what the hypothetical expected result would be meaning this would be the average for the entire population it would come out 50 50 if we flipped it infinite amount of times if it was a fair coin and then we can look at the differences between the outcomes per test so this is the outcomes per test 45 percent heads versus 50 percent for the actual this one came out 53 versus three versus 53 percent difference 56 versus 50 and you can see that of course we would expect that some of some of the outcomes would be over some would be under if these were random samples now if we counted all of the tests which we did 75 of them then and if i took the average of all the averages then we're getting pretty close right now we're at 50.33 as opposed to 50 so so so you could see as of course we take a larger sample you would expect us to get closer to the actual average of the entire population which is a theoretical concept in this case of 50 50 of an infinite number of flips so here's the the the the number of heads if we were to take a a histogram of the data and so and then this one so let's look so we could do and we could do this multiple times right i could take the random number generator and basically do the same the same thing again and we're not going to come out with the exact same results you'll see over here we came out with well i can see it in my data the first one was uh was 45 percent and over here where did i where did i just go i came out with uh 45 well that one happened to be the same but then it was 53 and 56 so if i go back on over so now we did the same thing and i've got uh i've got uh the 45 51 51 so here's the same process that we did to generate another bunch of flips 75 uh flips each and and we did it you know multiple times we could take the same you know data the heads data represented vertically so here it is represented vertically 45 51 51 and we can compare you know histograms that are generated from them the histograms are not going to be you know exactly the same but you would expect them to start looking similar as we have uh you know larger data sets would be the general idea so these are histograms of the averages of all the outcomes that we did uh 100 tests of 75 now then you might then say well well how could i simulate a a situation where it's not a fair coin so now i now i have the null hypothesis hypothesis which is 50 50 if it was fair but if it's not fair i would have to prove that it's not fair how can we simulate that well one way we could say let's use my random generation tool and say that if it's a one it's the heads and if it's anything other than a one it's going to be tails so we could use the same kind of concept but then go to three and so now we're going to apply everything that's one a heads everything that's not a one a tail do the same number generation tools but now we get from one to three right and then we can copy this whole random generator tool over to to get the hard coded numbers and so here's our hard coded number outcomes and again the numbers are a little bit messed up here for the for the number of flips but i think this comes out to actually 70 74 flips so uh and it might come out to 75 so i think this 74 is a little bit so what happened here is is here i took the number of heads which means that i'm going to say count if there's a one and it took all the heads now this one i actually cheated a little bit and i didn't catch the fact that my numbers were not properly uh calculated here because this is well actually this is one this one's populated properly one two so this one looks right so the other one had some funny business going on so i think this one is right but i still kind of short cut it by saying that well if there's 21 heads and i flipped it 74 times then the two and the three are going to be the what other rest are populated and i'm assuming those are tails so i just took 74 minus uh minus 21 gives us the 53 so that the 53 and the 21 add up to 74 which i'm saying is the total number of flips now notice that it would be better if i actually did another count if function uh to count this column if it came out to be three or two because then i get kind of a double check on my total which i like to do as an accountant but but uh but that takes a little bit more complex of a formula because then i have to hey say if it's not only a one but a one or a two right so i kind of short cut it here because if it's not a one i could have said i could also said you know if it's not a one then put something here right if it isn't this thing then put something here right but in any case that adds up to 74 and then i can take my 21 over uh 74 that gives us 28.37 percent and the other one is 53 over uh 74 so again and so that's 72 now again you would expect it to come out 50-50 so if i just look at heads and i was to look at these results if i flipped a coin multiple times and i came out with results of it came out 28 heads 26 heads 36 heads 42 heads notice that none of these results are above 50 percent that uh is quite unusual if it was a a fair coin right so so so this would this would give us a preponderance of evidence to say though the null hypothesis does not look proper here because uh although i'm looking at a sample it seems quite unlikely that this is going to be the case right so so then if i was to look at our results this way i transposed all the heads in a column and now we're going to compare it to the 50 percent that we would expect and the first one it's off by 22 so heads are too low by 22 it's too low by 24 by 14 so notice it's always too low right that it's not always the same number but but that's given us pretty good evidence that we're like okay this thing looks like it's weighted towards tails because because uh we would we would expect then uh that it would be that that it that you know it wouldn't so it didn't always come out tails right but but and this is the common thing that's going to happen we're going to get the expectation which is the null hypothesis what we would expect to happen if it were fair and then test it out and see if there's a difference to to to from the actual data to what you would think would be happening or to the entire population if it were in this case the entire population being as though we flipped a fair coin infinite amount of times and would have then a 50 50 split so here's a here's a histogram for the heads in a not fair coin of the results that we took here and notice it's the results that are populating around 31 to 35 percent instead of around the 50 and you've got this this shape that looks like it's that's happening somewhat consistently right and the outliers around around that center point somewhat balanced versus this is the histogram we looked at in a fair situation where it's closer to the 50 percent and then you've got this kind of shape populating around it so that's the kind you know just the kind of concepts that would that we can theoretically think about uh here applying more of a of a mathematical concept because we kind of thought of infinity in this case which is a mathematically conceptual term for the entire population if we flipped an infinite amount of times and also how we can kind and and then the statistics is similar to a situation that we saw before in weight where where we don't have a theoretical concept we have actual population of weights of individuals versus a sample of that example population but that's a similar concept as the theoretical infinite flips versus the sample a number of flips and we we can see how you can play with this stuff in excel pretty easily once you know some of these concepts which really helps you to understand it better if you can actually you know run tests in excel