 Statistics and Excel. Election polls statistics example. Got data? Let's get stuck into it with statistics and Excel. Well, we'll be using OneNote here, but we'll still be talking about Excel. You're not required to, but if you have access to OneNote, we're in the icon left hand side, OneNote presentation 1335 election polls statistics example tab. We're also uploading our transcripts so you can first a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one, because apparently the merchandisers, they don't want to be seen with us. But, but that's okay, whatever, because our merchandise is, is better than their stupid stuff anyways. Like our accounting rocks product line. If you're not crunching cords using Excel, you're doing it wrong. A must have product because the fact as everyone knows of accounting being one of the highest forms of artistic expression means accountants have a requirement, the obligation, a duty to share the tools necessary to properly channel the creative muse and the muse, she rarely speaks more clearly than through the beautiful symmetry of spreadsheets. So get the shirt because the creative muse, she could use a new pair of shoes. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Use the immersive reader tool changing the language if you so choose and being able to either read or listen to the transcript in multiple different languages using the time stamps to tie into the video presentations. One note desktop version here remembering the two primary categories of statistical problems. One being where we know all the data of the population using statistical tools to organize that data in such a way that we can extract meaning from it. Number two being where we don't have all the data for the entire population, but possibly are able to get a sample of the population applying similar statistical tools to the sample as we would for the entire population if we had all the data, not because we're particularly interested in the sample, but hoping that the information about the sample will be useful to tell us something about the entire population. Now it's useful oftentimes when we're practicing statistics to use a situation where we already know the information about the entire population so that when we can then take samples of it and test whether or not the samples actually do tell us something that we can infer to the entire population so that we can then apply those same statistical tools in cases where we don't know the entire population and we can then take the samples and we can come up with tools to see how confident we can be about those samples. So we're just kind of exploring those concepts at this time in prior presentations. We did it with an entire population of heights taking a sample of the population of heights to see how close the sample relates to the results for the entire population of data. And then we also did it in more theoretical areas when we're thinking about coin flips for example or we're thinking about drawing cards from a deck where the entire population you can think of as a theoretical concept as though we flip the coin an infinite amount of times as though we drew a card from a deck an infinite amount of times. Now in this case we're going to just we're just going to assume that there we know the answer of the entire population so that we can come up with our sample to see how close the sample is to the population. So we're going to assume we have an election type of situation and we already know the results obviously again in real life we don't know the results that's why we take the sample but when we're doing our statistical testing and thinking about the tools we're using again it's useful for us to think about a situation where we're going to say I know the result for the entire population already and that's going to be 60 percent of the population is for candidate A. And so now we're going to think about taking well how can we simulate a sample to see how close the sample would be to the actual result of the 60 percent. We want to do this to think about our sampling method statistically as well as think about how we might structure a problem like this using our tools in Excel. So if we use our tools in Excel we might say if we were to take a random sample because we're just saying it's 60 percent then we could structure our random function again so now we're using our random function again but now we're going to say that it's random between one and 100 and so if we drew a random group of people and they said they were for candidate A and we're assuming that the actual population would be 60 percent for candidate A and 40 percent non-candidate A right something other than candidate A. Then we can use this and say well if the result of our random draw between one and 100 comes between below 60 or below then that would be like similar to as though the candidate is for candidate A. So I can basically make a random sample like this we can use our generator tools and say I'm going to say this is a random draw between one and 100 any draw that happened that's between 60 or lower is equivalent to a vote for candidate A anything that's above 60 would be a non-candidate A vote a vote for somebody else and of course we can copy this down so now we've got our sample of this person is below 60 60 or below therefore vote for candidate A this one's above 60 therefore vote not for candidate A this one's below 60 therefore it's a vote for candidate A this one's above 60 so we have our our random people that we can basically apply this to a yes or no kind of equivalence to candidate A so we can structure our tools in excel and we can copy this over because this is an easy formula to copy and we can run a test like this quite easily to try to get an understanding of what the results of this would be so now we've got a random box generation that's doing 10 samples of however many samples down that we want to do and we can then start to use this to apply our statistical tools to see if it's if it gives us results that basically we would expect now once I have a random generating tool like this I can then copy the entire thing because this has formulas in it and paste it static meaning only the numbers so these are simply hard coded numbers for for all those random generations so now if I go to the bottom of this summing up our results this is for candidate A where we're asking excel to look at this series of numbers in this column which represents the poll that we took which is going to give us results of either them being for candidate A or not for candidate A the formula looking like this we're going to say equals count the items in this column if this is the range there's the criteria range this column and then the next criteria comma is if it's less than or equal to 60 now because we have to put this less than or equal to is and those are kind of like text we have to put the quotations around that and then we have to connect it to the 60 with an an so it looks a little bit complex but not too bad once you do it a few times and therefore we're looking at this column and it's counting all of the numbers that are equal to 60 or below 60 and then we can do the same thing if it's not a so if it's not a we're looking at this column it's this formula look at that range excel please comma and this time just take greater than not equal to but everything that's above 60 and that's going to give us this number this number then is a double check or this number is our total adding those two up so 102 plus 48 that means that we did this 150 times in other words if i count all of these rows that would come up to 150 so then i can take my percentage and i can say what's the percentage of times that we had a four candidate a versus against that would be 102 divided by the total of 150 68 percent of the time four candidate a and the other being 48 divided by the total of 150 32 percent of the time not so again we would expect that we had the starting point a priority that we said that it should be 60 40 if we did this infinite amount of times because or if we had the data for the entire population we said that the population data was 60 for candidate a so if i did this again we did the same thing here again and this comes out to 65 35 this one 57 so this one's below 43 this one comes out to 57 43 52 48 67 and 33 and so on so here's our series of results for us doing this multiple times now i'm going to take this column of numbers i'm just going to take the ones that are for candidate a which the percent for candidate a and put it in the format of a column which we can do it excel by basically transposing and it's a pretty simple procedure if you want to watch this practice problem or do it as well with us in excel so then we can say i'm going to then list it this way so now we've got column a and these are the samples so how many samples did we take 10 and these are the results that we got that were the percent for column a and so we can compare that then to what was expected or what is the actual number which we said that we knew a priority beforehand we knew it was already 60 and we created a test that you would think that if we did the test on infinite a number of times because it would come out to 60 percent because we took the test to take a random draw between one and 100 and we took everything that was between below 60 uh and below so so in any case the difference is eight five three uh and then you could see the difference is kind of above and below if i was to take the average of of these this column it comes out to around 61 which is pretty close to what to the actual of 60 okay so so if we were to we can we could run this experiment with excel just given the fact that it's a pretty simple thing to run i this is kind of blurry because i copied a whole bunch of columns here we'd ran the same kind of thing one through a hundred but you can do it basically a whole bunch of times not an infinite amount of times but a whole lot of times in excel so we so it's interesting to try to to run this same test which you can build very easily and analyze your results and see how much closer you get to kind of the actual this is just basically the sample running this i just copied the same formula all the way through and we did it you know a whole bunch of times so this is a huge random number generator or relatively huge you know and then i just copied that and then here's the actual results which i just gave us a few of them so now you've got sample one two three same thing we did before and the total of the sample uh down here and so now you've got same kind of concept this is we took this column and said count them if they are less than or equal to 60 we came up with 50 of them this one count them if they're greater than 60 we got 30 we did this 80 times so this number of columns here is 80 we took a sample of 80 out of the theoretical population is 80 and then we said okay well 50 out of 80 50 out of 80 we're four candidate a that's 63 about and 30 out of 80 is 38 about and that's the total of 100 percent now we did this a whole bunch of times this time so now now we have a whole lot more results now if i put these results again into a column i'm going to go way over here and say and i short cut at this table so this doesn't have this isn't as long as the table we will do in excel which is giving you the kind of idea but here's the results so now we've got uh uh the results uh the percents and we did this a whole bunch of times you could see boom boom boom all the way down to here and then we took the average so we did a whole bunch of times this time and we compare each one to the expected and then this is the difference now and then you can basically kind of make histograms so this is a histogram of uh the percent results that we've got from testing it a whole bunch of times and and you would expect then if you test this a whole bunch of times that it would be hovering around 60 percent right so if i look at my histogram over here we've got we've got the middle point uh is leaning a little bit to the right right because you would expect it to be at like 60 and so it's interesting then to look at your results and see how the histogram as you as you look at larger results and try to make a histogram out of it then then look at start to look at the structure of the histogram as you as you do this uh and have more results that you're that you're adding to it because you would think that then the form should get close to the theory in our case because we did kind of a mathematical kind of concept and we took it everything below uh 60 percent we took a random draw you would think that the more uh the more times you you have a result you would get something that's going to be tapering off and looking more kind of like a bell shape uh uh kind of kind of structure as you as you take more results so it could be an interesting concept to get an idea of certain things doing that conceptually in excel with small amounts of data and then just run the same thing with larger amounts of data and look at the structure of the of the graphs and histograms as you do more of it which should take you closer if you're doing a mathematical thing to the actual population which in this case we said was 60 of the entire population which is really kind of just a mathematical concept because we we just it would be the similar thing as with the coin flip right we if we did it an infinite amount of times you would expect it to be you know 60 but it's going to then it's going to taper off and and what not if you group them all together so uh there's that one