 Statistics and Excel. Calories, Data, Statistics, Sample, Example. Get ready, taking a deep breath, holding it in for 10 seconds and looking forward to a smooth, soothing Excel. First, a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay, whatever, because our merchandise is better than their stupid stuff anyways. Like our Accounting Rocks product line. If you're not crunching chords using Excel, you're doing it wrong. A must-have product because the fact as everyone knows of accounting being one of the highest forms of artistic expression means accountants have a requirement, the obligation, a duty to share the tools necessary to properly channel the creative muse. And the muse, she rarely speaks more clearly than through the beautiful symmetry of spreadsheets. So get the shirt because the creative muse, she could use a new pair of shoes. If you would like a commercial-free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. Here we are in Excel. If you don't have access to this workbook, that's okay because we'll basically build this from a blank worksheet, possibly being able to get sample data sets from Kaggle.com, that's K-A-G-G-L-E.com. If you do have access to this workbook, three tabs down below. Example, practice blank. Example, in essence, answer key. Practice tab, having pre-formatted cells so we can get right to the heart of the practice problem. The blank tab, just having our data set so we can practice formatting cells within Excel as we work through the practice problem. Let's go to the example tab to get an idea of what we will be doing. We've got our data set, which we will make a table out of. This is calories, our related data set. We're going to imagine that this is the entire population data set on the left-hand side. We'll make a histogram from it. We'll make an average of that entire data set. Then we want to take samples of this data set to see how closely the samples mirror the results of the full data set. We're going to be practicing how we can basically take randomized samples from what we're imagining as our whole data set. This is similar to what we've seen in the past, but slightly different because we can use our randomized numbers, but then we want to line them up to line items on the actual data set. We can then put our random numbers next to the entire data set to use that to scramble the data set in basically a random method. Then we'll imagine how we can use this tool so that we can pull multiple samples together. Then we'll do multiple samples here that we will be pulling together. It looks like we have samples of 20 or 19 or 20. Then we'll make a whole bunch of those samples together to practice that method of sampling. Then we'll see what the results of that sample are. Let's go to the blank tab and I'm going to delete this. Here's our data on the left. It's currently sorted by date, so we're tracking in essence calories here. By day, we're going to imagine this is our full data set. Let's insert a table. I'm going to go to the Insert tab up top. We're going to go to the Tables and Insert a Table. It's going all the way from A1 to 458, so fairly long data set, but not huge. I'm going to say OK. There's our data set. It's currently sorted by date. We could sort it by calories to see lowest to highest and then highest to lowest, so we could do that. Or we can sort it back by the date over here, lowest to highest. Then we're going to take that data and do some calculations from it. Let's do our standard average. I'm going to pull in column C, bringing that in a bit. Our standard calculations then would be the average or mean. Before I do this, by the way, I should have done this first. I'm going to format the entire worksheet. Let me put my cursor over here on the triangle. This is going to mess up the date formatting, so we'll see how to deal with that. I'm going to right-click. I'm going to format the entire worksheet. I'm going to go then to my baseline formatting, which I would like to be the currency, negative numbers bracketed and red. I don't think we need the decimals, so I'll remove the decimals and no dollar sign. I'm going to say OK. When I do that, it's going to mess up the date formatting over here. I'm going to go OK. I'll mess up the date formatting. Then I'll just select this entire column and go to the Home tab, Number Group, and in this dropdown, this is like the quick formatting, I can find then the date. I can go down to the short date and it'll populate it back to the date. That looks good. Let's make this one white on the font. Normally, I would like to do that to the sheet before I add the actual data, possibly if possible. That will make things a little bit faster sometimes. Then I'm going to select the calories, or let's calculate our average first. Let's go to the average, average, or mean. The average is mean. I'm going to pull this to the right and we'll say this equals the average. There's our function. Double-click in the function, selecting just the numbers so we have the dropdown in our table. We're picking up the table of calories. That's it. Enter. Then we might want the median. We'll use our median function, equals the median. I'm doing this fairly quickly because we've seen these in the past. Double-click in the median, selecting our data, and enter. There's the median. We might want the min, or let's take the max first. Max and then the min equals the max. Double-click in the max function, and then selecting our data. That's the top calorie. Then the min equals, and then min, selecting the min, and the data, and there's the min. Zero. Okay, so let's select the entire data here. I'm going to make it bold. Home tab, font group, I'll bold it. I'm also going to scroll in just a little. That looks pretty good. Let's enter a histogram now. I'm going to select the entire data and go to the Insert tab. The charts will make a histogram. There's our histogram of the data. I'll delete the title. I'll make this data blue and bordered as is normally our custom. I'm going to go to the Insert tab, I'm sorry, Home tab, font group, bucket dropdown. If you don't have that blue, it's in the More colors, Standard, and there's the blue I'll use, and then OK. Let's put some borders around it, Home tab, font group, and put some borders around it. There we have it, and there's our chart. Okay, so now let's create our samples. If I wanted to make a random sample of these numbers, then what I could do is say, I'm going to put this next to a column where we randomly generate numbers and then organize the randomly generated numbers in order so it will shuffle the calories column randomly. Then we can pick the numbers. Now, if I want to do multiple samples, then we can make multiple of these random shuffling tools. Let's try doing that. I'm going to select the entire column, column B, right-click, and copy. I'm going to put that over here somewhere. I'll put it over here in M. Right-click, I'm going to paste it just 1, 2, 3 because I want to put a different table around it. And then here's my random shuffle. My random shuffle tool will be over here. I'm going to use just the normal random shuffle generator equals RAND. Double-clicking that, and then I'm just going to close up the brackets or just enter, and it'll close it up for you. That's a 1, but remember, if I add numbers, Home tab, numbers, or decimals, add decimals, then you've got this pretty long randomly generated number, right? Now if I double-click on that, it'll do the same formula all the way down. So now we've got these randomly generated numbers all the way down. So if I shuffle the data by these randomly generated numbers, then I should get a random pick on the right-hand side and then I could just see how many I want to include in my random pick. So let's put a table around this so they're connected together. So when I shuffle them, they will shuffle together. So I'm going to then go to the Insert tab, Tables, Make a Table, and it's selected in the whole data set. That looks good. Okay. So now I can shuffle it by the random shuffle here, A to Z or Z to A. Now every time I do that, it basically reshuffles it again. The random generator keeps on generating. So that's great because it allows me to create more random samples. When I want to actually keep it static, then I'm going to copy these two and paste it some other place with the static numbers so it won't keep shuffling. But before I do that, let's make a couple of these. Okay, let's make like five of these random generating machines. So I've got one here. I'm going to copy both of them. I'm going to say Copy and then I'll Paste, right-click and Paste. Or I can just say Control-C. I'm going to paste another one, two, paste another, three. So I've got there's four of them. I'll paste one more over here. So there's five of these random generating machines. I'm going to make a bunch of skinny columns. Selecting column W, holding down Control, column T, holding down, I'm holding down Control to select each of the columns which are not next to each other, non-adjacent as they say. I'm going to put my cursor in between them, make them the same skinny size. So now they're the same skinny. And then maybe we can make all these columns skinnier as well, including column Y, Y, Y. And so there we have it. Alright, so now let's make samples of like 20 of these, right? So we're going to say I'm on one, so I'm going to go down to 21. So these are all randomly shuffled. So let's randomly shuffle them again. Notice I have a different number up here than up here because when I shuffle them, they shake it up again. So I'm just going to shuffle them all, shuffle it up. Alright, so now they all have different numbers. And if I just pick the first set, I'm going to get a random sample because they've shuffled them all up. So now let's imagine that we want like five samples of 20, right? So then I could copy just 20 of these, right? Go down to 20 of each of these. But if I want to do that, if I want to make a random five sample of 20 generator over here, then let's do that. I'm going to close that. I'm going to make column Z a little bit smaller. And then I'm going to say that we can say this is the count, let's say. And then this is going to be sample, sample one. And then I think it'll copy if I just do one, if I put my cursor on the fill handle. It does sample two, sample three, sample five, up to five. Alright, and then I'm going to make that a header. So I'm going to put my cursor over that home tab font group. Let's make that black, white, let's center it. Not that center, this center. And then I'm going to put this, it's going to go from one to and we're going to make a sample of 20. I'm going to select those two and just count down to 20. So it's going to count down to 20. Alright, let's center those home tab alignment and center them. And then I'm just going to say this equals the sample of the first random generator. So that one, so I'm going to say tab to go to the cell to the right equals and then this is going to be from the second random generator. So this one, tab and then equals and this is going to go from the third random generator. So it's going to be this one and then tab and this one's going to be for the fourth random generator. This one, tab and then this one's going to be from the fifth random generator and enter. So they're all different and they're going to shake up. See how they're going to shake up as these shake up, right? As this changes, as I change these and I reshuffle them, let's change it this way, then these are going to change. Alright, so then I can copy this down, putting my cursor here and copy it down to just 20. And now it's picking up the top 20 that are going to randomly shuffle every time I shuffle this deck over here. So if I go to the right and I shuffle this, we can go shuffle, shuffle, shuffle, shuffle and shuffle. Then this whole thing should shuffle, picking the top 20 of each which are randomly selected from the entire set due to the shuffling nature. That's the idea. So let's make this blue and bordered, home tab, font group, bordered, drop down, blue. And so that's our generator. And now let's imagine that we want, how many samples do we want to make? We want to make like, let's make like 75 samples. We'll make a decent amount of samples and we can just make them in sets of five, right? So I can say, I'm going to do this again. So this is going to be sample one, actually let's do it one tab over. Let's make a skinny here, a skinny AG. And then I'm going to skip a cell and put it here. Let's say sample one and I'm going to bring it out to 75. Fill handle, bring it out to 75. 75, right there. And then I'm going to make that home tab, font group and make this black and white. And let's center it. So alignment and center. Okay. And then I could just do this a couple of ways. I could copy the data over. I can't use a cell reference because I'm going to shuffle the data each time, right? So what I could do is I could just select all of this data that we have generated and control C, copy it. And then I'm going to paste it over here, one, two, three. So I'm going to paste it one, two, three. And then I might also paste the values. I mean, paste the formatting. So it gives me that blue. So I can could then paste it the formatting. And then possibly we want to hide these. I could then say let's hide these cells so I can see them closer, but I'll do that next time. And then we can reshuffle. See, we have to reshuffle every time. Shuffle again. Boom. And then shuffle the second one and then shuffle the third one. Shuffle the fourth one and shuffle number five. And that should rearrange our numbers over here. So we have a different set of numbers and then I can do the same thing. I can copy these and say this is our next sample that we made. So I'm going to right click and paste it one, two, three, just the values because I don't want the formulas in there. And right click and then paste the formatting. Now then I could hide these so I can still work next to pretty close to each other. So the way you hide is I'm going to go from the column AI to column AR. Let go, right click the selected cells and then hide them. So now I can see it's going from AH to AS. So I know there's cells between it. You can also see this little kind of thing in between a little box. And then I can do the same thing so I can reshuffle over here. We're going to go, OK, shuffle it up again. This is kind of redundant, but it's fun. This is fun to do. So we're going to shuffle it up. It's like shuffling cards, which is more fun than playing the actual game, shuffling the cards. So we're going to copy that and paste it, right click and paste it one, two, three. And then hold on a second, undo, undo, control Z. I just want to copy these numbers, not the counting. And then right click, hopefully I didn't do that in the other ones. Paste it one, two, three and then right click and I'll paste the formatting as well. So we'll do this a couple more times. So it's a little tedious. Shuffle and then we'll shuffle and then the third one will shuffle. And then the fourth one will shuffle and the fifth one will shuffle. And then we'll copy our new generated data, newly generated data, control C, and then paste that right here. Right click, paste one, two, three. Right click, paste the formatting only. And then I'm going to hide going from AS to BB, right click and hide so that we're lined up for five more. I did mess up on one of them I can see. So I'm going to unhide. I'm going to go from AH to BC, right click and unhide just to note that it looks like somewhere I picked up. No, that's right. That looks right. So let's go back to BB to AI, right click and hide again. I was second guessing myself, which is not something that no one should be second guessing me because I don't make mistakes or things that other people do. So we'll then do this again and then let's copy the next one and we'll copy the samples, control C, right click, paste one, two, three. Right click, paste value formatting only and then shuffle. I shouldn't have done 75 of these shuffle, shuffle, shuffle, shuffle. And then copy the results, copy, control C, move it on over, right click, paste one, two, three, paste formatting. Go from BC to BL, right click, hide. And then again, ultra base. Now we could have made more of these. Like you might say, well, why didn't you make more shuffling? I mean, I could have made, you know, 75 shufflers and that's another method that we could, you know, you could use. But you can test out what would be working best for you. I'm going to copy once again, copy the new shuffle data, right click, paste one, two, three, right click, paste the formatting and then shuffle, shuffle. Keep it rolling, keep grinding, keep the nose to the grindstone. But my nose is almost gone. My nose looks like Michael Jackson at the end of, no, just got to keep the nose to the grindstone. We're going to copy that and we're going to paste it over here, paste it one, two, three, right click, paste the formatting. And then hide from BM to BV, right click and hide. All right, ultra base another time. Let's do it again. Do it till you get it right till it's routine. So you don't even think about it. It's just like, it's just an extension of being this. I don't know what I'm talking about. We're going to then copy this, Roger, Roger that control C, right click, paste one, two, three, right click, paste the formatting. And then a couple more times, just a couple more random, random, random, random. No one wants to see you shuffle cards. What I shuffle like an expert, people wait around the block to watch me shuffle cards. Tell you what, the way I shuffle is amazing. Right click, paste one, two, three, right click, formatting. Let's hide. We're almost there. Hide BW to CF, right click and hide. And so then I'm going to do this a couple more times. Okay, I spared you from watching me do the last couple of months, but I did the same process until we get up to 75, the sample 75. So now what I'm going to do is just unhide the cells. So we have all of our samples. So we have hidden cells between AH and CQ. So I'm going to put my cursor on AH and scroll on over to CQ so those hidden cells are in between. By the way, you could go all the way over from here. If you need it to, as long as the hidden cells are in between what has been selected, right click and unhide now. So now we have unhidden and then I can put my totals down below. Let's put an average down here. So I'm going to then take the average of the sample. I'm going to take the average of the sample that we have equals the average brackets and scroll on up. So the average of the sample that we have taken is 2,174. You'll recall that the average of the entire population, 2,189. So there's our comparison and we could compare, of course, each data point that we got compared to the average. So let's take the bottom number here, the average, and I'm going to copy that all the way across to the right to our 75 samples of 20. So there's our averages for the whole set of samples that we took. And then let's take this entire thing, I'm going to make it blue. So I'm going to go up top and say let's make that drop down blue, white, bordered. And then I'm going to, I'd like to see that in a vertical format. So I'm going to do what we've seen in the past here, take this whole thing and then paste it in a vertical format. I'm going to copy this, put it in a column, in other words, control C or copy scrolling up. I'm going to put it, we put the wrong button, put it right here in DID, right click. I lost it now because I fumbled around, I fumbled. All right, fumbled. Let's copy it again. Ultra copy, ultra vase another time. Once again, por favor, right click and copy. Copy that, copy that, Roger, Roger. And then we're going to paste that in DID, right click and paste it. Let's do it one, two, three. And then I'm going to copy again, right click and copy. And then paste it over here in DG and paste it special because we want to transpose, transpose, make it into a column if you would. And Excel is like, okay. And then it does it, no problem, no problem says Excel. We can do that without a problem. So let's make this into a table. I can say insert and tables. Let's make a table. Boom. So there's our averages. And we can compare the average to what we had for the expected or actual or population average, right? The population average was equal to, I'm going to scroll all the way to the left to pick up that original average that we came up to. 2189. So there's the population average. And I'm going to say double click on this one. I'm going to say make it absolute by selecting F4 dollar sign before the E and the one to pull that all the way down. This is the difference from the average of all those. And so this is going to be equal to this minus this. So there's the difference all the way down. We could make a total column down here and a table. You can do that by going to the table tools and total row. And then on this column, I would like it to give me the average of the averages, which is, which is that pretty close, right? And then I can, for this one, I want the sum, which is doing of the differences. So that's, that's that I'm going to make this a little bit smaller. Now you could, you could take a histogram then of our samples to see what, what each sample kind of looks like in terms of the, of the data. So for example, I could go into my sample data here for 75 and insert charts, a histogram. Now we only took 75. So the histogram, you know, is somewhat limited in the data. But here's our histogram of 75. And we could change the buckets if we want to, to maybe add more buckets like five buckets. And so there's our histogram with five buckets. We can compare that to the data for 74. So I could take the data for 74 and insert if we so choose a histogram. And we can compare the results on 74, putting this down here. I'm going to just call this 74. And we can then take a look at our buckets, change that. I think we did five buckets, right? I have five buckets on this one. So there's, you know, different data sets for the randomly generated data mapped out with a histogram. Let's do one more 73 and insert charts, histogram, histogram, histogram. It's hissing at me like a cat. And then this is going to be, I don't know why I had to say that. I don't know, histogram. We're going to say let's make this one into number of buckets five. So they all have five buckets. So we can compare, you know, the samples data that we have. And we can also make a histogram if we wanted to of the averages of all the samples, because we took 75 samples. So I could then take these results if we wanted to. And then make a histogram of that, insert a histogram. I should have moved up before I did it. Now I got to move it up. Oh, that's not too bad. I thought it was going to be more tedious than that. Okay. So this is going to be the samples. Let's just say, and that has, you know, five buckets. So there's just a different kind of tool that we can basically use for the shuffling tool and how we can generate our random data samples and use that, of course, to create our histograms of the sample data that we have chosen per sample or possibly of the averages of all, you know, the average sample data.