 Statistics and Excel, uniform distributions with dice. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth soothing. Excel. First, a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one because apparently the merchandisers, they don't want to be seen with us. But that's okay, whatever. Because our merchandise is better than their stupid stuff anyways. Like our CPA six-pack shirts? A must-have for any pool or beach time. Mixing money with muscle? Always sure to attract attention. Yeah, even if you're not a CPA, you need this shirt. So you can like pull in that iconic CPA six-pack stomach muscle vibe, man. You know, that CPA six-pack everyone envisions in their mind when they think CPA. Yeah, as a CPA, I actually and unusually don't have tremendous abs. However, I was blessed with a whole lot of belly hair. Yeah, allowing me to sculpt the hair into a nice CPA six-pack-like shape, which is highly attractive. Yeah, maybe the shirt will help you generate some belly hair too. And if it does, make sure to let me know. Maybe I'll try wearing it on my head. And yes, I know six-pack isn't spelled right. But three letters is more efficient than four. So I trimmed it down a bit, okay? It's an improvement. We'll basically build this from a blank worksheet, but if you do have access to three tabs down below, we've got the example, practice blank, example, in essence, answer key, practice tab, having pre-formatted cells so you can get right to the heart of the practice problem. Blank tab, blank worksheet so we can practice formatting the cells within Excel as we work through the practice problem. Let's look at the example tab to get an idea of where we will be going as we think about a uniform type distribution using an example of rolling a dice. Now note that the die has six sides, six number on the die. If we thought about rolling the dice an infinite amount of times, which we can imagine to be the entire population, an infinite amount of rules, we would expect basically a uniform type of distribution. Then we can imagine, well, what if we rolled it a finite amount of times and plotted out the distributions of a finite amount of times of rules, and we can compare that then to the uniform distribution. So that's what we will do. Let's go to the blank tab and let's first, so we're imagining the dice rule. So remember there's six sides on the die. So what would be the expected rules of any number if we were to roll it, let's say a thousand times. So we're going to say rules, let's going to say a thousand times, and let's format ourselves. Hold on a sec. Let's select the entire worksheet, right-click on the cells, and then format them. I'm going to go into the currency, negative numbers bracketed, make them no dollar sign and no decimals and okay. I'm also going to make it bold. You may not need to, but I'm going to work it bold here. Home tab, font group, bold. So there we have it. I'm going to hold down control and scroll in a bit. So I'm currently at the 265% on the scroll in. So I'm going to imagine that we roll it. Let's roll the die a finite amount of times, which is going to be a thousand times as opposed to the population, which would be an infinite amount of times that we can imagine rolling it. And then the outcomes, we're going to imagine any one number. The odds on each rule of any one number coming up is going to be equal to one out of six. So there's six numbers on the dice. Let's go ahead and format that number. Let's make it a percent and add some decimals. So six numbers on the dice. We would expect then a one in six chance on each roll for any one number on the dice. So let's then say, well, the expected rules of any number, expected rules of any one number, then if I roll it 1000 times and I have any one number, whether that be a one, two, three, four, five or six, I would expect how many times for that number to come up. Well, if each roll has a one six chance, it would be 1000 times that percent. So we'd get 167 about, if I add a couple decimals, home tab, number group, a couple decimals, 166.66 on forever. So that would be what the expected results would be in essence for one number if we rolled it a finite amount of times 1000 times. We would expect to see 166.66, you know, number ones, two's, three's and four's. Now note that this expected result is actually impossible to do, right? Because I can't get an outcome that's not going to be a whole number. So note that we're basically making a model here, a prediction based on what we know that we know can't actually happen in real life because I can't get a roll that's going to be not a whole number. The model is still, of course, useful because we can get, you know, we can get the expected outcome with the model. So if I then let's make a skinny C here and then the headers of our table, I'm going to say these are the dice numbers and then I'll tab and say these are going to be the expected number of rolls. I'll just say expected rolls and enter. I'm going to format these now by selecting these two up top. We're going to go to the home tab, alignment group, wrap the text and then alignment group and center the text. I'm going to go to the font group, bucket dropdown, make it black and then the dropdown, make it white. All right, so there we have it and I'm going to say the number of rolls is just going to be one, two, three, four, five, six, right? And then we're going to say what are the expected outcomes for each of them, each of them, if we roll them a thousand times is going to be is going to be 166.67 is going to be the expected outcome for each of them. So I'll just say this equals this outcome and let's say F4, that's going to put an absolute value or dollar sign before the B and the 3 and then I'll take that and just copy it down and put my cursor in the fill handle copy that down. There we have it. So the total then that we would expect to have in kind of our perfect world would be equal to the sum. And by the way, I'm going to start using some more keystroke. Sometimes you could hit alt enter here. Let's do that again. Alt enter and then it'll try to sum up what's above it. So notice that keystroke could be a lot faster. Oftentimes whenever you're using the sum function, alt enter and so there we have it. And then there and so there's the thousand rules. That's what we would kind of expect to happen. Let's put an underline here, home tab, font group and underline it. Let's make this into our kind of format that we've been using. I'm going to select these items, font group, drop down on the bucket. And if you don't have that blue, it's in this. I'm going to make it that blue. That's the blue I like and then font group, drop down and all borders. We can also make this one a little bit more skinny so that we can just trim this up. Let's do the same over here as well. Making that home tab, font group, blue and bordered. Okay, so there we have that. Now if we were to plot this, then I can plot this out and just say, okay, well, if I select these items, these two and I was going to the insert and I'm going to use a bar chart to plot this one. So charts, drop down, we're going to make a bar chart which is going to look kind of like a histogram. I'm going to pull this over, right? Because I'm going to say the numbers I want on the bottom and I'm going to close this up a little bit. And so then I need to adjust the data. So I'm going to go in here and I'm going to say, let's go to the data up top. So I'm in the chart design and then the data. And what I want on this side, which is going to be on the y-axis, is just the expected roles. So I don't want this one or I could basically delete this one, delete and I want the expected roles. And then over here, I'd like to make sure that I have my actual labels, not the ones that they're going to make up. So I'm going to copy these items and okay. So there we have it. We're going to say okay. And it's picking up the name of expected roles. I don't really need the legend over here. I could add the names on the side. I could say these are going to be the data labels, right? I could add data labels and say there's the data labels. And then we could also add access titles if we wanted to. So on the access titles, this is going to be the results. So I could say this equals the expected roles. And on this side, I'm selecting it. I'm going to say this equals the dice numbers, right? So you can format your chart thusly. And notice sometimes now that I have actually my chart information, I might not even need like this whole bit right here, right? I might not need this stuff because now I've got the numbers that just basically represent it. So sometimes it might be nice just to unclutter things to actually delete this and then click on these little lines. So you get the lines and delete them so that you just get the numbers that are representing them. And that could be another format that you can use just as we build our charts. But you get this kind of boring uniform type of distribution. And if we were to write an equation for the uniform eula, it would be something just like f of x equals c. So for every x we have, we're going to get c. So it's a uniform distribution. And note that this is a family of distributions because it's possible that if we roll the dice less numbers, like if we roll the dice only 300 times, then we have a different distribution. It's still a uniform distribution here that is at the 50, right? I'm going to say 1,000 and bring it back there. Now, of course, in real life, we wouldn't get a uniform distribution because we're using a sample in essence instead of the entire population of infinite rules, right? We're just doing 1,000 rules. So we can simulate, okay, what would actually happen and compare a simulation of rolling the dice? So let's say we do a simulation of rolling the dice, and now we'll actually test this out and see how, and then we can compare to what the expectation is. That's basically what we often do, right? We're going to say this is the perfect model of the world, our expectations, and then we'll basically see what actually happens in the test and how close is it to the actual expectation. So we're going to go to the home tab, font group, make this black and white, and then we'll center it and then we're going to roll our die. Things are going to get dicey here. Things are getting dicey. So we'll do this by using our random function again, equals random brackets or random between, I should say between, tab. There's our formula, and I'm going to say we want to start at 1, comma, and go up to 6. So it's going to give me a random generation of 1 to 6 representing a dice rule, so that's just perfect. And then I can copy that down. I'm going to go down to 1,001, looking at the numbers on the right-hand side, because 1 is already taking up to get 1,000 rolls. So I'm just going to bring this down to 1,001 and note I can do that quite easily in Excel, even though it's quite a lot of rolls. It would be a lot more difficult to do this in the old days when you actually had to sit at the table and roll the dice a thousand times. That would be a pretty cushy job for somebody to be the dice-roller for the experiment, but now we can just kind of simulate it. And so then if we do that, then this is what it's spitting out for the random outcomes. So notice it changes again all the time, so I don't want it to change. I want that to be my generator. And now with the generator, I'm going to copy this whole column, right-click and copy, and then paste it, but hard-code it so that I'm going to say right-click and paste just the numbers. So now it's not regenerating every time, and my generator is still over there if I want to do it again. So let's go ahead and make this one, home tab, font group, black, white. Let's center this thing. And so there are our results. So now let's copy this same table from my expected results and then add the actual results in another column. So to do that, I'm going to put my cursor in column Q equals, scrolling to the left and just picking up the dice rolls. And then I'll just copy that down. I'll copy everything that's in that table down. Six, and then the total. And then I'll copy it to the right as well. And there's the same information that we pulled in from the table. Let's make us format it. I'm going to go to the home tab, font group, black, white, wrap it, center it. By the way, you could do it this way too. Let's do it this way, it'll be easier. I could select this entire thing and say I just want the formatting so I can go home tab, format painter. Just give me the formatting and then just put that right here. Boom, and it paste the format of it beautifully. Okay, so then let's take our actual rules, actual rules and compare it. And so I'm going to format paint this one again, home tab, format painter, boom, to get that. And then in order to get the actual rules, what I'd like to do is say, Excel, take everything in this series of numbers and count them if every time you see a 1. So we can use our trusty count if function to do that. So this equals count if brackets. I'm going to go over here, put my cursor in O2, hold down control shift and down on the keyboard taking it all the way down to the bottom and then we can have control backspace taking me back up to the top and so I can see the formula. So there it is, closing it up and we have too few arguments. Count if, I need to finish the argument, is a comma. What's the criteria? Second condition. Number 1. So count if in that range you see a number 1. Enter. And then I can copy this down. Now notice this time I didn't put a table over here and sometimes the tables are useful and sometimes they're not so I'm going to try to go back and forth between using the table or not. If I had a table that I was referring to I wouldn't need to make the cells absolute references. But here since I don't I could make them absolute by selecting F4 here and F4 so the range of the table doesn't move. Now note you can also use spills and arrays so there's actually multiple ways to do the same things these days which is really cool but also confusing so I'm going to try to mix in some of that stuff as we go. So in any case I'll do that. I'm going to copy this down put my cursor on the fill handle drag it down. So then I should have the same range here right picking up the same numbers this way and it's counting if there's a 6 in there. So that looks good. Now I can kind of double check by summing it up and I should still get up to 1000 so that's our double check that my range didn't get a skewed or anything like that. I could take the difference then difference between what we expected and what actually happened this equals what we expected minus what actually happened I can copy that down double clicking on the fill handle and then down here alt equals that's our keystroke for the sum function enter there we go with our differences I can go home tab font group and format this in the same formatting and then let's make all of this bordered and blue and let's put some underlines under here and underline so there we have it we can make this a little bit thinner maybe can make that little thinner I don't really need a space in between these two or this one we can close that up if we want to and so then we could make a histogram based on what the actual results are so there's a couple ways we can do that we could take a histogram of the entire column of results or since we've summarized the results in essence into buckets over here we can use our bar chart so let's use the bar chart I'm going to select this column holding down control and select this column because I want to look at the actual results and then I'm going to go into the insert tab and we're going to go into the charts and add a bar chart so there's the bar chart I'll drag it to the right let's make it a little bit smaller bring it on over to the right and so there we have it so then I'm going to go into my chart here chart design tab up top select the data so once again I don't want the dice over here so I'm going to select the dice only and delete it I'm sorry yeah we don't want the dice and now I deleted the wrong one sometimes it gets a little easier but I'm going to delete them both and add another one again and this is going to be the dice and then I'm going to say the column we want is this column hold on a sec I don't want the dice I'm going to say we want the actual rows and then delete this and say we want this that's what we want here okay and then on this side I want to make sure that it got my actual numbers so it looks like it's got one to six here but I want to go here and then pick up these numbers so that's the X and this is the Y actual rules clicked off okay so there we have it so I'm going to delete this thing down here and then we have the actual rules and we could also say if I hit the little plus button say we want data labels we can add the data labels like we did before and we can compare this to what we had in our perfect world summary over here so there we have that now you could also make a histogram from the data from the actual dicey rules that we ruled so I could select my data putting my cursor in O2 holding down control shift down arrow and then holding down control back space going back to the top insert another histogram and this time a histogram not a bar chart and then I can pull this down now the histogram is going to give us our buckets which isn't exactly what we want here because we would like it to number you know one to six but I can kind of adjust those buckets I can go into these buckets down here and say that I want I just want six buckets and then it'll give me something similar right and then I can say okay let's go up top and say that we want this to be our data labels and you can see that we get you know the same data points as we did up top I'm going to delete the chart title actual rules and so there we have that now we could obviously we have something different than than you know what we expect like in a perfect world right because we're taking an approximation taking a sample in essence from the population which would be the infinite number of rules so we could also then say well what if we did this multiple times so we can make multiple histograms to see that we'll get a different result each time with this random number generator and if we just played with our random number generator I could say make four of these generators right and then and then just copy and paste them four times so for example here's my number generator if I want to simulate us rolling a thousand rules four times I can just copy this entire let's copy this entire column right click and copy and then I'll just bring this over here and say that we did this four times so I'll say right click paste one two three right click paste one two three right click paste one two three right click paste one two three so now we've got four hard coded rules again this is rules number one rules number two rules number three rules number four so that would be quite long a lot of rolling to do if we were in the good old educational department statistical studies back in the day where we had that cushy job of just rolling the die a thousand times four times over but they have they've eliminated that job now because we can do it this way and the unions tried to stop it you know they was like they're like dude what are you doing we need to like this is an important job but somehow they couldn't hold out so so then we can make histograms of the four here and check those out so if I go from here holding control shift down let's do a quick couple histograms and we'll say and so notice each one of those even though I copied the same ones are giving me different results right even though I pasted them from the same thing so if I go over here and say let's hold on a sec control shift down control backspace let's just enter four histograms insert charts histogram histogram he made a gram of his toe and called it a his toe gram called it his toe gram this is his toe gram alright what are you talking about let's make this let's go down here and then say number we want six of these and so I'm going to say tab okay and so there's that one let's put some some chart titles or data labels on it the chart title I'm just going to say rule one and then we'll just check out what happens on rules to put in my cursor here control shift down control backspace scrolling down a little bit because I want it underneath insert chart histogram and then we'll put that down here and I'm going to say let's do the data labels and then go down here here access we want how many bends we want six bends close that up and then I'll put my chart data labels on that one and this was this is rules too man there's a lot of rules this is getting seriously dicey this is dicey business tell you what you thought fishing fish on the boat the salmon catcher people had dicey jobs there's nothing then I'm going to put my cursor on ad to control shift down and then control backspace and then I'm going to scroll down a bit I want this here insert and charts group another histogram and this is for the next one I'm going to put my cursor down here and let's say that we want how many bends do we want let's say six bends close it up add some chart labels and this is dice roll my spelling rules right I don't know I don't think that's how you spell rules I think you need two L's on the roll that's like a roll like a cinnamon roll oh man people are going to think I'm talking about cooking this whole time is that I started over I'm not doing it over just because people know what I'm talking about people know what I'm talking about okay so let's do one more and we're going to put our cursor on this one and control shift down control backspace scrolling down uno vase mas one more time insert charts histogram and we're going to say buckets are going to be number six tab closing plus button data groups this is rolls four not cinnamon rolls baking terms these are dice man we're in the gambling house rolling dice alright so you can see the point is that they're different they're different because it's a sample but what we would be doing is comparing that then to what we would expect in a uniform type of distribution and that would be how we can usually will be generating or approaching certain questions now obviously the uniform distribution the fact that it can be simulated with a line and a formula is quite useful because that gives us the predictive power right so even though it's not going to give us a certain prediction in a sample situation the fact that we have a general concept that we can approximate like we can approximate the examples here with a line and so that's what we would and so if we can and if we can do that then obviously that helps us for predictive power in the future now when we get into other kind of families of distributions that's kind of what we're looking for we're looking for the formulas will get more complex to simulate a line through or curve but if we can get a curve that can be approximated with a with some kind of formula then that would be great because that gives us that's what gives us a predictive power to then use that formula