 Statistics and Excel. Poisson Distribution. Potholes in Road Example. Get ready, taking a deep breath, holding it in for 10 seconds, looking forward to a smooth, soothing Excel. First, a word from our sponsor. Yeah, actually, we're sponsoring ourselves on this one, because apparently the merchandisers, they don't want to be seen with us. But that's okay, whatever, because our merchandise is better than their stupid stuff anyways. Like our crunching numbers is my cardio product line. Now, I'm not saying that subscribing to this channel crunching numbers with us will make you thin, fit, and healthy or anything. However, it does seem like it worked for her. Just saying. So, you know, subscribe, hit the bell thing, and buy some merchandise so you can make the world a better place by sharing your accounting instruction exercise routine. If you would like a commercial free experience, consider subscribing to our website at accountinginstruction.com or accountinginstruction.thinkific.com. We are in Excel. If you don't have access to this workbook, that's okay, because we'll basically build this from a blank worksheet. But if you do have access, three tabs down below. Example, practice blank. Example, in essence, answer key, practice tab, having pre-formatted cells so you can get to the heart of the practice problem. Blank tab, basically a blank worksheet, except for our image here, which you don't really need for the practice problem. So we can practice formatting the cells within Excel as we work through the practice problem. Let's go to the example tab to get an idea of where we will be going. Looking at the Poisson distribution, once again, oftentimes examples related to line rating situations, as we saw in a prior presentation. But we can also have situations over space. So in this case, we're talking about potholes in a road, thinking about the mean number of potholes, the average number of potholes being 20 for 100 miles of road. As opposed to a line waiting situation where we were saying how many people will show up over a certain interval of time. Let's go back to the blank tab now and populate some of our data up top. So I'm going to go into A1. Let's first format the entire worksheet like we normally do, select in the triangle, right-click it on the selected area. We're going to format the cells. I'm going to go to currency, negative numbers, bracketed and red, no dollar sign. I'm going to get rid of the decimals for now and add them as I need them. Okay, let's go to the home tab, fonts group and make the whole thing bold as well. You don't have to do that, but I will do that. Alright, so here's our conditions. We're going to say that the mean potholes per 100 miles per, I'm going to say per miles so that I can put them in a separate cell. I'm going to make A a little bit larger. So the mean number of potholes I'm going to say is 20 per 100 miles. Okay, so now what I would like to do is kind of imagine as we did last time that we were actually going out and counting the potholes per 100 miles. Now in this case to do that to get a random generated number, we have to already have the mean, but in practice you might not know the mean, right? What would we do to start this process? We might go out and count the potholes for every 100 mile interval and try to see how many potholes are in each of those 100 mile intervals. And then with our data, we might plot our data to then see if it conforms with a Poisson distribution. And if it does, the Poisson distribution might be a way for us to more easily make predictions and predictions into the future about the pothole situation. So let's do a random number generation, which will mirror or mimic us basically going out and counting the potholes. To do that, we're going to go into the data tab and we need the data analysis. We saw turning that on last time. If you don't have that, you go to the file tab, you go to the options down below, and then you go to the add ins and then manage. You want the Excel add ins and go and then you want the analysis tool kit. That's the one we want. Okay, so I'm going to make a little bit wider here. I'm going to make my image a little bit smaller. I'm going to make column C a little bit smaller. I'm going to put the data here. So this is going to be my X data results. Let's format that first home tab font group. I'm going to make it black and white, which is our normal heading formatting. Let's center it as well. And then I'm going to go into D2 because that's where I want the data to go insert. I'm sorry, not insert data up top. And then we want to go into the analysis and the data analysis. I'm going to do a random number generation. And here's the criteria. I'm going to put a one here because we just want one column. Number of random numbers. Let's put a thousand again. So we're just going to generate a thousand numbers that we'll be doing. Those thousand numbers representing a thousand 100 mile tests that we did. So that might be a fairly large test. Let's put like 500, like 500, 100 mile tests. Let's do that one. And then it's not going to be a discrete. We wanted to give it to us in accordance with a Poisson distribution, which still has an element of randomness. But it's going to be distributed according to a Poisson distribution with the lambda, the mean being, we're going to say 20. Now again, in real life, we might not know the mean. We'd go out and count potholes to basically figure this out, right? We would be counting every 100 miles, the number of potholes within each 100 mile frame. Let's go ahead and say, Oh, where do I want it? The output? I wanted to go right there. So that's where the output is going to go. Okay, let's go. Okay, and see what we get. So what's this mean then? It means that for the first 100 miles that we looked at, there were 18 potholes. Then there was 26 potholes in the second 100 mile space, the third mile 100 mile space, 21 potholes, 26 potholes in the next 100 mile space, and so on and so forth. So now we can take this data and we can create our bins with it to see if we can see any patterns with that data and possibly if it corresponds to closely or relatively closely to some kind of distribution that we can make a function for like the Poisson distribution. So let's say that we have our bins are going to be anywhere from, we're going to say, let's go from, I'll say 0, 1, 2, and I'm going to bring it all the way up to 100 here. So I'm going to select those three numbers, fill handle, drag it down to 100. Remember that in theory there can be like an infinite number of potholes. The whole road could be a pothole I guess in theory, right? Because there's no upper limit if we're talking about a Poisson type distribution. But if the mean is 20, then you would expect you're not going to get, you know, it's going to be quite rare that you're going to have potholes more than like 100 potholes in that interval given if it's in a Poisson distribution format. Now remember if we want the frequency, we could try to say I want to count these numbers and I could try to use a count if function equals count if brackets and the criteria range would be this. I'm going to hold down control shift down arrow, getting down to the bottom, control backspace going back to the top and then comma count that if the criteria is zero. And I could say enter and copy that down. Now I'm not going to leave this as the way we do it because the other way is a spill function for the frequency. I just want to show that sometimes this doesn't give us the result that we would expect, right? So if I sum this up at the bottom by saying alt equals I should get to 500 and I don't. And that's because sometimes I think that the distribution might give us a number that's not exactly a whole number or something like that because it's not picking up all the numbers which either means that it's not picking up one of these numbers or that there's some numbers that are actually higher than 100. So we could use a formula to try to make these all exactly rounded numbers but I find the other way we could do this I'm going to delete this data is to use the frequency which is a spill array formula instead. So the frequency equals the frequency tab and then we want the data array which is going to be here. I'm in D2 control shift I'm holding control shift down arrow takes me to the bottom then I'm going to hold down control and backspace to bring me back up and then comma the next argument are the bends which are going to be these bends it's going to look for these bends control shift down and then again control backspace to bring me back to the top closing it up and enter and it spills it now so now we have a spill array type of system I'm going to try to remove that last bit so it doesn't do that last one double clicking up top and say what if I clip it at 101 will it then stop that it does and now let's do the total which will be alt equals there's 500 so now it picked up all 500 of them and put them into the relevant bends which is nice let's make this a header I'm going to go to the home tab and say this is going to be black and white and then we can center it so now I could graph this frequency right so I could say let's take this frequency control shift down control well hold on a second whoa k-passo control shift down I don't want the total though all right I'm just going to select the whole thing and then I'm going to insert so let's go to the insert over here and we'll say that I'm going to insert a chart let's make it a bar chart for the frequency so there's what we have on the bar chart let's click into the chart chart design I'm going to the data up top so the data is good I want to edit this side however to pick up my bends on the left so it picks up my bends and not just some generic bends control shift down and then shift up so it doesn't shift up so it doesn't pick up the total and then enter okay and okay so there we have our distribution and you can see it's a little choppy here we don't have as much data but it looks like it might be conforming to you know the poisson type of poisson type of distribution so it's it's not going to be perfect because there's still randomness involved in it but if we can approximate what is happening here with the poisson distribution then we might be able to use that now I can also do the percent of the total percent of total let's make this format paint home tab format painter here I'm going to take each of these numbers and say equals the frequency divided by the total 500 I'm going to make that second number absolute because I want to take each number divided by the total so I can say f4 and enter and then I'm going to make it a percent home tab numbers percent to find it and then double click on that copy and it's down and if I then double check that everything is done properly I'm going to delete this total and sum it up for the double check and it should come out to 100 which is going to be alt equals will give us that some function nice and easy there's the 100 so I could make my frequency my chart using this column as well so I can select a this column and I could say okay let's make our chart based on that insert and then charts bar chart boom bar chart boom let's do some formatting while I'm here on it data are this side I want to make sure that this is picking up our bends shift up and okay so so then we get the same kind of look and feel but now on the percent basis as opposed to the whole number of bases all right so then let's do let's take the a couple stats on our data I'm going to make a skinny eye here make a skinny eye and then we're going to say that this is going to be the mean uh calculation so I can take the mean calculation of our data so I'll say this equals the average shift nine go into our data the whole data set and control and control shift down arrow and enter so now we've got it the mean being 20 I'm going to add some decimals it's not exactly 20 because remember we used the mean as a condition to populate the data but there's still randomness in the data that we populated so the mean of the actual data that we have is more like 20 point you know one four uh potholes per 100 miles and then we can calculate the variance let's do the variance with a p so it equals the variance and we'll take the variance of all of our data set control shift down and enter and let's add some decimals there and you can see it's pretty close to the mean and that's another indication that we might have something that would be uh we can represent abstractly with a poisson distribution here's the variance if it was a sample equals the variance s of our data and we'll add that a little bit more here so then we've got 1952 so you could see these so now if we're looking at this data we're saying okay so now we're talking we're talking about something that might conform uh to a poisson distribution because we have something that's happening over a certain uh not time element but space in in terms of the road we would think that each instance of the pothole would be independent from other instances of the pothole and so on and so forth and when i graph it it looks kind of like a poisson distribution possibly could fit that and when i pick up the variance and the mean of the data they're pretty close to each other which means that we might benefit from making predictions based on the poisson distribution curve into the future right so next time what we'll do is we'll then do the poisson distribution which will be a more perfect curve and and do some comparisons and see how how well that relates to our actual data that we populated and then think about well will will will that poisson distribution then be useful to make predictions uh as we extrapolate this information into the future for decision making