 have been calculated properly. So then we can take the percent of the total. So for example, what's the percent of times that we found nine potholes in our 500 counts of 100 mile spans, right? So I could say that would be, we found two times that that happened. So two over 500 is gonna give us our point 4%. And if I scroll down, we could say, what's the percent likelihood that we had 20 potholes and the 100 mile span? Well, apparently that happened 56 times out of 500. It seems like, so that's 11.2% that likelihood that we had the 56 potholes. So that's gonna be our data set, which we can represent then in terms of the frequency, the count, as well as the percent of the total, which is there. So then if I took some standard calculations, if I took the mean of the data set, for example, and the mean is just simply taking the average of this data set, so the Excel formula would be equals the average of this data set, we have a mean number of 20.14. And if I take the variance of the data set, so now I'm looking, this is the var.p and then dot as for a sample and a population, just to practice both of those of this entire data set, we get to 20.49. So one of the characteristics that's interesting with a Poisson distribution is that the variance, if it was a perfect Poisson situation, would be equal to the mean. So if it's approximating the mean, the variance and the mean are pretty close, then we're going, oh wait, this might be like a Poisson distribution situation, in which case we might be able to use a line in order to represent the data. So then, so we had the mean potholes are, before we go there, no, we can also plot the data. So if I plot this data, so this is a plot of the frequency, so we're taking the frequency here and we're counting, so how many times did we have the 16 or the 20 potholes, the highest one is 50 something, 55 or something like that. So that's the count that's happening here. So we had, again, the frequency represents the number of times it happened and the 500 tests that we ran. And if we're looking at 20 potholes, we had the 56, right? That's that one that's going way out there. It looks a little bit kind of like an outlier a bit. And if I look at, I can also do it on a percent basis. So if I take the percent, the percent of the total, notice you get, in essence, the same shape here, but now we're looking at it as a percent of the total. So that's gonna, so now we're saying, hey, that kind of, that doesn't look perfectly like a Poisson distribution, but it looks like it might be a little skewed to the right, right? It might be looking kind of normal. Poisson sometimes looks like a bell shape, but it's slightly skewed to the right, generally, as the general idea. So we're thinking maybe it would be a Poisson. So now we could say, what if I took that mean, so now I'm taking the mean that we got here and the miles of 100 and I did an actual curve using the Poisson distribution. So this would be like the perfect representation as opposed to our approximation with our data set if it wasn't a perfect line, right? So I can take my data set here as if we're seeing how many times the potholes would show up representing zero potholes in 100 miles, one pothole, two potholes in 100 miles. And then if I do a Poisson calculation, which is this formula, Poisson.dist, the x is representing the zero, the one, the two, and so forth, the mean is now gonna be the 20.14 we got from our example. And then do I want it to be cumulative? No, I don't want it to be cumulative. I want you to give me the percent likelihood of each of these numbers. What's the percent likelihood that we get nine potholes in the 100 miles, 0.27. And then I can compare this to what we actually got over here. Now remember the Poisson distribution isn't giving us an actual frequency, the number of potholes, it's giving us basically the likelihood of the number of potholes, right? Which is this column kind of represents this column. So I can compare this column then, if I subtract this column to this column, here's our differences. So I can look at those differences and say, does that look pretty close? It looks like it's pretty close to a Poisson distribution, an exact Poisson distribution as the data that we generated. And so then I could say, all right, we could then of course ask questions if we had our Poisson distribution such as what's the likelihood of having zero to five? So if I said zero to five, let's pull out the trusty calculator and we can say that would be, now I would have a cumulative from here to here. It's still pretty low. I've got 0.006, I can't count it here because I needed an actual other decimal to pull it up. But the idea is you could generally sum it up if I had another decimal to do it to get to the point 0.06, which would be the sum this way, or we can use the Poisson.dist formula this way. I could say Poisson.dist, the x is now going to be the five, the mean is still the mean over here, 20.14. But now I do want it to be cumulative, so I put a one instead of a zero and that'll basically take sum it up. So you got that nice Poisson distribution to give you the cumulative up to that point, which would be a common kind of question. What if I want seven to 14? So if I go from seven to 14, here's seven and then down to 14. So I could add those up, I could sum them up being careful to say, well, are we including seven or are we not including seven in a practice problem or in any situation you have to be pretty careful and say are we including those two ends or are we really only saying that we want to include the eight to 13? But I could then add those up here. I won't actually do it, but you get the concept of it here. And then if I did this with a Poisson distribution, it's a little tricky because what I have to do then is I have to say, well, the cumulative function will take me up to the upper limit, which I'm saying is 14, which I'm gonna assume we're including the 14 here. So that would go up to the 14 and then I'd have to subtract out up to the lower limit, which we have six. So I'm assuming we're including the seven, so the lower limit would be six. So it'd be Poisson dot dist, X would be 14, the upper limit, which I'm including the 14 in this case, and then comma and the mean would be the same mean over here and it's gonna be cumulative with a one minus Poisson dot dist up to six. So I'm not including the seven because I don't wanna subtract out the seven. I want the seven to be included in the range we're talking about comma. And then the mean is gonna be this one again and it's gonna be cumulative. So we can do the calculation that way as well. Now, this is a histogram plotting the, and we're doing it with a bar chart by the way in Excel, but it's a histogram kind of format. We're plotting the P of X, which is the perfect Poisson distribution curve, histogram versus this graph over here, which was our actual data. And you could see they line up pretty close. You got that weird outlier right there, but they line up pretty close. This is the same thing with a line chart as opposed to like a histogram type of chart. And so now if I'm looking at my data and we're saying, okay, these potholes, can we get some predictions about these potholes? Let's count all the potholes that come up in every 100 mile and see if we can analyze that data and see if there's any trends to this that might help us with our maintenance policy or something like that. And then we counted the potholes. We noticed that the mean is close to the variance. So we're saying, hey, that looks like it might be a Poisson distribution. It looks kinda like it might be a Poisson distribution too. And then we actually plot out the Poisson distribution, compare it, and it's like, yeah, the difference on their data versus our data is pretty close. So maybe we could use the Poisson distribution. Let's plot them together on top of each other. And we could say, yeah, it looks pretty much like the Poisson is approximating our data. Therefore, maybe going forward when we make decisions about how many potholes might be showing up in any 100 mile radius or something, we can use the Poisson distribution to make some approximations about that and plan accordingly. Notice that if the Poisson distribution did not correlate well, couldn't approximate the actual data, then we're left with a problem. Cause then we have to say, well, how am I gonna figure out how many potholes we have to do something different to kind of extrapolate what the data is gonna mean going out into the future.