 middle point, and there's the standard deviation. So standard normal distribution, special case where mu equals zero and sigma equals one. So that's kind of like a special, like normal textbook kind of case where you have mu right at that middle point at zero. And that means that you're gonna have half of the data on the upper or positive end and half on the negative. And then we have the Z score is used to standardize and represent data points in terms of their distance from the mean measured in units of standard deviation. So notice if I look at this chart up top, we have the X's, which we could imagine being grades in this case. So we have, we're imagining we have a bell shaped curve that's approximating grades. We have 34, we've got a 55%, 71% or 61%, 70%. And we're somewhere around 74% being the average, above average going to 85% and so on and so forth. You can imagine at this middle point, that's where we have the mean, which is around 74 in this case. And I can then list where we are by saying, this is a, if I say I have a 67, I can list it by the X, 67, but I can also name that point by the Z score. So this is the Z score down below, which is basically giving a number based on how far away from the middle point we are. So you can see the Z score will be zero if we're at the mean. And then when we go below the mean, it's gonna be negative. It's gonna be increasing in the negative direction as we move away from the middle point. And then it's gonna be increasing in the positive direction as we go away from the right side. The Z score is particularly useful when we're comparing, say, two different data sets. We use an example when we do our example problems of batting averages for multiple years, which is basically a scenario that is like a job, right? Because baseball players are measuring their performance at a job. So you might find similar kind of applications for other type of jobs. But the idea would be that if a baseball player in one year was compared to a baseball player in another year, we can look at their batting average, for example, which gives us the percent, which will help us even if they had a different number of at bats. But there's other circumstances that might be revealed, such as in one year, the mean might be higher or lower. The average might be higher or lower for whatever reason. And the spread might be different from year to year. Therefore, it might also be useful for us to compare the Z score from year to year as we'll do in those examples. So we'll also graph this and note it's kind of tricky. It's kind of fancy here that we're graphing it with two Xs, one being the grades, the Xs and then the Z scores. Okay, area under the curve integrates to one making it a probability distribution. Probabilities of intervals can be found by the area between the interval endpoints. So in other words, we're usually gonna be thinking about an area type of graph here and we're looking at the area under the curve, which means we have the integral calculus kind of thing to deal with, although we won't be dealing with calculus here because we have Excel to help us to calculate the area. So we wanna understand it in essence conceptually so we can use the concepts of Excel. So for example, questions that we might have, we might look at this 91% and where it lies, or we might ask questions like what would be the likelihood that I get a 91% or above. To do that, I would need the area under the blue part of the curve. Now in Excel, we have this cumulative kind of formula which will help us to get the area from the left to the right. So I can get the area of the orange part from the left to the right up to here. If I want the blue part, I can use the concept that the whole thing adds up to 100 or one and then subtract one minus the area of the orange part which will give us the blue part. So we'll take a look at some examples of those types of calculations in Excel. We also could of course have a problem where we want something in the middle. So we might say now I'm trying to pick up the middle part here or we might have a scenario where we're looking for the orange part as opposed to the blue part, the area of the orange as opposed to the blue which would be a cumulative norm.dist function that we'll take a look at in our practice problems. Why does the Gaussian distribution arise so frequently, you might ask. And remember that this curve, because it's so famous and it happens so often, we tend to think that all data kind of conforms to this curve which isn't the case. Remember that we still run into the problem of certain data sets might not have any kind of distribution that conforms to any of our nice curves such as the Poisson distribution, exponential distribution and now the bell-shaped curve. But a lot of things do, especially things that happen in nature for example, often conform to this bell-shaped type distribution. So we wanna do the same kind of process. Remember that this bell shape up here isn't representing actual data. This is not a histogram of data. We would typically be thinking about a histogram of data first and then think, does it approximate a bell curve or possibly some other distribution like a Poisson distribution or something. If it approximates a bell curve then we graph the bell curve out and then we see if the bell curve having its special characteristics allows us to give us predictive power over the thing that the data set is representing, right? So we have the central limit theorem which we'll talk more about in future presentations. Right now we wanna get an idea just of the characteristics of the bell curve but this is one reason why a lot of things kind of conform to a bell curve type of distribution. So no matter the original distribution of the population, the distribution of the sample means tends to be normally distributed as sample size increases. So demonstrates the ubiquity of the normal distribution in nature and statistics. So we'll talk more about that in future presentations, influences on data. You might also think of it, this is kind of more of an intuitive way to think of certain scenarios and why they might conform to a bell curve. So many values we measure result from various minor influences.