 So here's a quick recap of what we've done so far. We've discussed several types of discrete probability distributions. We've introduced the idea of a random variable whose output corresponds to an outcome in the sample space. We've introduced the idea of a continuous random variable. And we've looked at using a continuous random variable to round to the output of a discrete random variable. And the important question is, why? And the answer is, because calculus is easier than algebra. Now, if you haven't taken calculus, that's not really true. But here's a crash course in applied calculus, in particular what we actually need. A region in the plane requires a specified top, bottom, left, and right. And we can talk about the area of the region below some graph above the x-axis with left side vertical line x equals a and right side vertical line x equals b. And this area is expressed using the definite integral written this way. Now, generally speaking, computing a definite integral is hard unless it's impossible. And in fact, for many important functions, it is impossible. What's important to understand, however, is the following. The value of an integral can be computed from its endpoints, in other words, those left and right values. Integrals are areas and areas are additive. What do we mean by that? Well, let's consider the following. Suppose y equals f of x is a curve entirely above the x-axis and we know the values of some integrals. And let's use it to find the value of another integral. Now, we know nothing about y equals f of x except that it's above the x-axis. But if we look at this first integral, this gives us the area under y equals f of x above the x-axis and from x equals 5 to x equals 8. And we find that that area is 0.3. Similarly, the second integral tells us that the area under y equals f of x above the x-axis and from x equals 8 to x equals 20 is 0.5. Now, the area we're looking for is under the graph of y equals f of x above the x-axis from 5 to 20. And that means it's just these two areas put together. And so this area must be 0.3 plus 0.5 or 0.8. And so that's the value of the integral. And so this leads to an important idea. Suppose x is a continuous random variable. A probability density function, the original PDF, is a function f of x where the probability that a random variable is between a and b is expressed as the integral from a to b of f of x dx as long as a is less than b. In other words, our probability density function is a function that allows us to find a probability using an integral, I mean, using an area. So as a quick example, suppose x is a random variable with probability density function looking like this. We can compute the probability that x is between 5 and 10. So that probability is the integral between 5 and 10 of f of x dx. And since x is greater than 0, we'll use the formula e to the negative x. And since this isn't a calculus class where the details of the integration would be important, we'll leave the computation of this value to a computer algebra system and we find... Suppose f of x is a PDF for some random variable x. We note the following. First of all, since probabilities are non-negative, then the integral of our PDF over any interval must be greater than or equal to 0 for all a and b with a less than b. Also, since the sample space includes all possible outcomes, then if I integrate over all possible real numbers from minus infinity to positive infinity, that integral must be 1. And finally, since integrals are areas, the integral from a to a is going to be 0. And if you want to think about this geometrically, this is looking at the area of a region where the left and right are the same wall. And this leads to an important result. Consider the event a less than or equal to x less than or equal to b. Now, this event can be broken apart into x equal to a and x greater than a and less than or equal to b. And since these events are mutually exclusive, their probabilities can be added. But this first probability that x is equal to a is the same as x being between a and a. And so using our probability density function, we can evaluate these two probabilities as integrals. And since the integral from a to a is 0, that gives us... And the previous argument can be applied to the interval a strictly less than x, strictly less than b, or a less than or equal to x, strictly less than b. And this gives the following result. Let x be a continuous random variable, then the probability that x is between a and b, well, it doesn't matter whether you include or exclude the endpoints. In other words, the endpoints don't matter. Now, you should be careful with this. Suppose x is a continuous random variable that models the number of defects in a 100 meter coil of wire. So explain why the probability of having 5 or more defects and the probability of having 6 or more are not the same. And here's the fast wrong way. Five or more defects is x greater than or equal to 5, so we want probability that x is greater than or equal to 5. But since endpoints don't matter, the probability that x is greater than or equal to 5 is the probability that x is strictly greater than 5. However, if x is strictly greater than 5, then there are six or more defects. And so the probability x is strictly greater than 5 5 is the probability that x is greater than or equal to 6, and putting everything together, we conclude that the probability that x is greater than or equal to 5 is equal to the probability that x is greater than or equal to 6. The reason we can't do this is that x is a continuous random variable. So 5 or more defects. Remember we have to look at things that will round to 5 or more. So this is really x greater than or equal to 4.5, while 6 or more defects is x greater than or equal to 5.5. So while it's true that the probability x is greater than or equal to 4.5 is the same as the probability that x is strictly greater than 4.5, we can omit or include the endpoints. We can't get to this other probability from there, and so our two probabilities are not going to be the same.