 For reasons that will become apparent pretty soon, this lecture is mostly of historical interest. So, we begin with the following idea. If x is a continuous random variable, then we know that for some probability density function f, we can compute the probability that x falls within an interval by computing the value of an integral. And since calculus is easier than algebra, honestly, we'd prefer to use a continuous random variable where possible. But many important quantities are discrete, so for example the number of heads results in 1000 flips of a fair coin, the number of defects in 100 meters of a wire, the number of frivolous lawsuits by a con man trying to avoid prison. So how could we model discrete random variables with continuous ones? So there's two parts to that question, and the first part is we can use rounding to model discrete random variables with continuous ones. So suppose x is a discrete random variable, and if you want to find the probability that x falls within some interval, we can approximate x with a continuous random variable y and look for an interval that rounds to the required value. So for example, if y is between negative 0.5 and positive 0.5, then the continuous y rounds to the discrete value x equal to 0. And if y is between 0.5 included up to 1.5 excluded, the continuous y rounds to the discrete x equal to 1, and so on. So for example suppose you flip a coin 100 times, let y be the discrete random variable corresponding to the number of heads, and let x be a continuous random variable modeling the number of heads, and let's compare the probability that y is between 40 and 60 inclusive to the corresponding probability using x. Since y is the actual number of heads, we must compute the probability that we get 40 heads, plus the probability we get 41, plus the probability we get 42, and so on, up to the probability that we get 60 heads. And since this is a binomial probability, each binomial probability must be computed separately, and that means we have to perform a bunch of calculations. Meanwhile, if we are using a continuous random variable, the values that have x round to a number between 40 and 60 inclusive are going to be those in the interval 39.5 less than equal to x, less than 60.5, and so we need to compute the integral from 39.5 to 60.5 using some probability density function. And since the value of an integral depends only on the beginning and end of the interval, we can do this with a single computation. Of course, we do have to determine what that probability density function is. Fortunately, we do have the following, suppose we're repeating a binomial experiment n times where the outcome of interest has probability p. We can approximate the binomial probability using the normal probability distribution, where our mean is np. You can think about this as the expected number of successes, and our standard deviation is given by the formula square root np times 1 minus p. Now if both of these are greater than 5, then the approximation is considered to be reasonably accurate. So let's consider the following problem. When a coin is flipped 100 times and 67 heads are observed, then we'll use the normal approximation to the binomial to find the probability of 67 or more heads under the assumption the coin is fair. So we note that this is a binomial experiment with number of trials equal to 100 and probability of success one-half, and so we can approximate the binomial probabilities using a normal distribution with mean np, which will be, and standard deviation square root np times 1 minus p, which will be, and so we can approximate this probability with a normal distribution with mean 50 and standard deviation 5. Now since the normal distribution is a continuous probability distribution, so every discrete event must be considered as the range of outcomes that round to the event. So the event 67 or more heads corresponds to the interval 66.5 is less than or equal to x. So we'll use the normal distribution curve for our mean of 50 and standard deviation of 5. The event corresponds to the region to the right of x equal to 66.5, but the built-in tool for calculating normal distribution probabilities typically only works for probabilities that were less than an amount. So we can look at the probability where less than 66.5, and we find, and the probability of interest is the complementary event, and so we find, which will round to 0.00048. Now most binomial distribution calculators will allow you to compute the probability that an event occurs n times or less. So if y is the actual number of heads on the binomial experiment, then 67 or more heads is complementary to 66 or fewer heads, and so we find, which means that our complementary event will have probability, which will round to 0.00044. And so we see that the normal approximation to the binomial gives us a reasonably accurate approximation to the exact binomial probability. And at this point, you're probably wondering, calculator spreadsheets and free internet apps have built-in functions to compute exact binomial probabilities, so it's not immediately obvious why you would need a normal approximation to the binomial. However, using the normal approximation to the binomial can be helpful for our purposes of building up some intuition. And this is especially true if you remember the 689599.7 rule, if the data values are normally distributed about the mean, 68% fall within one standard deviation, 95% fall within two, and 99.7% fall within three standard deviations. So how does that work? Well, let's suppose you flip a fair coin 10,000 times. Let's make a prediction of the likely outcomes. So we have our mean and standard deviation, and this means we can approximate the binomial distribution using a normal distribution with mean 5,000 and standard deviation 50. And so this allows us to predict where 68% likely to get within one standard deviation of the mean, that's between 4,950 and 5,050, where 95% likely to get within two standard deviations between 4,900 and 5,100, and where 99.7% likely to get between 4850 and 5,150, three standard deviations. And importantly, anything outside this interval is extremely unlikely.