 In general, computing a probability is impossible, however it's often useful to find a bound on probabilities. For example, the probability a flood will rise higher than 3 meters above the floodline is at most, or the probability a serial box will contain at least 16 ounces of serial is at least, or the probability a self-serving sociopath will become precedent is no more than. And since calculus is easier than algebra, we can try to answer these questions using calculus. One of the basic results on bounds of probabilities is known as Markov's inequality. So let x be a non-negative random variable with pdf f of x. Now we know that the mean is the expected value of the random variable, so we can express that as an integral, and we can split the integral from minus infinity to zero and from zero to infinity. Now since x is non-negative, we know that f of x must be zero for all x less than zero. And so this integral tells us that the mean is going to be the integral from zero to infinity of x f of x dx. Now we can further split up this integral, say at the point, how about a? So this integral goes from zero to a, plus the integral from a to infinity. And since f of x and x are positive over this interval, we can just drop this first integral, and that does change our equality into an inequality. But wait, there's more. In the interval from a to infinity, x will be greater than or equal to a, and so we can extend our inequality a little bit further by replacing x with something that's guaranteed to be smaller. And since a itself is a constant, we can remove it outside the integral. And since f of x is our probability density function, this integral is the probability that a random variable is greater than or equal to a. So we have this inequality that involves mu, and if we divide both sides by a, we find that mu divided by a is greater than or equal to the probability that x is greater than or equal to a, and this gives us Markov's inequality. But x be any non-negative random variable with expected value mu for any a greater than zero, the probability that x is greater than or equal to a is less than or equal to mu divided by a. Notice that we've switched the direction of our inequality to make it read a little bit easier. So for example, suppose the height of trees in a grove has a mean of 1.75 meters. Let's find a bound on the probability of trees height is greater than two meters, and how about the probability the height is greater than one meter. So Markov's inequality says the probability that our non-negative random variable is greater than or equal to any amount is less than or equal to the mean divided by the amount. And so the probability that x is greater than or equal to two is less than or equal to the mean 1.75 divided by two, and that's going to be, and this is a useful result. Now the thing to remember as you get what you pay for, Markov's doesn't cost very much. We need to know the mean. So if we want to find the probability that x is greater than or equal to 1, we know that's less than or equal to the mean divided by 1, which is to say less than or equal to 1.75, and that's not quite as useful because we know that any probability has to be less than or equal to 1. So to know that the probability is less than or equal to 1.75 isn't actually very useful. Now while A is any quantity, it's easiest to think of it in terms of a multiple of the mean mu. So if A is k mu, then we can rephrase Markov's inequality as follows. The probability that x is greater than or equal to A, less than or equal to mu divided by A, or mu divided by k mu, or if we simplify that one kth. In other words, the probability of being k or more times the mean is less than one kth. Now we haven't really talked about interpretation of probabilities, but remember that under the frequentist interpretation, probabilities are frequencies. They tell us how often something will happen. And so under the frequentist interpretation, we can also interpret Markov's inequality as follows. No more than one kth the values are greater than k times the mean. So how good is Markov's inequality? Well let's consider a case where we actually know the probability. So let's say we have a fair coin flip three times and the number of heads recorded. So we can use Markov's inequality to find a bound on the probability of obtaining three heads, and let's compare to the exact probability. Now since this is a binomial experiment with n equals three and p equals one-half, then in a binomial distribution the mean will be np. And so the expected value of x will be three times a half or 1.5. Since the maximum number of heads is x equal three, then the probability of x equal three is the same as the probability that x is greater than or equal to three. And so Markov's inequality gives us, and so the probability that x is greater than or equal to three is less than or equal to one-half. Now we can compute the exact binomial probability, and this will be one-eighth, which is considerably less than the upper bound given by Markov's inequality. And so Markov's inequality provides an upper bound on the probability, and only requires we know the mean. However, it's only useful for the probability a value exceeds the mean, and it's not generally a good bound. And if you think about it, the problem is that the distribution about the mean is also important. And remember that distribution is measured by the standard deviation. And so the question is, what if we incorporated the standard deviation? Could we use the standard deviation to get a better bound on the probabilities? And we'll take a look at that next.