 A misunderstanding of conditional probability leads to many abuses centered around testing. For better or for worse, modern society is based around testing, so there are things like professional certification tests which are probably a good idea, there are things like drug tests, maybe that's a good idea, maybe it's not, and loyalty tests which, if we have, we will no longer have a free society. But even if a test accurately measures what it's supposed to, what do the results really tell us? So it's easiest to frame the question in terms of medical tests. So a medical test for the presence or absence of a condition can return one of two results, a positive result which indicates that the condition is present, so a test for pregnancy or diabetes or opiate drug use. Alternatively, the test might return a negative result which indicates that the condition is absent. However, this is just the test result. The true state of the world might be that the condition is in fact present, you are pregnant, diabetic, or have opiate drugs in your system, or the condition is in fact absent. You are not pregnant, diabetic, and have no opiate drugs in your system. A test result that reflects the true state of the world is a true result, a test result that does not reflect the true state of the world is a, wait for it, false result. And what is vitally important to keep in mind is that the test result is not the true state of the world. We might summarize these four possibilities, either the condition is present or absent, either the test result is positive or negative. If the condition is present and the test result is positive, then the test result does reflect the true state of the world, so it's a true result. And because the test result is positive, it's a true positive result. Likewise, if the condition is absent and the test result is negative, the test result is a true result, but in this case it's a negative result. On the other hand, if the condition is absent and the test result is positive, then the test has returned a false result that is positive, and a negative test result when the condition is present is a false negative result. While both false positive and false negative results are errors, which is more important depends on the situation. So for example, suppose you're being tested for diabetes. A false positive result might cause you to be directed to avoid carbohydrates and to monitor your blood sugar level. This is inconvenient. On the other hand, a false negative result might allow the diabetes to progress until it causes severe health problems. This is worse. In this case, a false negative result is much worse than a false positive. On the other hand, suppose you're being tested as a job requirement for opiate drug use. So a false positive result could lead to your dismissal even when you're not using opiate drugs. On the other hand, a false negative could lead to a person who uses opiates continuing to work at a job. And while it may be very important to make sure that the cashier is not in an opium dream, a bigger problem is that there's no way to prove that you weren't using drugs in the past, and so a false positive could have lasting consequences. So let's take a look at that. The sensitivity of a test is how often it returns a positive result when it should. This corresponds to the probability of a positive result given that the condition is present. We can find this by testing people known to have the condition. And so if 10 persons known to have the condition are tested and 9 of them test positive, then the sensitivity of the test is 9 out of 10, 90%. The other factor is the specificity of a test. The specificity of a test is how often it returns a negative result when it should. This corresponds to the probability of a negative result given that the condition is absent. So suppose 10 persons free of the condition are tested and 9 of them receive negative test results. The specificity of the test is 9 out of 10 or 90%. These conditional probabilities lead to what's known as the prosecutor's fallacy. Let C be the event that a person has a condition and T the event that the test returns a positive result. The sensitivity of the test corresponds to the probability of T given C, the probability the test result is positive given that the condition is present. But this is only relevant if we know the person has the condition and are uncertain about whether they test positive. What we really want to know is the probability of C given T, the probability that they have the condition given that they test positive. And the prosecutor's fallacy is the belief that the probability of C given T, what we really want to know, is about the same as the probability of T given C, what we do actually know. But in fact these two probabilities are completely unrelated. With these ideas in mind, suppose the condition is present in 1 in 10 people and a test for the condition is 90% specific and 90% sensitive. If a person tests positive for the condition, what is the probability they have it? Now we could apply mathematics and get the correct answer, but what if we just apply common sense? Common sense says that since this test is 90% specific and 90% sensitive, it must be very accurate, so the person almost certainly has the condition. Now that argument might work if you're a politician, but if you want a correct answer, we'll have to do some mathematics. Now we could use a formula to calculate this probability, but let's use the frequentist interpretation and see how often this event occurs. So suppose we test 100 people. Since the condition is present in 1 in 10 persons, then 10 of them will have the condition and 90 will not. So now let's test all of them, some of them will be marked positive and some will be marked negative. So the idea is we're going to focus on just those who tested positive. So let's see how that breaks down. First, since the test is 90% specific, then 90% of those without the condition will test negative. These are the true negatives and there are 81 of them, but the rest 10% will test positive and these are the false positives and there are nine of them. Next, since the test is 90% sensitive, then 90% of those with the condition will test positive. These are the true positives and there are nine of them. However, 10% will test negative and these are the false negatives and there is one of them. And so this gives us the total number who tests positive, nine who have the condition, and nine who do not, for a total of 18 who tests positive, and the total number who tests negative, one who has the condition, and 81 who don't. To find the probability that person who tests positive for the condition actually has the condition, again we want to separate this into what we know and what we're uncertain about, we know that the person tests positive so they're among this group of 18. What we don't know is whether they have the condition and so we want to know whether they're among this group of nine. And so the probability is nine out of 18. And summarizing, we might say that the probability that a person who tests positive for the condition actually has the condition is nine out of 18. That's only 50%. So even though this test has a 90% specificity and a 90% sensitivity, 50% of the time we can't rely on a positive test result. 50% of the time a positive test result is a false positive test result. It does not tell us the true state of the world. And what this means is that if we're going to use the test result for anything important, we have to keep in mind that no matter how specific or how sensitive the test is, we can't always rely on the results.