 When talking about data sourcing and measurement, one very important issue has to do with the accuracy of your measurements. The idea here is that you don't want to have to throw away all your ideas, you don't want to waste effort. One way of doing this in a very quantitative fashion is to make a classification table. So what that looks like is this, you talk about for instance, positive results, negative results. And in fact, let's start by looking at the top here, the middle two columns here, talk about whether an event is present, whether your house is on fire, whether a sale occurs, whether you've got a tax evader, whatever. So that's whether a particular thing is actually happening or not. On the left here is whether the test or the indicator suggests that a thing is or is not happening. And then you have these combinations of true positives, where the test says it's happening and it really is. And false positive, where the test says it's happening, but it's not. And then below that, true negatives, where the test says it isn't happening, and that's correct. And then false negatives, where the test says there's nothing going on, but there is in fact the event occurring. And then you start to get the column totals, the total number of events present or absent, and the row totals that talk about the test results. Now, from this table, what you get is four kinds of accuracy or really four different ways of quantifying accuracy using different standards. And they go by these names, sensitivity, specificity, positive predictive value, and negative predictive value. I'll show you very briefly how each of them works. Sensitivity can be expressed this way. If there's a fire, does the alarm ring, you want that to happen. And so that's a matter of looking at the true positives, and dividing that by the total number of alarms. So the test positive means there's an alarm, and the event present means there's a fire, you want to always to have an alarm when there's a fire. specificity, on the other hand, is sort of the flip side of this. If there isn't a fire, does the alarm stay quiet? This is where you're looking at the ratio of true negatives to total absent events where there's no fire and the alarm's not ringing, and that's what you want. Now, those are looking at columns, you can also go sideways across rows. So the first one there is positive predictive value, often just abbreviated as PPV. And we flip around the order a little bit. This one says if the alarm rings, was there a fire? So now you're looking at the true positives and dividing it by the total number of positives, total number of positives is anytime the alarm rings, true positives is because there was a fire. And negative predictive value or NPV says, if the alarm doesn't ring, does that in fact mean that there is no fire? Well, here you're looking at true negatives and dividing it by total negatives, the time that it doesn't ring. And again, you want to maximize that so the true negatives account for all of the negatives, the same way you want the true positives to account for all of the positives and so on. Now you can put numbers on all of these going from 0% to 100%. And the idea is to maximize each one as much as you can. So in sum, from these tables, we get four kinds of accuracy. And there's a different focus for each one. But the same overall goal, you want to identify the true positives and true negatives and avoid the false positives and false negatives. And this is one way of putting numbers on an index really, on the accuracy of your measurement.