 The next step in our discussion of statistics and inference is hypothesis testing very common procedure in some fields of research. I like to think of it as put your money where your mouth is and test your theory. Here's the right brothers out testing their plane. Now the basic idea behind hypothesis testing is this, you start with a question. And it's something like what is the probability of x occurring by chance if randomness or meaningless sampling variation is the only explanation. Well, the response is this, if the probability of that data arising by chance when nothing's happening is low, then you reject randomness as a likely explanation. Okay, there's a few things I can say about this. Number one, it's really common in scientific research, say, for instance, in the social sciences, it's used all the time. Number two, this kind of approach can be really helpful in medical diagnostics where you're trying to make a yes, no decision does a person have a particular disease. And three, really, anytime you're trying to make a go, no go decision, which might be made, for instance, with a purchasing decision for a school district, or implementing a particular law, you base it on the data and you have to make a yes, no hypothesis testing might be helpful in those situations. Now, you have to have hypotheses to do hypothesis testing. You start with h sub zero, which is the shorthand version for the null hypothesis. And what that is in larger term, rephrase. And what that is in lengthier terms is that there is no systematic effect between groups, there's no association between variables, and random sampling error is the only explanation for any observed differences that you see. And then contrast that with h sub a, which is the alternative hypothesis. And this really just says that there's a systematic effect that there is in fact a correlation between variables that there is in fact a difference between two groups that this variable does in fact predict the other one. Let's take a look at the simplest version of this statistically speaking. Now what I have here is a null distribution. This is a bell curve is actually the standard normal distribution, which those z scores and relative frequency. And what you do with this is you mark off what are called regions of rejection. And so I've actually shaded off the highest two and a half percent of the distribution and the lowest two and half percent. What's funny about this is even though I draw it to plus and minus three, it looks like it hit zero. It's actually infinite and asymptotic. But that's the highest and lowest two and a half percent collectively that leaves 95% in the middle. Now the idea is then you gather your data, you calculate a score for your data and you see where it falls in this distribution. And I like to think of that as you have to go down one path or the other, you have to make a decision. And you have to decide whether to retain your null hypothesis, maybe it is random or reject it and decide no, I don't think it's random. The trick is things can go wrong. You can get a false positive. This is when the sample shows some kind of statistical effect. But it's really randomness. And so for instance, this scatterplot I have right here, you can see a little downhill association here. But this is in fact drawn from data that has a true correlation of zero. And I just kind of randomly sampled from it until I got this it took about 20 rounds. But it looks negative. But there's really nothing happening. The trick about false positives is that's conditional on rejecting the null. The only way you can get a false positive is if you actually conclude that there's a positive result. It goes by the highly descriptive name of a type one error. But you get to pick a value for it and 0.05 or 5% risk. If you reject the null hypothesis, that's the most common value. Then there's a false negative. This is when the data looks random, but in fact it is systematic or there's a relationship. So for instance, this scatterplot, it looks like it's pretty much a zero relationship. But in fact, this came from two variables that were correlated at point 25. That's a pretty strong association. Again, I randomly sampled from the data until I got a set that happened to look pretty flat. And a false negative is conditional on not rejecting the null. You can only get a false negative. If you get a negative, you say there's nothing there. It's also called a type two error. And this is a value that you have to calculate based on several elements of your testing framework. So it's something to be thoughtful of. Now, I do have to mention one thing, big security notice, but wait, the problem with hypothesis testing, there's a few number one, it's really easy to misinterpret it. A lot of people say, well, if you get a statistically significant result, it means that it's something big and meaningful. And that's not true, because it's confounded with sample size and a lot of other things that just don't really matter. Also, a lot of people take an exception with the assumption of a null effect, or even a nil effect that there's zero difference at all. And that could be in certain situations could be an absurd claim. So got to watch out for that. There's also bias from the use of a cutoff. Anytime you have a cutoff, you're going to have problems where you have cases that would have been just slightly higher, slightly lower, it would have switched on the dichotomous outcome. So that is a problem. And then a lot of people say that it just answers the wrong question, because of what it's telling you is, what's the probability of getting this data at random, that's not what most people care about they want it the other way, which is why I mentioned previously Bayes theorem. And I'll say more about that later. That being said, hypothesis testing is still very deeply ingrained, very useful in a lot of questions and it's gotten us really far in a lot of domains. So in some let me say this hypothesis testing is very common for yes, no outcomes. And it's the default in many fields. And I argue that it is still useful and informative, despite many of the well substantiated critiques.