 I want to turn my attention now to some practical matters of calculating statistics. So one of the very common types of questions that statistics can answer is, how are two things different? And let me show you some examples. This is the unemployment rate for Pennsylvania over time, January 05 through July 2015. You can see the unemployment rate sometimes low, sometimes high. You can see the big bump from 2008 recession and then coming back down again. So that's the unemployment rate over time for Pennsylvania. And here's the unemployment rate over time for New York. And a similar thing, right? It's low, it goes back up again, it's high, it comes back down. And an interesting question that statistics can address is, which state has the lower unemployment rate? Or is there a difference between the unemployment rates at all? Now one way to approach this question, and it's the naive approach, is to simply take the average unemployment rate for Pennsylvania over this period, the average unemployment rate for New York over the period, and whichever one is lower, well that's the state that had the lower unemployment rate. What that approach ignores is the fact that the unemployment rate itself is what we call a stochastic variable, that is it has a random component to it. I can look up the unemployment rate for Pennsylvania right now. And what I get is not exactly the unemployment rate, there's noise for all the reasons that we've talked about earlier of how the thing is measured, what it's measuring, who's measuring it, all of this stuff. And so what I get back is not really the unemployment rate for Pennsylvania, but kind of one view, and it's a dirty view of this unemployment rate. I'm measuring the thing I want to measure, the unemployment rate, I'm measuring with some error, there's noise in the system. Similarly with New York, if I wanna know New York's unemployment rate, again, there's noise in the system. And the way we describe noise is with standard deviation. So standard deviation, if you picture the unemployment rate as this number that vibrates, right? Sometimes it's high, sometimes it's low. And the degree of vibration around this unemployment rate, this is the standard deviation. So I can use the standard deviation of the unemployment rate for Pennsylvania and the standard deviation of the unemployment rate for New York. And I can employ them in answering the question, which of these states has the greater unemployment rate? So if we look at a number line, you can see the two unemployment rates here. Pennsylvania's averaged over this period 6.4%. New York averaged, whatever that is, 6.7%. But look at this. That is plus or minus one standard deviation for Pennsylvania's unemployment rate. So roughly, very roughly speaking, you could imagine that Pennsylvania's unemployment rate, yes, it averaged 6.4%, but you could find it as low as 4.8. You could find it as high as 7.9, right? And it fluctuated around this range. New York's unemployment rate. Similarly, on average, it was 6.7%. But depending on what month you're talking about, you could find it as low as 5% or as high as 8.3%. So it wanders around. If you overlap the two, notice something. The overlaps are almost the same, right? Pennsylvania's unemployment rate wanders around the same kind of area that New York's unemployment rate does. This overlap of the wandering around, we can take into account, and when we do all of our statistical analysis, we would say something like, there's not a strong statistical difference between the unemployment rate in New York and the unemployment rate in Pennsylvania. There's not a strong statistical difference in the unemployment rates. So what I've said now are two apparently contradictory things. The first thing is that the average unemployment rate in Pennsylvania is 6.4, the average unemployment rate in New York is 6.7. So first statement, the average unemployment rate in Pennsylvania is lower. The second thing, which kind of sounds like it contradicts that, is that when you account for the variation in the unemployment rates, there isn't much difference between them at all. The way we rectify this apparent contradiction is by talking about what economists call p-values. What p-value measures is the likelihood that the data you're observing coincides with or confirms is not contradictory to the thing that you believe. Does the data match your belief? In this case, my belief is that there's no difference between the unemployment rate in New York and the unemployment rate in Pennsylvania. If I get a low p-value, it means that the data is contradicting my belief. If I get a high p-value, it means that the data is not contradicting my belief. There's more that goes into it than that, but as a rough gut level approximation, that's not a bad way to think of p-values. So p-values you'll see repeated, and every time you see them, think of them as this statistician's measure for the extent to which this data that I'm looking at is in agreement with the thing I believe to be true to begin with. The technical term that statisticians use is the null hypothesis. The null hypothesis is the thing I'm believing or I'm assuming to be true. And it gets a little more complicated because the statistician will also state what he calls the alternative hypothesis. The thing that he assumes will become true if the null hypothesis isn't true. And you can see we're getting weighed down in the weeds. At a very high level view, this p-value, you think of it, you can think of it as accompanying this null hypothesis or the thing I believe to be true. And the question is, I'm going to look at the world, I'm going to assume that something's true, my null hypothesis, and ask the question, do I see data out there that confirms or denies this thing? And that p-value is what's measuring this. High p-value is confirming what I'm already assuming to be true. Low p-value is rejecting or contradicting this thing I believe to be true. What types of hypotheses can you set as a null hypothesis? Can it just be any claim or does it have to be something like there is no relationship or there's no significant difference? Because null sort of implies a not. So it's, is your default usually no relationship, no significant difference, that sort of thing? The default tends to be, and there are mathematical and statistical reasons that cause you to have to have certain types of hypothesis, form your hypothesis a certain way. But usually the null hypothesis or the thing that you're assuming is that there's no difference. So there's no difference between Pennsylvania and New York, or you assume that there's no effect that if I increase stimulus spending, I should see no effect on the economy. So the null hypothesis, generally speaking, is a no effect or no difference sort of assumption. So we can look at Pennsylvania in New York and you see how the unemployment varied in each, and you can see there's a tremendous amount of overlap here, which at a gut level from what we described of p-values would indicate that the p-value here should probably be high. My going in assumption is there isn't much difference between Pennsylvania's and New York's unemployment rates, and indeed the overlap is quite significant here. So this would be like having a high p-value. And in fact, if you run the statistical test, you do get a high p-value here. So the statistician would look at the data that we've seen and say, there is no compelling evidence here that New York's unemployment rate is markedly different from Pennsylvania's. Now, let's look at Pennsylvania and California, which you see here. Now again, there's overlap, but look at California's. It's markedly different, right? On the low end, it goes over the same period California's unemployment rate dropped to like 5.5% and the high end it was up around 11.2, right? It's fluctuating around this range. And that range appears to be markedly different from Pennsylvania's. So it's not just that Pennsylvania's average is different than California's average, but when you account for the range over which the unemployment rate would tend to wander on average, those ranges are different. And in fact here, you do get a very low p-value. So my going in belief here would be, let's assume that Pennsylvania is the same, Pennsylvania's unemployment rate is the same as California's, and let's run some statistical tests. You run the test, you get a very low p-value, indicating that the evidence is contradicting this assumption of yours. My going assumption is the unemployment rates are the same, the evidence contradicts this. I get a very low p-value. So statisticians will, if you're interested in this sort of thing, I encourage you to, as a contrary view, take a look at some of Deidre McCluskey's work, who Deidre McCluskey doesn't like p-values and make some interesting arguments why economists should just stop talking about them. But being a statistician and economist, this is what I do, so I'm going to talk about them. So p-values range over this range of zero to one. And by the way, I'm going to take a moment out here, and this will be kind of as technical as we get as we talk about stats. And the reason for this is you see p-values repeating themselves over and over again. And it's something that people should, common people should get into their vocabulary, because what people often do, and we'll talk about this later on, common people, non-statisticians, will talk about correlation. And oftentimes they will talk about correlation when really they should be talking about p-values. So if there are two things that the average man in the street needs to know about stats, I'd say those are the two, the correlation in the p-values. So here are p-values, they range from zero to one. And generally speaking, what we're doing is walking in with this hypothesis that we've got two things that are the same, or we have some effect that there's no effect here that's our going in assumption, no effect or no difference. And we look at the p-values, if you get a very low p-value, like below one percent, we say this is very strong evidence, very strong evidence from the data, that your hypothesis is incorrect. A p-value between zero and five, or between one percent and five percent, we say a strong evidence. There's strong evidence here that your hypothesis that there's no difference or no effect, that this hypothesis is wrong. And then economists tend not to use them, but people in other business fields will p-values in the five to 10 percent range. They say this is weak evidence that the data is contradicting you. So again, the question is to what extent is the data contradicting your hypothesis? So five to 10 percent we'd say is weak evidence. And then if you get above 10 percent, we say, look, there's just no evidence in the data to contradict your initial belief. So you might see something that's common here. This two-pronged approach of we have initial belief and we look at p-values and ask do they contradict these things. What you're seeing is the scientific method at work. So we learn in grade school the scientific method is you form a hypothesis, you collect some data, and you ask does the data confirm or deny your hypothesis. That's exactly what's happening here. The difference with economics and social sciences is that we find it very hard to, not impossible, but very hard to conduct experiments. We have to look at data that already exists. And yet, even though it is hard, sometimes impossible, to conduct experiments, we can still use the scientific method by walking in the door saying I hypothesize something and then asking does the data support or refute my hypothesis. So what can we do with this? Well, we see some interesting things. We talked about inequality earlier and what you see here is income inequality measured by the Gini coefficient for countries that are less economically free and more economically free. So we're measuring economic freedom through by the Frazier Institute's measure, which looks at, Frazier Institute asks questions like how much, what percentage of your economy is due to government spending, the question you asked earlier. How much of your economy is sucked up by the government in taxes? What's your top marginal income tax rate? How much restrictions are there on labor markets? Can you go get a job wherever it is that you want or is the government going to interfere in your negotiation with your prospective employer? What kind of protections are there for property rights? What kind of rule of law is there? Is the money sound or is your central bank inflating it away? These sorts of questions. So Frazier asks a whole suite of questions like this of every country and puts on the country a stamp that represents the amount of economic freedom that's there. What you see are all the countries that report 127, I believe, split into two groups. On the right we have those that are above the median for economic freedom and on the left those that are below the median for economic freedom. And the size of the bar vertically is the amount of income inequality within that country. So you can see over on the left like the second bar in has tremendous income inequality and you move further in you have some countries that have much less income inequality. The question here is, is it the case that countries that are more economically free have less income inequality? Now this is a fascinating statistical question because like the unemployment rate, inequality is this stochastic process. It's a random thing. Sometimes it's high, sometimes it's low. It could be high for various reasons that have nothing to do with economic freedom. It could be low for reasons that have nothing to do with economic freedom. It fluctuates like the unemployment rate does. What we've done is split into two parts. Like we had New York and Pennsylvania. Now we have less free countries, more free countries. And we ask the question on average, on average, am I seeing a difference in income inequality for the less free versus the more free? So I'm going to walk into this problem with the null hypothesis, my initial belief that there is no difference. My null hypothesis is income inequality is the same for less free countries as it is for more free countries on average. And so we can conduct a statistical test and if you conduct a statistical test on this, what you get is a p-value of 0.0002, incredibly low. It falls way into the range of very strong evidence that the data rejects my hypothesis. So my hypothesis going in is that there's no difference in the average inequalities here. I look at the data and the data comes back very strongly and says, no, that hypothesis is wrong. And if you look at, you can tell by looking at the bars, but if you calculate the averages, what you find is the average income inequality for the more free countries is lower than is the average income inequality for the less free countries. So I would walk away from this picture with the conclusion that there appears to be strong evidence that economic freedom is associated with less income inequality. Now notice my choice of words here. This is very important. Economic freedom is associated with less income inequality. The statistical test doesn't tell me anything about causation, right? It's not telling me that one causes the other. All I know walking away is that when I look at the more free countries, the income inequality on average is less. When I look at the less free, the income inequality is on average is more. So we use this term, they're associated. So some fun stuff you can do is look at some other things. This is gender inequality. Now the interesting thing here is, while Frasier is providing the economic freedom measures that let us divide the countries into more free and less free, the United Nations is providing the gender inequality data. So when the UN measures gender inequality, it doesn't ask any questions about economic freedom. When Frasier measures economic freedom, it doesn't ask any questions about gender inequality. And yet, if you cross-reference these two completely separate data sets, you get what you see here, which is that the more free countries exhibit less gender inequality than the less free countries do. The more free countries exhibit less gender inequality than the more free. And again, you get a very small p-value here indicating that if you walk in, you walk in with the hypothesis that there is no difference between the less free and the more free, the statistical test tells you the data strongly rejects that hypothesis. There seems to be a very significant difference. What was the source of the data for the first graph? The first graph income inequality comes from the United Nations. And again, it's this interesting thing, the United Nations does not consider economic freedom when it evaluates income inequality. And Frasier does not consider income inequality when it evaluates economic freedom. So they're two completely separate data sets. And yet you find this phenomenon that the more free countries have less income inequality, the less free have more income inequality. We've talked about using statistics to determine whether two things are different from each other. So unemployment rate in Pennsylvania versus unemployment rate in New York, inequality in less free countries, inequality in more free countries. And this detecting differences in things that we've been talking about is called a difference in means test. And it's useful for determining whether two things are different. But it's not useful for determining the relationship between two things. And the relationship between two things is a much richer thing to talk about. Give you an example, we've seen here the countries split into two parts, the more free and the less free for gender inequality. Now here's the same data but presented slightly differently. What you see here is economic freedom measured across the horizontal axis. So to the right is more economic freedom, to the left is less economic freedom. And up and down is gender inequality. So up is more inequality, down is less inequality. And you see these dots, every dot is a country. And there are two things here. First off, there's noise in the system, right? There are dots all over the place. But there's also a trend on average. On average, as you move to the right, the dots fall. There are exceptions but it's the underlying trend that's interesting. And remember we talked about that when you see a phenomenon in the social sciences and economics and psychology and so forth. What you're seeing is human behavior which is comprised of two components. One component is a random component, right? Humans will sometimes just do random things. The other is a more deterministic component. It's a component that causes the human to do certain things based on his surroundings. For example, if I go to the coffee shop in the morning and I always buy a bagel, in today the price of bagels is gone up from what it was yesterday. There are two things playing on me. One is my desire for the bagel or the habit that I always buy a bagel, this sort of thing, or maybe I had breakfast or didn't have breakfast earlier and that influences whether or not I like the bagel. There's a kind of random things. That's one thing that plays into my decision to buy the bagel or not. The other thing that plays into my decision to buy the bagel or not is the fact that the price has gone up. Now, what economics tells us is when the price goes up, people buy less of the thing. That's true on average. It's not necessarily true in every single case. In this case, maybe I really want the bagel and the price went up and it doesn't matter, I'm still going to buy it. This is not a violation of the laws of economics. Rather, you're looking at one person. The laws of economics apply in the aggregate. Although I might still buy the bagel anyway, you might not. Maybe he would, but he wouldn't. And maybe he wouldn't either. And what happens is when you put us all together, in total, the number of bagels we buy declines. So what's important is to keep in mind is there's this randomness when you look at the individual observations, the individual people or the individual countries, but there's also this underlying trend. And the underlying trend is the deterministic part. The thing that we as economists or statisticians were interested in seeing. So part of, and you'll see this theme recurring over the next couple of topics, one of the things that we're very much interested in is taking this messy, dirty data world and stripping out the noise. Blow it away so that what we see underlying is the phenomenon, the behavior that we actually want to examine. So what you're seeing in this picture now is the same data, the economic freedom, the gender inequality, but now we've shown each dot as a country and you see this kind of trend that as you go to the right, more economic freedom, you get less gender inequality. And I can superimpose on top of this cloud of data points a line. This line, some people call it a trend line. The statisticians call it a regression line. This is a line that comes as close as possible to approximating the data points that you see here. So think of it as the underlying average trend in the data as you move to the right, the line declines. What this line gives us is this thing we didn't have before. It's not just that more economically free countries have less gender inequality. This line can tell us by how much. This line tells me the relationship between economic freedom and gender inequality. The way we go about getting this line is we start off by saying, okay, let's propose that there is a relationship between gender inequality and economic freedom and that that relationship can be shown mathematically. So I propose this thing and you see here, gender inequality is a plus b times economic freedom plus u. And what are these things? a and b are numbers. We're going to put them to the side for the moment, but it basically is what is giving me some function that relates economic freedom to gender inequality. U is the noise in the system. It's all the things that would affect gender inequality other than economic freedom. So if I put all this together, what I have here is a proposed, now my proposed relationship may be faulty, but it's the relationship I propose exists between economic freedom and gender inequality. That economic, you take economic freedom, you multiply it by some number and you add some other number to it. And that plus the noise in the system gives you gender inequality. Our goal as statisticians is to find values for a and b that cause this line to come as close to the data points that we have as possible. And there are technical things that go on in the background, but for the moment, if we run this thing through the appropriate statistical analyses, what will come back is this value for a and this value for b. So the regression analysis is what we call it, would come back and tell me the best value for a is 1.38, the best value for b is negative 0.15. Those two values will give you the red line that comes as closest to the data points as many of the data points as possible. So what we walk away with then is the idea that, okay, there are two things happening here in the relationship between economic freedom and gender inequality. The first is that economic freedom is associated with gender inequality according to this equation that we see here. And the other thing is, oh yes, and there's all that noise that I can't account for, but it's noise on the back end. What happens is when I blow away the noise, this is the relationship that I find. So a couple of things that come out of this. First, there'll be what people, many people are used to talking about what people refer to as the R squared. The technical term is the multiple correlation coefficient. But people who use the term correlation or many people, even non-statisticians, will talk about the R squared. And they'll ask questions like, all right, fine, you've got this relationship between economic freedom and gender inequality, but what's the R squared? The R squared, or the multiple correlation coefficient, ranges from zero to one. And the higher it is, the more closely the data points fall on your trend line. So right now you can see that the data points, they're a cloud around the trend line. The R squared in this model is 0.44. If you could imagine, picture those data points getting closer and closer and closer to the trend line, that would be the R squared going up and getting closer and closer to one. And as the data points moved away from the trend line, so they start to become just this amorphous cloud, that's the R squared going down towards zero. So R squared is one thing we want to look at. Another thing we want to look at is the value of that variable B. Actually, we call it a parameter. Parameter means this is a value in the regression model that does not correspond to data, but it's something we're going to calculate. So B, in this case, we would call a parameter. This value of negative 0.15, what does it mean? What it means is, on average, there are exceptions, but on average, a one-unit increase in economic freedom is associated with a 0.15 unit decline in gender inequality. On average, a one-unit increase in economic freedom is associated with a 0.15 unit decline in economic inequality. So that parameter B, we get the value of negative 0.15 for it, is telling me about the magnitude of the relationship between economic freedom and gender inequality. There's a third number that isn't listed here, and it's a p-value. It's the same kind of p-value we've talked about before. We talked about p-values as measuring the extent to which data contradicts my hypothesis. So I walk in with my hypothesis that Pennsylvania unemployment is the same as New York unemployment and a low p-value says the data contradicts that hypothesis. There does seem to be a difference here. Here for regression analysis, the hypothesis is always that your parameter is zero. In other words, our hypothesis here would be that the value, the true value of B is zero. That is, there is no relationship between economic freedom and gender inequality. So a value of B of zero means there's no relationship between economic freedom and gender inequality, and that's our going-in assumption. And when we look at the data, what comes back as a p-value of 0.0001 something, a very low p-value, meaning that the data contradict that hypothesis. The data contradict my claim that there is no relationship between economic freedom and gender inequality. Now let's pause here for a moment because we've discussed three measures, and these are the three key measures that we're going to take away from a regression analysis. First is the p-value, which is measuring the significance of the relationship. High p-value, there's not any relationship here. Low p-value, there appears to be a relationship. So the p-value is measuring the significance of the relationship. The value of B, the B parameter, that's the magnitude of the relationship. Changes in economic freedom are associated with what kind of changes in gender inequality? Big changes, small changes, positive changes, negative changes. The B parameter gives me that, the magnitude of the relationship between economic freedom, gender, and equality. And then lastly, r-squared, which you could maybe think of as the precision of the relationship. And I'll give you an analogy that ties these three together. You're at a party, and you're at one side of the room. There's lots of people in the room, and they're making lots of noise and singing and dancing on the tables and whatnot. And your friend is over there by the door, and you notice that you're running low on beer. And you yell to your friend, get more beer. First, your friend's over there, and lots of things are hitting his ears. The noise of the people dancing and the music and, you know, this table cracked because someone stepped on it wrong. And one of the things that hits his ears is your voice yelling, get more beer. R-squared measures what fraction of all the noise that hits his ears is coming from your voice. So a high r-squared would mean that the noise in the room is very low. Most of what hits his ears is your voice. A low r-squared would mean that the room is very noisy. So a small percentage of the total that hits his ear is your voice. A large percentage is other things. So notice right off, the r-squared, a low r-squared doesn't mean that there isn't a relationship. It simply means that there are other things included in the relationship other than your voice, r-squared. The p-value is like measuring whether your friend understood what you said. So just because he hears my voice doesn't mean that he heard and understood my words, right? I could have been slurring or, you know, speaking a language he doesn't understand or maybe I'm just, I'm not slurring and I'm speaking English, but the noise in the room is, and people are saying things such that he's only catching every other word, right? So the p-value is measuring is attempting to measure how much of the words that I am saying he is understanding as opposed to hearing. How many is he understanding as opposed to hearing? And then finally, the value of b, this parameter, measures the magnitude of the relationship. That's like measuring the extent to which my words have spurred him to action. If he hops up and runs out the door to buy more beer, I have spurred him to action. That's like having a large b. If I don't sperm to action, that's like having a small b. There's not, there's, there's not a straw. The magnitude of the relationship is in that large. So notice three things going on. I'm yelling to my friend, he's got, first, he's got to hear me. Second, he's got to understand what I've said. And third, I want to sperm to action. And those are these three measures, the r squared, the p value, and the magnitude of b. So I make a big point about this because many people who don't know statistics but do kind of at a gut level understand r squared get fixated on it. And so I'll show a picture like, like this one that has the economic freedom and gender inequality. And people will say, yes, okay, that looks very good, but what's the r squared? What they really mean is, but is this relationship significant between these two? Or is it simply noise? The answer is the relationship between freedom and gender inequality significant or is it noise? Is it answered by the r squared? It's answered by the p value. The p value, did you understand what I said tells me whether there's a, whether economic freedom actually does have some relationship with gender inequality. The r squared simply tells me, are there lots of other things that also influence gender inequality? There may be lots of other things that also influence gender inequality. But the question I'm asking here is, this specific thing, does this specific thing influence it? That's a p value question. So like in this specific equation, so what does the a stand for? And why there is no u in a second equation? Yeah, both good questions. So the a in this example, in some regression models, the a has more meaningful meaning and others has less meaningful meaning. In this one, it's one of these cases where it has somewhat less meaningful meaning. Here, a would be the gender inequality we would expect to exist in the absence of economic freedom. So in other words, if economic freedom were zero, then our regression model predicts that the amount of gender inequality you should see is simply a. It's not, that measure isn't very useful here in this particular example, but we'll look at an example later in which a has a really nice interpretation. So whether or not it has an interpretation depends in part in the data you're looking at. Now your other question, what happens to you? We're looking here at two equations. The top one is the regression model. In other words, this is the thing I walk into the room assuming that there is this relationship and I want to measure it. In this equation says, gender inequality is a, I don't know what a is plus b, I don't know what that is yet either, times economic freedom plus u. And u is all the things other than economic freedom that might influence gender inequality. So picture that equation as corresponding to the blue dots. The second equation is the estimated regression model. That is, when I run my data through my statistical package, the statistical package comes back and says, our best guess for a is 1.38 and our best guess for b is negative 0.15. And so our estimated regression model says this, we expect or we estimate gender inequality to be 1.38 minus 0.15 times economic freedom. Where's the u? The u is disappeared because our whole purpose here was to brush away the noise so that all we're left with is the relationship between economic freedom and gender inequality. The u is gone because the u is the noise. Our whole goal was to brush it away to get rid of it so we could see this underlying relationship.