 In the first video on linear regression, I showed you how to set up the most basic version of linear regression, which is called simultaneous entry, where you take all of your predictors and you put them in the model at once. And what it gives you is this one that we have right here, which gives you the T values and the P values that allow you to hypothesis test on all of your variables, but only in the context of each other. And if you select different variables, or sometimes if you put them in in different order, you can get different results. Now I want to show you some of the options that Jamovie gives you for this. And to do that, I'm going to come back to regression. And I'm going to pick linear regression again. And I'm going to set up a new analysis. I'm going to scroll down a little bit here. The outcome variable, the dependent is openness. So I'm going to pick that right here. And this is where I can start choosing which variables I want to include in the analysis. Now what I did previously is I picked all of these Instagram down through modern dance, which are Google correlate terms, which have to do with the relative popularity of these search terms on a state by state basis. And I can pick those. And I also chose governor, which is simply whether the governor of that state was Democratic or Republican when the data was gathered. And we have this table right here. But what I'm going to do is I'm going to come down here to model builder. And here I have an opportunity to do data in blocks. Now a very common approach or something that a lot of people want to do is something called stepwise entry, which you simply tell the computer, these are all the possible variables. And I want you the computer to go through and pick the one that has the highest individual correlation, stick that in the model, then pick the one that has the highest correlation after that and put it in and so on and so forth. The problem is stepwise models tend to really build on the idiosyncrasies of that particular data set. And so you get a model that's really well tailored to what you have. But it doesn't generalize well, and it can actually mislead you in a lot of different ways. Jim Moby doesn't even give you the option for that. What you do have the option for is what I personally call block regression or block injury or block wise regression. And that is you can set up several models that sequentially add additional variables. And so what I'm going to do here is I'm going to first take all of these search terms, I'm going to get all of them out. And so right now block one is just whether the state has a Democratic or Republican governor. And you can see right here that in terms of openness, it's doing subtraction Republican minus Democrat. And what that means is that the Democrat governors or the states with Democrat governors tend to have higher levels of openness than the ones with Republican governors. That's because we have a negative coefficient there. But let's add a second block and see how things shift around. Now I do want to come up here and change this to include the adjusted R squared. So I'm going to come down and say model fit and add that on right there. And I'll close that. So now we have our model fit simply knowing whether a state has a Republican or Democrat governor predicts almost 13% of the variance in the outcome, the openness of this state as a whole. That's kind of remarkable all on its own. But I'm going to come over here and add a new block. And in block two, we're going to try to put several other variables, we'll say for instance, come down and pick just these last few museum scrapbook and modern dance. And I'm going to put those into block two. Just as a main effect right there. And then I can compare the model with just governor and then the model with governor and these three in it. And so that's what I have right here. We have this first one is just governor and we have an adjusted R squared of 12.9. It explained about 13% of the variance in the outcome openness. When we add these other three Google search terms, the relative popularity of those on a state by state basis, it goes up to 52%. We've now explained over 50% of the variance in the outcome, which is openness. Now that makes a little bit of sense because openness often includes artistic or creative openness, though it can include a lot of other things as well. And this one right here, this model comparisons tells us that there's a statistically significant increase, which makes sense because it's a huge jump. And then here we have the variables. Now this is for model two. If we want to look at model one, we just click on this and we get the one with just governor. And you can see there's a statistically significant effect there. That's the 007 right here. But let's go back to model two. And now we can see that governor is no longer statistically significant. That's because it was a spurious correlation is predicted by other things. And then museum doesn't seem to matter. But we have strong and statistically significant effects for both scrapbook and modern dance to creative interests. And we see that one's negative and once positive. So in the states that have higher levels of interest in scrapbook as a search term, there are lower levels of openness. And in states that have higher interest in modern dance, there are higher levels of openness. Again, that's because we have these negative and positive coefficients. But remember, it's always within the context of one another. Now if we want to, we can add a third block where we take on for instance, these other ones that we go from Instagram. And then I'll do a shift click down to volunteering. And I'll put those in the third block as main effects only, we're not looking for interactions right now. And when I do that, we can see that adding those variables improve the adjusted r squared from only 0.525. That's 52 and a half percent to 57.2%. And in fact, you can see here that that increase from model two to model three is not statistically significant. The increase has a p value itself of 0.180. So we didn't get anything helpful by adding those extra variables. In fact, we lost degrees of freedom by doing that. So that tells us that we're probably best off going with model two, which had a substantially higher rate of predictive ability compared to the one that had just governor, and the results make sense in this that theoretical interpretation that matters. And so this is one way of looking at the relative contribution of different variables. Now this one's theory driven because you decide what the blocks are, and you enter them in and you interpret the results on your own to see if they make sense and how applicable they are. And really, whether it's something that you can put into use, whether you can do something productive with it, that is the goal of linear regression. And this blocked entry is to find the information that's going to be most useful to you in solving the problems that you're dealing with.