 Instrumenta-varioidaan analysointi on suurtoinen asumson, että asumminen on ympäristössä ympäristössä. Tämä asumson on kuitenkin esim. esim. esim. ja se on ympäristössä. Tässä asumminen on ympäristössä. Onko se mahdollisimman, että esim. esim. on ympäristössä, ja esim. on esim. esim. Let's take a look. So the exclusion criteria was about this correlation. So we must assume that z, the instrument of variable, is uncorribled u y, which is the error term of y, the main dependent variable in our analysis. And that must be assumed to be zero. But that is fundamentally untestable because we don't observe u y, so we can't correlate it with z. And also this model is just identified, we have zero decrease of freedom, so we can't really test anything. So, but anyway, we have this test called Sargon test, Sargon saw J, Sargon Hansen test, over identification tests that claim to be testing this exclusion criteria. So what do these tests actually test, and in which way do they test the exclusion criteria? Let's take a look. These tests will require that you have more instrumental variables that you need. So here we have one endogenous variable, and we need one instrument for estimation. If we have more instruments than what we need for estimation, then our model is over identified. If we have two endogenous variables, two axis here, then we have need two instruments. If we have three, then we have an over identified model. So let's add one more instrument to the analysis. So we have two instruments, z one and z two, they are freely correlated, they explain x, and they are assumed to be uncorrelated with u y. We have one degree of freedom in this model. And whenever we test the model when we have decrease of freedom, we need to understand where those decrease of freedom come from. And what is the constraint being tested? What is the meaning of that constraint? And then how does testing of constraint inform our research? Let's take a look at where this one degree of freedom comes from. So what is being tested? We can do a little bit of algebra with covariances, and we'll see that the covariance between z and y, z one and y here, is the two paths from z one to x directly through the beta say z one and here through z two. So it's the sum of those multiplied by beta x, so that this is a simple application of the tracing rules. And turns out that this can be simplified because this thing in parenthesis is simply covariance between z one and x. So that's the covariance between z one and x. And so there we have it. We can solve for beta by dividing with correlation z one y and correlation z one x, and that solves us beta x. We can also solve beta x using z two, so that gives us another estimate. And our test, the overall identification test, is simply testing if these two are the same. So we write the equation bit differently like so, and we express that the difference between these two ratios is zero. We actually calculate that ratio from our data. If it's far from zero, then we conclude that this constraint does not hold, so we are rejecting our hypothesis. But this doesn't really tell us, this is the covariance that we are testing, but it doesn't really tell us what is the meaning of that test. So what exactly is being tested with this one degree of freedom. So let's take a step back and let's go back to this equation. So covariance between z one and y is simply the model implied covariance here if the model is correctly specified. Let's assume that our model is misspecified, so the z one actually correlates with u y. So that would be a violation of the exclusion criteria. And let's call the correct population value beta star, so we get that kind of equation here. So the actual estimate beta x, I'm omitting the hats for simplicity, would be a function of the population value and the correlation between the instrument and the error term. So we can see that if this is non-zero, then our estimates will be inconsistent. Let's proceed with this equation a bit more. We can simplify the bit, so it's beta star plus ratio of covariance z one with error term and covariance z one with x. So the first covariance quantifies if the exclusion criterion holds and the second covariance quantifies the strength of this instrument. Now this is a useful way of looking at the constraint or beta x, because if we write again an equality, we can solve beta x using the instrument z two as well. We can write it like that, we can cancel beta x star from both, and this is something that we can actually start interpreting. So what's the meaning of this equation? So how should this equation be understood? We have the covariance between instrument and error term divided by covariance between instrument and the endogenous variable, and those two ratios should be the same. There are two ways of understanding the equality of these two ratios. The first one is that if one instrument is valid, the other one is as well. And this is the assumption of these exclusion tests or overidentification tests. We must assume, we need to have a strong theoretical reason to believe that at least one of our instruments is perfectly, is uncorrelated with UY in the populace. And then we can test if the others are uncorrelated as well. So we can basically test if these instruments are equally valid. And if we know that one of them is valid, then if we know that the others are equally valid, then we know that they all are equally valid. Another way of understanding this is that if one of them is correlated with the error term, if one of them is invalid, then both are equally bad. So these tests for exclusion don't really test whether the instruments are excluded in the absolute sense, but they test if the instruments are equally good. And if one of them is known to be valid, then we can infer that the others must be too. There are a few implementations of this idea. Let's take a look at Stator User Manual. So Stator has this S-that-over ID, which gives you the overidentification tests. And they explain that there are a number of different tests that they can calculate based on how you estimated the instrument available model. Did you use GMM? Did you use maximum likelihood or something else? But all these tests boil down to two basic ideas. So the first idea is that we regress. We take the residuals of Y and then we regress them on the instruments. If the instruments are not equally valid, then there will be some correlations between the instruments and residuals from this regression analysis. We test whether the R-square of that regression is exactly zero. If it's not, then we conclude that these instruments are not equally valid because one of them explains the residual more than the others. And this is the idea. We take the residual here, we take the disease here and regress. This is the idea of Sargon test, Sargon Hansen test and Sargon's J, which are very similar. It's like a family of different tests building on this idea. Another idea that we can implement is to test the overall fit of this model. So we can use the one degree of freedom from this model for model tests. For example, if we run structural ecosystem model and get SCM, then we can do a chi-score test for overall model fit. If the model is rejected, then we could conclude that the instruments may not be valid. But it's also possible that the model is mis-specified in some other way as data user manual tells us. So the fact that we reject the model just means that the model is not correct, but we don't know which way it is incorrect. But the same thing here to do if you reject the model or if these test statistics are significant is that you treat results with caution because they are probably not valid anyway.