 Modifikation indices are useful diagnostic tool after structural economic estimation. Modifikation indices are often misapplied. Let's take a look at what motivation indices do. Our example for the video is Mesquite and Lazarini from 2008 Academy of Management Journal. They estimated a sequence of models and their CFA model did it well, but what they called theoretical model, which they used for testing your hypothesis, failed the chi-square test. Then they proceeded doing diagnostics. They report that they applied LM tests to see if some paths that were not in the model should be there. And this paper is a nice paper because the correlation matrix in the article contains all the variables so we can simply replicate the analysis ourselves. So this is their modification indices. Based on these modification indices, the authors conclude that it was necessary to add a path, a covariance path between variables called horizontal and vertical, and then they continued doing the same analysis, calculating new modification indices, and add more things to the model because it is, as they claim, it is necessary and it makes the results more robust. Is it necessary? Does it make the results more robust? That's a bit questionable, but this is how modification indices are commonly applied. Let's take a look at what modification indices do. Klein's SCM book provides a fairly accessible explanation of what modification indices do. They say that modification indices are also called LM tests, and they basically are univariate tests that tell us how much the chi-square of the overall model would decrease if we add three one parameter that is constrained to be zero. Klein explains that this applies to these zero constraints, but modification indices can be applied to also other parameters that are constrained to be some other values than zero. For example, if you constrain a path to be one, it can get a modification index if freeing that path and letting it to be estimated to some other value would improve the model. Klein also explains that modification indices are calculated with some kind of matrix algebra magic without actually estimating any other models. They are kind of like answers to what if scenario without re-estimating. It tells us how much it will change if it is something without actually freeing that thing. What does it actually do? What is the calculation based on? Volans, now a classic book, provides a bit more technical detail on how this works. Importantly, he tells us that this is based on the derivatives, and we don't need to actually calculate any other model. We can just take the derivatives of the model that we just estimated, and that gives us the modification indices. The derivatives, the first derivatives or the gradient of the likelihood, is also called the score vector, and this gives these indices another name, sometimes it refers to as score tests. Let's take a look at an example. I have generated data, our data contains six indicators from a model, our population with two factors. We have F1 measured with X1, X2 and X3, we have F2 measured with X4, X5 and X6, and we fit a mis-specified model. In reality, the correct model for these data, we have two factors, one for X1, X2 and X3, another one for X4, X5 and X6, but we fit a single factor with six indicators. Unsurprisingly, because our sample size is okay, and the chi-square rejects the model. So now what? We can see that all the indicators load approximately equally well on the factor, so that's not an indication for problem, but when we take a look at the modification indices, we see that we could actually improve the model of fit quite a lot, so our current chi-square is 39, and if we just free one path between the covariance X4 and X5, then the chi-square is likely to go down about 12, so it goes down about 30%, which is quite a lot. The modification indices also tell us the EPC statistic that comes from expected parameter chains. So if we add this parameter here, X4 and X5 error correlation, it's estimated to be 0.36. So we will do as the modification indices say, and we will add the path. Now the model fit improved, it is now 27, but it is still a statistically significant. What we can now see is that when we got the expected parameter chain, 0.36, it's pretty close to the 0.86 that we got, 0.386 that we got. So our estimate of the parameter was pretty good. Also if we compare the actual modification index to how much our chi-square changed, so the change in chi-square between the original model and the new model is about 12.8, and the modification index says that it's going to be about 12.5. So we actually got a little bit better model than what the modification index predicted. Importantly, when we release this constraint of X4, X5, error to be uncorrelated, it also affects all other parameters estimates in the model, and therefore it can improve the model fit a bit more than what the modification index predicts. All right, so that's our modification index. We can proceed because our model is still not non-significant. We can see that the highest modification index is now X4, X6. The chi-square is expected to go down by 17. It's 27, so that's quite a dramatic decrease. Importantly, this modification index is different than what the X4, X6 error covariance was before. The reason is that it's now evaluated using this model instead of the model where X4 and X5 errors were constrained to be uncorrelated from which the previous set of modification indices was calculated. So in practice, how people do modification indices is that you read one parameter, then you re-estimate, you check what is the next parameter to read, then you read that and re-estimate, and you kind of iterate, and that is supposed to give you a better model. Okay, so we do that, we free it. We have a non-significant chi-square. But non-significant chi-square does not guarantee that the model is correctly specified because it could just be a false negative. So we fail to reject the null because our sample size is small. So 200 power is not ideal. So maybe this is a false negative and then we need to look at the modification indices. So we again see modification indices, chi-square is now 10, and if we add this X5, X6 path or error covariance, chi-square is going to go down by 8 approximately, so from 10 to about 2, and this is indeed what happens. So we add these X4, X5, X6 covariances and we get a really well-fitting model. Everything is good. Well, no, because this is actually not the right model for the data. The data, the right model for the data, was a model with two factors, and this brings up two important limitations of modification indices. First, modification indices are completely theoretical. If we had actually measured some data and we looked at the indicators of the data and we had done some proper diagnostics, like look at the residual covariance matrix, maybe run an expert or factor analysis, try estimate a model with first three indicators and then second three indicators and so on, we might have realized that the correct model actually had two factors, but modification indices have no capability in success in that. Modification indices cannot do two things. They cannot tell us that we need to add more latent variables to models. So that's beyond the capability of modification indices and they're also univariate, which means that sometimes adding two paths would be the right choice, but a single path, which is incorrect, might increase the model feed more by any of those two paths individually. So this is strictly univariate and it cannot tell us to add more factors to the data. So there's quite a lot of research that shows that blind following modification indices will hardly ever give you the correct model. So this is something that you need to use your judgment and the computer results that just there to guide you, they are not there to tell you what to do. Let's revisit the example now and the modification indices are here and basically in a laser ring you tell us that because the computer gives us a modification index for vertical and horizontal, it is necessary to add a covariance path. Well, it's not necessary because the computer actually gives us three options. So we can have a covariance path that is two-tildes or we can have a regression path from horizontal to vertical or from vertical to horizontal. So we have actually three different choices that the computer gives us. All of those produce equally fitting model so the computer can't tell us which one of those is the right thing. What they do then is that they say that well they added more stuff based on modification indices and that is supposed to increase the robustness. This is not the case. So modification indices make the data a bit better to the model that you have but increasing fit is not the same thing as the model being better for two reasons. Simply empirically adding things to the model might mean that you add things that are not supposed to be there. So if there is no theoretic horizon for a path to be there if a modification index succeeds to adding a path it's probably a mis-specification that you would do even if it makes the model fit better. The second thing is that when you modify your model based on your current sample you are not making the results more robust. You are making the results less generalizable because you are fitting the model more and more to your specific sample and you are adding sample idiosyncrasies which makes the model less generalizable. So simpler models are more generalizable than complicated models. When I took my first course in structural league as a modeling the instructor told me that modification indices are there to guide your judgment. You are not there to make decisions for you and you should only add things to your model if you realize that there is this aha moment that things should have been there in the first place. If we look at the horizontal and vertical the variables basically refer to how a firm applies governance to different kinds of business relationships. So horizontally or vertically and they are supposed to be highly correlated based on theory but their model implies that they are basically uncorrelated except for investment which is measured, which model is a common cause. That is unrealistic. So this is basically a specification error to start with. They should have had the correlation in horizontal and vertical there because they are theoretical related. That should be there in the first place. The reason for adding that is because they are this aspect of the same thing not because a modification index tells you so. A modification index might indicate that you actually forgot to add something and it acts as a reminder but it's not the justification for doing anything. So they have this fairly complicated model and they add a lot of stuff. They add covariances, more covariances and then one regression path. But they also failed to add things that should be there in the model. For example, their competitive pressure is a single indicator variable so it's directly measured and they initially constrained to be uncorrelated with all other control variables. That constraint of course does not make any sense but that's probably their software default and they forgot to free the covariances or simply did not look at the model in more detail before they estimated it. And then they add these covariances. Competitive pressure is correlated with firm size with export orientation. Well, control should be correlated but it's considered to be uncorrelated with investment, for example. So investment and competitive pressure are supposed to be completely unrelated according to their model. That's something realistic. So if you apply these modifications and indices in a more thoughtful way you would go and look at so what should we add into the model? What are the constraints? Why are we having a constraint of investment and competitive pressure, for example, in the model? If there's no reason for adding that then you free the covariances. Klein tells us some words of caution about modifications and indices. Basically, he gives us two things and one thing is technical and the technical thing is that because modification indices tell us how a model is expected to change if we free something but it doesn't really estimate that model some of the modification indices might lead to unidentified models. Some of the modification indices might also be impossible so the computer software might tell you to free a parameter that you can't free. For example, it might tell us to free the scaling indicator of a factor and if you free that then the model becomes under-identified or it might tell us to free a parameter which is not a part of the parameter matrices and it can be freed in that software. Then he also tells us the general thing that I also told that blindly following modification indices rarely give you the correct model. Modification indices are there to point you where you should look at what you should consider. They are not there to tell you what you should do. Modification indices are useful. I use them all the time when I do SCM but they should not be your only diagnostic. I find it very useful to also take a look at residuals, estimate smaller models, sometimes do an exploratory factor analysis to a scale or a couple of scales at that time to see what kind of structure or what kind of factor structure would fit empirically. So this is useful but as an only tool for diagnostic and without theoretical guidance it's going to just misleach you.