 Kun speciallisimme laitettavien varmaston, joka on eri kontekstit, se on tärkeintä, että ympäristetään ympäristetään varmaston. Tällä tavalla, jos ympäristetään varmaston, haluamme ympäristää, että ympäristetään varmaston on the same way in all time points. The same thing applies when we collect data from different contexts. For example, if we do cross cultural research, we collect data from five different countries, we want to be sure that the scale works the same way in all those countries so we can attribute any differences between the results, the differences in the phenomenon instead of differences in how the scale works. Measurement invariance has different levels and the required level depends on the purpose of the analysis. The first level is configurable invariance and this simply means that if we estimate the same factor model in different contexts, does the same model fit well. So we don't need to have correlated errors in one context or cross loadings in one context but not another context. So this is simply our model testing and there is nothing special about it. The actual invariance testing sequence then follows weak factor invariance. The idea of weak factor invariance is that we test do the scales possibly measure the same thing on the same scale. In practice we constrain factor loadings to be the same across the occasions or across the context. Then when we move on, we take a look at a strong factor invariance. The idea of strong factor invariance is that we look at the item intercepts and we ask if the differences in the item means could be attributed to the differences in the latent variables or the attributes being measured instead of being just differences in how the items work across different contexts. Then we have a strict factor invariance which is not typically used when we look at how the error variances behave across different contexts. But this is not normally required and typically not normally tested. So how does the testing sequence actually work? Let's take a look at example from a little. This example is a bit unconventional in how the latent variables are scaled. Typically we scale the latent variables by fixing the first factor loading to be one and fixing the latent variables mean to be zero. In this case however the scaling of the variance is done by constraining the mean of the factor loadings to one. So instead of constraining a specific factor loading we constrain the mean of the factor loadings to be one. The same thing with the intercepts instead of constraining the latent variable mean to be zero and estimating the intercepts we estimate the latent variables mean and we identify the mean by constraining these intercepts to be zero on average. This is not common but it's one way to deal with the problem that how do we pick which of these indicators is our scaling indicator and how do we pick which one of these variables is our reference indicator for the mean which we need to have for some longitudinal models. So the configuring variance model basically fits a normal converter factor analysis to the data we look at whether the model fits well and whether we need to do any modifications and are same modifications needed for first two time points. Typically in longitudinal studies configuring variance holds if the scale is well developed. So configuring variance simply we look at whether the model fits well no additional constraints. Then when we go to the weak factor invariance we look at whether the factor loadings can be the same so that basically establishes the scale for the variance of the latent variables for both of these latent variables and to do that we constrain these loadings to be the same across occasions the first loading is the same as the first loading here the second loading is same as second loading there and the third loadings are the same. So this is the weak factor invariance. If these factor loadings would be widely different then we couldn't wouldn't be able to conclude that these differences in variances or this correlation is actually or covariance is actually a valid estimate of the covariance between the traits because it could be that these measures simply work differently so that they are less related to the concept they measured than in the first time period. Then we have the strong factor invariance in the strong factor invariance we constrain these intercepts to be the same across occasions. Typically we would constrain the first intercept to be zero that is a reference intercept and then these second and third intercept would be estimated constrain to be the same and we would also estimate the latent variable means. So this is the strong factor invariance and it answers the question can we attribute differences in levels in the latent variables to differences in the trait being measured or is it possible that it is simply that these scales work differently in different contexts. In practice if we simply take let's say there are the mean of these items and the mean of these items the mean at the second occasion could be higher for example because this happy indicator would have a higher mean and cheerful and glad indicator would have the same mean as before and that would be evidence for simply this happy indicator reacting to differences over time and not all the indicators. If all the indicators change by the same amount in their means then that is evidence for a change in positive effect if simply one indicator changes in the mean then that is evidence for the scale working differently across these different contexts. So the idea of strong factor invariance is that if these indicators change over time they change in the same way and that is evidence for change in the actual trait being measured instead of being evidence for differences in the measurement process. In practice we compare these models using nested model testing. So this is a sequence of tests we start with the unconstrained factor analysis model the configure invariance we add the constraint to the loadings to do the weak factor invariance and if we need a strong factor model then we add constraints to the interest. We do nested model sequence. There are two ways that are recommended in the literature one is the chi-square test another one is rule of thumb and CFI if CFI is more than 0.02 lower than in the previous model then we say that the models are different if CFI difference is less than 0.02 we conclude that the models fit equally well. This is difficult to recommend the chi-square is actually what you need and if the chi-square rejects your model then you need to consider what is the degree of difference if you have a sample size of thousands then trivially small differences will be concluded that statistically significant because 0.001 is different from 0 in the population but whether it makes a difference then you need to interpret if the factor loadings are similar enough even though they are non-identical and make a call sometimes you don't have exactly perfect measurement invariance then you need to explain what that means in your particle study context to summarize we need measurement invariance when we have multiple different contexts so we have multiple groups or we have repeated measurements over time what level is required configuring is always required so you estimate the same factor model for both occasions or both contexts weak is for comparing effects strong is for comparing levels strict typically not required and how we go about testing measurement invariance is that we start from the configurer model and we go to the more constrained models we compare always against the previous model because these are nested models using the likelihood ratio test so which one do you require in practice so there are rules of thumb one is if you have a model of time so you are interested in change over time then you need strong invariance because you want to attribute the changes in the means of the indicators to the changes in the trade or latent variables that you measure instead of differences in how the items work across different contexts if you want to study dynamic models or compare effects from one context to another then you need just weak factor invariance for example if you want to study whether happiness and sadness are correlated more in Europe versus in the states then you would need weak factor invariance so whether you are interested in correlations or effects or whether you are interested in the level differences determines whether you need strong or weak factor invariance