 So, an important distinction in structural equation models is between what we refer to as recursive and non-recursive models. So all the models that we've looked at in these videos so far are recursive models. And a recursive model is a model where all of the causal effects are going in the same direction, a unidirectional, and the disturbances, the error terms are not correlated with one another. And we can contrast this with a non-recursive model which is a model where we have some kind of feedback loop where two variables are causing each other and therefore we have what we can refer to as reciprocal effects or where we have correlated disturbances. So this difference is important because it has implications for model identification and it has implications for whether we can, how we interpret and trust the correctness of the estimates that we get from our models. So here's an example of a recursive path diagram model. We have x1 causing x2 and there's a disturbance term for that equation. x2 in turn causes y1, we have a disturbance term there, and also in that equation we have x3. So y1 is regressed on x2 and x3. But here we see that all of the causal effects are going in one direction and none of the disturbances are correlated. So this would be a recursive model. A non-recursive model on the other hand will have some kind of feedback loop and here you can see that there is such a feedback loop between y1 and x1. So x1 we have here is causing y1 and y1 is causing x1. So these are reciprocal effects and this is actually quite a plausible kind of causal mechanism. There are many examples of situations where we would expect two variables to be causing each other. We can think for example of economic perceptions. The more people perceive that the economy is doing well, the more that they will support the government and the more that people support the government, the more that they may think that the economy is doing well. So there are many examples where we would want to estimate this type of equation and we also see here that we have a correlation between the two disturbance terms, the errors in those structural equations. And that's indeed implied by the fact that we have this reciprocal causal effect between y1 and x1 means that the disturbances must be correlated. Now there are some gray areas and this results in what we refer to as partially recursive models. Here we see that we have a correlation between the disturbance terms but we don't have any direct effects amongst the endogenous variables in this model. The endogenous variables here being y1 and x1. So in this case we can treat this in terms of identification as a recursive model. But here we do in this diagram we have a direct effect amongst the endogenous variables. We have a regression of y1 on x1 and we then also have this disturbance term correlation. So this would be treated as a non-recursive model. Now I said that recursivity or recursive versus non-recursive model status is important for identification but that's not terribly interesting from a sort of analytical perspective. Recursivity is also important really because a recursive model is always identified and it's simple to estimate. We can estimate recursive models using OLS, using a set of OLS models. But that simplicity is also rather restrictive. It means that we can't estimate the more complex kinds of models that we would often want to. So introducing a non-recursive model means that we have more flexibility in the kinds of model specifications that we can use. And these are actually a lot of the reasons why many analysts want to use structural equation modeling, structural equation modeling software because it's actually very easy to specify this kind of model. But we have to be aware that just because we can specify a model as a path diagram and we will generate some parameter estimates that doesn't mean that we can always trust them as being valid estimates. So non-recursive models despite being more flexible also can be challenging in terms of identification. And we'll often require in order to achieve an identified model we will need to use other variables in the model that which may not be of direct substantive interest in the model but we need them nonetheless for identification purposes. So as I said if we have a model it may be empirically identified doesn't mean necessarily that we can trust the parameter estimates. And in particular if we want to have unbiased and consistent estimates for reciprocal paths, these are when we have arrows running between two variables in a model, we have to make some quite strict and some would argue often plausible assumptions about the variables in the model. So in particular in this sort of context with reciprocal effects we need to assume that we have some exogenous variables in the model that we can treat as instrumental variables. And this is another important idea for understanding and implementing this kind of non-recursive model, the idea of an instrumental variable. And to understand what we mean by an instrumental variable in this context it's useful first to understand another concept which is that of an endogenous regressor. And here we've got a simple path diagram to help understand what we mean by an endogenous regressor. So we have here y1 regressed on x1 we want to estimate beta where we'd ideally like to treat beta as the causal effect of x1 on y1. But we also see here that we have a covariance or a correlation between the disturbance term in this equation and x1 which is the predictor. Now we know from our OLS classes that this is an assumption that we have to make in OLS that we don't have a correlation between the error term and the predictors. If we if we find that there is a if there is a such a correlation then we have what's referred to as an endogenous regressor. The x1 is an endogenous regressor and this is because it can be for a number of reasons but will often be because of some unobserved variable that we should have in our model that maybe is related to both x1 and y1 or it may be because of simultaneous causal effects that x1 is causing y1 and y1 is causing x1 the sort of reciprocal effects that we're interested in here that would generate this this correlation. So when we have this kind of a situation we need an instrumental variable for x1 if we are if we want to be able to interpret the beta coefficient as the causal effect of x1 on y1. So an instrumental variable is a variable that's going to deal with this endogenous regressor problem and it does this by introducing exogenous variability into the endogenous regressor and to have the properties of a of an instrumental variable then which we'll refer to as z our instrumental variable will be z in this context and the the the instrument must cause the endogenous regressor but not cause the outcome. Now there are lots of different examples of of instrumental variables that have been used in the empirical literature and we'll come on to some of those but one good way of thinking about an instrumental variable is the assignment variable in a randomized control trial. The randomization which determines whether someone is allocated to the treatment or to the control condition. This is a perfect instrumental variable because it's very strongly correlated with whether you are in the treatment or the control group but it is uncorrelated with whatever the outcome is in the randomized control trial. So that's a good way of thinking about what an instrumental variable is and the sorts of variables that we will be looking to use as instruments should come as close as possible to that sort of randomization type of variable. So this is what we're looking for in terms of a path diagram here. We've got an endogenous regressor x1 and we need an instrument which is z1 here which causes x1 but doesn't cause y1 other than through its effect on x1. So you can see it has an indirect effect on y1 but not a direct effect. So this would be an instrumental variable. As I said there are many papers particularly in economics which have used natural variability, natural experiments if you like and one example is the Vietnam lottery draft which determined whether US citizens were allocated to go to Vietnam or not. This was done on the basis of a random lottery. So if you want to assess the effect of going to Vietnam on later outcomes like your earnings, your education, your mental health and so on then you can use that initial lottery draft as an instrument for going to Vietnam war. Another one that's been used is proximity to your nearest college for studying the effects of education on earnings. Obviously if you just look at the relationship between education and earnings there are many unobserved variables that would mean that you couldn't just take the the simple correlation between education and earnings as a causal effect. But if you can use something like proximity to a college that can have a direct effect on education but not a direct effect on your earnings other than through its effect on education. The third example might be variability in the compulsory schooling age. This can vary across geographic boundaries in US states for example have different compulsory schooling ages or in the UK there was an increase in the compulsory schooling age from 15 to 16 in 1973. And this can be used to as an instrument for again the effects of education on later outcomes such as earnings because the policy change introduced random variability into how much schooling people obtained but it wouldn't have had any direct effect on earnings. So those are some examples of instrumental variables and it should give you an idea that you have to meet some quite strict requirements to be a good instrumental variable and even for these three quite well known examples there have been criticisms of these as whether they really are valid instruments. So again this is something of a caution because non-recursive models are easy to specify. Here's an example again using the European Social Survey data where we're looking at the relationship between life satisfaction, happiness and social trust. Scholars have been interested in what the relationship is here and this model specifies reciprocal causality between these variables. Now if you just try to estimate that model without the two exogenous variables at the bottom of the diagram whether you're married and your earnings it would be unidentified. So these variables are acting as instrumental variables in the model but it's not really plausible to assume that they are valid instruments because we have to assume that neither of them has a direct effect on the other latent variable in this model. Each one only causes one latent variable but it's not really reasonable to assume that your income is not related to your level of social trust. We know that's an implausible assumption. So we have to be careful just because we can estimate a and we get parameter estimates for a structural equation model which is non-recursive we have to check our assumptions that are needed to make that identification and assess whether we can really trust the estimates.