 Tällä tavalla, että vaikea ilmeintövalipuolta, kun se on puolesta, pystyy vähän tykkäyntä, jotta kyse on eri näkökulmasta. Tällä tavalla, että vaikea ilmeintövalipuolta, kun on katsoa myös mahdollisuus ja hyvää asiaa. Seuraavaksi, että oletemme vaikea ilmeintövalipuolta, meidän tarvitaan yhdistää tehtyä. Let's take a look at the next example. So Deep House 1999 explains that there are two reasons why a lag-dependent variable is included in their model. The first reason is that it reflects the possibility that the effects of changes in independent variables are reflected or are affected over multiple time periods. So basically they're saying that it's not that the effect of x and y is instantaneous rather when x changes than y changes at multiple different time periods. So we have this effect that is distributed over time. And then other reason that they state is that adding lag-dependent variable can control for omitted variable bias. Of course this is not like a general solution that will control for every possible omitted variable or control for every possible kind of process that is distributed over time. And this article while it states the two advantages of using lag-dependent variable unfortunately does not go into the specifics. If you include a lag-dependent variable then you should really explain a bit more what is either the distributed process that you are focusing on or what is the omitted variable that you would like to have in your model. But you don't and therefore you're proxying it with a lag-dependent variable. Out of these the second is the most common stated reason for including a lag-dependent variable. Let's take a look at how a lag-dependent variable can proxy omitted variables. This example comes from Woolridge 2002 and he presents this regression equation. So the dependent variable is scrap or log of scrap which is the amount of scrap a firm produces. Some firms get a grant for reducing scrap, other firms don't. We regress scrap on grant whether the company got the grant for reducing scrap or not. And we find that the effect of grant on log of scrap is positive. So those companies that got the grant have more scrap. Is that evidence for the grant being counterproductive? Did the grant actually cause the scrap levels to go up? The answer to this question is probably not. It does not sound very logical that if you get a grant for reducing scrap levels your scrap levels would go up. Instead what is probably going on is that those companies that have initially a higher level of scrap will apply for the grant. Those companies that produce very little scrap to start with don't apply for the grant at all. So the grant here where the company applies is endogenously determined. So it correlates with past values of scrap and the scrap of course persists over time because there is inertia. So if your company is producing lots of scrap now you are likely to produce lots of scrap also tomorrow. So things don't vary randomly over time. We can deal with this kind of issues by including a lack dependent variable into the model. So we take the amount of scrap from the previous time period. We take the grant and then we regress the current amount of scrap on the previous value and the grant. And this shows that actually control for the previous values the effect of grant is actually negative. It's not statistically significant but it's negative and that is the direction that we expected it to be. So how do we then interpret this kind of model? Worried gives the following interpretation. We see that beta one measures the proportionate difference in scrap rates for two firms having the same scrap rates in the previous year. Where one firm receives a grant and the other did not. This is very appealing so you control for the initial differences and you compare two firms that have initially the same level of scrap. One gets the grant and the other one doesn't. So this sounds a very good idea to do. There are of course some limitations but generally controlling for past levels of something that persists over time sounds something that you probably at least would consider every time when you do a longitudinal study. Let's take a look at the explanation that Worried provides for using lag dependent variables. So this is from another book. This is the introductory book. And Worried states that again that lag dependent variables can be used for modeling unobserved, effective, modeling, selection and so on. In this example there is a crime rate and then expenditures on crime prevention, policing and other stuff. So crime rate and expenditures on policing are highly correlated because you expend in policing, you expend in crime prevention if you have high crime rates. So when crime goes up then expenditures on crime prevention also go up. So you can say that they are because they are correlated positively therefore spending more money on crime prevention causes crime to go up. It works the other way around. If we want to estimate the causal effect of a dollar spent on crime prevention on the crime rate then we must control for the initial level. So the interpretation of this model would be again that for two cities with the same initial level of crime rate, what is the effect of additional dollars spent on crime prevention? And that sounds very logical compared to just regressing crime on spending because spending depends on crime. So using a lag dependent variable allows you to control for certain kinds of selection or certain kind of self-selection. And in this case it controls for the fact that those cities that have high crime rates tend to spend more on crime prevention. The same thing like in the scrap example. So is this a silver bullet? What are the limitations of using lag dependent variables? And what are some of the reasons why using lag dependent variables might not be a perfect solution to an omitted variable problem? Let's take a look at an example presented by Morgan and Winship. And this example comes from a paper by Colman. So this is an actual empirical study that Morgan and Winship analyzed. The study tried to assess or estimate if Catholic schools are better than public schools. So Catholic schools are private schools, you have to apply to them, you have to pay tuition fees, and then the paper was looking at whether these schools produce better outcomes. The dependent variable Y is test course and the D, the main independent variable, is whether a student was enrolled in a Catholic school or a normal school and there were some absurd control variables that can be categorized into X variables, which are causes of performance, causes of test course that are not causes of going to Catholic school and then some background variables like parental income that we assume affect both test course and whether a person goes to Catholic school. There are also some unobserved variables, so we can assume that there are variables, for example a student's intelligence that can affect whether they go to a Catholic school when they are admitted into the school, that also correlate with our cause differences in test course. We cannot possibly measure every possible variable that could affect both of those. That's pretty much impossible. For one, we don't know all the causes of differences in test course. If we did, we could just regress test course on all the causes and get an R square of 1.0. Typical we are far below 1.0, so we don't know what all the causes of test course or why variable are. The Coleman strategy here was to use a lag dependent variable as a control. The idea here is that these unobserved effects influence past test course. They are comparing basically students who have an initially the same level of test course and whether in a later period those students who went to the Catholic school performed better than others. This sounds appealing, but it has some limitations. It's not a silver bullet and there are some trade-offs. We also need to consider what these trade-offs are because if there are no trade-offs, then we could just say that well, even if lag dependent variable does not control for everything, the downsides for including one are so small that we should always have a lag variable in the model. The downsides are, some of them are related estimation, but they are also some modern related downsides that we need to consider. So the two counterarguments against the causal effect estimated from this model are that the lag dependent variable can be an imperfect control. So some of those unobserved factors that affect past performance also affect current performance. For example, if we don't measure students intelligence and we assume that intelligence determines whether they go to Catholic school or not and intelligence also determines the past value and the current value of the test score, then we can't include the effect from unobserved intelligence variable to y variable and therefore our estimates will be biased because the model is mis-specified. So lag dependent variable works as a perfect control when the effects of unobserved variables are fully mediated by the lag dependent variable. So the idea is that the lag dependent variable or the unobserved variable that the lag dependent variable proxies is only affected by the unobserved variables that don't then affect the current value. Another more subtle criticism is that there are differences between individuals that don't affect selection but they affect both the lag dependent variable and the current value of the dependent variable. So in either cases there are dependent variable, lag dependent variable model is slightly mis-specified because there is unobsertated or generated. There are some unobserved variables that affect both the lag value and the current value. And these two diagrams here correspond to the fixed effects specifications so we have an unobserved effect here that is allowed to be correlated with the predictor of interest and then we have the unobserved effect here that is not allowed to be correlated with the predictor. So this corresponds to the random effect specification in nickel metrics and it makes the random effect assumption. Now if including the lag dependent variable as a control causes a model mis-specification either one of these models is correct should be included or not. That's a tricky question that the literature really does not have an answer to. So even if a lag dependent variable is not the perfect control should you still try controlling for it. My personal opinion is that probably yes use it every time consider having one and then include the lag dependent variable unless there is a good reason not to do so. So this is using lag dependent variable as a proxy for omitted variables but lag dependent variables also play another role. Lag dependent variable can be used as a proxy for effects of past values of x. This example comes from Kili and Kelly and they come from political science. They present a model where they say that politicians approval rate depends on the economy and it depends on also on past economy. So if the economy did really well for the first three years of your four year term as a president but then there is a downturn for the final year. People will remember the first three years that were really good under your presidency when they go to vote. So it's not only the current state of the economy but also how the economy did under the first three years of your term. So history matters and it's just not like a snapshot independent of the past. We would use normally if we have this data for the economy we would use something called a finite distributed lag model. So this is finite because we have a finite number of lags so we have three lags here. So basically this is like a four year term for a president. The economy under every year matters and it's called distributed because we have different values of lags of the independent variable. So what if we don't have the economy data here? We can use lag dependent variable under certain assumptions. Let's take a look at another model. So this would be a corresponding model with infinite distributed lags with geometric decay. The difference is that here we assume that the effect of economy beta zero decays at the constant rate over time. So the current economy has an effect of beta zero. The effect from the previous year is lambda, the decay parameter, times beta zero from the year before that year lambda squared times beta and so on. And this goes on until infinity. When we look at the infinite distributed lag with geometric decay we can actually rewrite this equation by using a lag dependent variable. How that works is that we take first the previous year so we simply take approval at time minus one so that's the lag value of the dependent variable and it depends on the economy we simply change the indices by minus one. And we can take this equation, multiply both sides with lambda and rearrange the equation a bit and we get the following form. Now we can see that this third equation and the first equation contain a common part. So this history here is actually now the right hand side of this equation here. And therefore we can substitute the left hand side of this equation in place of this history here and that gives us a model with lag dependent variable as a predictor. So what is the interpretation of this model? We can see that the current state of the economy is still here. Economy at t has the effect of beta zero. And then we have the coefficient for the past, the previous approval rate. And that actually works as a proxy for these historic effects of economy and the regression coefficient of past approval, approval at time minus one is an estimate of the decay parameter of the model. So this way we can model effects that unfold over time where history matters. Of course this only works if we can assume that we have the constant geometric decay if we have an effect that is for example beta one here would be about the same as beta zero here but then we would have decay starting at beta two and then continuing at beta three then that wouldn't work. So we have to assume this geometric decay for this approach to work. Ideally we would have all this data and use all those regressors in the model to understand the nature of the dependency over time. Sometimes we don't have the data therefore in those situations using lag dependent variable the proxy for these distributed effects would be a possibility that we seriously consider. So summary of the lag dependent variable. Most commonly lag dependent variables are used for controlling unobsert effects that may influence the selection or the value of the main independent variable. So this is the most important reason for using lag dependent variables. Sometimes lag dependent variables are used also for modeling this kind of effect that decays over time but quite often we actually have long interval data sets so we can actually have the axis from different lags and estimate there are finite distributed lag model directly. So this first reason is the most common reason for using lag dependent variable. Another reason for using a lag dependent variable is that if the dependent variable actually has a causal effect on itself over time and this will be the case if you have a model that is accumulative over time. So for example if you want to explain firms assets then the starting point for the assets at year t would be the amount of assets at t-1 and then we just look at how much the firm gained or lost assets but the previous year is always the starting point. So if you have this kind of processes that accumulate then there could be an actual causal effect of lag dependent variable on the current value and our reason would not be using lag dependent variable would not be this kind of proxying but instead direct modeling of a causal effect. Unobserved effects are something that we should also consider so it's possible that there is an unobserved effect unobserved heterogeneity that affects both the current value of the dependent variable and the past value of the dependent variable. This is something that you should always pretty much include in your model if you have panel data but unfortunately it can get a bit complicated on the modeling front but nevertheless using lag dependent variables is something that I would always consider even if it makes estimation a bit more challenging because estimation problems are simply kind of like engineering problems they have solutions so your model should be driven by your theoretical concerns and the nature of the phenomena that's your model not on any things that relate to the convenience of actual estimating the model.