 Lähdenkin- ja kohdattelmastosta on henkilöryhmäinen teknikkönen kohdall- ja kohdall- analysointi kautta kyläkulta. Tämän teknikin, joka on tärkeästi kurssia kohdallisuus- ja kohdallisuus-utuun, onneksi tarvitaan yksi samanlopuksi, jotka ovat kriittävää ja kohdalla viimeinen, kun sisällä voimme ottaa eri kohdall- ja kohdall-anlopuksi-vastilö. Lähdenkin kohdall- ja kohdall-anlopuksi- actually does. So in experimental research the key idea was that we have the randomization here and because of randomization the treatment and control are comparable in large samples or they're identical in large samples so we can make the valid causal claim. The problem is that if this randomization is missing or one of the problems if the randomization is missing is that it is possible that the treatment and control are different to start with. So if the treatment and control are different on the observed characteristics for example health then we really cannot tell whether the difference post treatment is due to the treatment or is it just a difference that exists pre-treatment and just persists over the study. And difference in differences was designed to address this scenario. So the idea of difference in differences is that it applies to this quasi-experimental non-equivalent control group design. So the idea here is that we cannot really compare these two groups after the treatment because there could be a difference that exists before the treatment and we cannot either compare the treatment group before and after the treatment because it is possible that there would be a change over time anyway. So difference in differences addresses these two problems in this scenario. How the difference in differences works is that we measure two groups, we measure both groups before the treatment and after the treatment and no randomization is required. And the name difference in differences comes from the main idea of this technique. So the idea of the technique is that we first calculate the difference before the treatment here. Then we calculate the difference after the treatment and then difference in those differences here the delta is the causal effect. So if the difference between the treatment and control grows over time after the treatment then we conclude that the treatment had a positive effect. If the difference between the treatment and outcome decreases or becomes more negative over time after the treatment then we conclude that the causal effect is negative. So here we have a lambda the initial difference so that's the first difference. Then we have baseline change over time we are typically not interested in this baseline change over time but that would is the change that prevents us from just comparing the treatment group against itself before and after the treatment and then we have the treatment effect delta here. So how do we then do this in practice? We could calculate to do differences and then subtract them but the problem is that if we do that in excel for example then how would we calculate the standard? In practice this analysis is typically carried out using a simple regressor model. So we have a regressor model here we have the d the treatment t the time and then we have beta 1 for the treatment beta 2 for time and then interacts on between treatment and time and how do we interpret this? Well beta 1 is the initial difference marginal effect of d at t is zero so that's the difference between before the treatment and then we have beta 2 is the baseline change over time so it's marginal effect of time for the control group and then beta 3 is the treatment effect. It tells us how much the difference between the treatment and control increased over time when we went from pre to post treatment. So very simple so you just need two variables and their interaction and then you have a causal effect. Of course if that will also always give us causally effect the life of a researcher would be very simple but it does not it requires some very strong assumptions so that we need to understand those assumptions and then also justify those assumptions in our article to actually make this technique produce any valid causal estimates because this is simply a regression model it's useful to start with the regression assumptions this is from bullet's we have six different assumptions and not all of those are equally important let's take a look at which assumptions are possibly violated and one of the assumptions is always violated in this difference in differences analysis when you read about difference in differences of the explanations or introductions to this technique usually mentions that it relies on a parallel trends assumption so the idea here is that in parallel trends is that the control develops over time and then the treatment group develops over time as well and for the causal inference to be valid for this delta to be a valid estimate of the causal effect we have to assume that the treatment would have treatment group would have developed the same way over time as the control group did had the treatment group not received the treatment so the development of the control group is used as counterfactual for the development of the treatment group had it not received the treatment which we don't really observe and this is critically important assumption because if the parallel trends assumption does not hold then our estimates will be inconsistent and causal inference will be invalid so for example what would this parallel trends mean let's assume that we have a medical trial we have sick people and we have healthy people and then all the healthy people decide that they will go for the control all the sick people decide that they will go for the treatment and then we compare the health over time if people did not get sick spontaneously or even if people did not get cured or get well spontaneously then there will be no trend over time so and any difference after the treatment will be a result of the medication so in other words if the medication is the only reason why people could get well then we have a clean causal estimate but that's not of course the case if we have people that are first sick to start with some of those people will get better over time even if they did not have their own medication so the average health of people who are sick to start with is probably going to increase over time also if you have people who are healthy to start with some of those people will get sick during the study period so on average those people their health will decrease over time and that's an example of scenario where the parallel trends does not hold so which of these regression assumptions is violated when the parallel trends assumption fails let's take a look at the regression equation so if we have a non-parallel trend it means that the effect of time t would be different for the treatment even if the treatment group would not get the actual medication or whatever the treatment is and if we reorganize this equation a bit we can see that because we don't observe what would be the effect for the treatment group had it not received the treatment this is unobserved and it goes to the error term and we can see from this simple rearrangement that this error term which includes the trend of the treatment group had it not received the treatment if this beta 4 is non-zero which means that the trends differ then we have an endogeneity problem because this term here correlates over these two terms here that contain the t why it correlates is because beta 4 is non-negative if the trends are not parallel and d is one for the treatment group so we have an endogeneity problem unless the parallel trends holds we have also another assumption that is violated and it is the non-independence the independence of observations and the observations are non-independent because we are measuring each person twice before the treatment and after treatment and there are probably some courses of the dependent variable that persist over time so that if we measure one individual before the treatment and the same individual after the treatment those two scores will be more similar to one another than if we measure one individual before the treatment and another individual after the treatment and then compare the differences so we are in violation of that assumption so these are the two critical assumptions in difference in differences and let's take a look at how they are typically dealt with so this has to be justified the parallel trends and this violation has to be dealt with so the parallel trends assumption or common trends assumption can be justified in two different ways and generally you should do both in your study so first you should provide a conceptual argument based on theory why do you think that the trends should be parallel so if we think about let's say a person's health how it evolves over time we should start thinking about what causes differences in the trends and then whether those causes are present and also different between our treatment group and control group this article by wing gives a pretty good explanation of how you could start arguing there are the common trends or parallel trends assumption conceptually then there are empirical ways of doing it so you can you can test the common trends assumption or let's say test because it is ultimately untestable but you can provide some evidence that it may hold and the idea of testing the common trends assumption is that we compare how the trends develop before the treatment so this is an example from right wheel wealth paper and they have two groups and and they compare that both groups had the same trend there are the treatment group was higher but we don't we don't care about the level differences we care about the differences in direction and these two trends they follow each other quite quite closely so based on this figure the authors argue that the trends appear to be parallel so if the control treatment group were not in the treatment they would probably have been developing somewhere over here but because they got the treatment it's actually are the differences lot larger so that's one way there is also a formal way of testing if two trends are the same between the before the treatment and how that works is that you estimate the regression model like that so basically you estimate a set of time dummies d here and then you estimate the set of time dummies d multiplied by d which indicates whether the individual belongs to the treatment or control and then you test that all these beta 2 t's are zero using a an f test or a valid test if you don't use OLS regress and if all these interactions are zero then that conclude then we can infer or we can say that the trends are probably very similar or identical before the treatment here so that is how you justify the common trends assumption two things conceptual arguments and then show a graph of the trend tell that they are the same if you want to have a bit more rigor then you can test it empirically by estimating the time trends for the for the control group and then for the control group and then add there are differences using these thought interactions between the time dummies and and there are the variable that indicates the group members so that's the common trends how about the non-independence of observations this is something that a lot of methodology of research has been addressing in in the last 10 or so years and there is still like no consensus on what is the best approach for dealing with this issue methodologically if you have a large sample size then our cluster robust standard errors is probably the ideal way of of dealing with non-independence of observations because the standard error is what is affected if the observations are non-independent. If you have a small sample size then I recommend taking a look at this Wings article again and then they are provide lots of references on different techniques then I read about those and make a decision many large samples cluster robust standard errors are pretty much always the easiest and safe way to do and that's how it's done in for example Mockon's article that I use as an example in some of my videos and they just stated they use cluster robust errors to account for repeated measures and then they are site Bertrand's paper that explains why this is is required for difference in differences most applications of difference in differences that I've seen in management research don't actually mention this issue so it's possible that it's not well known and it's not properly taken care of by management researchers we don't know maybe they just know about it but forgot to report so let's take a summary of difference in differences and important things about difference in difference is that it's it's not any kind of magic it's just a regression analysis to a regression analysis where you have one binary variable indicating time another binary variable indicating whether it's a treatment or control group and then interaction of those two binary variables of course you can have control variables too it depends critically on the parallel trans assumption if the parallel trans assumption that the treatment group would have developed the same way as the control group had it not been treated if that assumption fails then our causal inferences will be invalid then we have non-independence of observations which is typically dealt with using cluster robust standard errors if you have a small sample size then that may not be an ideal technique but in large samples that works well this is applied with quasi experimental designs where you have a non-equivalent control group and typically how that control group is is developed or constructed is that our researchers apply some kind of matching technique so for example if you have people that are Arbivin 30 and 40 that receive the treatment then you take the same people from the same age group from the general population as a control of course there are more refined matching strategies like the propensity discord matching for example that can be applied