 Regress and discontinuity design is a method for causal analysis that is not very common in management research, but it's very useful in certain scenarios and there are quite a few articles about how to make causal claims that actually recommend this technique as an alternative when it's applicable. So what is regress and discontinuity design and why it will be useful and under which scenarios. The idea of regress and discontinuity design is that there is a regression line or it can be some kind of other kind of function that you estimate through the data and then there is a discontinuity which represents a causal effect. This is a bit of an abstract presentation and I'll get to the details a bit later. But the key idea here is that we have assignment in the treatment and control group which is non-random, so we are not as researchers putting these groups in the treatment and these observations to control, but we know that the allocation is based on some variable that we observe and we know that there is a cut-off or a threshold and if you pass the threshold then you belong to the treated and if you don't pass the threshold then you belong to the control. And the key idea here is that around this threshold here so the observations that almost made it to the treatment but not quite who are the last ones who belong to the control are nearly identical to those that just got into the treatment. So if you consider for example our kids who go to school there is a clear cut-off at least in Finland if you were born in December 31st you go to school with all the kids that were born in the same year if you were born on the next day on January 1st then you go to school with those kids that were born on the next year. The difference between December 1st, 31st and January 1st is not that great but it is in terms of age but it is a big difference in whether you go to school on a particular year so there is a clear cut-off. Another thing is for example voting. If we have two candidates and we vote who becomes the president if the candidate receives 49.9% of vote versus 50.1% the candidate's popularity is about the same but there is a big difference because the threshold of getting to be the president or whatever office you are running is at 50% if there are two people in the election. That kind of thing so we have this variable X that determines the selection and we have a clear cut-off which separates who go to control, who go to treatment and then we can estimate the cost of it. So this is easier to go understand through an example. Let's take a look at example. So this is a funding agency in Finland that gives grants to companies who apply for money and the idea of the grants is that it helps the grants are supposed to help these companies to develop their business, grow, expand internationally and so on and we want to understand whether the grant has a causal effect on company performance. It's of course we cannot simply compare those companies who got the grant and those who didn't because this agency picks companies based on their potential performance. So if there is a very low potential company then it doesn't get any money so there is no possibility that that company will get a grant but if there is a very high potential company then that will almost certainly get a grant if they apply for it. So if we compare those companies that didn't get the grant and those that got the grant we don't really know whether it's because they were the good companies that would have grown anywhere that got the grant or we don't know or whether it's the effect of the grant that caused the difference. So how could difference in differences be applied in this scenario? Let's assume using this figure from Siväkes article that for the sake of argument whether a company gets a grant is treated or doesn't get the grant control is based on a score. So let's assume that this agency scores all applications on a scale between 0 and 100 and if you get 50 points you get the grant, if you get less than 50 you don't get the grant and of course the company performance correlates highly with the score of the funding application because this agency takes past performance and future potential into consideration when assigning the scores. Now our causal effect, causal claim here depends on comparing these companies that just were just below 50% so they did not or 50 points they didn't get the money and these companies were just about the threshold got the money so that's the difference and let's take another example to understand the principle of difference or regression discontinuity design. This is from Flammer's paper and she has written a couple of papers using RDD and she is looking at votes in a shareholder meeting. So this paper is about long term orientation and the vote is about whether a CEO gets a compensation package or not that is focused on the long term performance of the company. The idea here is that shareholders and investors who value the stock so this is the return of the stock or abnormal returns if it's sure or almost certain that the company will adopt the CEO compensation package then that will be taken into account in the stock price well before the election or the vote and if it's almost certain that the company will not adopt the compensation package that will also be taken into account by the investors well before the actual vote but if it's 50-50 investors don't know whether these packets will go through or not then they really cannot anticipate it. So the idea here is that there is victory margin so whether the package was approved or not approved and these close votes investors really cannot anticipate the result of a close vote so if we compare the just rejected and just accepted votes then we can get the causal effect of adopting that package and why this goes down is that the investors really cannot anticipate it So here the idea is that if we know that there is a small chance that the package will be adopted then that information has been available for the investors and it's also already included in the stock price same here there are no abnormal returns because investors are already counting the fact that this package most likely will pass in the stock price So only these uncertain scenarios we have the big difference and that big difference is the causal effect How do you in practice then implement this analysis and how do you report this analysis Regress and discontinuity is something that not all researchers are aware of so it's a good idea to explain the idea of analysis to your readers and that's what Plummer does in her article and she tells that between 49.9 and 50.1 there is probably not much difference it could have gone either way and chance affects the assignment here whether it goes through or not How then do we actually estimate the causal effect here Well, that's the equation we want to estimate the coefficient for passing and quite often we cannot assume that there are a relationship between the margin whether it goes through or not and the stock performance that it would be linear In rare cases that would be true and in practice in regression discontinuity design we estimate some kind of polynomial model some kind of curve or a non-parametric model which can't be expressed as it's made us but this polynomial case is easier to understand as a regression model so I'll focus on that So this article estimates a third order polynomial for the left hand side here that didn't go through and then another third order polynomial for the right hand side for these proposals that actually were adopted and how you run the regression model is that we have first here this is the third order polynomial so it's the margin to the first, second and third power each have a regression coefficient and then we have the same first, second and third power of margin and we multiply it by pass which is a binary variable indicating whether the proposal was passed or not and that estimates another polynomial So we are basically estimating a polynomial here and then we are estimating how much this second polynomial differs from the first polynomial these actual coefficients for the polynomials are not important but they are useful because we want to understand the trend right before the cut-off and right after the cut-off to estimate this causal effect, this gap here So we want to control for the fact that investors are anticipating a certain kind of decision and then that decision is modeled or that anticipation is modeled with this polynomial the causal effect is the marginal effect of pass at margin to zero So here passing is zero means that it passed and negative number means that it didn't pass in some other scenarios the cut-off may something other than zero so if this was the actual votes and not the margin then the cut-off would be 50 and the causal effect would be marginal effect at votes here equals 50 for pass So in practice it's probably easiest to estimate this kind of model is to convert your data so that the cut-off point is at zero in which case this pass directly this beta one gives the causal effect otherwise you need to start using for example status margins or another way of calculating the marginal effect to calculate what are the values of these all these interactions at the cut-off point So how does regression discontinuity compare with other designs this technique is sometimes compared with randomized experiment and they have the similarity that in both cases we know exactly what is the source of variation of the independent variable so recall that the problem of endogeneity when the source of variation of the independent variable also affects the dependent variable so whenever you have a potentially endogeneity problem you need to think about why does the dependent variable x vary in my sample so here we know in randomized experiments the source of variation for x is our randomization in regression discontinuity design the selection process is known completely and there is a small chance element that can affect whether you go right before right above or right below the threshold and that chance fluctuation that small randomness is the reason why we can make causal claims so if we have a predicted vote margin of 50% then it can go either way we don't really know there's randomness involved these techniques make them assumptions so standard explanation of this technique start that the assumption that they make is the stable unit treatment value assumption which means that each observation the value of the treatment for each observation does not depend on whether any other observation got the treatment so if you consider that from the multi-level modeling perspective it would mean that the contextual effects are zero so that's equivalent to the SUTVA assumption then there is no attrition so we assume that every person who goes to the treatment is actually measured from the outcome then we have treatment contamination and treatment misallocation which refer to the control or cases assigned to the control who actually received the treatment so it's possible that if we go and we give some advanced teaching materials to half of teachers other half don't then those half that got the advanced teaching materials will share the materials with those who belong to the control group so that's the treatment contamination then we have treatment misallocation which is also sometimes referred to non-compliance so if we have persons assigned to the treatment group then maybe they forgot to take their medication or something like that treatment manipulation refers to where people know that their subjects know that there will be allocation of treatment and control and they try to game the system to get to the treatment that causes problems there is also a difference in what these effects estimate so this randomized experiment gives us the average treatment effect and regression discontinuity sign gives us average treatment effect at the cut off so whether we can generalize from those close 50-50 votes to other votes it's not clear so generalizability to other scenarios is not that clear here in randomized experiments we have more power because we don't estimate anything except the effect of the treatment in RDD there is more complexity we have to estimate the trend before and trend after and that decreases our statistical power let's take a look at how these different assumptions are investigated and justified using the Flammer's paper and one of the things that it's useful to check is whether there is a difference in how or difference in distribution in whether proposals are accepted or not accepted and if there is a big peak here right after the threshold like if you look at P values in published articles there is a peak right after 0.05 then that means that that's evidence that people are gaining the system to get right get just about the threshold and that's the problem for this technique and how this assumption is inspected or tested or assessed is that you can plot the frequencies of different values like it's done here there is a macro test which is a formal test for the assumption that there is no peak right before or right after the treatment assignment so we look at the boat shares here another important assumption is that there is no confounding so no other variables that are in our model should depend on the treatment assignment so for example here Flammer studies all the variables that she had right before the treatment and then shows that there is no effect of the treatment on these variables that are not supposed to be affected so if vote is today we can study values of our control variables yesterday and of course the vote cannot have a causal effect on past but if results or analysis shows that there is a difference within the treatment and control before the actual day of the vote then that indicates that there is a potential confounding factor that we did not consider in our analysis regression discontinuity design sometimes also uses this sensitivity analysis so you can fit the same model into different subgroups for example you can fit choose just a small amount of observations right around the cut off or you can choose different ways of estimating the trends before and after and so on if your assumptions fail then you can apply something called Fuzzy regression discontinuity design the idea of the Fuzzy RDD is that there are the decision of who goes to control and who goes to treatment is not clear, it's not like you have a 50% vote and if it's more, the actual vote is more then the motion is passed if it's less then it fails but there is some uncertainty and this can occur if there is treatment misallocation so let's say that we assign people the treatment to receive a medication some of them decide not to take the medication or this could be treatment contamination so we assign some people to control but somehow they get their hands on the medication and take the medication and in this flammer's paper they were studying the effects of long term orientation of the company and they were measuring that through the CEO compensation packages whether a long term compensation package was adapted or not if the CEO gets a package that rewards for long term objectives then they will be more long term oriented because that's how it should work but it's possible that the CEOs ignore the package so they will optimize short term results even if they are rewarded for long term or it's possible that the package actually does not have an effect on the CEO performance so the CEO would be long term oriented anyway and this is a scenario where a fuzzy RDD could be useful and this is very similar to how you would deal with the experimental research with the scenario where some of your people don't take the medication and so on so you have non complies in treatment and the remedies for that scenario are the same in practice you apply instrumental variables the compensation whether the compensation package for long term performance was adapted or not as our instrument and then we measure whether the CEO actually behaves in a way that is long term oriented and we use the package adoption as an instrument for the long term orientation of the CEO and then we are in the second stage of the instrumented variable that predicts performance so this is a simple application of instrumental variable analysis here they have a two states least course analysis so let's take a summary of regression discontinuity design this is something that is very useful if it can be applied but it's not very commonly applied in management research it could be that we don't have experience with the technique or it could be that scenarios where this technique is applicable are not so common so how does RDD work and why you would use RDD well it's first critically important that we have a variable a selection variable that determines whether the group goes to treatment or control and that there is a clear cut off of whether a case goes to treatment or goes to control for example funding application score perhaps you are studying athletes if there is a qualification for olympics you have to run 100 meters under a certain time to qualify for olympics otherwise you don't perhaps there is a vote and 50% is required for pass if you fail below 50% even at one vote then it's not going to pass and that kind of thing so in compared to randomized experiments we will randomize the treatment here we know what is the social variation of the treatment variable these are the same pretty much that are for experiments and for example those who go to control actually don't get the medication those who go to treatment all get the medication if that fails then we can apply instrumental variable techniques to deal with that problem the results that we got on randomized experiment they have more statistical power they are easier to calculate and they are more generalizable in RDD when we calculate the difference between those who were right below the cutoff and right about the cutoff then we really can generalize only to scenarios right at the cutoff slightly below slightly above but for example votes that are clearly going to fail we don't really know whether we can generalize from RDD with randomized experiments we could with RDD we don't RDD is also more complicated to implement so we don't really know what to do with this technique