 Researcher's social sciences typically do empirical studies to make or support or refute causal claims. While running a regression analysis does not require that you understand much about causality, understanding what our causal claims are actually about is very useful in the long run because it allows you to evaluate when a particular research design or particular statistical analysis technique can provide valid evidence for causality. However, defining causality is not that straightforward and there are many different ways that we can define what causality actually means. In this video I will explain one of the most commonly used theories for causality, the counterfactual model. In a research methods course or a book about research methods you may have seen this kind of explanation for making causal claims. So we are supposed to demonstrate three different conditions, association, direction of influence and then elimination of rival explanations. We are also told that the best way to demonstrate the direction of influence empirically is to measure the cause before the effect and that there are two main ways of eliminating rival explanations. The randomized experiment where you have treatment and control and people are assigned to those groups randomly, one of the groups gets the treatment, one of them doesn't and then we compare the outcomes of those two different groups afterwards. And then we are also told that statistical modeling can be applied when we don't have an experimental design where we are simply doing a correlation study. Further we can have figures like that that demonstrate the logic of the golden standard experiment. So we have the other population of interest, we take a sample for that as I randomly in the treatment and control, one group receives the treatment, other one doesn't, then we measure something from both groups and we compare the outcome and that is our estimate of the causal effect. Now there are two important questions or two important issues about this model. So what is this actually based on? So how is it possible that we say that this difference is actually a valid estimate of the causal effect? And the second issue is that this model or this research design actually does not provide us a causal effect, but instead it gives us the average treatment effect. So why cannot we get an individual level causal effect, that's something that I will address next. But before we go into that we need to understand what causality is and why we need to define causality in some way to do research. This is addressed pretty well in this chapter by Hitchcock, which I've taken this small excerpt. And Hitchcock explains that there are basically two reasons why we need to define causality in a way that does not involve the term cause. So we need to have these ways of defining causality that does not require the concept of causality itself. And this is important for the practical reason that we cannot observe causality in action. So if we put an object made of wax outside in direct sunlight on a hot day it will melt, but we really cannot see. We know that the sun is causing the object to melt, but we cannot see the causality in action. We know that it is out there, it's sunny and the wax object is melting, but we really cannot observe that it's actually the sun causing the object to melt. So we cannot observe causality directly. Therefore, for empirical research purposes we need a way to define causality that does not require the sort of causality itself, but something that is actually observable. So we need an operational definition for causality. Then there is the metaphysical argument that if we can do away with causality as one of the principle things or the original lowest law of things in the world, then our description of the world is simpler, which is always better. So there are a few different main theories for causality. One of them is the regularity theory of causality, which basically means that if cause always precedes the effect, then we can say that there's a causal relationship. But that's problematic because it's possible that the effect would have occurred anyway, and to address that we have the counterfactual model for causality. So where does the idea and the term counterfactual come from? Let's take a look at the experiment again. So let's consider that we want to estimate the causal effect on an individual. So typically when you go to a doctor and the doctor describes your medication, you are not really interested in what is the average effect of that medication on people, but you are interested in knowing what is the effect that that medication has on me. And that is a question that experiment cannot answer directly. The reason is that the difference, the causal effect of a medication can be defined through the difference of these two what we'll call potential outcomes. So the idea of the potential outcomes framework or Rubin's potential outcomes framework, which is the term that's used for this model in some contexts, is that an individual has two options to go. Either the individual goes to the treatment group or the individual goes to the control group. And let's assume that the individual receives the treatment. So this is the actual outcome. So what is the health, for example, after the individual has taken a medication? And then we have the other potential outcome, which we call the counterfactual. So this is the outcome that could have happened, but did not. So this is counterfactual because we don't observe this outcome. And the causal effect with this model can be defined as the difference between what actually happened, what's the actual outcome compared to the counterfactual outcome, what would have happened if the person had not taken a medication. So for example, if I'm holding this remote here and I'm releasing my hand and causing the remote to drop, that we could say that the regularity theory says that if I open my hand, then the remote drops and that is enough for causality. For counterfactual model, we need to actually make two different claims. We have the regularity claim. I opened my hand, the remote dropped, and we also have the counterfactual claim. Had I not opened my hand, the remote would not have dropped. And that's the idea of a counterfactual argument. So how does this counterfactual model between comparing actual outcomes and counterfactual outcomes that we don't observe, how do we then actually operationalize this model? Because we cannot see the counterfactual. We cannot observe something that could have happened but did not. In practice, the counterfactual modeling approach is simply they don't try to estimate the individual causal effect because that is simply in most cases impossible. Rather, we are focusing on the average causal effect. So the idea of doing an experiment here is that, for example, an experimental design is that we have one group, let's say we have the treatment group, which is our actual outcome. We measure or estimate some effect for the actual group and then we construct kind of like an artificial outcome through research design and for the counterfactual. In the context of experiment, in a valid experiment, we can make an assumption that the control group is a pretty good counterfactual for the treatment group. In the case that the treatment group would not have received the treatment. So the treatment group is the actual outcome and the control group becomes the counterfactual outcome for the treatment group. And of course we can refine this reasoning and make these signs where we estimate the average effect of the treatment on the treated and so on. And then we compare the actual outcome and the counterfactual, estimated counterfactual, and then that gives us the estimate of the causal effect. This model works pretty well on paper, but it requires assumptions. So we need to assume that the control group is a valid counterfactual for the treatment group. And the important thing about these assumptions is that they must be defensible. So in the counterfactual model, because we cannot observe the counterfactual, we need to estimate it and then we need to make assumptions and also defend those assumptions to make the claim that our estimated counterfactual is actually a valid estimate. In practice, many of our statistical analysis techniques are compatible with this way of viewing causality.