 So, if you are going to use social research to understand cause and effect, you need to understand that in order to demonstrate cause and effect, you must make a case. In human behavior and a number of other sciences, it is often hard to prove cause and effect. It's not as easy as it is in something like physics. And this is because there are lots of unaccounted four variables that can affect our study of human behavior. Studying human behavior is much like studying fish while you're a fish in the fishbowl. So, you are always only going to be making a case that cause and effect exist. It's either a strong case or a weak case, right? It's on some sort of continuum, but it is never absolute proof or it's very rarely absolute proof. So, it's really important to understand cause and effect and how one demonstrates and makes a strong case for there is a cause and effect relationship. This is also important because in popular media, we very often hear about A causing B when in fact the studies that have been done do not demonstrate cause and effect at all. There is a popular misconception that if there is a relationship, a correlation between two variables, that one of those variables is causing the other variable or causing changes to the other variable. It's important to understand that this is not necessarily true. So the first thing that we have to look at are three factors that have to be met, three conditions that have to be met in order to make a case for cause and effect. These conditions are each necessary. In other words, if you're missing one of them, you're not able to demonstrate cause and effect. They all three have to be addressed. But no one of these conditions is sufficient. That means that if you succeed in demonstrating one of them, but you cannot succeed in demonstrating the others, then you have not demonstrated cause and effect. No matter how strongly you demonstrate the one condition. So again, and I guarantee you, you will see this on tests, each condition that we're about ready to show you is necessary. No one of these conditions is sufficient. Each condition is necessary. No one of these conditions is sufficient. The first condition is that cause must occur in time before effect. What this means is that the sequence and timing has to be the variable that you are calling a cause occurred in time before the variable you are calling an effect. Now this sounds pretty straightforward. You may think, well, duh, but it is really hard to demonstrate this sometimes. Think about sickness. When does somebody get sick? Were you ill when the doctor diagnosis you? Probably before then. Were you ill when you first felt symptoms? Maybe before then. What if you have a disease that gets triggered later in life, but you were born with the disposition towards it? When did it start? When did you get sick? Were you sick when you were born? Were you sick when the trigger happened? So you can see that it's very difficult to figure out timing. In social sciences, this problem comes up often when we think about when people form opinions or attitudes. Very often when you ask somebody a question, they formulate the answer to that question immediately after the question is asked. There may be something that they sound like they have a strong opinion about, but when was that opinion formed? Because often times in social sciences studying human behavior, we make the assumption that the opinion was formed before the question was asked. And therefore we make the assumption that behaviors that occur happened because of that opinion or that attitude. But in fact often times people form opinions simply because they were asked about a topic, which means that the behavior became before the attitude and it can get very mucky on timing. The other problem is that very often social researchers don't even address the question of timing. So you have no way of knowing from reading the research if the timing has been addressed, if it has been studied, it has been demonstrated. So very often when we get information from social research, we are not getting enough information to determine cause and effect. That doesn't mean that we're not getting important information, but we can't make that leap of faith that one factor causes another. A change in the cause must result in a change in the effect. This is called correlation. So if the cause increases, the effect increases. Or if the cause increases, the effect decreases. Or if the cause decreases, the effect increases. Or if the cause decreases, the effect decreases. It can be inverted, it can be up, down, so forth. But some sort of corresponding change has to occur. This is what correlation is. This is where statistics come in. Real data, data that has statistics applied to it, can show correlation. Correlation does not in and of itself demonstrate cause and effect. I can't emphasize this point more strongly because this is where the problem is when we get information from the news media. They see a correlation and they use the word cause. It is not a cause. It doesn't matter how strong the correlation is. It doesn't matter if, you know, every single time you make a change in the cause and a change in effect occurs. It may very well be that some other explanations can account for this. And that's why the third condition is that you have to rule out all those other possibilities, all those other factors. You have to see whether or not there are alternative explanations for what you are observing. This is called the null hypothesis. We do not try to prove a hypothesis. We make a hypothesis and then we try to disprove it. We try to see if there's some other explanation that can be given to tell us what we want to know. If we try to eliminate it every way that we can think of and other researchers try to eliminate it every way they can think of and it holds up, you have a strong case if it meets the other two conditions for cause and effect. This is why no one study will ever tell you cause and effect. This is why studies are published. This is why other researchers do the same research and try to make sure that there aren't other explanations. There are some particular alternative explanations that we want to look at that need to be tested while you're working towards demonstrating cause and effect. First of all, you need to check for data quality. One of the things that could just be happening is coincidences. You can't know whether or not data is good unless it is tested to see whether or not it's at random. But even the best data has what is called a probability factor, a p factor of .01, meaning that one out of a hundred times it could be a coincidence. It could be at random that you got the result. This is why you got to test it more than once. So even the best and the strongest correlation could possibly be from coincidences. Things like that just happen. So you need to at least test more than once. We've talked about this in the other lecture about reliability. Also, how big of a sample did you have? Did you have a sample in which each person in the population that you're studying had an equal chance to be in the sample or as good as you can make it? There is a survey that is done every four years at least in the United States called the General Social Survey. It is probably the least biased sampling done in social sciences. But even it has a slight bias in it because the people who are in the population are anybody that they have contact information for. So it isn't if you do not have published contact information that they can get a hold of to put you into the mix so that the sample can be pulled out of that, then you do not have an equal chance to be in that. You won't be in the sample. This has created a bias in the General Social Survey against the very poor who do not have contact information to give people and the very, very wealthy who are very good at hiding their contact information. So while it presents itself as a general social survey of the population of the United States, it is in fact a general social survey of the population of the United States who are not very, very poor and who are not very, very rich. So the sampling bias exists almost always. So you need to make sure and check and see whether or not how extensive the bias is on the data. And if it's too extensive, then you begin to question the quality of the data and the results that you have. And then if you have tainted data, if you have people who collected it poorly, collected it with biases, the questions were wrong, it's not reliable, it's not valid data. If you just cheated, if you essentially put in things that you thought were good but took out things that you didn't want to hear, then you basically have garbage data and you can demonstrate all kinds of things with garbage data and it's still garbage data. So bad data can create bad results. If you get into looking at the data and trying to understand how other factors that we are not considered originally affect the results, you can start seeing that something that looks like a cause-and-effect relationship may not be as simple as we once thought. Most of the time, people have a cause that they look at and we'll call that the intermediate cause, the one that's most obvious, and it creates an effect. So we have this relationship that looks really good and we consider that relationship and then along comes another factor and it turns out that that primary factor causes the intermediate factor. And then that in turn causes the effect. Now the problem with this situation is not so much that you can't demonstrate an immediate cause and the effect, but if you base anything on that without thinking about the primary cause, your interventions will not work. Because every time you try to affect a change in the immediate cause, the primary cause can come along and change it back. And then you will not get the result that you want in the effect. So basically the primary cause causes the intermediate cause which causes the effect. And if you only concentrate on this relationship, you will have interventions here that will not work because you're not doing anything here. So this, whatever's going on over here, we'll change that back. An example of this is taking a look at arrests. There are a number of studies that demonstrate that African-American males are arrested more often than any other particular gendered racial group. Now one might ask, well, why does this happen? The most immediate answer to this would obviously be African-Americans are committing, African-American males are committing more crimes and therefore they are being arrested and that sounds like a good cause and effect, right? The crime happens in time before the effect. We see it happening in a reliable way. So we think we've got something here. And so we build jails and we build education programs and we aim our great programs towards African-American males and it turns out no matter what we do here, it doesn't really make a change in the number of arrests. So there may be other factors that are creating this relationship. So if you back up one space and ask, well, what is causing African-American males to be involved in crimes more often or be accused of crimes more often? And there are a number of alternative explanations that could be considered. One is that African-American males are also overrepresented in poverty and poverty does create situations where people commit crimes. Another is racial profiling. It may very well be that African-American males are not committing more crimes than anybody else, but the police are watching them more completely and with more scrutiny and so they get caught more often than anyone else. Or they get accused falsely more often than anyone else. And there are other explanations that could have to do with families, with race and fear and so forth. So there is a whole bunch of alternative explanations that need to be looked at. And if you don't look at these alternative explanations, you're never going to fix this. Because if these things are causing this, every time you change this, all of these things over here are going to change it back. So you can take a person and teach them not to commit crimes and put them in jail and punish them for it. And they go back to the impoverished neighborhood in which the police are watching them more closely than ever. And lo and behold they get arrested again. And it becomes a situation that just doesn't move forward because we're not considering the primary causes in our social programs. Another alternative explanation is called a spurious cause. A spurious cause is an unaccounted for variable that is causing both of the factors that you think are causing an effect. So you've essentially identified two effects and thought of them as a cause and an effect. So basically you have a cause, but it turns out to be an effect, that you're saying causes this other effect. But if you take a look at the true cause, then it turns out it causes the first effect and then it causes the second effect. And so it's this cause down here is hiding this relationship. It's making this relationship look like something it isn't. So classic Las Vegas example. Every May, swimming pools all over Vegas Valley open up. People open up swimming pools in all sorts of places in apartment complexes and local public pools, all of that kind of stuff. And then in June, every year like clockwork, remember the timing is there. One happens in May, the other happens in June. It's reliable, it happens all the time. We can test it, we can show it. In June, ice cream sales go up. So one might conclude by looking at this data with the correlation and the timing and the reliability and so forth, that opening swimming pools increases ice cream sales. Open the swimming pool, you'll get more ice cream sales. But we know that that's not true because essentially the cause of both the swimming pools opening up and the ice cream sales going up is that it gets very, very hot here. So we open swimming pools in May because it's heating up. Ice cream sales go up in June because it's heating up even more. So the heat causes the first factor and the heat causes the second factor. That's a spurious cause. A spurious cause is that heat, which causes both things and makes it look like a nice, neat little cause-and-effect relationship. But it isn't because both things are actually being caused by an unaccounted for variable. The last alternative explanation is not an alternative explanation for something that looks like a cause-and-effect relationship, but rather is an explanation for why a causal relationship exists when it doesn't look like it exists at first. Sometimes correlations are very weak or they seem to not exist at all. So we got a cause and it happens in time before an effect, but we do a correlation on it and when we do this correlation study, we find out it doesn't look like it affects the effect very much, that the changes in the cause don't really create clear, reliable changes in the effect. Well, if you add a third factor, another factor that you weren't considering the first time, suddenly a correlation will appear and a stronger case for cause and effect will appear. Okay, so I'm going to give you an example of this. In the 1970s in the United States, there was a moratorium on the death penalty. The moratorium on the death penalty was based in part on the question of race and how many people were being sent to the electric chair to punishment by death. There was a belief that African American convicts were being sent to death more often than any other group. And then they did a study and they couldn't see the cause and effect. They looked at all of the people who had been sent to death row and they could not see a pattern of African Americans being more likely to be sent to death row. So a very interesting study was done after that and that study took a look at the race of the victim as well as the race of the convict. Okay, so follow what we're doing. We're saying this is black on black crime, black on white crime, white on white crime, white on black crime. Okay, so these murders happened when they were intraracial and when they were interracial. And by looking at the race of the victim, a definite pattern emerged. And that pattern was if the victim was white and the perpetrator was black, that black person was the most likely situation to be sent to death row. So a black convict killing a white victim was more likely to be sent to death row. The second most likely combination to be sent to death row was a white convict convicted of killing a white person. When the victim was black, it was much rarer for the convict, whether they were white or black, to be sent to death row. Now the reason it didn't show up in the first relationship is because most crime, most murders are committed intraracially, meaning that you will be killed by somebody of your own race, more likely than somebody of another race. So that had a tendency, without looking at the race of the victim, it had a tendency to hide the race as a factor and who went to death row. But by looking at the interaction effect, you were able to see that death row convictions did have an unequal racial aspect to it, and that actually you could make a case that in our justice system, we value white victims more than we value black victims. So I hope that this helps you understand cause and effect. I want to make sure that you get that. It is very, very difficult to demonstrate this. It is always making a case. It is never simply a matter of correlation. You must study things much more thoroughly, and you will never really be able to definitively say, this absolutely causes that in human relationships.