 We're going to look at an example right now called Simpson's Paradox. And this is an example of how sometimes looking at the whole data set with one number can be misleading. It's not necessarily lying with statistics, but I can show you the piece of data that I want to convince you to think like I want you to think. And so sometimes just the whole number data set doesn't paint a very clear picture. Let's look at a couple of examples of this. There's a famous case from the Berkeley Graduate School in California that the females were arguing, a group of females were arguing, and this was in the 1990s, that the graduate school was biased towards the male students because if we look at the total data set, they were allowing in 44% of male students and say 30% of female students. If we look at the bottom line, the percentage of males accepted was 44% in the graduate school. The bottom line of females, 30% were accepted. But what if we look at these different colleges that accepted male and female students based on their number of applicants? Within the graduate school, within the university itself, let's look at college A, where they had 800, this is just sample data, for instance, maybe they had 825 males who applied, and they accepted 511 out of those 825. That's 62% of males that the college A accepted, but look at the females. If 108 applicants, for instance, applied to college A, and 89 of them were accepted, that's 82% of the females that were accepted at college A. In the same university, maybe the graduate program, college B. I won't read all these numbers out, but you can get an idea if we look down college by college in the university, compare the percentages in each college that accepted males compared to those that accepted females. And we can get an idea that oftentimes the female population had a higher acceptance rate if I looked at individual colleges. Upon further study, they realized that some of the more competitive schools had more female applicants. And so the school was more competitive, and so fewer females got in. But the paradox is college by college overall, overwhelmingly, females were accepted more. But if we look at the bottom line, it looked like, overall, they accepted a higher percentage of males. So here's an example of, actually, the male population might have countersued and said, look, we wanted, in this program, a new only accepted 62% of males, where you accepted 82% of the female population. It has to do with how this is weighted. You may have heard of weighted averages before. That's what Simpson's Paradox is about. Where are the heavy denominators? How many applicants? Here, there were many applicants. But where were the females? They had many applicants here at College C. That's where their heavy population was. But College C only allowed in 34% of females, where they allowed in 37% of the males. So if we look at the different college breakdown in the graduate school, we get a different picture. Let's take a different example. Another example from history. This data is about the passage of the Civil Rights Act. How many Democrats voted in favor of it? How many Republicans voted in favor of it? If we look at the total numbers, it looks like Democrats voted only 61% in favor of the Civil Rights Act. But Republicans, if we look at the total line, 80% of Republicans voted in favor of the Civil Rights Act. However, if we break those down and look at the Northern States and the Southern States, in both cases, Democrats voted more percentage-wise, more heavily in favor of the Civil Rights Act. So how can that be? Because it looks like if we look at the bottom line, Republicans voted more heavily in favor with 80%. Well, what happened is there were more representatives in the Northern States for both Democrats and Republicans. Since there is higher representation in the Northern States, then we have those percentages in favor of influenced the bottom line more heavily because there's more representation in the Northern States. One more example. So these are two examples from history. Now let's just make up an example that maybe has some numbers that's easier for us to work with. This is just an example I made up about treatment for an illness. Let's say that this is a drug company A, and the other one is drug company B. And they're promoting that their drag is going to cure or effectively treat some certain illness. Well, let's say that they collect some data in September and they all collect some data also in December, so at two different times. Drug company A is going to say, look, overall 62% of our patients were effectively treated using our treatment. So you should choose our treatment. But drug company B is going to argue that. They're going to say, but look, in September we had 80% of our patients were effectively treated, where only 70% of the patients in drug company A were treated effectively. And then look at our December data. Half of our patients were treated effectively, whereas in December, for drug company A, 30% were treated effectively. So if I look at just the individual data, if I look at September and December separately, then drug company B is going to want, they're going to show you that data and say, look, how much more effective our treatment is. But drug company A is going to say, oh, look at the bottom line. We had 62% of our patients were treated effectively, where only 56% of drug company B's patients were treated effectively. Nobody's lying. This is all true data. But what we're going to do is show you the consumer what we want you to see so that you will think like we want you to think. So as consumers of data in the news, media, newspaper, Facebook, we have to wonder what lens are we looking at? Which piece of data is somebody showing me? That doesn't mean that somebody is lying to me, but it's always good to be on the lookout that somebody might be showing me just the piece they want me to look at to try to convince me of something. We should always read with a critical eye and a critical mind what is it that they want me to believe.