 Okay, Svanpanim and baby seals. Two service announcements before we go into the topic of today's video. How psychology lies to you using statistics. And so the first service announcement has again unfortunately to do with Richard Granon. Many of you have written to me to ask about my participation in some kind of product that Granon has issued or published or is selling or whatever. I want to clarify, I am not associated directly or indirectly with anything that Richard Granon does. Definitely not with his products. Thankfully he is out of my life for good. I am not working with him. I am not collaborating with him. Not now, not forever in any shape, way, manner or form. In any forum, in any setting, in any seminar, ever again. No Granon, thank you. I hope I made myself clear once and for all. Stop writing to me. I have no idea what this guy is selling you. Nor do I want to be involved in any of his, how to put it gently, commercial enterprises. Thank you very much for listening to the first service announcement and now the second one. A seminar in Poland, Gdansk. Gdansk is a beautiful city in Poland. At the end of March and beginning of April. If you want to register, I think the cost is about 20 euros just to cover the venue and the recording of the seminar. So if you want to participate, there's a link in the description. A link in the description. Go to the description, click on the link. It will take you to a video by Daria Żukowska. Daria is the organiser of the seminar. Write to her, communicate with her, see how you can secure your seat. The seminar is about recent developments in cluster B personality disorders and of course called therapy. The latest bleeding edge, cutting edge news. And now let's go straight into lies, damn lies and statistics. Anyone who is studying psychology nowadays knows that a huge part of the syllabus, a huge part of the curriculum revolves around math, mathematics and especially statistics. In their desperate attempt to appear to be scientists, psychologists have adopted mathematics as a way to align themselves with more respectable disciplines such as physics. Yes, psychology is a pseudoscience, but it's still useful as a narrative of human affairs, the human condition and various states of mind. Yet psychology uses mathematics, more specifically statistics in ways which I find to be misleading, pernicious and extremely problematic. I'm going to identify five issues with the use of statistics in psychology, but trust me, that's the tip of a very, very submerged iceberg. Starting with the fact that the vast majority of psychologists don't know how to use statistics properly and this coming from a physicist. My name is Sam Bakni, I'm the author of Malignant Self-Lab, narcissism revisited and I'm a professor of psychology and a professor of finance in the Center for International Advanced and Professional Studies, the Outreach Program of the CS Consortium of Universities. And if that's not respectable enough for you, switch to another channel. Let's start with the problems in using statistics. First of all, you have to know what you're doing and as I mentioned, very few psychologists actually do, but even if you know what you're doing and you musted all the considerable number of techniques available, there's a problem. The vast majority of psychological studies are comprised of a tiny sample. It's not uncommon to see an N, N is the number of participants. It's not uncommon to see an N of let's say 30 or 20 or even six. This presents a problem called normative validation. When your sample is very small or when the selection of the participants in your sample is skewed or wrong, you can't validate the outcome. We leave aside at this stage the problem that the subject matter of psychology is human beings and human beings are mutable. They are changeable. It's impossible to replicate a study because a participant in the study have had a nightmare during the night or have divorced over the preceding three months. People change in short. You can't even test the same person. Because the very act of testing changes the person. It's very reminiscent of the uncertainty principle in physics, in quantum physics. Okay, but let's assume for a minute that by some miracle, there's a sizable sample, let's say 600 people or even 80,000 people. I know of at least one study with 80,000. Let's assume that the sample is large enough because the sample is not large enough, which is 99% of the studies in psychology. The studies are useless. Moreover, most of these studies cannot be replicated. And this is known as the replication crisis. So how do you gather a sample? First of all, you construct a profile, a profile of the cohort, a profile of the population studied of the demographic. And then you select people to fit the profile to represent the cohort. This is called a representative sample. A representative sample is a sample that eliminates possible biases. And the problem is that samples are either small, cannot accurately represent the entire cohort or group, and assessing large samples is nearly impossible in many cases, owing also to mathematical limitations. Another problem is the number of biases, the number of potential biases, which is ginormous. It's simply huge. For example, in market research, in polling, and in many branches of psychology, we use something called stratified random sampling. Stratified random sampling is that you break the sample into small groups and then you examine each subgroup separately. This is, of course, very inaccurate because any division is arbitrary and the subgroups are often very misleading when it comes to representing the entire group. Consider, for example, conducting a door-to-door survey or poll. Well, it excludes all employed residents. Consider calling people on their smartphones and conducting a poll or a survey. It excludes certain poor people. If you conduct the poll in the evening, it excludes people who socialize or run errands. It's extremely difficult to get a representative sample. Indeed, it is safe to say that the vast majority of samples in most psychological studies are either way too small, way way too tiny to give us any meaningful answers or totally non-representative. And this chauvin is only the beginning. This is your blue professor of psychology leading you deep into the recesses of the fault lines of psychology. Second problem is, when you identify biases that limit a specific set of statistics, then you can determine the data's accuracy. You need, in other words, to identify these biases and somehow eliminate them. Consider, for example, the following bias. People lie. Minor bias. People lie. You ask them how many times they have cheated on their intimate partner. They lie. What is their body count? They lie. How many sexual partners they've had in the last year? They lie. How many abortions they've had? They lie. People lie and not only about sex and not only about intimate matters. People lie on the time to make themselves look good or in order to compete with others, a process known as relative positioning. Pulse are extremely sensitive, seemingly susceptible to bias and lying. People sometimes give you an answer because they think that you want them to give you this answer. They avoid unpopular answers. There's a lot of peer pressure. And popular opinion polls are actually pretty useless because of that. Another common bias is the way we present an average in the statistical process. We can represent an average as the mean, the mode or the median. The mean results usually in a larger figure because it's the arithmetical average of several numbers. However, if we want the figure to appear smaller, we would use the median. And the median is the middle figure around which all other figures converge. So you can also use the mode average and this is actually the most frequently used measure. So as you see, even the statistical measure you choose has a massive impact on your outcomes. And statistics are therefore extremely easy to manipulate. There's also conscious bias. Only favorable data is chosen for presentation. And inconvenient data is totally discarded. Filtering out that data is bad unethical practice, but more common than you know. Or emphasizing the favorable outcome at the expense of the unfavorable result. We use different types of averages and so we consciously choose what we want to convey. Conscious bias is less difficult than an unconscious bias because in an unconscious bias we need to recreate the entire process. We need to find the source behind the statistics and we need to understand the motivation or motivations of the scholars and the researchers and the publishers of the statistics. Are they somehow invested in the poll or the survey or the study? What are the sources of financing? What are the conflicts of interest? Why they had chosen a particular set of statistical representation? It takes research, meta research. It's a mess and very few people bother. Facts therefore in psychology, especially facts founded on statistics are highly suspect. And then there is the issue of graphical presentation, charts, maps, other types of graphs. They are never what they seem. They are exceedingly misleading. They are built to mislead. What do you choose? A linear scale? An exponential scale perhaps? How do you present the data? What are your X and Y? And so a researcher or presenter of data can skew the statistical graphic representation in a way that's very deceptive. You change the numerical representation on a given axis. You zoom in on a rise in the chart and the publisher can easily manipulate you to see what he wants you to see. It's very seductive and it's very wrong. Line charts, for example, are the most commonly used in psychological studies. And line charts are also the most easily manipulable and most commonly manipulated. Even a small increase in the chart can be shown as a much larger rise simply by changing the numerical increments on one or both axes. Looking at the chart seems impressive and very, very objective, but if you take the time to check the numerical representation, it's very underwhelming. Trust me, bar charts can be deceptive as well. When you change the width of the bar, when you show a truncated version of the bar, only the top half, the information on display can easily seem to represent something that is not true, that is counterfactual. And so bars and charts are a serious problem. Sizes chosen with length, numerical axes, they all can mislead. What about statistical fallacies? Yes, there is such a thing. Many techniques in statistics end in failure. One of the major fallacies in statistics is if you use a specific type of formula. For example, if you say B, if B follows A, then A must be caused by B. It's a very, very ancient fallacy. And so this formula can be easily turned around. And this is known as the post hoc fallacy. It is easy to misrepresent data when you use this formulation. Both A and B could be the product of a third factor. Correlation is never causation. And so representing A as the inevitable outcome of B or vice versa is almost always wrong. We must look closely at the information when we are presented with such an argument. B may follow A for any reason whatsoever. It could have occurred by chance. You need to test and test again, and then B may vanish altogether. We call it an artifact. Testing continually may yield the same result. It still doesn't mean that the result is valid because the testing methodology itself has a huge influence on the outcome. Choosing the methodology often determines the outcome. The variables B and A may be related in some way. Causation is only one way of relatedness. But is it? There are instances of causality between two factors which represent an overriding third factor, as I say. Another statistical manipulation is when data is presented that does not accurately represent the correlation of factors. A positive correlation can suddenly become a negative correlation if it is applied beyond the information given, for example, in extrapolations. So the correlation of factors sometimes exists, but we must look closely to determine all the factors. Very often we connect A to B, we correlate them, we measure the correlation, and then we extrapolate it, we enlarge it, we expand and apply the outcome wrongly. For example, rain and crops. A measured amount of rain? Healthy crops. Too much rain. Ruin crops. The correlation breaks down. Of course, not all statistics are misleading. Statistics is a very useful tool. Believe it or not, it's even used in physics. For example, in statistical dynamics. Statistics is really an amazing development of the last 250 years. It is used in a variety of settings, for example, actuarial tables and insurance. So statistics is a blessing, but you need to ask yourself, who says so? How does he know what's missing from the data? Did somebody change the subject? Does it make sense? What was the size of the sample? Was the sample representative? What measures were chosen? Statistical, mathematical measures were chosen. And why? Had different measures been chosen? What would have been the outcome? Who presents the statistics is very crucial. For example, if you have a feminist psychologist, that's a bias, I'm sorry to say. Feminism is an ideology. Conversely, if you have a racist psychologist, it's also somewhat of a bias. The statistics coming from a source that has something to prove is the study financed, for example, by the tobacco companies. There were numerous statistical studies financed by the tobacco companies at the time in the 50s and 60s. Do these studies wish to persuade us in a highly specific way? We need to check for both conscious and unconscious biases. Always scrutinize the validity of the source. Is the source reputable? Is it trustworthy? What previous work was published by this source? And how does this work tie in to the current study? Look at the size of the sample. 1,200 companies, 1,200 participants is another thing. 120 or 12. It's an entirely different picture. Now we do have measures of confidence and measures of significance. Tell us how valid the answer is, how likely the answer is valid, but even they are subject to both bias and mathematical inaccuracy. Ask yourself, what figures are missing? For example, did anyone bother to indicate the number of cases? Yes, believe it or not. There are studies where the number of cases N is not indicated. And when someone says 14% of something, what is this something? When someone says 86% of something, that leaves 14% and maybe the important message isn't the 14% left out, not in the 86%. For example, if I tell you 60% of women don't cheat, well 40% of women do. And that's where the problem lies. That's where the pain arises. That's where the agony resides. We should focus on that. So, selectivity in presenting data is often meant to obscure, camouflage, masquerade and disguise. Choosing a median or a mode or a mean shift the result substantially, such choices should be scrutinized, should be questioned. And omitting certain factors is also very important. Why some factors are not mentioned? This in itself is a misuse and misrepresentation of statistics. Suddenly in some studies the subject changes. The study starts with a general presentation of the goals of the study, the aims of the study and so on. And somewhere between the figures and the conclusion something shifts. Suddenly the outcomes or the results or the conclusions of the study have nothing to do with what the study purportedly was aiming to verify. When a long term trend is used there is no evidence to back up what is being represented, for example. And so, does the statistic make sense? Yes, use your common sense. Because common sense is a relatively reliable guide. Not your intuition. Your intuition is wrong 50% of the time, but common sense is a reliable guide. And if you come across a statistic that strikes you as nonsense, then feel free to question it, to delve deeper, to pull it apart, to unearth, explore and reveal the inner mechanisms and workings of the study. The use of precise figures gives the erroneous impression of objectiveness and validity. That you use numbers doesn't make you objective. That you use figures doesn't make your claims valid. And that you use mathematics doesn't render your study scientific nor does it render the entire discipline scientific. People often fabricate, not intentionally, not maliciously, but because they want to. They support existing biases. This is called confirmation bias. Sometimes they round up figures that's wrong. Many statistics cannot be that precise. We should question when the statistics, if the statistics does not make sense, because the statistics somehow assume an infinite extension into the future. And that's always wrong. This use of numbers is a fallacy, a fallacy, because it creates the impression of authority. Take, for example, IQ. IQ tests results beyond 140 or 160 cannot be normatively validated. I have 190 IQ and I can tell you it's totally meaningless because according to statistics, there are only another eight people in the world with my IQ. That's a tiny sample. We can learn nothing from it and therefore my IQ cannot be normatively validated. The more we know what to look for, the easier it is to determine if a statistic in psychology is trustworthy. We need to establish all these things. How is the average presented? We need to look closely at line graphs. We need to identify whether first impressions are real and the statistic is accurate. There's a lot of play in statistics and that is why lies, damn lies and statistics.