 I've been taking a long time to make this video, partly because of things going on at work, the start of a new project and my recent promotion, but also because I wanted to get this video just right. I care about the science behind this video, and it's easy for me to get carried away mixing my feelings and the evidence. In my last video, I made it clear that I don't see much of a basis for biological races in the human species, but I now realize that I was using the wrong terms, opposing the wrong movements. I spent my effort showing that races are not terribly useful, or arbitrary, and that they don't really matter much in genetic terms, a position shared by most geneticists. But my critics rightly pointed out that even arbitrary categories exist, so I don't see any reason to oppose the assertion that races exist given the proper biological definition of races of organisms, and that they have at least some small aspect that is rooted in biology. We'll call this perspective the biological race concept, or BRC. What I continue to oppose, and what I will present evidence against, is the idea that races are essential concepts in the human population, what we'll call race essentialism, or RE. This is the idea that the buckets we assign people to are somehow more meaningful than any other way of dividing human diversity. Central to this question is the number of races, the number of buckets, in which we find that the populations are homotypic, that is of a single type, a uniformity of members. Race essentialists defend the idea that there is something in the biology that suggests the categories we draw, that they are not completely arbitrary, and it's an idea that has roots in Victorian-era science, an era that was dominated in the biological sciences by ideas like phrenology, and eugenics, and genetic determinism. We see these concepts reborn in the language of the race essentialists, ideas like dividing the entire human species into large continental categories, and then comparing social characters like criminality, IQ, and income, as though these were entirely determined by immutable characters of inheritance, rather than complex and interconnected factors of society, environment, genetics, and non-genetic inheritance. It is a simplistic understanding of biology, a retro approach to science, and that's specifically what I want to differentiate from the simple act of defining groups in the human population, the biological race concept. The key difference is how much we become wedded to the idea that these concepts really have meaning or force in biology, that they are useful categories in how we treat people, or what freedoms or rights we grant, or how we perceive our fellow humans. In my own estimate, if we let the biology alone determine these divisions, there are either about 350 human races, about 250 of which reside exclusively in Africa, or there is one human race. I'm going to break this up into a series of videos. The first topic we'll cover is the biological versus social construction of race. The human races, as most people think of them, are unquestionably social constructs. That's not to say that there are no biological races, but they're not the ones we use in everyday parlance. The labels we stick on people are not dependent on their actual ancestry, merely on our perception of them. Haplogroups, as I mentioned in my previous video, would be one such biological race concept since it uses actual genetic markers, but is devoid of the cultural elements that we have imbued with false meaning. It's less prone to confirmation bias and confounding by social norms. What we actually use when we talk about races are socially constructed, truly skin-deep, because we judge someone's race strictly on their outward appearance, not their inward genetics. Any overlap with actual biological race is purely statistical and accidental. I give you Exhibit A, the current U.S. President, Barack Hussein Obama. What race is he? You can answer white, black, mixed, and be right on all counts, since his mother identifies with one group and his father another. To most people, his race is not a matter of what genetic markers he possesses. It's how he's perceived, his social identity. He may choose it, he may not. Growing up in Southeast Asia with an adopted father, who knows how he was thought of there, what group he most identified with? But to most Americans, he's the first black president, not the first mixed race president, not the first half-black or half-white president, simply black. The one-drop rule of race identification has been the standard in the U.S. since long before DNA testing was available. It's obviously not very reflective of the actual markers I would find if I genotype the U.S. president. He's not atypical for the American population with African ancestry. One genotype someone who self-identifies as African American will typically have between 5 and 20% European ancestral markers. Many will have Asian markers as well. The same for those who identify as European Americans. Most will be unaware of very recent African ancestry. This is called admixture, the mixing of different ancestral populations in the genetic complement of a modern population through interbreeding. Admixture is the rule in most places, but especially in areas with high diversity or multi-regional emigration. Many of the race realists would say that the existence of mixed races doesn't mean they don't exist biologically. But it absolutely does mean that the rules of social race identification are not based on classical taxonomy. There are no half-robin half-speros, that's simply not how taxonomy works. As a general rule, if we can define two populations with a single character, then there really aren't two populations. If there's significant genetic overlap between two subgroups, then they aren't distinct after all. There are two exceptions, not found within strict taxonomy but still useful in population genetics, and we'll need to see if either or both of them applies to the human species. But I want to clarify early on the idea that an alien coming to Earth, with our same understanding of phylogenetics, would put all modern humans into a single species and a single subspecies. We are all Homo sapiens sapiens. Any differentiation from this point on will have to be below the level of subspecies. The first concept we need to address is a deem. A deem is a distinctive group within a species, where reproduction is still possible between the groups, but each group is subject to different selection factors. An example might be a single species of bird with two distinctive groups, say a western and eastern group, that differ a bit in their mating call, and only rarely interbreed. If the differences between the two groups vary continuously across geography, then we have our second concept, acline. Acline is different from a deem, or perhaps we could say a special type of deem, where there is no deamarkation between the groups, what we would call a discontinuity. Imagine for example a species of rabbit with several different coat types, depending on the local environment, where the mountain deem blends into the desert deem in small steps. In classical taxonomy, a deem or acline would fall below the level of a subspecies, and the definitions for these groups are flexible, and only very loosely defined. So how do we define whether a group is a subspecies, race, deem, or acline? The famous geneticist Sewell Wright, the man who wrote most of the equations that we use today in the field of population genetics, developed a particular descriptive number called the FST, or Population Fixation Index. It was a way of describing something we call population substructure, that is whether a large group is genetically homogenous, composed of a single large population, or heterogeneous, composed of diverse groups of breeding individuals isolated from each other and genetically distinct. This is clearly the best statistic for describing whether or not races are genetically distinct. The FST for modern humans is approximately 0.110, or we could say 11%. What that means, in essence, is that 89% of all the variation in humans is shared across all groups. Only 11% of human genetic diversity can best be explained by the presence of distinctive subgroups. Wright himself proposed that a subspecies should be considered valid when this FST value exceeded 0.25. Humans don't even make it halfway to the standard criteria, so objectively, by the standard taxonomic practice as governing subspecies, human populations don't qualify as subspecies. If we do decide to assign subcategories to the human population, they cannot be called a subspecies. We might still use the term deem or acline. We might still talk about races. There are no criteria that a population needs to meet for these categories. They are completely arbitrary in division. We can have 3 million human biological races, or one. Both are equally valid because both are completely arbitrary buckets in which to put diversity. I want to make it clear that I don't object to arbitrary buckets. I only object to the essentialist concept that the buckets were there before we created them, and that dividing up diversity in this way reveals something significant. It's just an arbitrary division, like dividing up world history into the classical era, the middle ages and the Renaissance period. It's a way of simplifying a continuum, breaking it up into understandable chunks. The essentialism in our history example would be to treat 400 BC and 350 BC as though some real division exists between them beyond the 50 intervening years, simply because one falls into our arbitrary division of the classical period and the other does not. My central question in this video is, does the biology, the genetics of the human population suggest that there is a good place to draw divisions, these buckets or categories? Are we defined best in terms of the Victorian era ideas of race, or rather the modern concept of arbitrary but objective divisions like haplogroups or clines? For that we're going to have to look at some real data. The type of graph we'll look at first is called a principal component analysis. On each axis or properly the eigenvector, we're going to use a combination of lots of genetic markers. I don't want to focus too much on the nature of these markers, except to say that they're from non-coding regions, not from genes, but the vast distances between genes. They haven't been subjected to specific selection, so what we're measuring has nothing to do with adaptation to the local environment, merely the natural genetic drift of two populations with some reproductive isolation. Each dot on the PCA graph represents a group of individuals drawn from a different population. The goal here is to see how much of the variation in the dots can be accounted for by our two marker sets. If human races are distinctive deems, this is what they would look like. You'll find graphs just like this one on papers showing how biological races differ from each other. So why would I bring it up? Doesn't data like this destroy the idea that the races are distinctive? Yes and no. The problem here is one of how we select our populations. If you choose one population from Central Africa, one from Central Asia, and one from Australia, this is indeed what you see. Data like this suggests that these populations must have been isolated for a long time, with no intermarriage and hence a great deal of genetic drift and differentiation. Africans and Europeans didn't share a gene pool for a long time. However, what happens when we sample from all those geographic regions in between these distant populations? Now we can see what looks like a continuous change from one region to another. This is consistent with a model where gene flow is fairly continuous across geography. The only reproductive isolation was from geographic distance, so that markers found in populations in Africa remain distributed in populations in the Middle East. This is the classical presentation of a Klein, the continuous distribution of alleles that we discussed earlier. Human diversity is best represented by a continuum of change across geography, with the occasional gap where a physical separation existed between two peoples, producing what is called in genetics a discontinuity. We see these clines on the global scale, say from Portugal to Siberia, but we also see them on the very, very small scale, even within a single ethnic group or national population. For example, geneticists can map a Klein of Spanish markers into western France, or differentiate French and German-speaking Swiss people. To me, this suggests that dividing up diversity into homotypic groups is probably doomed from the beginning. If you attempt to draw dividing lines at each steep sloped Klein or discontinuity, you will find that there are anywhere from 120 to 600 of these clinal sets in the global population. If you attempt to keep your divisions at sharp discontinuities, you'll find that you can't have more than one or two subgroups. That's why I say the number of human homotypic races suggested by biological diversity is either about 300 or one. There just are no other divisions that account for the continuous changes, both between and within populations. Let's take our PCA graph and try some other arbitrary divisions. What we're going to alter here are called inferred populations, and we assign the capital letter K to this number. If we assume two homotypic groups exist in the human population, that is K equals 2, this is what the analysis looks like. Perfectly valid. We've created two lobes. Set K equals 3, and now we have three distinctive populations, suggesting a central population that two groups migrated from. Set K equals 4, and we have the basic continental populations. Again, the data isn't changing, only how many inferred populations we assume. If we increase K to 9 inferred groups, you get an analysis that still looks quite valid. There's nothing in the data here that tells us what value K should be, unless we already have a goal in mind. So how does my simulated data compare to actual PCA analysis of human populations? Let's look at a few. Depending on how we set up the markers and the populations, we can get tightly clustered deems, or we can get clinic variation. That information by itself can tell us something about the history of our species. In the next video, we'll take another look at admixture data using a program called Structure, and we'll explore the question of IQ, criminality, and the presence of genetic markers from ancient hominids in modern humans. Thanks for watching.