 Are there any questions from the last lecture? Nothing? OK. So the notes from this last lecture, Mondays and todays and Fridays, are all posted as of 7 or 8 last night. So you can actually see the notes that I'm reading off of. Now, this week, the lectures were all on population genetics, are all supposed to be on population genetics. So I want to finish up with Hardy-Weinberg and then some of the implications of Hardy-Weinberg equilibrium. And then we'll go into the forces that cause allele frequencies to change in a population. Remember, we're looking at the world in sort of an idealized way, where we have populations, individuals within populations randomly made. And we're looking at evolution as changes in gene frequencies through time. That's the modern view of evolution, at least from evolution. It was a view that was formulated in the 1940s. It's called the neo-Darwinian synthesis. But it's a view that was formed in the 1940s, looking at genes and populations. OK. And I thought I'd sort of speed things up by writing everything we'd done up to this point in terms of this one very simple example that was from your book, where we had flower color that was determined by two alleles, the red and the white alleles, CR and CW. The phenotypes you can have are red, pink, and white flowers. And the phenotypes were determined, the phenotypes were completely determined by the genotypes, of course, where if you're homozygous for the CR allele, you have red flowers. If you're homozygous for the CW allele, you have white flowers. And if you're heterozygous, you have pink flowers. And this is a situation where you can actually just look at the phenotype and you can determine the genotype. True. That's not always the case. Not all traits work like this. So for instance, if it turned out to be the case, if it were the case that the red allele were dominant, then you would have red, red, and white flowers. And so in that case, if that were the case, looking at the phenotype would not be able to tell you the genotype. If you had red flowers, you could either be homozygous or heterozygous for the CR allele, right? But in this case, we don't have to worry about that. So that's just the genetics of this situation that's in your book. Now we're actually looking at a population, some population of flowers. And you go out in the field and you count that there's 320 individuals that are red or that have the CR, CR genotype, 160 individuals are heterozygous, and 20 individuals are homozygous for the CW allele. From these, you can actually pretty easily calculate the genotype frequencies, right? If you were to find out the total number of individuals that you counted up, there's 500. True. And so the frequency of the CR, CR genotype is just 320 divided by 500. So presumably, nobody's having trouble up to this point. I believe I stopped here where we were just going to calculate the allele frequencies. Correct me if I'm wrong. I don't think I covered this. And there's two ways of calculating the allele frequencies. This is just the frequency of the CR. So I'm calling F with the subscript CR, the frequency of the CR allele in the population. And F subscript CW is the frequency of the CW allele in the population. But we can also call those frequencies P and Q, if you wish. And that's what we'll probably do from here on. Just a generic P and Q for those two frequencies. And of course, P plus Q has to equal, has to sum to one, right? Okay. All right, so there's two different ways of doing this. They're both easy. So I'm just going to tell them both to you because depending on the situation, you might want to use one method or the other. And in either case, you have to put on your accountant cap, so to speak, right? Pretend you're an accountant for a while. So just keep in track of things. So over here, if we want to get the allele frequencies, we have to make, we have to note that each individual has two alleles, right? So there's 500 individuals in the population, but there's 1,000 alleles floating around in the population, so to speak. Okay, now how many of the alleles are found, of these alleles are found in the homozygous individuals? Well, there's, if we want to count up the number of CR alleles, how many are in the homozygous individuals? Well, there's 320 times two, right? There's 640 CR alleles found in these individuals. How many of the CR alleles are found in the heterozygous individuals? 160, right? There's 320 alleles found in the heterozygous individuals, but half of them are, 160 of them are CR. So plus 160, and then you just divide, oh, and then we can ask the question, how many of them are found in the, how many of these CR alleles are found in these individuals? Zero, right? So we have 640 plus 160, that should be 800, divided by the total number of alleles, which is 1,000, that gives you 0.8. Truth? If I didn't make a mistake in my math. And we can do the same type of thing over here. We can say how many of the CW alleles are found here? Well, zero. How many of the CW alleles are found here in the heterozygous individuals? Well, it's 160 plus how many of the CW alleles, how many CW alleles do we have among these homozygous individuals? We have 40 of them, right? Because each one of those individuals has two of them. And we just divide by the total number of alleles. And that should, we did our math right, 160 plus 40 is 200, so that's gonna be 0.2. And you can always do the obvious check, which is you add up these two frequencies and they should sum to one. If they don't, you better go back and redo things. You've made a mistake somewhere. And it's always tempting, of course, to me if you're very confident, which I'm not, but if you're very confident, you can say, well, in your time, press for time, maybe like in an exam, you can say, well, I'll just calculate the frequency of these individuals and then use subtraction to get the frequency here, right? And that works perfectly fine as long as you didn't make a mistake here, right? Using this fact that these frequencies have to sum one. Anyways, that's the method that uses the genotype counts. And that would be appropriate in situations where somebody tells you the genotype counts. There's another method which is just as easy, which uses the genotype frequencies. And so we're gonna use these. So what you do is you take, if you wanna get the frequency of the CR allele, you take the frequency of the CR homozygous individuals, homozygous individuals, which is 0.64, plus one-half the frequency of the heterozygous individuals, which is one-half times 0.32. So 0.64 plus 0.16, that should equal 0.8. And similarly over here, you take the frequency of the homozygous individuals, which is 0.04, plus one-half the frequency of the heterozygous, 0.16. So that's gonna be 0.04 plus, oh, 0.32, thank you. One-half times 0.32 is 0.16, so that's gonna be 0.2. I would have quickly discovered my error, right? All right, that should be, hopefully that's clear to you. But anyways, those are the two different easy methods for calculating allele frequencies. Now, all we've done at this point is just characterize the frequencies of the genotypes and alleles in the population. No higher math was involved and none of the stuff I'm gonna tell you involves hard math. Hardy-Weinberg predicts the frequencies of the alleles and the genotypes in the next generation. That's the key. And the prediction is, maybe I'll go over here. The prediction is that if we have CR, CR, CR, CW, CW, CW, the prediction is that in the next generation, the frequency of the homozygous CR individual should be P squared. These should be 2PQ and these should be Q squared. That's the prediction. So we can actually predict using our allele frequencies. Remember, the allele frequencies are P and Q. So what P was 0.8, right? So what's 0.8 squared? That's 0.64. And then Q squared was 0.2, so 0.2 times 0.2 is 0.04. And then the predicted frequency of these heterozygous individuals if the frequency is gonna be P is 0.8 times Q is 0.2, 0.16 times two is 0.32. What we just found is that this population was in Hardy-Weinberg equilibrium. And the reason why that's the case is that the observed genotype frequencies, which we calculated just by counting and dividing by the sum, are the same as those predicted frequencies. Now we could imagine a situation where 0.8, let's say that the frequency of these individuals was 0.8, zero, and 0.2. What would the allele frequencies be? Well, they would be the same. They'd still be 0.8 and 0.2. But you would find that if it were the case that there are no heterozygous individuals, 80% of them were homozygous here and 20% homozygous for CW, then you would still get the same predicted Hardy-Weinberg frequencies. And yet you would say that this population was not in Hardy-Weinberg equilibrium because there was a strong deviation. You'd expect it a lot more heterozygous individuals. Okay, so that is it. That's all there is to Hardy-Weinberg equilibrium, Hardy-Weinberg theory. It basically just predicts what the frequency should be in the next generation based on the allele frequencies in the current generation. That's all there is to it. So why is this an interesting result? Okay, so that's probably something you might be wondering yourself. So there's three reasons why we'll talk about Hardy-Weinberg being interesting. One is the evolutionary implications. So why is it interesting evolutionarily? What Hardy-Weinberg says is that in the absence of anything else, natural selection, if you assume you have a very large population and so forth, random mating by itself does not cause allele frequencies to change. That sounds like a trivial result, but it's a very important one. By itself, just random mating doesn't change allele frequencies, okay? And we can show that. So let's call P prime. We'll say that is the frequency of big A in the next generation. Well, we can get the frequency of big A in the next, so I'm thinking we have two alleles, big A and little A now, okay? If the Hardy-Weinberg equilibrium says that if you want to get the frequency of the allele in the next generation, we take the frequency of the homozygous, big A, big A individuals in this generation, and add one half the frequency of the heterozygous, right? Remember, these are P squared and 2PQ are the Hardy-Weinberg predictions of the frequencies of the allele in the next generation using the frequency of the allele in the current generation. P and Q are the frequencies of the big A and the little A alleles in the current generation. You see that? So let's do some higher, let's do some math. So this is gonna be P squared plus, well, the two and the one half cancels, so it's gonna be PQ. And then we can factor the P, so you can have P times P plus Q, right? Does anybody know what number P plus Q is? It's the number one, so you get P, all right? So what does this mean? It means that the frequency of the big A allele in the next generation is the same as the frequency of the big A allele in the current generation. Frequencies don't change, which is just another way of what I just said, but here's a mathematical proof showing it. So this is maybe your first mathematical proof in a biology class, right? Pretty exciting, no? Okay, okay. Okay, so that is an evolutionary implication for Hardy-Weinberg. There's also some implications for human genetics, okay? And one implication has to do with DNA fingerprinting. So this is something you might have heard of, so it's often now used in the court of law, and the scenario is this, you have some crime that was committed in which some sort of biological sample was left at this crime scene, you have the hair, blood, semen, something like that, okay? And you call or some person that was in the area that suspect might have been involved in the crime. You take a sample of his or her blood and you ask, does that blood match the blood found at the crime scene? Okay, so this is the basic idea, is you're gonna use Hardy-Weinberg theory to predict frequencies of an allele in a population, of genotypes in a population. So let's say, for instance, that at the crime scene, you found that the sample was little a, little a at some locus, and that you know that the frequency of the little a allele is 0.01, so it's in low frequency. And then you have the suspect over here, and you look at his genotype or her genotype and you find it's also little a, little a, okay? Well, that's intriguing, instead of you have this match that they have the same genotype, but it's not really putting any numbers on the probability of that match now, is it? And the way you can put a probability on that match is to use Hardy-Weinberg, all right? So if the, if the allele is at a very low frequency like this, 0.01, what is the Hardy-Weinberg prediction for the frequency of this homozygous genotype? 0.01 times 0.01, right? Which is one in 10,000, I think, right? So the frequency of this genotype under Hardy-Weinberg is predicted to be one over 10,000. Now you're actually putting a number on this, the probability that a person randomly drawn from the population would have that same genotype, right? And it's one in 10,000. So you could basically, you know, you can imagine the prosecuting attorney telling the jury and the judge, say, look, you know, this fellow was found in the area. He was acting suspiciously and he happens to have the same genotype as the sample left at the crime scene and the probability of that match, they probably wouldn't tell you the theory behind it, but you would just say the probability that matches one in 10,000. You know now what the theory is that is used to predict that frequency or that probability. Now this is quite commonly used now. There's actually a database called the CODIS database, which is basically people that pass through the criminal justice system. They're now give a blood sample. And what they do is they accumulate the frequencies of 13 different loci, the leels of 13 different loci. So it's not just one locus like this, but you would look at 13 different loci and get the genotype for all 13 of them. And so what you can do then is accumulate frequencies for a large number of individuals for different loci, for different alleles at these 13 different genes or 13 different loci. And now I think the size of this database is approaching five million individuals. So when people now present evidence in court, they'll calculate the probabilities of a match at each one of these 13 different loci and then multiply those probabilities together. So if you ever, you may have read in the newspapers about these very low match probabilities. They say this individual has a probability of one in 10 million in terms of the match probabilities. It comes because you're multiplying a lot of small numbers together. So if you multiply a bunch of numbers between zero and one together, the product has to be smaller, right? Okay, that's the idea. Are there any questions about this? Okay, so that's an important implication for Hardy-Weinberg, especially, I don't think any of you will ever find yourself through the criminal justice system, but especially if you're going through the justice system, you might, presumably most of the people are going through the criminal justice system don't know this theory, right? But anyways, so another implication is for rare genetic diseases, especially those that are only expressed when you're homozygous, okay? So the implication is this, if you have a disease such as cystic fibrosis, so cystic fibrosis is a disease caused by a defect in the CFTR gene. It's a cystic fibrosis, what is it, transmembrane conductance gene. It's a gene that if it's defective and you have two copies of it, of the defective allele, you get cystic fibrosis, okay? This is a gene that the allele that causes cystic fibrosis is at high, or is at low frequency rather in European populations, but the frequency of the allele, oh by the way, the allele that causes this is called Delta F508, all right? So this is a mutation in this gene, okay? Usually I don't know much about this particular mutation, but I can guess from the name that it's a deletion. Usually when you see deletions, there's a Delta put in front of it, and it's probably a deletion at the 508th amino acid, or position 508 is my guess, all right? But anyways, there is a mutation in this gene. If you have two copies of the allele of the gene that have this particular deletion mutation in it, then you're gonna get cystic fibrosis, that's the story, okay? So what is the frequency of the Delta 508 mutation? It's about 2%, okay? So the frequency of this mutation in the population is 2%. So what does that mean in terms of the fraction of the individuals in the population that are gonna come down with cystic fibrosis? Well, it's gonna be bat squared, right? Should be .0004, if I do my math in my head correctly. So the frequency of the individuals that are actually affected is quite small. How about the frequency of the people that actually have at least one copy of the gene? So these are the so-called carriers. You're a carrier for a genetic disease if you're heterozygous, that is to say, you have one normal copy of the gene and one disease variant of the gene, okay? A lot of us are probably carriers for all sorts of genetic diseases that aren't expressed because they're only expressed if you're homozygous for them, right? But it means that the frequency of the homozygous individual, remember the heterozygous individuals that are carriers, well, Hardy-Weinberg says it's 2pq, right? Now let's think about this for a second. If q is the frequency of the disease causing allele, in this case it's .02, this is gonna be a small number and this is a number near one, right? So this times this, this is a number of almost one, so this times this is about the frequency of this is gonna be about .02, this sort of roughly speaking, right? This is in your head. So generally speaking, the frequency of carriers for rare genetic diseases is roughly twice the frequency of the disease causing allele. So in this case, the frequency of carriers is about 4%. Now do you see how I did that? I didn't actually do 2pq, you'll get a number that's slightly different, a little bit smaller than 4%, but it's roughly 4%, roughly twice the frequency of the disease causing allele. So the implications of course are that, now that we have that even for rare genetic disease is a fraction of the population that's much higher than you might think, actually are carriers. Persistic fibrosis, roughly 4% of Europeans are carriers. So in a class like this, 700 people, well about 28 of you probably are carriers or maybe it's not a purely European or Caucasian class, but it's gonna be 10, 20, 30 of you might be carriers. That's kind of an interesting way of thinking about it. And of course with genetic testing, this becomes more, you actually might know this, right? You might actually know whether you're a carrier or not. So you have to make some decision based on that information. This isn't a class where we'll tell you how to make any of those decisions, but it is something that you should be aware of that you could know about it. You might have to make a decision based on that knowledge. Okay, I'm gonna skip the APOE example. So this is, I think, I'll probably modify my notes if I skip something, if I skip it in lecture, you're not gonna be responsible for it. So I'm just gonna make a note to myself that I'm not gonna talk about the APOE example. It's sort of the same business. Okay, the last thing I want to mention in terms of the importance of Hardy-Weinberg theory that I just described is that deviations from Hardy-Weinberg are interesting. So many people, many evolutionary biologists, and many biologists or geneticists in general, think of Hardy-Weinberg as what you might call a null hypothesis. How many people have heard the word null hypothesis before? All right, good. I just wouldn't make it uncertain. So as you know, a null hypothesis is the hypothesis that you're interested in testing. And so if you tested the null hypothesis of Hardy-Weinberg equilibrium and you find that there isn't a deviation from Hardy-Weinberg, then you say, eh. All right, it's in Hardy-Weinberg equilibrium. Nothing interesting's happened, right? If you test the null hypothesis of Hardy-Weinberg equilibrium and you reject that idea, that is to say the observed frequencies are much different than what Hardy-Weinberg would predict when you say, what's going on? What is causing this deviation from Hardy-Weinberg equilibrium? Okay, now of course, deviations from Hardy-Weinberg equilibrium, I'm not gonna discuss how you actually detect them that involves a statistical test where you have to reject the null hypothesis with, say, 95% certainty. And there's lots of ways of doing that. This is not a class where I'm gonna be teaching you about that, okay? So in all the examples I could possibly give you, you're gonna either have a match, the frequencies I give you will exactly match the Hardy-Weinberg expectations or not, right? So I'm not gonna ask you, are these frequencies significantly different from Hardy-Weinberg expectation? That would be a, you know, you need a little bit more statistics to do that. So what types of things could cause deviations from Hardy-Weinberg? Oh, I should also say that remarkably, when you go into real populations, say human populations, and you count up allele frequencies and you calculate and genotype frequencies and you calculate the Hardy-Weinberg expectation, most genes are pretty close to Hardy-Weinberg equilibrium, okay? That's kind of interesting. So one thing that can cause, just as an example, one thing that can cause deviations from Hardy-Weinberg equilibrium are species that self, that is to self-fertilize. So for instance, there's lots of plants out there that self-fertilize. And as an example, wild oats, type of plant, oat, self-fertilized, as I said, lots of plants do. And so the example I've got in your notes is that there's at some particular gene, the genotype frequencies, so we'll just call them big A, big A, big A, little A, little A. The genotype frequencies were found to be 0.548, 0.071, and 0.381. These are the observed frequencies. The Hardy-Weinberg predictions, for this I think I wrote them down over here, they're not difficult to calculate, so I'm gonna leave this as an exercise for you to figure out how I calculated this or make certain that you get the same number as I do. But the Hardy-Weinberg prediction is that these should be at frequency 0.34, this should be 0.49, and these should be 0.17. That's a course assuming I didn't make an error, but I don't think I did. That's the Hardy-Weinberg prediction. And the point here is that, for instance, Hardy-Weinberg predicts many more heterozygous individuals than you actually observe in the population. And that might make sense for a self-inspecies, right? If you self, you're gonna have a, you're gonna cause genotypes that are just like your own. All right, so that's one example of an example where we have a deviation from Hardy-Weinberg equilibrium. Another example has to do with a disease called sickle cell anemia, which is at high frequency among many Africans, and among many African Americans as well, okay? And the basic idea is this is a particular gene in the hemoglobin protein. That's involved in carrying oxygen through your body. And so individuals, well, there's two alleles involved. We'll call the normal variant of the allele, we'll call A, and the variant of the allele that causes sickling we'll call SS. And basically, this is a mutation at, so what causes the S allele? Well, it's a single mutation from, I think it's from a T to an A, at position six in the amino acid that changes one amino acid to another. It goes from glutamate to valine or something like that. But it's a single mutation in this beta-globin gene. And individuals, the genotypes are normal, pretty normal, and sick, okay? These individuals have a much lower life expectancy. I think they usually have a life expectancy about my age in the 40s. Basically what happens is the cells sickle. Normally your blood cells are kind of like discs. These get these sickling-type cells, and they basically have problems with the cells becoming stuck in their small capillaries and causing all sorts of problems. These individuals are sick. Now it turns out that if you happen to live in an area that has malaria, these individuals, high prevalence of malaria, these individuals are more resistant to malaria. This is a story I'll get to probably in the next lecture. Okay, just keep that in mind. The genotypes, of course, write those over here. We have AA, we have AS, these are the carriers, and you have SS. Now in a particular study where they counted up the number of individuals in the Equatorial Africa, the population Equatorial Africa, West Africa, they found that there were 25,000, 374 individuals that were homozygous for A. They found 5,482 individuals that are heterozygous, and then they found 67 individuals that were homozygous for the sickle cell allele, which means that if you go ahead and do the total, there's a total of 30,923 people that they surveyed, which means that the frequencies of these alleles are genotypes, rather, are 0.821, 0.177, and 0.0, and we can go ahead and calculate the frequencies of the two alleles, and I get the allele frequencies to be 0.909, and then this is gonna be 0.091. Now, hopefully you can see how I got, you should be able to calculate these numbers from these, this would be something you might try out on your own, just when you get back to your dorm or your apartment, go ahead and using these genotype counts, make certain you get these genotype frequencies, and then make sure you get these allele frequencies as well, and if everybody in the class gets a different result, then come to me and I'll admit I made a mistake, okay? I don't think I did. So what are the predicted frequencies for under Hardy-Weinberg for these individuals? So let's say the HW prediction. So we have, here's our genotypes, so it's gonna predict that the predicted frequencies are 826, 0.165, and 0.008. Now, once again, you have to do a statistical test to determine whether these frequencies are different. I'll tell you that they are, and one way of seeing the implication for that is to change these frequencies, predicted frequencies into expected number, okay? Called the expected number of people in each class. So it's gonna predict, for instance, if we take 30,923, was that the exact number? 923, was that the total? Yes, it was, times 0.008, we get that we predict about, well, we predict 247.4 people, all right? We can have predicted or expected fractions like that are a fraction of a person, just like the mean household has what, 2.2 kids or something like that, or 1.8 kids, I forget what it is. It's not like they literally have 8 tenths of a kid around, that's gross. They have, that's the average, right? So let's go ahead 30,923 times 0.165, that's 510.3, and then 30,923 times 0.826 equals 25,542, okay? Now, the real deviation is right here where you actually would predict about 250 people to be, to have the SS genotype, and yet you'll only see 67, okay? Yeah, that off by a factor of 10, of course it's gotta be. It's gotta be about 5,000, what do I do wrong? 30,923 times 0.165, ah, I misread it. Don't do that in an exam, you lose points that way. Fortunately I don't have to take exams anymore. Anyways, the real deviation is right here. There's too few of these individuals, and we have 510 versus 5400. You actually have more heterozygous individuals than you'd expect, okay? And we'll discuss this, like I said, in the next lecture, but this is a pattern that's predicted for this particular gene in Africa given that the heterozygous individuals have a fitness advantage and these people are sick, okay, they're selected against. Okay, now there are any questions, this is the last of lecture three, which took me much more than I thought, much longer than I thought. Are there any questions up to this point? All right, no more statements of error on my part, okay? So what I wanna do now is, we basically talked about what happens when nothing interesting's happening, right? We talked about when you just have random mating, but you don't have natural selection or anything else that would change your little frequencies, this is what we expect, the Hardy-Weinberg expectation. But it is worth asking what types of processes cause allele frequencies to change in a population? And so I'm gonna list them and we're gonna discuss them one by one. One process we'll discuss is called mutation, okay? It's the change in an allele from one form into another, this is the process that generates new alleles. And ultimately, mutation is what introduces or what causes variation in populations. We'll also talk about natural selection. This is another process that will cause alleles to change frequency in a population. I'll talk about genetic drift, okay? Which is kind of the random element of evolution in terms of how allele frequencies change over time. And then we'll talk about migration. And remember, we're always thinking with respect to some particular population. So if you're a population of all big A individuals, for instance, then a mutation from a big A to a little A, obviously that's gonna change, obviously that'll change the frequency of the alleles in the population. If you have a population with several alleles floating around in it, and one of the alleles confers upon the bearers of that mutation, fitness advantage, that is to say, maybe they'll survive better, they reproduce more offspring, the natural selection will change the frequency of that allele. Genetic drift, like I said, is a random component of evolution. And what was left unspoken in all this discussion of Hardy-Weinberg is that it assumes, I said, very large population sizes. Well, formally speaking, it assumes an infinitely large population size. So it's something I swept under the rug. Genetic drift is a factor that occurs in finite size populations, which is what all populations are. But it's basically random changes in frequencies. And then migration is basically you have an individual from another population migrates into your population. And if it has a different allele than what's found in your population, then the allele frequencies change. It's that simple. So let's start off with mutation. There's just a few things I wanna say about mutation. There's different kinds of mutations. The ones that you probably come to mind when you think about a mutation is single base pair changes or nucleotide changes. So here's my chromosome, right? So this is a single strand of DNA, double-stranded DNA. And it has a specific sequence of nucleotides, ACGs and T's. And one type of mutation, let's say that most of the individuals have a T at some position in the genome. One type of mutation is changing that nucleotide to another type. So for instance, if you have a change from T to A, so you have some sort of mutation, the T changes in A. It's a problem or error in the replication of the chromosome, of the DNA. That's called a point mutation, or point nucleotide change. And you can ask what is the rate at which these occur? So every time DNA replicates, every time you have a cell division, you have an opportunity for errors to be introduced into a chromosome, right? So the question is what is the error rate for replication of DNA? It turns out in humans and lots of other eukaryotes, it's roughly 10 to the minus nine per site per replication. What does that mean? It means that every time you have a DNA replicates, about one in a billion nucleotides have a mutation of flip. That's the error rate. And our genomes, so that means basically every replication, you have roughly three or four mutations introduced into the new genome. Ultimately, it's things like this that cause cancer, right? It's mutations in cells that aren't part of the, often people divide your cells into two kinds, those that are called somatic cells, don't ultimately reproduce, and then your germ line are the ones that ultimately give rise to sperm or eggs. If you have a mutation in your germ line cells, then of course those are the types of mutations that can be passed on to your offspring. If you have a mutation in your somatic cells, like your skin cells or something, that's the type of mutation that could cause cancer, ultimately. If it happens to be mutations in regulatory genes that regulate your cell division. Anyways, that's roughly the mutation rate in eukaryotes. Viruses have a much higher mutation rate. For instance, HIV has a mutation rate which is about three times 10 to the minus four, I believe, and then minus five, rather. So one of the problems with HIV infections is that the viruses are evolving so rapidly that they can escape drug treatments that there's likely to be a variant in the body that is resistant to some particular drug treatment, or this high mutation rate also helps the virus evade the immune system, right? The immune system is always trying to remove these viral particles in, and if the high mutation rate means that there's probably gonna be some number, some fraction of the individual's HIV and viral particles will be invisible to the immune system because they have a different, say, coat protein. So anyways, that's roughly the mutation rate in something like HIV. Other types of mutations you can have are already alluded to one, a deletion, right? The Delta F508 mutation that causes cystic fibrosis. That's caused when you have some segment of DNA that's no longer there, okay, that's removed. And you can also have, along with deletions, you can have bits of DNA that are inserted, right? So you can have insertions, you can have deletions. There's an interesting deletion in a protein, in some human populations called the Delta 32 allele. It's basically a mutation in a receptor protein that HIV uses to get into a cell. And individuals that are homozygous for this deletion, it's a 32-base pair deletion, Delta 32, are more resistant to HIV. They don't get the disease as easily as people that are homozygous or heterozygous for it, okay? Anyway, so you can have insertions and deletion type mutations. You can have duplications of chromosomes. So you can have an entire chromosome become duplicated. And so you may have heard of Trisomy 21. This is having three copies of the 21st chromosome. And individuals that have Trisomy 21, they have Down syndrome. There's some duplications and there's a number of diseases that are caused by Trisomies of different chromosomes. This is of the 21st chromosome. Some Trisomies, we can imagine that what happens is you have a problem with the dosage of the genes. You have more of the gene products than you should. And there's some Trisomies of certain chromosomes that so disrupt the development of the individual that they don't survive. Yes, that's a point mutation. Yeah, so we can call that a point mutation. It was you can have entire chromosomes replicated. It turns out that at least in terms of Trisomy 21, in the US population, roughly about one in 1,000 births have Trisomy, okay? And it's actually something that's related to, it's a mutation that's related to maternal age as well. So it's at higher incidence among older women than among younger mothers, okay? And there's another thing that can occur. You can have just not one chromosome be duplicated, but you can have the entire genome duplicated. And this is something that's especially prevalent in plants. So you have many plant species are very similar to their close relatives, but they just have a different count in the number of chromosomes. And so for instance, bananas, they're triploid. That is to say they have three copies of the chromosomes. And this triploidy can be caused by a mutation during meiosis when normally during meiosis, you have a reduction in the ploidy of the cell, going from every cell having two copies, one from mom, one from dad to having your gametes or sperm and eggs have one copy of each gene either from mom or from dad. But if you fail to have this reduction during meiosis, you can have gametes then that are diploid instead of being haploid like they should be. And if you have a successful fertilization between say a diploid gamete and a haploid gamete, then you would get a triploid individual being formed. And like I said, that can happen much more frequently in plants and it happens quite frequently. It was that's about all I want to say about mutation. I want to give you a flavor for the rates of mutation at least point mutations. Those are probably numbers you should at least remember because it's something that is, it's a hard fact you can latch on to that will help you interpret the literature, even newspaper articles. But so there's, so I wanted you to know something about the different types of mutations that can occur, their rates, at least among point mutations. And just remember that ultimately, ultimately mutation is what causes genetic variation. Remember evolution doesn't occur unless you have variation in a population on which natural selection or these other processes I mentioned can act upon. You don't have, if you don't have variation you don't have evolution. Okay, what I'd like to do now is turn to natural selection. This was Darwin's big idea and in the 1920s, people started to generate the mathematical theory behind how natural selection should operate in the population. And a number of things were discovered at the time. One thing that was discovered is that even very small differences in the fitness between individuals, even having like a 1% advantage over the other guy can cause significant changes in allele frequencies over the course of a couple hundred or a couple thousand generations. Now to us, a couple hundred or a couple thousand generations seems like a really long time and of course it is to us. But from a geological perspective, in terms of how much time evolution has to play with, that's nothing. Couple hundred generations is just a blink of the eye in terms of the history of the earth, right? So that was one thing that fell out of the mathematical theory is that even small differences in fitness effects can cause large differences in allele frequencies given enough change, given enough time rather. I think what we'll do is we'll start with that. We just have like one or two minutes to go and so it's probably not worth going into starting the discussion yet. So we'll start with natural selection next time.