 okay so let's start so we can finish before lunch so right so okay before I forget the ever important sheet so we're talking about evolution what are you can't hear you should be able to hear you can't hear me otherwise okay so I don't know if there's somebody here regulating this whether hello this person please fix it but it's on it should be working what the is it okay okay let's so yes we're talking about evolution and we're going to try and understand what the different forces of evolution that we defined last time do and we're going to start with mutation and selection and we're going to start with this first trying to understand the simplest case okay so we're going to look at deterministic dynamics of mutation and selection just to get it down okay so we're going that means we're in the limit of large large n we're going to look at large numbers the thing is this feedback on this thing that's mainly why you probably can't hear okay yeah I know you can hear but you can also hear I mean I can hear you know an echo and what Andrea did yesterday is he reduced the volume this volume let's see if this is better okay there's this still a bit of it but we can figure it out okay so it's going to mean we're in the limit of large population sizes specifically what it really means is also that the okay I'm gonna start using a word allele I think I'm not gonna help myself from using it so I should explain to you what it means so if you have you know this is really the way we think about it right we have many many genomes this is you know genome from organism one two three four and so on and the way it works essentially is that most organisms have one given base pair at a given position right say we have an a base pair okay so they won't the way this notation works is you don't write the a you just you just leave it blank and then if somebody has something else so say has a C at that position right then you don't write the C you'll just put across saying it's a mutation which means that person has somebody else which really means is that everybody has an a and this one has a C okay so it just and then the a will be called the dominant one and this will be the mutant okay and then what an so there'll be a dominant allele say the a base pair and then the C will be the mutant now because historically now we know that these are base pairs and mutations actually are changes in base pairs but when people first started thinking about it they didn't know about DNA they didn't know about about base pairs they didn't know about any of this right there was Mendel crossing his plants right you've all heard about that right the check monk and so he just said there's two types you can have the dominant type which will be a capital a which is not the same thing as this a right this is just it just so happens it's the same letter and there'll be the mutant type which is a little a okay and that's it's really genome just means it's something genetic but you can really think of it as a type okay and in a way this reduces the problem to a binary problem because you have the main one and then you have the other one which is less frequent and at least in principle okay and then you define the frequencies you say this one has frequency and one and this one has frequency a 2 and so the idea is that at least at the beginning this one will be the more common one and the question is can this one take over and become the more common one right just to now just for notation and we'll get back to it the way people traditionally called the frequencies they gave the frequencies names P and Q so sometimes in literature you'll just see a P and a Q and sometimes they won't even bother to explain to you that it's a frequency because if you're a geneticist P and Q automatically means frequency to you okay just a sort of notation aside okay of course it can happen that somebody else here will have not a C but a T or something else but in the large majority of cases you'll have really the dominant one and then a mutant that appears and at any given time there'll be a small number of mutants well at least that's the approximation we're going to work in today so we only need to worry about the other one so the problem is really binary okay and then well we'll get there in a second but of course this is just at one position but you have many positions so then you can imagine somebody has an AA which doesn't mean that they have the same base per next to it it just means that the second position they're also the dominant they have the dominant one and somebody will have an AA and so on okay so it's just reduced notation and and so on you can write it up but well and then you can define frequencies of all these things but we'll get there in a second so let's first think about just one and okay and this genotype is also called an allele I'll write it down and it's just a genetic word and meaning type for at least the purposes of us physicists okay so we are not going to worry about the fact that we have two chromosomes we as humans have two chromosomes haploids have one chromosome and we have two chromosomes haploids in our body all our gametes are haploids right gametes are the the reproductive cells so sperms and oocytes but there's also many organisms that have more than two chromosomes okay so just like like we have two chromosomes and we have two chromosomes and we have two chromosomes and we have two more than two chromosomes okay so just like like we have to for example goldfish have I don't know okay this has to go here and this has to sorry and see whether this works now okay so so just as a sort of funnier side but let's let's get to working out these frequencies so we have some population that has the large a and there's n one of them and we have some population and two of them that has the little a I just made the big a little a and they can mutate into each other so if the if the large a has a mutation it becomes little a and the little a if it an organism of little a if it has a mutation it becomes big a and these happens with rates mu one and mu two and other than that they reproduce because that's what organisms do and they reproduce with rates gamma so gamma I is the growth rate and it's an effective growth rate so it actually includes both birth and death we're not going to worry about it it's just how fast this population grows on average okay and this is this is mutation okay so then we can write down simple deterministic equations for how these frequencies change so how does n one change well what can happen right if you're in and a large and leave well you can mutate and become and to or you can mutate out of and to sorry I'm a way around you can mutate yeah two two right you can mutate out of two to become and one or you can mutate out of one to become and two and you will grow same thing for and you mutate from one to become and two you mutate from two to become and one and you grow with your rate and we're going to make an assumption which is very often made in population genetics that the population size is constant so that n which is n one plus n two is constant so this is called constant population size okay now we can discuss whether this is reasonable for what kind of population but for example in the experiments that I'll tell you about it is roughly true because that you know you can keep it fixed in an experiment and then you have to each individual has to fall into one of these groups but now what we're going to do is we're going to actually enforce this on these equations and we're going to enforce this by the form of this rate so we're going to assume a form that each of these rates has a part that really comes from linear growth and a part that whose only purpose is to keep the constant population size okay so this is linear growth and this is chosen to keep n constant okay so let's figure out what it has to be for n to be constant so n constant means that the end dt doesn't change in time right the sum of n one and two so we have to calculate what that is and to do that we have to sum these two equations and you may have noticed that these two equations these two terms cancel out so we're just left with gamma one plus gamma two okay which we've defined as our one and one plus our two and two and minus these two terms again I have n one plus and two over n so n over n so I'm just left with the f okay and that gives me an expression directly for f one f and two which has to be our one plus our one and one plus our two and two okay so now I can plug this in but I'm going to plug it in well okay also because of this constant population size we also realize that we don't actually need n one and then two we only need one of these variables right because the other one is set constant so I can define x as n one over n and I'm going to write rewrite the equations for x here okay okay so I have mu two one over n and then I have n two which I'm going to rewrite this n minus n one minus one over n well n one over n mu one plus gamma which I'm going to write out explicitly so our one n one over n plus our two plus our two n minus n one over I think that's everything no because I lost one yeah plus okay I know what I screw up so let's do it again so plus it's our one n one over n so that was correct and then I have minus n one over n our one n one over n plus our two n minus n one okay that's correct okay so we'll rewrite it in terms of q's so from here I have mu two minus x mu one plus mu two so that takes care of all the mu terms and then I have our one x minus our one x squared minus our two x plus our two x yeah okay so let me just collect terms here I have x our one minus our two minus x squared our one minus our two and collect terms further x our one minus our two one minus x so what this is is a different of difference of growth rates right so what the difference of growth rate is is selection how much better some one individual grows than the other is the selective advantage it has because that means it'll spread faster it'll grow faster right and so they'll be more of them so this will call the selection coefficient or the selection strength if you prefer same thing and then we have our equation which is very now which is simple this mu two minus mu one plus mu two x plus s x one minus okay so this is the basic deterministic equation for mutation and selection and the important term here is really is this term which has a very characteristic form it has this x one minus x nonlinearity which comes essentially from the constant population size and from the fact that if the n one allele increases right if x goes up then the other one has to go down so that introduces competition and this term will come up over and over again wherever you go in population genetics okay and it's good it's this is a sort of simple derivation to help you understand where it comes from but yeah it essentially reflects the facts that allele frequencies have to replace each other okay so we can analyze this equation first we can say what happens if there's no selection so that's easy well all of this is easy but okay so if there's no selection we get rid of this term and if we if we look at steady state so we have dx dt equals to zero mu two minus mu one plus mu two x equals zero and we just get a steady state solution given by the relative frequencies okay so you just get a balance of mutation rates that you have this a and you have the a and you have the mu rates between here and you find an equilibrium for for x between these you get the balance then we have no mutation so then we have dx dt equals s x one minus x and we know so x one minus x looks like this well it's okay it's symmetric but I can't draw try and draw it's symmetric it's what it looks like so you have two steady states one which is zero and one which is one so the zero one this means extinction and the one is called fixation okay right if you die you die and you can't in this framework that you know you can't come back and the same thing for the fixation is also absorbing boundary right because once you take over the population then you can't there's no other one and of course so it depends so dx t if it's larger than is larger than zero if s is larger than zero and then you have certain fixation so that means that n one dominates and conversely and two dominates okay and then you have the case where you have both so both mutation and selection are different from zero then you have essentially a quadratic equation to solve in steady state which you can solve it has a solution it's not it's not incredibly insightful you can just something like that but in the limit of mu one equals mu two and everything else being equal this roughly goes as one minus two mu over s and so that this is fixed point like that and this is called this solution is called mutation selection balance that's what you have in a lot of cases okay so just a simple analysis of these equations and now we're going to do something quite sort of algebraic but the result of that is useful it's useful to know what it is so we're going to do it it's not you know it's a sort of incredibly simple calculation but it explains sort of the basis of the rest of what we'll be doing so can I yeah okay yeah sorry yeah so I said mu one equal to mu two just to give you an idea of you know what's the scale so the reason it's called mutation selection balance is because it gives you a mutation over selection coefficient and you know obviously this thing is is between zero and one which is you know the only maybe interesting thing about it right okay now what happens if we have diploids so diploids are organisms that have two chromosomes like ourselves so then we can imagine two possibilities right that you get you have two parents and you get one of each of these from each of your parents so you can have any three of these combinations and this frequencies for all of them and does this actually matter what these are maybe not okay so we're going to now and try and calculate what is the frequency in the next generation if these individuals made so we're going to assume random-mainting discrete generations and infinite population size which allows us to be in this deterministic regime and we're going to define two frequencies P which will be the overall frequency of the dominant allele so of capital A and Q which is one of one-mindness P and then we're going to actually make these things yeah make babies okay make allele babies so we're going to write out the the mating table why don't I do it here actually how I don't I do it here so we can write so these are there are assumptions of our mating game which are again completely unrealistic probably even if you're you know a yeast in a test tube but okay I sorry I misspelled mating which probably won't help there's no frequency of the offspring and then the progeny progeny means kids okay so if a and a meet then they're going to the only possible outcome is a a right so and the frequency the probability of these two meeting is a a squared everybody agrees good okay now we need big a a means a a so this is called a homo zygote and this is called a hetero zygote because the hetero cuz they're different and homo because they're the same right so okay so if these two meet then they can now produce then things can get more interesting right like if these two have blue eyes this one's gonna have blue eyes but if these two meet and one of them doesn't have blue eyes then life becomes more interesting okay so you can have you can still get an a a because this can go with that and you can get an a and an a but all of both of them you get with probability one half right but now the probability of something like this happening is f a a times f a a times two because either one of the parents can be this this can be one parent parent one this can be parent two or this can be parent one and this can be parent two so there's two ways of getting this combination so same thing here so now two homo zygote meet but they're two different homo zygote so the outcome is there's only one possibility but since they're different there's still two ways of this happening okay I'm gonna fill in the table you get the idea it's not the most right here we have all possible outcomes but since they're the same it's quits a a a as I said this is a sort of this is you know a silly calculation it's not very profound but the result actually is interesting okay so this is our mating table okay this is one half oh I have a I should yeah you're right okay good so you're so you know you know how the game works okay okay so now we want to find the probability the frequency of this one in the next generation so I can call it it you know t plus one okay so we just have to sum up every time we had an a a appear so we have the square then why do we have another a a we have the one half f a f a because we have an a a here I can use my chalk now we have this we have that and we have this and that's it right so the last one is one fourth f a squared well because you can get it this way and this way no no no no so that that's the I will give this okay so no there is no so if I do this I can write this okay and from the definition of p this is p squared yeah yeah I'm I'm actually confused about that myself right now because I think it should be but then it doesn't work does it unless we screwed up the it works with the two okay if you say so I really can't see it from here but I'll trust you it's I mean it makes sense for it to be the one half yeah no it does okay because it's too okay good so then we can do we can do the second one okay I mean yeah let's do the second one and let's not do the third one because then you you you get the idea so now we're looking at at the probabilities of little a so yes so that's two times a a plus two f a f a plus one half f a squared plus one half times two f a f a okay and if you bring this together and this is two p one minus p and then if you do the third one the same way which we won't do but you so this is t plus one t plus one you can you know you you get not surprisingly one minus p squared so what now why is this interesting so here we got p squared here we got two p one minus p and here we get one minus p squared so all the frequencies in the next generation are described by one parameter okay so the frequency of no matter what in t plus one is some well I'll just write it out function of p which is the frequency of the a allele in a well p or q doesn't matter since it's symmetric of the a allele in the previous generation right so in order although this diploids and although there's mating and all this of course this is only because this random mating but you all you need to know is how many big a's are there in this generation to know how many big a's will or how many big a's or how many of these heterozygous will there be in the next generation so to predict the frequency of any individual in the next any type of genotype in the next generation all you need to know is the frequency of that of the allele in the next generation so this is a result which is called which has an extremely fancy name as you can see for what it is it's called Hardy Weinberg equilibrium it's just called HW equilibrium and the reason it's important is because we can forget about all this mating and all this these complications at least in models like that and all we need to do is we just need to look at the frequency of this allele and that's why from now on that's all we're gonna do okay and essentially for the last nearly now a hundred years that's all population genetics has been doing okay and of course there's more complicated things that can happen nobody may you know we even yeast don't made randomly and so on but for many explanations of what goes on this is a good model and that's why we're sticking to it and that's why we went to the through the pain of going through this as I said essentially boring calculation because it's an important result that motivates what will come later okay so okay so there's another so from now on we go when I we're going to move to a well soon we're going to move to a stochastic description of all of this that now that we've done the basics and maybe just before we do that before the break I'll show you some some results so there's another important thing is that as we've sort of shown with the previous calculation a very likely outcome of for any allele is that it'll die in fact once once you you'll see that when we take into account stochasticity that is essentially the fate of every mutant in time right a mutation appears it grows well unless it fixes it will most probably die after some time so the fate of each mutation is definitely not in steady state but we can still look at steady state distributions of mutant frequencies we can look at the population level at the distribution and we'll see that there are things that are on average that are in steady state and just to sort of show you some of the things that actually can happen so this is the first one up there is what's called the simple selective sweep and that's what we described without the deterministic model when things either go when the extinct or would fix it so there's the a a allele and then suddenly an S mutant comes about and the S mutant is good and it starts to spread so up there you see a picture of it actually spreading and below you actually see the frequency of that mutant and then you see that it fixes and that means that the a one has to go extinct okay these akatoons based drawn on based on yeast experiments that we'll talk about in a second but then another thing can happen this is called clonal interference and we'll also talk about this tomorrow in more detail is that the same thing happens as appears and then before as manages to to over take the whole population and fix another mutant appears see and see is way better than S and then it really it outcompetes S so it's called interference because there's two clones now see an S and the directly competing and then you can have more complex dynamics because you can have another mutant appear on the background of C or on the background of S and you know you would if you look at the frequency of traces you would start seeing funky things like that and then there's something called frequency dependent selection that in fact the two mutant CNS can coexist that they're comparably good well one is slightly better than the other but not good enough to overtake the whole population and you will see this steady equilibrium so all of these things happen in at least in the lab and we'll talk a bit about differences between the lab and the not lab but one important parameter to sort of motivate what's going to come happen now is population size so if the population size is is relatively small then so the the size the population size here is sort of depicted by the width of that of the of the lines you see the number of individuals then it's quite easy to sweep if you're good you're going to take over but if the population size is is larger than other things can happen before you manage to sweep and you get this clonal interference type dynamics which happens in the other three panels okay so time-wise I guess it's break time right so we take ten minutes and we come back yeah yeah yeah yeah coming so before before we go back to calculations one last thing about an experiment so you can ask and people did ask in the past how do we know that you know how do we know that there's this picture of mutations happening and then selection as acting on these mutations how do we know that it's not a different how do we know specifically that it's not that there's some selection pressure and then mute that induces mutations and so in the 40s a physicist called Delbrook and biologist called Luria did an experiment to show this because in the 30s people basically they had no problems with this being true for for higher organisms for like us or giraffes right which is what Lamarck was considering that the giraffes la neck grows because the giraffe wants to eat so they're like okay we know it doesn't happen for that but for bacteria and viruses maybe it actually does happen that that way that when there's a selective pressure then the bacteria reacts so that did this experiment which is known as the Luria Delbrook experiment in the 40s and it's really one of the most classic experiments and it's a beautiful also example of how probabilistic thinking or physics can be used to understand the real-life phenomenon so they said there's two possibilities ones are induced mutations and the other ones are spontaneous mutations so what spontaneous mutations means is that mutations happen all the time at any given time so in this population they can happen here in this population they can happen earlier right these are dividing bacteria here nothing happens here they happen here and then what they do at the end of the experiment is that they put some selective pressure and then depending on when the population had its mutation the number of individuals that actually have the mutation will survive right so here nobody will survive here this one will survive but here the mutation happened early on so many individuals will survive because all the offspring of that individual have that mutation right and then in the induced mutation scenario they say well nothing happens until unless there's actually stress and then at the end the stress and when the stress comes so the antibiotic in this case then the bacteria will start to find the produce the mutation and they'll produce this mutation and this will what will happen so and intuitively the difference between these two scenarios is if you look at here at how many bacteria survive it depends that there's a there's a striking difference between these two scenarios because in the induced mutation you don't have a lot of time to produce the mutations so in every colony on every one of your experimental plates you'll have roughly similar numbers of mutations and in fact there'll be Poisson distributed right because it's just random appearances but if you look here then if you look at the different colonies then you'll have very different numbers of mutations because it depends on the history of when this mutation happened so yeah so this is what this shows that in the induced mutation or acquired mutation hypothesis you'll have a low variance before between plates and in the natural mutation hypothesis you'll have a high variance okay basic idea and you can quantify that specifically you can do a calculation which you can do for fun if you're interested that okay so in the adaptive mutation scenario is basically at every time when the antibiotic is put there the cell has to divide will decide will I have a mutation or will I not have a mutation and every cell does this independently so if you ask every cell independently do you have a mutation or do you not have a mutation what distribution is that a cell is like a coin being flipped okay so you have some instinct in you right so right and what's the limit of well okay I'll tell you that in in the limit of you know a binomial distribution and a limit is a Poisson distribution right so if you look at the distribution in this case you'll see a Poisson distribution of bacteria that had a mutation now in this case which is the case where you know my mutation appear randomly for time so mutations are still Poisson distributed right because it's it's the limit of the binomial it's still every cell at some point in time and asks itself a question at every point in time it has to set the question will I mutate will I not mutate some of the mutate right so at every point in time there's a Poisson distribution of mutations but you're not asking the cells at every point in time you're asking at some later time so like we saw in the diagram the one that had a mutation before now all of it now all of its offspring will have a mutation so you need to propagate this Poisson distribution in time okay and at the end of the day when you ask the cells with the antibiotic they'll have a different distribution than a Poisson distribution and this this distribution is called the Luria Delbrug distribution and it's not because this was the first time it was derived it's it's a long tailed distribution okay so we're not going to derive it because it takes quite a bit of time to do it and it's actually very tricky but the basic idea is that this is Poisson and this is long tailed with large variance with okay so if now I do the simplest thing possible I look at the variance to the mean here I should get one and here I should get something larger than one and if you actually do it on this on the example of this diagram you can calculate the variance to the mean here and here and here you'll get well you get you'll get one and here you won't get you'll get something larger than one you won't get the complete right answer for the Luria Delbrug distribution because it's just two generations but it works out so this is what the plates look like with the this is a paper picture from the 1943 paper where they you know took real-life photographs of the plates the first thing they did is they tested for plating buyers so that when they actually take the bacteria and put the antibiotic whether they don't whether they don't have any sampling bias so again sampling should if everything is done right should be Poissonian so they should have a mean to the variance of roughly one and they do again within experimental error I mean you see that the middle one isn't perfect but the other ones are actually pretty amazing given these are live experiments so then they actually did the experiment and they looked at the average per sample and the variance corrected for sampling and so if you now look at the mean to the variance in these experiments you see that these are way of they are much different than one right the variance is really much larger than one so so you see it then and this is actually the distribution that they they measured compared to the theoretical distribution in in in white so then they they did something else based on this experiment they actually wanted to calculate the mutation rate of bacteria and they did it in two ways one way was from from the distribution they derived so we won't go into that but the other way was pretty nice and extremely simple so I'll show you how to do it so as we said that mutations when they appear a Poisson distributed so when they appear the probability of having a mutation is the mutation rate times the number of mutations okay so the probability of seeing a plate with zero mutations well it's just the zero term of this so it's just e to the minus mutation rate so that means from this I have an explicit expression for the mutation rate okay so a is the mutation rate and p0 is just the number of zero colony plates so they repeat this on many plates to the total number of plates so they can't use any of the any information from the plates that had one mutation is from the place that had a colony with one mutation to mutation of four or so on because those have already grown in time so they don't know when they appeared and those are not from a Poisson distribution to do that they have to they do what's called method two is that they actually take the full distribution they derived but without even doing that they know that if there's zero mutations then they know the what's the probability what they should expect given a given mutation rate what's the probability of seeing a plate that has zero mutations yeah sorry speak up yeah they had a exactly so they need a huge number of plates so this is this is this is a less accurate so from that they so I don't actually remember which is this is method one so this is the yeah no they did it's a this is a less I'm not gonna argue this is the best okay this is a good an easy or good way of doing it right the problem is you need a lot of zero colony plates to get good statistics and it's very hard to do it so in fact this is so is this is something that's so easy that you could do this like if we had a lab or if you go on your day trip to see sir you can do this okay if somebody you can win somebody here to do this and I guarantee you you won't get the the number you'll get from doing it this way and the number you'll get from actually doing it from the Luria Delbrouk distribution will be quite different okay but in principle it's a neat idea because you really okay the Luria Delbrouk distribution the reason it's hard to derive is that you actually have to figure out what's the probability distribution of times when the mutations happened and there's always someone known about that you know you can you can have a model but you put in assumptions and you put assumptions about how it grows you just assume that it grows the colony grows exponentially it doesn't always have to grow exponentially so you know it's just to say in if you had a huge amount of plates this is a very clean and simple way but you're completely right that that is the main problem with doing it this way okay so another experiment so this is okay fast forward to nine the 1980s and this is a guy called Rich Lansky who you know people people have been saying things for a while for a long time like oh if we could replay the tape of life okay this is this is a big thing that we're here we've evolved but if we could replay the tape of live and start evolution from the beginning would we look at the same right and so Rich Lansky said oh shut up and stop asking let's just do it and so he of course he's not going to evolve us or dinosaurs or anything like that but he can do it with bacteria so he took E. Coli K-12 the world famous K-12 which I can tell you more about in a second and it's a strain of E. Coli and he said I'm going to evolve them in my lab and he started in the 80s and he's still going so now I think we're at 60,000 or so Lansky generations what he does is he lets them grow then he samples them freezes the the sample you know to basically takes the one samples from the top so the ones that grew the best puts them in a new flask lets them grow puts them in a new flask but of course he does not do this himself so if you're considering you know doing your graduate work or a postdoc in the Lansky lab you should know that you will live on the bacteria schedule and so you know so he you know that's that's these huge fridges you can Google this and you know he freezes them so now if you go and say hey Ridge I want a sample from you from 1993 which you're not going to say like that but I want one of your you know one of your two thousand generation lines he puts it in a FedEx envelope and mails it to you okay and then you can start your experiment so we will maybe discuss yes so now we're over 60,000 generations the amazing thing about 60,000 generations is that for humans that's 1.5 million years so that predates the emergence of homo sapiens on earth okay so he his bacteria have had a lot of time to do funky stuff so the other thing he so this is called well by everybody it's called the Lansky and experiment by Lansky himself it's called the long-term evolution experiment and the other thing he does is he then competes his trains so he takes the ancestral strain and he takes some evolved strain from whatever generation and he puts them in my flask and he comes back after some time and he plates them so he he puts them on a plate and then he sees which one grows better okay so from that he can actually measure fitness fitness is a word for growth rate in this field basically how fast they grow so he can measure fitness as a function of time that's what these lines show for two strains and the idea so you know his initial idea was also this will allow me to measure mutation rate selection coefficients understand what's going on in evolution what I think he's understood and the community has understood is that it's way more complicated than anybody imagined in the 1980s here are some really simple examples of probably what they did imagine so there's an idea of fitness landscape okay so there's a landscape you can think it's like an essentially an energy function and if you're higher on this landscape then your fitter you'll all grow it but of course you can imagine having a rugged landscape having different peaks that there's no one global optimum and so this is what goes on here that you know that the red one and the green one have found different solutions the red one is still a bit better then is the concept of generalists and specialists so a generalist is somebody who's very good in one environment so the blue one here the a one is a is a generalist in environment a and the orange one is a sorry it's a specialist a specialist is somebody who's good in one environment the generalist is somebody who's good in many environments so a is a specialist of environment a and b is a specialist of environment b the orange one but a is bad in b and b is bad in a but then there's this third one see the green one who's not the best possible one in both in either of these environments but it's pretty good in both okay and so he's seeing the evolution of both specialists and generalists in his data so this is essentially this is the adaptive evolution experiment another picture of this the long-term evolution experiment this is what I've been talking to you about is that you start off and you basically explore this landscape and they're trying at least in a constant environment they're trying to find the best possible solution the idea of all of this is that the lens key lab is like bacteria heaven right it's it's at least it's a place where it's a constant environment it's not like anything has changed since 1986 okay I'll tell you tomorrow about one of the really funky things that they have evolved and but they know the ideas that generally the phenotype changes and the fitness increases with generations and we do see the increase of fitness we also do see a slowing down of the increase of fitness and now people are sort of trying to understand where does that come from exactly and then you have the mutation accumulation experiments which is something slightly different which is where you say I don't want to select for something I don't want to select for the best thing but I want to accumulate as many mutations as possible and to do that people try to eliminate selection and impose randomness so they're gonna we're gonna go to a limit that we're gonna talk about right now and to basically then the most random part possible and so randomness in all of this comes from again small numbers small population sizes okay and so what they do is they let the bacteria grow but not to get to high population sizes they stop it still relatively small population sizes and then they take one bacteria out or a few but a very small number and put it in a different plate and let that start up the colony and before it manages to take over any mutation as she manages to spread they do the same thing so they really try to randomize by very taking it very quickly and taking just one so having a very sorry large bottleneck it should say basically well it depends how you define bottleneck but you want to take as little as possible and then at the end of the day you end up with a population that has sampled many different mutations because you've eliminated selection and enhanced randomness so then if you look at the distribution of mutations you have a distribution in that so okay so why don't we now talk a bit about drift so how much time do I have what's the official until one okay okay so what we're gonna talk about now it's called genetic drift which is the world's worst name okay because genetic drift is actually diffusion okay just you'll you'll see it's the random term but because the term comes from population genetics which is more a field of math than of physics it's been called that and but you'll see it's essentially a diffusion term okay so as before that we when in the previous hour we talked about large population sizes but populations of course are not infinite and in fact the most important thing even more so than populations are not infinite or very large is that each allele has to start with us at a small frequency because when a mutation happens it has to start with one mutant right there's no other way so somewhere in the evolutionary process we always have the number one and the number one is always much smaller than the number of the population than the population size there's no way around it okay so that's why evolution is inherently a random process so okay so we're gonna go back we're gonna get away from our a's and a's but just to to do that so we had and a was n over n and this was one and over and in frequency okay a popular we're gonna still keep a constant population size which is gonna be n and so we're gonna look at discrete time well maybe not so important we're gonna look at okay me I don't have to we're gonna have overlapping generations and we're gonna choose one individual to die in each generation and one individual to reproduce so let's imagine we start so we start off with n plus one individuals and we want to get to n individuals so the way we can do it is well we pick one of these n plus one individuals to die and we have to make sure that nobody else gets born in this time so this is deaf and this is not birth which is equally important because if in this period of time that we're looking at somebody else dies then we're back to having and okay if we start off with n minus one same thing we have to we had somebody has to be born so we have to pick that somebody and then we have to have this person not die or nobody else can die and that'll give us n and then we can also have the situation where somebody that the same individual is born well that somebody is born and dies so that the same thing happens and so it's either one of the alleles either the big one or the small one can be somebody from that fraction can die and and be born okay so then we can calculate based on this graph the probability of the a allele having frequency n at time t that's what this is and so given that they had one plus t then we have this scenario n plus one one minus one over and given that it was n minus one then we have sorry this is t minus one yeah this is this is this this is this and then this is the middle line when essentially nothing happens but nothing can happen by both birth and death okay so as you can suspect we're gonna write we write this in terms of frequencies now and we're also going to do something else we're gonna measure time in the number of generations so that means that p well pn of t is just gonna go to px of t that is simple but pn plus one or minus one of t will go to px plus one minus one of t or of minus one essentially in the equations that have so what that means is that now one well okay we're not yet no did this this is just a color early of that but we're going to deal with this in a second so we have sorry px of t and so I'm just gonna rewrite everything okay I'm gonna try and not screw this up okay and then the last term which is easy x of t x squared one minus and this we can rewrite I write it here so that you see more easily one minus two x one minus x just algebra okay so now we're going to do the same thing we were doing yesterday we're gonna tailor expand so I'm gonna define a function just plus minus one of n which is p of x plus minus one of n t minus one okay and I'm gonna expand it I'm gonna tailor expand around one over n equals zero well it bother you tremendously if I if I put this x in a parentheses now instead of as a subscript no so it's just a change of notation but it'll be easier I'm just gonna do this it's the same thing nothing changed okay I'll write it out notation change for laziness okay so I'm expanding what I have here I have first the plus one then the minus one and then I have whatever's left over from here so I have p of x t minus two and actually I have x one minus x which is the same thing I have here so this is two f of x okay so now I'm going to proceed to cancel things so the f's cancel and the first terms and and this is t minus one and and I have p of x t minus p of x t minus one equals one over n squared because these add two fx squared and so now I'm gonna divide this by one over n which is the same thing as multiplying here by n and one over n is the generation time okay it's the time of one generation right I pig an individual I pig a generation I pig so it's basically the characteristic time scale of the problem I pig an individual I have n individuals I pig one from these n individuals so that gives me a time scale of one over n and I'm taking one over n to zero so that means I take n to infinity and since this one is a generation this is essentially one over n to so then this limit just becomes the dp of x of t and square goes away okay and I can write out df precisely so let me do that so this is what we call genetic drift and so what genetic drift is is the randomness coming from small from the fact that the well from small numbers in evolution and population genetics from the fact that basically you always start with one individual and even if nothing interesting really happens that they just you know we don't have any mutations here we don't have any selection yet we just have them reproducing and so just from the small numbers you get a stochastic term and this stochastic term is a diffusion term that's why I said initially that it's like diffusion and now we can add to this let me rewrite the full equation we can add the terms we derived before and you can derive them in a stochastic way by adding them to the diagram we just wrote down and the redoing the calculation if you want but what you will get is if you actually add them and you can see them because they're there in the stochastic version while you get selection you get mutation so do you get the real the physics drift and then you have okay the other thing to notice is that a reappearance of our favorite term x1-x of course because again somebody but this the population size is constant so somebody dies somebody has to be born so this is the full this is the full equation with mutation selection and drift and so you can solve this in steady state you can actually solve it even out of steady state okay out of steady state this is given by a sorry not what I mean what do I mean by how you can solve the full full equation you can for solve for the dynamics of this but instead there's a Gagan bow there's a special function solution but okay but in steady state this there's a simple solution let me set mu 1 equal to mu 2 to mu just for simplicity but okay let me first write out the thing then you can just so you see it so then p of x is c x and if we said mu 1 equal to mu 2 equal to mu then we get so let's plot this so if we take the limit of small mutation rate then this is you know similarly to what we had before the x1-x terms will dominate this is mu mu 1 much less than 1 and we have a 0 and 1 we have a distribution p of x that looks like this so most of the population is either extinct or fixed and then you have occasional excursions between the two and then the other one can fix the opposite limit when the mutation rate is large get a distribution like this if s is smaller than the mutation rate remember deterministically we got mu plus mu so we would get about a half here we get a half plus s over 8 mu as the mean of this distribution but this is essentially near this is called near neutrality because selection doesn't matter because of this limit yeah this is the extinction fixation scenario and then we have also n mu larger than 1 but selection is now strong and this is reproduced gives us back the mutation selection balance again with the these are all p of x's okay and so this is what we're going to talk about next week this is a limit where you now have a lot of mutations and they're all being selected upon and you essentially end up with many lineages at the same time many mutants coming up and fixing so I can just I mean for those of you that mentioned it this is just a slide that shows that this is as a simulation that showing that the effects of the demography and selection so demography meaning a bottleneck and selection are very hard to distinguish and if you look at the this is a distribution of the how many times you see a derived allele so the little a in neutrality it sort of goes on so you see the neutral prediction that it's sort of a decays and flattens out if you have selection or a bottleneck it will go up like that so you get this characteristic you shapes and these characteristic you shapes is what you actually see in data but there is a huge question of where they come from and what's their origin and since many factors give you the same thing that's very hard to distinguish okay so maybe one last thing so we won't talk about this much but as I sort of maybe mentioned it in some experiments tomorrow another big thing is of course recombination so mixing of of chromosome so I just want to put that out there so that you know you are aware this is a major force of evolution that is present in every organism including bacteria bacteria do share DNA and recombine the DNA it's called horizontal gene transfer okay so we'll end here for today and tomorrow we'll continue with adaptation