 Okay, so we're going to do one last thing. So remember we're talking about evolution. Okay, I can start with showing you pretty pictures just to warm you up. We were talking about, we're talking about evolution of bacteria yesterday. Remember the long-term evolution experiment. So you can do the same thing on yeast. This is just a summary table of what people do in lab evolution. So we were mainly talking about this and climbing up the landscape starting from some initial individual who tries to mutate and do that. So you can do something similar with just DNA, mRNA, or just protein, some molecule. You generate a random library of different mutants, and so they fall in different places on this rugged landscape. And then you select them for, say, binding to something, like a protein, whether it's a good enzyme. And the ones that are left behind are left behind. And as time goes on, you find the winner. You find the one that lies here on this landscape. You can also do these kinds of in-lab experiments with breeding, so with mating. Yeast do mate in general. In the lab, they don't like to mate. Okay, it's kind of like animals in the zoo. So what they do is they mate them beforehand in happy conditions, and then they usually, what happens is it turns into more of evolution like that. And flies, on the other hand, if they want to reproduce, they have to mate. So people are now starting to do these kinds of experiments also with flies. Flies are difficult because they fly, okay? And so, you know, this is actually a problem. So already with E. coli and yeast, you have a problem with contamination. How do you make sure that something from this dish didn't somehow come into the other? Now imagine that this thing flies and tries to get from one test tube where it's supposed to be breeding with your controlled population to another one. So these are very hard experiments to do. And of course flies also reproduce much more slowly. So it takes more time. But just so that you know that people are doing things like that. And this is what you get as a result of these experiments. So for the in vitro ones, as I said, you really should select for a winner. This is like picking the best protein out of a randomly generated library. So at the end, so you start off with a lot of diversity. And at the end of the day, you get one winner. In asexual organisms, you get these patterns of what we talked about yesterday, clonal interference, where somebody wins for a while. Then somebody else that's better comes along and so on and so on. And this goes on and you're generally getting more diversity at the end of the day than you started off with. In yeast, the time scale here is shorter. So it's going up until here, there. You start off with large diversity because you outbred them. You made them made in the beginning. And then on short time scales, there's one that will win before the process starts looking like that. And this is just flies where, of course, they mate. So chromosomes get mixed up. That's what mating is. So you start off with specific chromosomes. And then you get these mosaic patterns, which is what happens in real life. So this is just to tell you that these kinds of things happen. But let us now think, go back to what we were talking about. And as you say, see, in a lot of these cases, you get this fixation phenomenon. So somebody wins. So yesterday, we derived this equation that I will just write back. This is the equation with selection and mutations. So we have the mutations and then genetic drift. So I think it looks like this. So now we're going to ask a question about fixation. And we're going to say, given that our population is initially at time zero, that there's some initial condition that we have, well, one individual of the new mutant, what's the probability that this mutant will fix? OK, so at time zero, you have one individual out of n. And so we want to solve this equation. So of course, we could go ahead and solve this equation. So we're after the question of what's the fixation probability. But that's rather difficult. There's an easier way of doing it. So physically, what is fixation? Grow. So remember, we were drawing these. This is the steady state probability. And fixation is when you get to one. And the extinction is when you get to zero. That's symmetric. So maybe extinction is easier. In terms of boundary conditions, does that look from, if I say boundary conditions, does that look like anything? Both of these are absorbing boundary conditions. Once you get to one of these, you don't leave. So how do we solve these kinds of problems? If I ask, if I'm telling you I start off with this frequency, and what's the probability of me getting here, what kind of problem is this in physics? First passage time, yes. So we're going to solve it as a first passage time problem. And how do we solve first passage time problems? What kind of equation? This is a forward equation. And we solve them by the backward equation. So we're going to write down the backward equation. Everybody's seen a backward equation in their life, right? OK. If you didn't, you will see one now. But so OK. So the general idea of backward equations, I'll write it down and then I'll explain it. I'm sure you've seen it. You may not recognize the word backward. If you've been overly tainted by mathematicians, you'd want to call it Kolmogorov something. So this is an equation for p of x, y given t, where x is the frequency at time t, which is the end of the interval given, what is it, that y is the frequency at time t0, at the initial time, OK? So we know, it's called the backward equation because we know what we want x to be. We want x, we want it to fixate, and we want it to be one. But we're going to ask, it's an equation for y, and we're going to solve it for depending on what y is, what is the probability that you get to x equals y, OK? That's why it's backward. We're asking, we're writing an equation for the initial condition instead of the final condition. In a normal equation, you're given an initial condition, and you're asked, what's the frequency at the end? Now I'm going to tell you, I know what my frequency at the end is, what is the initial, what's the probability of getting there given my initial condition. This is the adjoint equation to the forward equation, and this is how you derive it. And we're not going to, I'm not going to prove that, so, OK, if I tell you it's the adjoint, this form should be obvious to you. You write this in matrix form, and you take the adjoint of the matrix. So that's why these things come out of the derivative, and that's why the sign changes here. This is like a momentum operator in quantum mechanics, right? It's the first derivative, the adjoint of the first derivative gives you a minus sign. I'm giving you analogies, right? I'm waving my hands, because in principle you should have seen it. But if you haven't, that's where this comes from, OK? If you want to, you can write this out as a matrix and take the adjoint, and you will, I guarantee you, this is what it will look like, OK? So specifically we are going to solve it, so we're actually going to solve for the extinction and then say that the fixation is one minus, the probability of fixation is one minus extinction in the approximation that you're in this limit where you're either go extinct or you don't. So we're going to say we want x equals 0, so extinction. If you want a title for, yeah, OK, fixation probability is what we're after. And so that means that we want p of x equals 0 y of t, and I'm going to shortcut my notation and just call it p y 0 t, OK? So we're writing this equation, maybe I don't need to rewrite this equation, OK? Let me just rewrite it for the sake of being general. So I'm just rewriting what you had, OK? I'm even going to drop this 0, because I'm going to drop this 0 I feel from. We're going to look at the long time limit. We're going to look at t goes to infinity. So we're going to ask what's the probability of extinction at long times, which is essentially we're asking what's the probability of it going extinct at all, OK? So we're solving the steady state form of this equation. And now this becomes easy. I can define an auxiliary function, which is just the derivative of d pi ui dy, OK? For those of you that are bothered by, I'll try and keep my capitals and non-capitals the same. So if I define p y dy, well, first of all, if I have a 0 here, I can get rid of d's. So then I have, from the second one, I just have dy, whoops, I'm sorry, this is dy, of course, yeah, OK? So I have dy equals ns with a minus sign r of y, which is just an exponential equation, right? Everybody agrees? This is solved by an exponential, good, r of y equals 0, OK? So now we have to unsolve that. So p of y dy equals, so this is r at y equals 0. I'm just going to call it r0 e to the minus sy. So I can do this by direct integration of the two sides, get minus r0 ns e to the minus ns minus 1. And on this side, I get p of y minus p of y equals 0 equals to that. Nothing interesting happened. But now we know that p of y equals 0 is equal to 1. Because if p of y is the probability of going, basically, I'm asking now the question, what's the probability that I end up at 0 given that I started at 0, right? So if I'm already there, and it's an absorbing boundary, I won't move. So this is 1. So that means that p of y is 1 plus r0 ns 1. Sorry, I missed the sign here. Negative. All my signs are OK. OK. And then I happen to know another thing, that p of 0 of y equals 1 is 0. Because if I start off fixed, so if I start off in 1, again, another boundary condition, I'll never get to 0, right? This is the probability of getting to 0. If I start off in the other sticky side, I'll never get to 1. So that's 0. So that gives me a way of calculating r0, which is 1 plus r0 ns ns, right? So this means that r0 over ns minus 1. So if I put all of this together, then the probability is 1 minus 1 minus e ns y 1 minus en. And so since this was the probability of extinction, and I'm interested in the probability of fixation. And as I said, I'm working in this limit of weak mutation, essentially of weak mutation, that I just look at the case. So I should say that mutation is much less than 1. I'm considering a case where I just had one mutation, and there'll be no other mutation before this process manages to go through, right? So the idea is that basically all of the probability is either here or here. So I can approximate the probability of fixation as 1 minus the probability of extinction. So that means that the probability of fixation is this nu mu much less than 1. Because I can write it t fix is up to 1 minus 1. OK. Sorry? Where does the time, if you know the time, the system will go through one of the points. But what is the difficult time? That's a very good question, and that's what we'll do in the second part of today. You're nearly as if I paid you, you know? But in order to do that, we need the answer from this. So this is, you know, just be clear about the approximations. There's clearly this approximation, and it's, yeah, nothing interesting happened in the time that it takes for this to happen. You can calculate the time for fixation, too, but we won't, we found this formalism, we won't do that. OK, so we'll start off by analyzing what this gives us. OK, so the first limit, let's look at the limit that selection is strong. So in this limit, p fixation, which is given on that board over there. So I'm going to expand in the limit of ns large. So basically, I can delete whatever, you know, in the denominator, the one will dominate, and I'm left with e to the minus ns y. And if we say now that y is 1 over n, so initially we have one individual at the time we're looking at, then that means that this p fixation becomes 1 minus e to the minus s, which again, for strong s, I can expand this 1 minus 1 in the s, and so this is equal to s. OK, so this has a name. It's called the Haldane result, that the probability of fixation is proportional to the selection for a beneficial mutation starting from one individual. OK, if we start from k individuals, which essentially means we start looking at it when this k individuals, which is often what happens in experiments because before that, we may not see them. We may not be able to detect them. Well, then we have the probability of fixation to be 1 minus sk, and there's a threshold here because it equals 1 if k is 1 over s, and it equals roughly ks if k is less than 1 over s. So that tells you that there's a threshold at 1 over s, above which if you see an individual, so if you see them in k individuals where k is larger than the selection, 1 over the selection coefficient, that's almost certainly going to fix. Well, certainly going to fix. However, if you see less individuals than 1 over s, well, they still may fix, but their faith is much less likely. OK, so that's case 1. But this is a result to remember that generally, if selection is strong, then the probability of fixation of a beneficial mutation is proportional to the selection strength. However, if selection is weak, and this can be either beneficial. So this is beneficial because it assumes it's a beneficial mutation. So it's a mutation that's good for you because it assumes that this is positive. OK, we're going to do the negative one in a second. OK, so this is either a deleterious or a beneficial one, but with a weak selection coefficient. So this is the general result. And then we expand in the limits, and we get that it's proportional to y. So now, if selection is weak, the probability of fixation doesn't depend on the selection strength. It actually just depends on how many individuals you had in the beginning. So really, so what's interesting here is that whether you're beneficial or a deleterious mutation, so your mutation that's good or bad for you, your probability of fixation is completely the same. Right? It just depends on how many of you are there at the beginning. So if I now ask you this question in a different way, can a deleterious mutation fix? What's the answer? Yes? Yes. With a certain probability, but this tells you yes. It tells you it's not impossible. So just because you see a mutation somewhere, and like all of us, we can all have the same deleterious mutation. And all the yeast in the Petri dish can get a deleterious mutation which can spread and just by chance. So it's not impossible for bad things to happen and survive. And it's even stronger than that because you can imagine having strong deleterious mutations. Okay, so this is deleterious, which means bad. Okay? Beneficial means good. Deleterious means bad. Okay, so in this limit, we do the same game and we expand and we find a certain probability, which is arguably less pretty, but it's an exponentially decaying probability with the selection coefficient. Now remember that this is this, you know, so okay, so you can say that, so first of all, again, the answer is even a very bad deleterious mutation can fix. It decays exponentially with the strength of how bad it is, but it can still fix. And so, okay, so basically this is exponentially small. So it is small, P fix. I write it here, but you know what I'm saying. It's exponentially small unless 1 minus y is much smaller than 1 over s. Okay? So unless you're already very close to winning, you're probably as a very bad deleterious mutation. You are going to go extinct, but still this is probability, so anything can happen and you may actually fix. So let's draw this. The probability of fixation as a function of the selection coefficient. Starting with 1 over n between minus 1 over s and 1 over s, you basically have, if you start with 1 over n individuals, so 1 individual in your population, you have a constant probability of fixation relative of whether you're good or bad. Here you have an exponentially decaying probability of fixation. Here is 1 where you're certain to fix and in this intermediate regime. This is for, again, n mu. So to connect this again to experiments, there was a pretty nice experiment from another guy who got annoyed at all this. All we think in these abstract terms about fitness landscapes and reproducibility and how do you evolve from point A to point B? He went until the lab and he just did it. So this is Dan Weinreich in paper from 2006 where he took a protein, beta-lactamase, whatever, that conveys, confers antibiotic resistance. So you've probably all heard about antibiotic resistance. As the general, we as society have been eating too much antibiotics and now if you're old and you go into hospital, the thing that is actually likely to kill you is the hospital because of the bacteria that, which doesn't mean I'm being recorded, so may I shouldn't be saying things like this, which doesn't mean you should not have your grandparents go or your parents go to hospitals. Hospitals are very good and all that. But antibiotic resistance is a problem and especially for the elderly and very young, so people are very, very interested in this. Okay, but essentially it doesn't matter for what's going to happen. So what they knew about this protein is that if it gets five mutations, so this is the fifth mutation over here, it's gonna be very resistant to antibiotics. And if it doesn't have any mutations, then it gets killed by antibiotics. Antibiotics is measured in terms of something called MAKE, which is called the Minimum Inhibitory Concentration, which is essentially the lowest concentration of antibiotics that prevents visible growth, so what essentially kills the bacteria, okay? So you put in different concentrations and at some point the bacteria dies and that's what this is. Just that sort of point of interest. So what they did is if you have five mutations, that means you have 120 mutational trajectories to get from the beginning to the end, right? You have 120 different ways in which these five mutations could have appeared. And so they built all these 120, sorry, they didn't build the mutational trajectories and you have two to the five, that means 32 different mutants, right? So you can have all the point mutants, that's five, then you have all the double mutants so those that have two out of the five and so on, okay? So they know exactly what these five mutations are and then they combinatorically built all possible versions of these mutants, okay? So that's 32. And then they measure the fitness to the two, sorry, they measure the fitness which means they measure how resistant they are to antibiotics. They put in different concentrations and they watch when they'll finally die and here is a table for all of these mutants so this is no mutations, one mutation so on and all five and then it gives you this concentration of the antibiotic at which they die so the higher the number, the more resistant they are. And the first thing they sort of notice is that the order of mutation, well the combinations of mutations actually matter. It's not an additive process. So if you have this mutation, you don't essentially get any resistance and if you have this mutation, you don't get any mutation either but if you have both of them, you suddenly get a huge increase, okay? So you go up from this 088 which is nothing to this which is something. So what you're saying here are clear interactions, right? You have plus and minuses so you think these are like spin variables so if you think about like an Ising model you have a J-I-J term. This is what this is telling you that the effect of having the two is not the additive effect of having each one of them separately. So in evolution this is called epistasis. Epistasis is just the word for interactions and you see that a lot. This is sort of another example of things like that. So basically the background on which the mutations happen matters a lot and it also turns out the order of these mutations matter and the way they figured this, well what they did then is they actually used the sort of formalism that we were just looking at and they said, well I can calculate the probability of go for every trajectory I can calculate the probability of this trajectory taking place. So they say mutations happen independently so if I want all five mutations to happen then I need all five of them to happen independently it factorizes and then they say, then they make a very strong assumption they worked in the weak mutation strong selection limits so here and they say, well in fact if the mutation is deleterious or neutral then the probability of it surviving is zero, okay? So that they basically equate this to zero because they know that all of these mutations well, okay, they don't know that but so that's what they do and they assume that everything else is strong and they consider two models where there's a constant fixation rate and one where it's proportional to one over the resistor it confers and doing that basically they find if they get rid of all the trajectories where you would have something be deleterious so that means the fitness goes down when you add that you're left instead of 120 trajectories with only 18 possible trajectories, okay? So the take home message of this is that not all of the 120 trajectories are possible and the reason they're not all possible is because of these epistases because of these interactions, right? So I mean this should be somehow running bells to you because if you think about the network graph, right? There's some moves that are forbidden because they would decrease your overall function which is here fitness so you can't go there so that severely constrains the number of paths you can take and in fact if you plot the cumulative probability of the probability of the trajectories half, if you want this cumulative probability to get to a half you don't need many trajectories so depending on the model two or four trajectories basically capture half of the total probability so they take out most of the space so this really, you can take this down even from 18 to essentially two most likely trajectories so okay so this is plotted on this graph where the yellow line is the most likely trajectory and these are all the 18 ones but basically the first and the second most likely trajectories take up most of the probability space of all trajectories so when we think about this I started off with telling you that evolution is the most random of all processes, right? Because all mutations is random whether they arise and so on but it turns out that if you're actually evolving something specific the number of ways of getting there is much more limited than you would have initially thought we got down from a hundred and twenty to essentially two possible paths okay now how are we doing on break time I still have some time okay so let me continue with telling so just there's another way of this is another antibiotic resistant experiment from 2015 where they build something called the mobidostat okay which is basically a bacterial killing machine and so as I you know as as in the previous experiment we saw we have these bacteria and if we give them and if we give them more concentration of antibiotic you know it so there's some concentration of an of antibiotic at which they'll die so these people said well we don't want them to die we want them to constantly be on the edge and so we're gonna put in some concentration of antibiotics and then let them evolve resistance in this and if they manage then we'll pump it up and give them more okay so the bacteria are constantly pressured to find new solutions and so they did this with three types of bacteria and you know unfortunately the the bacteria do very well in this experiment they do manage to find new and new solutions and this is you know in a way what's happening in in our world right we we've been giving ourselves more antibiotics and we bacteria have been finding new solutions but I won't go into all these details of these uh... graphs but what you actually see here is when they do replicates and then they sequence the final population uh... these bacteria in different replicates find a very similar solutions so again there's some notion of reproducibility okay so they uh... these are the way to read this is these are four replicates as one fourth of the pie chart and so this is time and as time goes on you see which fraction of the bacteria have found the same solution at the gene level okay they don't find exactly the same mutation but they say find the same phenotypic solution and you see that many of them in many cases they find solutions so these are different kind of uh... bacteria so it is and you see that the fitness goes up so they keep on finding things okay so this is for the next part uh... so before the break let's do one simple estimate and then we can uh... then we can take a break okay so this this this is supposed to do that you know this is supposed to be as sort of crazy estimate time and the question is how many cell divisions have there been from the beginning of life okay this is i'm sure i'm kicking you way out of your comfort zone because you're mostly theoretical physicists which means you like to solve equations and you don't like to do estimates because estimates are kind of bullshitty right so that's what we're gonna do okay beginning of life when did life begin who knows on this planet yeah life any life any ideas yes ability through three point five billion years but essentially ten to the nine years ago well done your your your checks in the mail uh... okay then the next thing we saw this is how you know when life began okay so how many cell divisions per year so okay so let's let's think about the ecology because that's the only thing we can think about uh... so the coli in the lab in the wild this is of course not true and we don't really know but it divides about twice per hour okay so two times means forty eight times per day which means forty eight times forty eight times three hundred and sixty-five days which is of the order of ten to the four per year okay this is what you get from multiplying this numbers if you say this is about ten and this is about ten to the three you get that or if you say this is ten squared and this is ten squared you get that but if you actually multiply these numbers you get something that's close to ten to the four not that it matters for this level of bullshitty estimate that we were doing so we have ten to the nine years times ten to the four divisions per year which gives us ten to the thirteen divisions since life began okay well sorry these are just the number of divisions for once one lineage so now we have to multiply this by the number of organisms we have on earth any ideas ten million yeah give me it's a log question i'll help you there i don't know the name of this number if this is a hint up up thirty five is that what you said yeah yeah yeah yeah that's a very good estimate you said you said twenty five okay so you're roughly in the ballpark people estimate ten to the thirty okay we can do we can do an estimate to where where the you know this is obviously a completely bullshit number let's agree on that right i mean okay you can count the number of in principle you could try to do this and i think they have done it in a in a better you know in some sort of sensible way but one way i'll give you away where you can sort of get as i said it's a log question so whether you say twenty five or thirty five it doesn't matter really right yeah yeah yeah yeah yeah yes yes of course yeah no you're it's no no you should be integrating an exponential probably okay i as i said let's okay let let me if you know like i'll get back to so let me tell you how you can estimate this number in a fast way and so because it'll give you fun numbers and so let's say let's let's think about bacteria okay because bacteria make up essentially ninety eight percent of say marine life so anyway we all agree that's more bacteria in the world than mammals or vertebrates right no no question about that okay so bacteria rule okay so so where do we find bacteria we find them in our gut so how many bacteria do you have in your gut right now after a few wonderful days of cafeteria food in place again but you said billion very close yeah about a billion ten billion ten to the tenth okay okay how many people on earth this one you should know all you really have to go to that sustainability session yeah okay so again we're in the same ballpark so that gives us already ten to the twenty bacteria and then i'll give you a number that you probably have no no idea of i mean yes you can estimate it from some weird ways or you could also do this estimate using like how many bacteria are there in one liter of ocean water and stuff like that but then you have to know that and uh... that's not a number okay so it turns out that humans account for ten to the minus four percent of a non marine biomass so ten to the minus four percent is ten to the minus six in fractions okay so if we multiply these two numbers uh... actually divide by that you get ten so ten to the minus six ten to the minus seven whatever ten to the twenty seven uh... bacteria okay so as i said order of magnitude it's quite close but we get of the order of ten to the thirty organisms modulo the problems of things expanding than the you know the expanding and so on i'm just gonna multiply ten to the thirty by ten to the fourteen which is the number of cell divisions and this will give us approximately ten to the forty three cell divisions since life began okay apparently a more clever estimate gives you ten to the forty six as i said completely doesn't matter i don't believe either one of these numbers but it just gives you an order of estimate right so if you were to sort of say from the top of your head it's like ten to the fifty cell divisions from since life began ten to the forty ten to the fifty but now the question is what if that number was ten to a hundred cell divisions okay what if it was you know power square of that number would we look different would life look different sure i mean the you know so that's that's your your thinking exactly or along the right ways i mean the honest answer again is we have no idea right uh... but that the problem is that we don't have and we don't have an idea because you know we haven't mastered partly because we haven't mastered the theory of evolution well enough so one thing i can tell you so think back to the lenski experiment yesterday so lenski's lenski's experiment has been going on for sixty thousand generations okay at around generation thirty thousand lenski bacteria ecoli learned to utilize citrate which is a food source uh... and this means nothing to you but i'll tell you that the definition of ecoli as a species is that ecoli are defined as a bacteria that cannot utilize citrate so generation thirty thousand one strain from the twelve strains of the lenski experiment learns to utilize this thing that they are by the name forbidden to do and so they form a new species right if we believe in bacterial species okay so they've done something funky it's like i don't know us growing a trunk and suddenly being able to take a shower by ourselves right okay so but then the question is should we be surprised how do we gauge the element of surprise okay that's and we know since that's what statistical physics is about right we estimate surprise we calculate probabilities that gives us a concrete estimate of surprise so on one hand you're gonna say yes i'm surprised because these bacteria we gave them a name that says they cannot do this so how did they learn to you know they've really done something out of the way but on the other hand you can ask the other question it was one in twelve lines of experiments why didn't more of them do it so should we be surprised at only one or should we be surprised at even one right and again we i'm not gonna answer this question for you because we kind of stuck at the theory but uh... that's you know that's what we'd like to be at least you know forget about the question of whether we would look different and so on but we would at least like to know whether we should be surprised by this one line of lansky i mean i i think that's an achievable goal alive okay so with that let's take a break and come back it's it's a j-con mat and i don't remember whether it's two thousand eleven two thousand twelve something like that okay so it's it's like because if you look into a biology textbook you're gonna have way too much details you don't probably don't want to do that there are textbooks that's the physical biology of the cell by rob phillips and uh... collaborators uh... but it's most it's more that the the physics is easy most of the time and you it's it's a it's this fake okay so you're not gonna just read it and it has a lot of most of the things we haven't talked about but again reading maybe an introduction to a relevant chapter may help you and then this bio physics by bill beallic uh... which in some of the things you'll hear here may you may find that i don't even know some of the things you'll hear from terry you may find that but again it's this fake and this one is actually aimed at physicists it has a lot of information okay but uh... and i have a feeling those are not the textbooks you're looking for so i would sort of recommend uh... checking out wikipedia seriously for like a basic idea of the biology quickly uh... okay so now we're going to talk about the rate of adaptation so organisms of all and they have all through this mutation selection and the question is how fast will the population of all so that basically means what is the speed with which the fitness changes uh... and we're going to think about the constant environment similar to lansky uh... there's another question of how similar it will these mutations be which i've sort of been telling you that in fact they may be more similar than you expect although you know on one hand the similar on the other hand and not because these lansky strange to different things so okay but it's well mine will talk about uh... that and so maybe another so we've already said the laterious mutations can fix right and others question uh... which you have not seen the answer to this one so this is really a question uh... did which is very nonintuitive but i can ask you a question if you increase the size of the population will more or less the laterious mutations fix okay so you have your say ten to the fifth pop bacteria now you make it ten to the seven and that will more or less of the bad guys fix less more i hear both who says more yes okay somebody says more so the answer is more but it's completely nonintuitive uh... you know the intuition should be good that less will fix because it's harder for them to sweep through the whole population but act in fact more will fix because they hitchhike with the good guys so you're still going to get more beneficial ones and the deleterious one is going to be next to the good one and it's just really going to hitchhike right it's gonna we go along with whatever happens to the good guy but it's completely not obvious and but this thing you can actually get out of of formalism okay so we're going to consider something like the lansky experiment constant population size and mutation rate you as a each we're going to say that each mutation is beneficial so we're going to make things simple and that they have a selection coefficient s we're not going to talk about recombination as i said completely unrealistic there and so the first thing we're going to do is we're going to consider the case case where mutations are rare so that's that's the same limit that we worked out the fixation probabilities and we're going to ask uh... how often does the first question is how often does a mutation occur in any individual or in some individual in any individual right then in some individual i'll add the other population size and so you have n individuals and you have a mutation rate how often does the mutation occur n times u so the time that you have to wait for one mutation is one over n u okay so for one over new time nothing happens so then they can uh... so then it occurs and most of them of course quickly go extinct due to genetic drift but what is that probability that the mutation actually survives genetic drift so it's so we we can say that if it survives mutating genetic drift it establishes so just for simplicity for the estimate let's say that it fixes what's the probability that a mutation fixed in this limit from the first part of the lecture today strong selection week mutation strong beneficial selection remember that result s that i underlined and said it's important now you know why it's important for other reasons but so after one over n u s well you know we have these ones here that go extinct but we'll have one here that will go above one and then it will drift against some more and it'll go it'll become larger than one over s after after some time and after this time it'll grow deterministically okay so how long it will so and then after some time ill so it will grow deterministically and then it'll fix some max value of that it'll find or essentially it will fix so can we calculate now this deterministic time okay so okay so the so deterministically when it grows how does it grow exponentially thank you so that means that the population the number of these back mutants which started off as one over s then it increases exponentially with a coefficient s and we want to ask what is the time after which they'll take over the whole population right which is n so we can solve this equation and get that this time deterministic time is one over s log ns so that's one picture this is the picture for rare mutations so the next question is when does this whole picture break down exactly when the mutation is not rare and so what does it mean that the mutation is not rare right so look okay I didn't draw this perfectly well the mutation is rare means that the time for a mutation to actually appear and come up to here is much longer than this time it takes for it to fix right because it remains that this happens very rarely and then this is deterministic so it'll go okay so that means that this time is much longer than this time that means that one of a new s is much larger than one over s cancel and we get that this means that n over u is one log this is what rare mutation means okay so so maybe first let's see how possible this is for this to break down okay so now we know you have the 10 billion a E colite in your gut and so let's say n is of order of 10 of a billion and let's say they divide more rarely than in the lab so let's say they divide 20 times a day um does that this actually doesn't matter okay so and the genetic mutation rate of E colite so u overall is about 10 to the minus 3 per genome if you're being conservative but not all of those mutations are beneficial in fact about one in 10,000 mutations are beneficial so if you want the u of beneficial mutations which is what you want then you have to divide 10 to the minus 3 by 10 to 1, 2, 3, 4 is that right uh yeah so you get 10 to the minus 7 um and if you multiply that now uh by n so n u is well okay seven yeah so let's say this is 10 to the 8 10 to the 9 this is 10 to 100 okay uh selection coefficients are about 0.1 to 0.01 and so this gives us an ns of about 10 to the 6 so 1 over log 10 to the 6 is 10 to the minus 2 okay so what we're comparing here is n u which we got to be 10 to 100 to be much less than 10 to the minus 2 okay so that's clearly not true so this is even being conservative uh what we get is that for typical numbers that we have in bacteria mutations are really not rare which in a way we've also seen from those experiments that's why we see those clonal interference patterns okay which sort of yeah that's a very good question in general what we mean is a mutation that will uh increase the growth rate of an organism but as you saw in the example earlier today it depends on the context so there can be mutation happening in a specific position that if it happens in a given background it's good and it'll increase the growth rate of the bacteria of the organism but if it happens in another background it'll be bad so I give it the same exactly the same mutation so the same going from A to G in one context can be good and another can be bad so beneficial and deleterious on top of that that's even in the same environment now reality happens in fluctuating environment so in different environments right the environment we find ourselves in is constantly changing so something that's good in one environment can be bad in another so I mean just as a simple thing you know imagine you find yourself in a in an environment where there's tons of one food source so if you develop the ability to use that food source that's good but then you finally find yourself in an environment where that food source doesn't exist at all and now you're really specialized in using this one food source in fact that mutation you got which may said I can only eat this like I can only you know eat milk and then you find yourself in rural China you're you're not going to be doing very well right in an environment where there is no milk okay so this is this yeah very so that's what people are doing now that's a design you know the question of how do how does a changing environment influence evolution this is I'll show you an experiment in a second but it's also theoretically that's what we're doing many other people are doing you know there's questions of matching of time scales of the environment to evolution all of those questions are out there so these are you know this is part of why we don't understand anything but these are all very good questions and this is also why the answer to you know in in simple terms deleterious means bad and beneficial means good but what is good and bad really depends both on the genomic context and on the environment on everything so it's a phenotypic definition it's an effective definition okay so yeah no sorry this this says mutations are rare we're first doing the limit of mutations are rare this is the number this is n of t n of t is the number of bacteria or whatever of it's the number of organisms because you've survived genetic drift right so you've you your population size has grown enough that you're less susceptible to the fluctuations coming from small numbers an estimate yeah so okay no this is yes well the another definition wait so okay the philosophy of this is I did an estimate for the time scales for what happens assuming mutations are rare okay then I asked the question what does it mean for mutations to be rare right how do I define the limit of rare mutations and I based on these estimates a mutation is rare if the time this determinist whatever happens deterministically is much smaller than for a new mutation to happen okay I think it'll become so now I'll ask the question of what is what happens when mutations are not rare so what happens biologically here we had a situation where one mutation appeared and it was left alone until it fixed and took over the whole population now if mutations are not rare what will happen yeah so what will happen is basically before I managed to fix this one a new mutation will appear right so I'll have many mutations at the same time so if this is not true I will have many mutations and that's what we've been seeing on on these pictures where we see these different colors right a mutation appears here but before it manages to sweep this one appears and this one and this one so there's new mutations coming in all the time so we're going to have many lineages right a new mutation forms a lineage and the lineages are going to come in and they're going to interfere with each other that's why it's called clonal interference because a clone is a lineage essentially okay so that's what we're going to work out now okay so before maybe we work out that out now we can still ask a question about this limit about what is actually the speed of in the rare mutation limit so what happens so if we think about this picture a bit differently now if we think about the fitness instead of thinking about time I started off with what I'll define the initial individual that had zero fitness this is the fraction of the population and I got a mutation and I started a new lineage that has a fitness of one right it has it has one mutation so in terms of mutations it's higher in mutations by one and it's better so what will happen is this one will grow and this one will decrease right this one is better so it'll overtake that one as time goes on so at a later time my graph will look like I drew with the green one okay and so what the speed of adaptation is is how quickly you you change from this picture to that picture okay so what is the speed so to calculate the speed you have to calculate the rate of new mutations establishing so that we've already determined right this is the rate at which they overcome drift and once they establish the population grows with rate s it's growth rate or selective advantage okay so that means that the speed of adaptation is nu s squared I'll write it down here in case I delete other things so in the rare mutation rate of new mutations establishing right this is when they overcome drift and this is just once they establish right they grow with this rate because this is their selective advantage so the overall speed of going from this picture to that picture is given by that it's on average right okay I should also say for those of you with a mathematical inclination or maybe a rigorous inclination what I'm doing now is I'm doing a back of the envelope calculations there is a rigorous calculation there's rigorous calculations to do here based on generating functions and they're very beautiful calculations but we just don't have time for them and I want to get to a result if you're interested I can send you I can I can point you to the sources but I am I am waving my hands and I am waving my hands because I have the power of the rigorous calculation behind me so I know what the answer should be okay okay so now let's go to the non-rare regime so if we look at that same picture now we start off with zero mutations we start off with zero mutations and then a first mutation establishes so same thing it starts off small but it starts to grow because it's better right so it starts to grow whereas this starts to decrease but as it grows mutations are no longer rare so a second mutation will come about right and this grows even faster than this one so it grows grows grows but before it and this one you know keeps on decreasing now and before this one gets very far a third one comes about right that's what means mutations are not ready key one happening all the time so at the end of the day and this one keeps so at the end you know so this one will well this one will grow even more so it this but at the end of the day this one will have shrunk this will you know you'll get some sort of picture like this and a fourth one and so on so essentially whoops I lost my chalk you will get something that looks like this okay because the ones at the top grow the most quickly the ones here shrink the most quickly and all of these grow slightly less than the average the arrow should be symmetric but you get the idea right so if you're in the mean we can now renormalize and call this zero fitness and then there's the ones that are better than the mean and the ones that are worse with the mean okay you intuitively see that this happens right and so you get this distribution which actually you can work out from mutation selection balance in the probabilistic set and it's roughly a Gaussian but the other thing you may notice that happens is that this never stops that this is a wave that will move like that because there'll always be a fit a mutation so we're assuming they'll always be a fit a mutation and there'll be a wave that travels through this fitness picture okay so you're gonna get this traveling wave you're always translating to higher make this longer and this is what we can now calculate the speed with which the wave moved so maybe one control question what happens if this population size will increase what happens to this wave to the Gaussian yeah okay so you start you know you establish these things that are more and more fit and the fitter they are the more they grow that's the definition of them being more fit okay so but the fitter they grow since we have constant population size that means the faster these ones have to die and if you it happens with time then you'll find yourself with this mean of the the most individuals are in the mean of the distribution there's the head of the distribution where you get the new ones that are very fit but they just came about not too long ago so they haven't had time to grow and the long the more they are in this direction the older they are so the more time they've had to grow but the slower they grow so this is there this is fitness and this is the fraction of population so you'll find that most of the population is well in not very fit but not very bad because then there's the ones that are very bad which are the ones that were here a very long time ago and eventually they go extinct okay so you get this shape and i'm not proving to you it's a Gaussian but you can you know it depends on whether for beneficial mutations yeah i mean in a limit it's a Gaussian it's more like a Poisson okay but you can you can do the calculation you can work out exactly what this distribution is but it looks it looks like a you know little Gaussian envelope that the important thing is it always it will continue moving there because we're making the assumption that there's no ceiling there's no optimal solution you can always get better so it'll continue moving and if the since it's the mutations that make it move the rate of mutations are constant so it'll move on average at the constant rates yeah but there's a different calculation you can do the same so in if all the mutations are deleterious then yeah then then you have something like that it's called mules ratchets and you have the same same thing so actually the the you can build the formalism and people have with both beneficial and deleterious and the okay right so now if the population size will increase right so what happens to the distribution it gets wider exactly because it'll take more time for you to kill the bad guys and okay what if you increase the mutation rate same thing right also gets wider because now you're going to be producing more and more of them more quickly okay so question what is the speed of this wave in 15 minutes so we're going to break this up into two regimes we're going to break it up into two regimes so the first thing we're going to notice we're going to call the fittest class q okay so we're going to call this distance q and we're going to first assume we know what q is and then we're going to calculate itself consistently but we're going to notice two things that there's in fact two regimes to this distribution there's the bulk where there's a lot of individuals and we can trade the bulk bulk deterministically because there's no stochastic effects here right everything's been around for a while everybody's there probably in fairly large numbers however there's a lot of non-linearity here because there's very i drew it like this but in fact there's very many lineages all the individuals here are not the same right because they came from some different ancestor so there's many lineages interacting with each other and competing so there's high non-linearity but no stochasticity and then we have the head where there's a lot of stochasticity because they have very small numbers they just arose but there's no non-linearity because we're probably dealing with one lineage so we're going to treat these two different regimes differently and we're going to we're going to couple them okay so this is q so then this will have an advantage of q s the head of the distribution now okay so let's assume we know q for now and let's calculate the so the number of individuals in the nose so in this head of the distribution at time t is just like the previous time deterministically given that they've established they have a fitness e to the q s and then they grow with rate e to the q st okay so the question is how long does it take for an even fitter individual so one that has q one plus s to arrive and so we know the answer to that because it's the same thing as we did before it's the mutation rate times the number of individuals times the probability that they establish okay so to do that that they grow to a certain length and we can ask okay for new fitness class to establish and we're going to do right so we're going to ask how long if they're this is the probability of them appearing um dt how long before they reach a significant number and we're going to say that the significant number is one it can be five if you want it really doesn't matter right it's just order of magnitude do like that so we plug in the results for the nose and so we have zero to tau the q s cancels e to the q st dt equals one which is e u to the q s e to the q s tau minus one is equal to one if q s tau is probably larger than one for any reasonable value of q so i can assume this is smaller if you see q s tau is larger than one and so tau equals one of the q s logarithm of q s okay so this is the time for a new fitness class to establish i'm going to mark this with a dot okay and so every tau steps we're going to get yet a new fitter one this is a separate thing just okay now let's try and calculate q from the fact that the mean also changes right because as this wave travels the mean also changes so after q so okay after q of these towels this will become the mean right because we will have clicked q times and what is here will now come here so on average if you're growing at a rate zero here and a great q here and it takes you tau times q to get from here to here on average you're growing at a rate of q tau in the time of q tau over two sorry q sorry q s over two during this time you're you go from being here from being at the nose to being at the middle we're going to demand that every time the nose increases by one the mean has to increase by one so that the wave is really traveling symmetrically and we're going to say well if i start of one over q s individuals just as i do in in the nose after q s t and i'm on average growing with this rate as time goes on so i'm averaging over the growth rate in the bulk how long what's the time i need for me to get to here where essentially i take up most of the population right this is the bulk so i take up nearly all of the population okay of course i take on really n minus the wing the integral of the wings but the majority of the population is still there okay so this time so we can say i'll write it out for you time to go from nose to the mean okay so this is just from solving this equation yeah yeah because we're saying i want to estimate a raft time it takes me to get from being the best to being average okay like you start a new trend you're cool and then everybody else is doing it right this is the time it takes for you to stop being cool on average you're growing in all of this i mean your growth rate here is q here's q minus what q sorry your growth rate here's q times s right your advantage is the number of mutations you have here if you're here then it's q minus 1s q minus 2s q minus 3s and so on so over this time on average you're growing with a rate of qs over 2 because here's 0 so assuming that that's your growth rate on average and you started off with 1 over qs individuals how long does it take you to basically dominate the population and if you solve it then you solve for t and you get that and this by definition is q tau yes okay it's a bad as i said i agree this should be n minus integral of the distino okay but if it's a peak distribution it's not that bad of an approximation look i as i said i'm doing this because i know it'll come out roughly okay but i know i agree with you it's it's a it's a it's a strong approximation okay but you're getting you'll see you'll see why it's not so bad of an approximation you're going from zero to some q and it has to fall off once you see what q is maybe you'll feel a bit more comfortable okay but first we have to calculate it but it's a self-consistent argument so if it was a bad approximation then i would have to go back and i would have to add this and maybe i would have to worry about whether i haven't cheated too much here and there's many points like that so that's um but you know that there's a hardcore calculation of this okay so this by definition of tau has to be equal to q times tau the time because we define the tau as how long a new fitness class establishes okay so the time that we've lost our superiority somebody else has established as a new fitness class okay so then we want so this is another estimate for tau well i'll write it out explicitly this gives us a new estimated for tau which is one over two q s log n q s and then we want these two to agree and from having these two estimates for tau agree we can we can calculate q okay and so from that i get q equal to well why don't i go up there again q equal to two log n q s over q s okay um so this gives us a solution for q so this is okay so this is what the tip of the distribution is so okay the only thing is that you see this is an implicit equation for q so how do we solve these kinds of equations graphically is one answer right and there's another way we can solve them which is even more brute brute force and you're going to like even less but we can make approximations right we can do it recursively so recursively means that we ignore the q uh since q depends logarithmically here and linearly here as a first approximation we ignore the logarithmic dependence sorry i i messed up didn't i uh no no it's okay okay as over you so two is in at zero of order we are we ignore the q dependence and uh we get that and then at first order we take this solution for q here and plug it in here and then get a first order solution for q and so on right and we can do this until it starts to converge but let's stop at zero of order it'll turn out it's already pretty good compared to doing the full calculation and if we look at the in-lab e-coli population of n to the six with s to 10 to the minus 2 and u 10 to the minus 4 that gives us a q of 2 log 10 to the 4 of log 10 squared which is 4 and uh i don't actually know if i have it but 4 is a pretty good estimate for what goes on in in the yeast in the lab okay i'm gonna i'm gonna go a little bit over but then i finish and then we can stop with this and not get back to it so that gives me q so then i can calculate the speed which is just the rate of the growth divided by tau right which is the rate at which this whole thing moves and so i can take tau from either one of these uh so i can take it from here but let me make the same approximation i made over there to get rid of the logarithmic dependence in q and plug in uh sorry uh i have a yeah so i have a q here and a q here so i'm gonna get rid of the logarithmic dependence because i'm gonna plug in my q from over there without the logarithmic dependence so i get one to s log as over u log ns and if i do this so i have one over tau so i have one two s squared logarithm of ns logarithm squared and as i said this turns out to be roughly right and now the punchline of all of this so if we plot the velocity the speed of adaptation as a function of the population size for rare mutations it's linear in n right so these are but for the interfering mutations for the non-rare mutations it's logarithmic in n okay so it's much smaller it's much smaller and if you do this correctly it goes up a bit if you include um yeah so but essentially it still goes as as the log of n so what that means is that if you want to adapt quickly you're actually better off having rare mutations than having non-remutations right and this is like the log of n and this is like n so the question is how can you do better than this if you're a bacteria and you do have many mutations so you're forced to be in the non-rare regime is there something you can do exactly you can recombine okay the answer is recombination in fact recombination doesn't get you as far as here it does do slightly better but it definitely does in speed increase your speed and so this is one of the reasons that people give for the emergence of recombination or sexual reproduction because it increases your speed of adaptation okay yes yeah so what if you have a distribution yeah you can do this with a distribution i mean it's not as easy but once once you go to this generating function framework that's behind all of this you can do that you can put in a distribution of mutation rates the general conclusion doesn't change okay and if you find this call and if you find these fitness waves call you can work out what is the speed of this what is the i mean you basically can write down an equation like a diffusion equation that we've been writing down and show explicitly that this is a traveling wave in the sense of a traveling wave so x minus vt and work out what is v and so on so if you like that kind of thing you can go ahead but it's like a tall different set of a 20 hour lecture to do all this so we won't do it okay so i'll see you after lunch