 Yeah, thank you, Matteo and Alicja, for the invitation. This has been a very interesting and educational meeting for me. So I want to talk about a story mostly of experimental evolution. This is a collaboration with the lab of Ariane de Fisser. And it came out in Nature, Ecology, and Evolution just a couple of weeks ago. And this work has been spearheaded by two former postdocs in Ariane's lab, Martin Schenk and Mark Swart. And Mark Swart actually already appeared in the talk of Sarah Duxbury earlier this afternoon. So the general context that we're interested in are factors determining parallel evolution, in particular, of course, evolution of antibiotic resistance. And so very broadly, there are these various aspects that are important. There is, on the one hand, there is selection as typified here by selection coefficient. So the strength of mutational effects, the distribution of mutational effects, and possibly their interactions. Then, of course, there are the rates at which these mutations appear. So there's the mutation rates. But another important factor that I want to emphasize here in this talk is the size of the population. And of course, all these factors sort of interact with each other. So the competition between mutations that are selected or that are prominent because of high selection effects versus mutations that occur at high rate. This is sort of the theme of the effects of mutation bias. Of course, the mutation rate and the population size together determine the supply of mutations. And there is, in addition, this important effect of clonal interference, which sort of filters out mutations of particularly strong effect in large populations. So all these factors you'll see are sort of interacting in this story that I want to tell you. And the plasmid also appears in the story. So otherwise, I wouldn't be here. But you'll see that the plasmid is maybe not the most important thing. So the model system that we have been working on for a number of years is a well-known antibiotic resistance enzyme Tem1-B-dilectomase, which originally arose as a resistance enzyme against ampicillin. And what our studies and many other studies are looking at are how this enzyme can adapt to novel antibiotics, in particular, how it can raise its low activity against C4-texam by mutations. And we have spent a lot of time on sort of characterizing these mutations, determining their properties and their interactions and so on. But so the experiment that I want to describe was there the scope was a bit broader. So the question here was, how do populations evolve resistance against C4-texam in a sort of more open-ended way? So these experiments looked at the adaptation of E. coli to increasing levels of C4-texam in standard serial transfer experiments over 500 generations. These bacteria had the Tem1 gene on a multi-copied but non-conjugative plasmid. So they did have the Tem1. But as I said, this is a very bad Tem1, which doesn't do much against C4-texam. In order to maintain the plasmid, there was also tetracycline resistance gene on the plasmid and there was tetracycline in the medium. And so in this experiment, resistance against C4-texam could evolve through mutations that could occur either on the plasmid or on the chromosome corresponding to different kinds of resistance mechanisms. And this was what we were interested in, and particularly since we're interested in the effect of population size, these experiments were carried out at two different population sizes. So two times 10 to the sex and two times 10 to the eight. And altogether, there were about 100 parallel lines that were studied. Just a word on the experimental protocol. So I said the C4-texam concentration is increasing. And the way this was increased was to simply increase the concentration bias by a fixed factor whenever the OD reached 75% of the ancestral value. So in this way, we sort of tried to keep these populations more or less at a constant selective pressure. And what you can see in this spot, so here you see sort of the concentration trajectories of the small populations in red and the large populations in blue. And you see that the blue populations are essentially already sort of evolving at the maximal rate, whereas in the small populations, there is more variability. And of course, the small populations reach lower resistance levels than the large ones simply because they have access to less mutations. So this experiment was carried out for 500 generations and then we looked at what had come out. So we were interested in looking at the mutations that were present at the endpoint. And so the sequencing revealed about 1200 mutations. And what was particularly interesting here is that there's sort of mutation of different classes. So there are SNPs, there are different kinds of structural variants. And we decided to sort of distinguish here between small-scale insertions and deletions, so less than one kilo base pair and large-scale variants, which are larger than one kilo base. Among these structural variants, a particularly prominent one, which was actually unexpected was a deletion of the 10-1. So a lot of these populations actually got rid of the 10-1 because it didn't do them any good. And this is sort of a separate story that I will come back to at the end. On average, there were about 10 mutations per line and this number was essentially the same in the large and the small populations. And that already tells you that clonal interference played an important role. So the bigger mutations supply in the large populations didn't mean that they ended up having more mutations, but they just ended up having different mutations. So this is a picture of the panel of mutations that we found. This is of course not easy to read, certainly not in this kind of format, but just to sort of explain a little bit. So here is the chromosome, here is the plasmid. These are the small populations in red and the large populations in blue. And the one thing that I want you to notice here are these red bars on the right-hand side which correspond to the deletion of the 10-1 gene. So a lot of, in particular, the small populations often deleted the 10-1 gene. In some cases, since this is a plasmid that is present in multiple copies, in some cases some of the copies had it and others not, but this was a very prominent event and I'll get back to it at the end of the talk. Okay, so now you can, and so as I said, the thing that we want to focus on now are these different mutation classes. And so this is to begin with a histogram of the number of mutations per line in these different classes, SNPs, small-scale indels, large-scale structural variants. And you see that there are some differences. Most importantly, in the small populations you have less SNPs and you have many more structural variants which is sort of hinting at the main storyline that I want to get to. But since we were interested in parallel evolution, we wanted particularly to look at the similarity of these endpoint clones. So we introduced a repeatability index. So the difficulty here is, of course, that you're sort of comparing apples and oranges. You have SNPs and you have these large-scale structural variants and they sort of occur at the same time. So how can you define similarity on that basis? So what we did was basically to define a kind of normalized similarity based on the overlap of these mutational events. So for a pair of genotypes with M versus N mutations, you look at the overlap between all the events, their sizes, you normalize this by the size of either one or the other. So this index is not symmetric under A and B. So in order to symmetrize it, you just take the average of H, A, B and H, V, A. So this is just a convenient way of being able to determine similarity between mutational events of very different scales. Now, if you do that, then you can sort of quantify the repeatability. So essentially, you take your endpoint genotypes and you determine all pairwise comparisons and then your averages over all the pairs. So that gives you a kind of overall repeatability, pairwise similarity between the endpoints. And you can do that on the nucleotide level or you can do that on the gene levels. On the gene level, you just ask whether a particular gene is affected or not. One very conspicuous pattern here that you see is that there's more repeatability on the gene versus nucleotide level. This is of course expected because it's sort of coarse-grained scale. You also see that repeatability is stronger in large than in small populations. This was also expected because you have more clonal interference or things become more deterministic and you expect to have more repeatability. The really striking and unexpected finding, however, is this picture, which now compares the similarity for different mutation classes. And here I just show the snips and the structural variants. And so here you see that repeatability in the large populations is driven by the snips. So you have a couple of snips that are common and that occur over and over again. But in the small populations, it's much more strongly driven by the structural variants. So this means that different classes of mutations are important for the adaptation in small versus large populations. So where does this come from? So our hypothesis was that the population size mediates a transition from structural variants to snips. And this happens because the structural variants tend to have higher rates and smaller effects in the snips. So this was sort of the hypothesis. And why would that lead to this pattern? Well, this is due to clonal interference. So this is sort of schematically shown here. So if you compare a small population and a large population in the molar class like this and you have two types of mutations, high-rate, small-effect mutations and low-rate, large-effect mutations, then in the small population, the first type of mutation that appears is likely to be a high-rate, small-effect mutation and that will then fix. Now, in the large population, not too soon, no, not too much later, a low-rate, large-effect mutation can appear and this can then displace, sort of out-compete the high-rate, small-effect mutation and so the blue mutation fixes instead. So this is essentially a kind of competition between mutation bias and clonal interference. So you have mutations that are favored because of high-rate and you have mutations that are favored because of a large effect and the balance between these two is shaped by population size. Now, this is a kind of theoretical problem that has been discussed in the literature quite a bit, in particular, a sort of minimal model of this was introduced by Jan Polsky and Stolzfuus. So the minimal version is the following. Suppose you just have some wild type, the genotype and then you have sort of two alternative mutations with rates mu A and mu B and selection coefficients S, A and S, B and let's assume that the A mutation has higher rate but lower effect. So then you can ask what is the probability that A fixes first, okay? This was studied numerically by Jan Polsky and Stolzfuus. Now, it turns out that you can sort of develop a rather complete analytic treatment of this problem based on earlier work that we did with Kavita Jane and I don't want to go into that here. This is work in progress with Switzerland Park and this picture is just supposed to show you that we understand this problem. So this is a comparison between simulations and analytic formula and you should just see that this sort of matches. So what you see here is always that the probability for the high rate mutation to fix goes to zero as a population size increases. And so these are sort of different pairings of selection coefficients and these are different ratios of mutation rates. So here the mutation rates are the same. So in the small population, the neutral limit, the ratio is just one half. Here the ratio between the mutation rates is 100. So you start essentially only seeing the high rate mutations but at population sizes of 10 to the six, you only see the high effect mutation. So this is how population size sort of drives you from mutation dominated to selection dominated behavior. So this is sort of the hypothesis. Now, is this actually true? So how can you sort of check that? Of course, it's not easy for these thousands of mutations to determine their rates and their effect sizes. So we have sort of two kinds of evidence for this. One is that we can look at time resolve data. So we didn't do that for all the 100 lines but for five small and five large populations, we sequence the populations at multiple time points and try to sort of reconstruct model plots of these mutational events. Of course, the information that we have is only at these time points. So these plots that Mark Swarth created should be taken with a grain of salt but this is roughly what we think is happening. And if you do that, then you see that in fact in the small populations, the large scale duplications and deletions which are red and green here tend to appear first whereas the snips which are blue tend to appear later and in the large populations, the order tends to be the other way around. So the snips appear earlier and tend to dominate the evolution. So this is statistically significant. You can see that you can show that the order of events is different in large and small populations. The other piece of evidence and this is work was worked by my postdoc, Sangmin Wang who is now in Paris, was a kind of simulation approach. So what Sangmin did was to simulate use standard right-fisher simulations with three classes of mutations with which all of them had exponentially distributed fitness effects but with different rates and different means strength. And so in principle, this means that you have like the six parameters and there is in principle a kind of map that takes these parameters to the mean and standard deviation of the number of mutations in the respective classes at the endpoint. So Sangmin used the neural network to sort of learn this relation and this then allowed us to estimate the selection coefficients and the mutation rates from the endpoint data and this confirmed the hypothesis in the sense that the SNPs turned out to have the largest selection coefficient, the structural variance is smallest one and the rates also were ordered in the opposite direction and with a very significant difference in the rates between the SNPs and the structural variance. And in fact, these selection coefficients at least in relative terms, I think the absolute values shouldn't be taken too seriously because there are lots of things about this model that are not realistic but the ratios are consistent with an analysis based only on the mic values using a generalized linear model. Okay, so this is basically why we think that this hypothesis is confirmed. So let me finally say a little bit about the 10-1 deletion. So what we observed here is that 10-1 is often deleted from the plasmid unless it is rescued by a point rotation. And so basically you have this costly and useless 10-1 and you either have to get rid of it or you have to mutate it to make it work. And this is one of the large populations where this is happening. So after a hundred generations to a large extent, the 10-1 has been deleted but then there is a kind of rescue mutation that is called G238S. So this is one of the most powerful activating mutations which sort of saves the day and then the plasmid is or the 10-1 is maintained. So why does this deletion occur? So this is apparently somehow related to the construction of the plasmid which has these repeated regions which tend to promote this kind of deletion. We see that this happens more frequently in small populations. And this is again because of SNPs that you need to rescue the 10-1 are more likely to occur in large populations. Remarkably, we do not find that this 10-1 really has a cost or a benefit. So this is these are fitness ratios based on competition essays. And here you see that the 10-1 deletion is essentially neutral. On the other hand, the G238S mutations or this activating mutation has a strong benefit at large enough C4 tax and concentrations and it has a significant cost at low C4 tax and concentration. So it's clear that the G238S is selected for but there is no selection, there's no cost or benefit of the deletion which implies that the reason why it occurs has to be related to the fact that it has a very high rate. And so in some sense, this alternative of 10-1 activation or loss defines alternative pathways to resistance. And so this is again small and large populations comparing the MIG levels or the MIG increases that you achieve if you activate the 10-1, if you lose it or if you maintain it. And what you can see is that activation gives you a significantly higher MIG than if you get rid of it or if you maintain it. So these are sort of, and this is more, this happens much more frequently in the large populations. Okay, so with that I come to the end. So what I have told you here is a story about the contributions of mutation bias and flow interference to parallel evolution. So we saw how the population size sort of selects for different classes of mutational events. Sort of maybe methodically interesting is this approach of inferring mutation rates and effect sizes. And you see that in order to evolve high resistance, the pathway using this term one activation is needed. The experiments were done, as I said, by Arianna Fisser, Martin Frenk and Mark Swart and the theory by Sangmin Wang, who is now in Paris and also my long-term collaborator, Su Chan Park from Seoul. And so with that, let me thank you for your attention. Thank you very much, Harkin. So maybe we have time for some one or two questions, other questions. So otherwise we postpone to the discussion session.