 Part two, I believe it's called. There is an actual title, but hello again. So I thought that maybe we could start, so as I told you last time, the first hour was actually to set the stage for the second hour. And so I thought that maybe it was useful to maybe recapitulate what we talked about last time while we saw some per wise co-culture experiments and then I, despite some disagreement on the model, perhaps what we saw in the data is that when we increase mortality, faster growers were favored. And then when we increase temperature, you can basically reverse the outcome of the competition and you see slower growers being favored. So the second part of the talk is actually trying to extend this prediction to natural world settings. And so what I will show you today is that when we look at the distribution of fast and slow growers in natural setting, we see that it changes predictably with temperature and we think that it also changes predictably with salinity. So the second part is more an ongoing project where actually working on this, and so it's just a few preliminary results, but the first part is actually published and so you can look it up. So I wanted to circle back to the question that Otto asked yesterday and was also part of the discussion. And again, I think it came up also this morning with Terry is that the big question is, how do we cause green communities? So again, and ecologists have tried to do that. Otto showed you these nice cartoon where you can see the different roles. But I think that one trait that is particularly important is whether you are a faster or a slow grower because it's actually something that at least people could measure for bigger organisms and in the lab we can in principle measure it for bacteria. And so I think that for example, looking at these two pictures, if you know anything about plants, you can see that there is one picture clearly dominated by slower growing species. This is a stable forest. And this instead is a grassland where you have in principle faster growing species being present. So these are all like annual plants while these are just, you know, they grow very slowly and then they are perennial. So just to say that the abundant distribution of fast and slow growers is an interest in the script for communities. But the problem is that if you are not in the lab, we cannot really measure growth rates of all the bacteria that we have in the community because well, first, many of them are not cultured which is a huge problem. And second, well, it's not feasible, let's say if you want to do it in less than 100 years. So the question is, are there any descriptors or let's say, are there any proxy for growth rates? For example, in the genome of bacteria. So and a couple of, well, now it's several years ago, a paper came out showing that there is a direct proportionality between maximum growth rate and the number of copies of the 16S gene operon. So different species have different copies of this operon that basically regulates how many ribosomes the cell can produce. And for many bacteria, actually these values are tabulated. There is a huge database. It's called the RNNDB database from the University of Michigan where you can find a list of species, thousands of species for which they have complete genomes. And so they were able to count actually how many copies of the 16S gene operon they can found. And so then they made some studies. This is a study with these are measure growth rates in the lab and they see that they basically correlate decently with the copy number. So we also started measuring growth rates in the lab and trying to correlate them with copy number. It's a decent correlation even for our bugs that are not maybe those that you find. These were bugs that actually were pretty common in the lab. We have some bugs that we find in the soil or in marine environments and the correlation is decent. Very nice. So actually we can use this copy number to try to get one number that can describe the community. And this one number is the weighted mean abundance. Sorry. The weighted mean abundance, sorry, the weighted mean copy number. So basically you can calculate the, you can, you get all the bugs in your community and you can say, okay, this is the copy number and then you assign a weight based on the relative abundance of these pieces and then you can calculate the mean. Yes. Temperature changes, this question comes in, right? Because you take a given organism, E. coli, bacillus, the copy number is not going to change but the temperature growth rate can change more or more. So are you talking about measuring, you know, when you change temperature, what are we talking about here? Are we talking the potential maximum growth rate? But so when you measure growth rate, are you varying temperature? I mean, of course, higher temperature, you have a faster growth. Yeah, what I'm saying is that the way I'm describing the community is just based on the copy number which is a proxy of the maximum potential growth rate. So I'm not measuring actually anything. Yeah, and some optimal temperature. Yes, I know that is different from the realized growth. This is an important point. So basically the idea is that you can have a, for example, a faster growing species that's copy number one, sorry, three, a slower growing species that's copy number one. And so if you have a community that has more abundant bread bugs, it's gonna have a mean copy number that is higher than the mean copy number of these other communities that has more abundant blue bugs that are slower growers. Why do we think we can use the mean copy number as a good community descriptor is because it has been used in several studies and we know that it responds to changes in the environment. For example, so these are studies done with some natural communities. And basically we are describing the mean copy number along the succession. So for example here is these people that is this first paper here plays some nutrient growth in natural environments and they started measuring how the copy number changes and actually you see it behaves predictably as we would expect for succession in plants. At the beginning you have more faster growers and then at the end of the succession you have more slower growers. And they actually measured it also in some real microbiomes. So actually from a sample from the field and you can see this is like these are all post disturb. So this is like a post disturbance, early vegetated, late vegetated. This is after a fire and this is year of soil development. So the idea is that as the community matures you get more slower growing taxa. And so you can see that you can detect changes in the mean copy number and this might be a good description of what's going on in your community. The other study that I wanted to show you is that actually these are aerobic digester communities and in this other paper the scientists supplied manure to the design aerobic digester communities and they measure the mean copy number as a function of how much manure they were putting. And so you can see that the mean copy number responds to changes in nutrients and it does predict and it actually does what we would predict. So with the less nutrients you have more slower growing species and as you increase the nutrients you get more faster growing species. And so I would say that we can use the mean copy number as a good community descriptor. You might wonder if there is any other way to measure the growth rate from genomic data. So recently there has been a new method proposed that is based on codon usage bias. So the codon usage bias is telling you whether microbes use more, so it depends on the synonymous codons that the microbes use to translate their amino acids. And you can quantify these bias if you have the full genome and these people found that the codon usage bias correlates pretty well with the empirical minimal doubling time. So in this part of the plot we have the fast growers you can see that this codon usage bias that is actually quantified only on ribosomal proteins decreases with the empirical minimal doubling time. The only thing I want to show you and I think this is true even for the mean copy number is that in general when we use these descriptors we are not doing a very good job at describing slow growers because here you can see that the relationship flattens but it's true even with the mean copy number because many, many slow growers as the number of copies of the 6NS gene operand they only have one copy but this basically is a huge amount of bacteria and we know that when actually people measure their growth rates is not exactly the same. So just to tell you that this is a decent proxy is not the best that we can have because we are actually doing a poor job in describing slow growers. As a side note slow growers are probably those microbes that they are not very well well characterized because we can't grow them in the lab most of the time. So but the question that we had basically is whether these proposed generic effect of temperature that we see in the lab can help us understand what we see outside of the lab. And this is a project I did in collaboration with Claire that is now post-doc in Stanford and is on the job market. And the only, and so, and this actually let's say we were trying to answer a question that is very dear to me that is can we do some, so what we do in the lab does it matter when we try to explain the patterns that we observe in nature. So the plan for today is well I think I already showed you that the abundance distribution of fastest slow growers can be a very interesting community descriptor. And what I will show is that we think that the genetic effect of temperature can explain something that we see in the real world that slow growers are more abundant where and when temperatures are higher. But also we can see we also predict that fastest growers are favored by increasing salinity and we see this in marine microcosms. Okay so what we're doing here and as I was telling you this is something that I really like is trying to go from simple experiments in the lab and the idea is that we can use the simple experiments in the lab to generate a hypothesis that then we can test, for example using the huge amount of microbiome data that is available. And so what I will show you today is actually a work, a project that we've been doing taking advantage of the amount of data that you can find in databases and thanks to projects like Tara Ocean. So if you remember what was the idea, so slower growing bacteria are more abundant when temperature increases, well if we are in the ocean we can actually test it because temperature is changing predictably along several axes. So we know that for example if we have basically temperature changes predictably during the course of the year, so we know when temperature is going to be higher if we are in the borrel hemisphere. And also we know that temperature changes predictably with latitude so we know where temperature is higher. And finally we also have another axis of variation that is going from the surface to the depth of the ocean. And so the idea is that we should see if our predictions might be true that slower growing bacteria are more abundant in the summer at the equator and the surface of the ocean. And this is a super clear hypothesis. So how do we go about trying to test this hypothesis? Well we started by collecting data from data sets that are available. And we found in total seven data sets that cover the basically report the relative abundance of microbial communities in the ocean across different axes of variation in temperature. And the first one that I want to tell you about and in this map I'm only showing you three and the results that I will be showing you is just on these three data sets but I can show you that these results hold even if we go to the other data sets. So the first one is this green dot here is a time series that has been collected in the Baltic Sea and this Karina and Yaron are two of the scientists that are still collecting this data and our collaborators in the project. So this is a time series that goes on from about eight years they're still collecting it and it tracks the abundance of a marine community and it samples pretty frequently because it's about every two weeks for many years and then it switched to every month but it's a huge amount of data. And the second data set that we looked at is this yellow, is this transit that goes across the Atlantic from basically Patagonia to the coast of the UK and this is from these two papers and during this cruise that crossed the Atlantic they collected samples not just on the surface but also different depths but mostly in the Baltic zone so where there is not that much variation in temperature it's not too deep. And finally the purple dots are instead the published data set from Tara Ocean. Otto talked about Tara Ocean yesterday and you can see that this data set spans a huge amount of latitude, longitude but also depth because they covered from the surface to 1,000 meter. The coverage in all these points is not perfect but there are some, let's say, slices of this data set where you have a lot of points along the depth axis. And so what we did was for each of these tiny, each for these dots represents a community and so we actually, we could calculate a mean copy number for each of these communities and correlated with temperature. So what I'm showing you here, well, we can start with temperature and you can see that across these data sets temperature changes predictably or as you might expect. So here, as I told you before, the Elemon Station is a time series is in the Boral Hemisphere and so temperature is higher in summer and lower in winter. There is some variation but it's pretty much fluctuating in the way you would predict them. And then we have, this is the transit across in the Atlantic so as you might expect, the highest temperatures you have them around the tropics and the equators but there is some variation and this is actually the variation that is due to depth and is mostly in this area here. And finally, this is a slice of the Tara Ocean data set going from 40 degrees south to the equator and this is actually the slice where we have the best coverage in terms of depth and you can see that temperature, well, does exactly what you might expect is higher at the surface then it dips and then you have this change again but it generally decreases as we go down in the water column. What I'm showing you now is how the mean copy number changes along these across these axes of variation and you can see that even if there is variation in mean copy number, generally the lowest values of the mean copy number indicating that you have more slower growers are in correspondence with the highest values of temperature and this is true for the seasonal data, for the Atlantic transit and for Tara Ocean with depth. And again, I think this is, you can see that even if there is variation, even if so, I would say that the mean copy number variation follow pretty nicely the variation in temperature in these different datasets and we can also visualize it in a different way so we can see that this is what we would predict from our previous experiments that the mean copy number decreases with temperature and the other thing I want to draw your attention on is that the slope of this temperature relationship is pretty consistent, almost disturbingly consistent honestly because this is real data and when you see that the slopes are pretty much the same you start scratching your head but the good thing is that at least the intercept of these lines are actually different so which means that the mean copy number is not of course just a function of temperature but there are different things that might be important because these different basins are, so these different datasets come from very different environments so the Baltic Sea is almost brackish and we know that we have more nutrients, yes. Sorry, this is super cool data. My question is about the first plot here. Do you, or maybe this is just a feeling here that sort of weird temperature that you got in 2014 with the LMO? Yes, and the temperature where it never really went up but sort of stuck down and then we have these really long-term effects of very odd growth or maybe also like working at the wrong time scales when we look at communities if it takes it like three years to recover. Well, yeah, that's a good point. Actually to be, so if you look closely to these, to how they are correlated, there is a lag and it's about two to three weeks so and actually you can try to quantify it and you can try to insert some, if you approach it from a modeling standpoint, you can actually quantify these variations, you can insert them in the model I will show you later and you can reproduce a bit. But yes, there is some time lag at least in the seasonal data for how you see a change in temperature and what we see in the mean copy number. The other thing is that there are many other things going on that could change the time scale. But yeah, it's a fair point. Yeah, I had a quick question about the mean copy numbers. Is there a reason or could you provide some intuition why we do not see any values about 3.5 or so or 3.3? Well, because for example in the marine environments, you have a lot of slow growers. So one thing that, so the average is actually, so the distribution of copy number is heavily skewed towards slower growers. So you have a lot of copy number one. That is one reason is, well, we are not very good at estimating the growth rate of slow growers with the copy number or the other method. And the other thing, but they're also very abundant. Maybe less in the Baltic Sea. You see the Baltic Sea is a bit higher because it's the basin where you actually, you can have more nutrients, but yes. So the LMO data is really just from one location, whereas the other two are essentially global, at least they span an enormous part of the globe. So my first intuition would be that the yellow dots and these sort of dark reddish dots would be essentially the same line because you're sampling similar communities from an enormous variety of places. Okay. So I can understand that the green ones are above it, but I'm so kind of surprised that the yellow and the dark red ones are actually systematically different. Do you have any idea why? I might have, well, I don't have only one idea. I think that it also, so this is actually, so the yellow ones was sampled very quickly. If I remember correctly was over a season while the TARA data is collected across several years in different seasons. So I think one possibility is that in this data you also have some effects of seasonality while you have less effects of seasonality on the yellow data. But of course, this one, so when you look at the, so the thing that to me is striking is that you can still recover a pattern from this data that is collected not with the intent of being consistent with season, with timing, with depth. Even the depth is, you don't have at 200 meters, at 300 meters, this is, well, we're throwing this and then more or less this is the depth, but yeah. So the red point seemed to correlate most strongly, which is just kind of looking at visually. But even though I could also look at the data as a sort of two cluster, one at the low mean copy number and another cluster at the high copy number. But anyway, but no, that's the one taken at the different depths, right? And many things are happening. You mean here? Yeah, yeah, yeah, yeah. Many things are happening when you vary the depth. Of course, temperature is one of the things, right? Yes. Anyway, so that the, But it's true even for seasons, right? I think it's true, not just for depth, but for seasons, because you can have for example. The temperature is changing, but many things are changing. Of course. Yeah, yeah, yeah. That's just one comment. But the, and then the blue guys or green guys, whatever the color is, yeah, it's much more spread out. Even though there, you probably have the most direct correlation with temperature. Okay, I'm a bit lost with colors. Do you mean, are we talking about this one now? Yeah, yeah. Okay. Right. Yeah, and so I'm a bit wary about the concluding. This is, certainly you see a correlation, right? But then the blue ones is probably the most direct response to temperature. Well, I mean, given previously you showed the correlation between gross rate and this measure is like tenfold. I mean, the correlation is really at the order of magnitude scale, right? I mean, with the growth rate. That first slide you had when the correlation was drawn, right, that's really a big, big spread. So, and here you're only changing this number by 50%, something like that. Yes, there's a correlation, but then it's really a lot of scattered, right? But then, so what I'm wondering is, of course in the statistical statements, the statistical statement, but what I'm wondering is, like in when you measure this mean copy number, do you take, say, species abundance into account? Yes, it's weighted by the abundance, the relative abundance. So, yeah, so for example, I could have a situation where size level, of course, has one copy that's all over the place, and just kind of varying the amount of this thing that you could have a, you could have a, Hopefully I will respond to all of your questions. Hopefully. But yeah, this is all things that we, you know, you see this, it's a correlation, there are many things going on, then you have a ton of study 11, and so we try to go to be like systematic, of course, this is not 100% proof, we're trying to address all these points because these were also the points that were bothering us when we saw, I didn't believe this correlation for a year, basically. And then we started digging more, and I'm more confident now, or at least, I think it's there. All right, so before going into all the caveats that are, that were raised and they're absolutely fair, I just want to show you a phenomenological model. This is the morning of phenomenological models, but basically, so what we did basically, the simplest thing was to take the Lotka-Volterra model that I showed you two days ago, and extend it to, I don't remember, for like 100 or 500 species, I think we tried both. And so it's basically the generalized Lotka-Volterra model where you have a matrix that describes the competition coefficients, it's exactly the same thing. We still keep mortality on because from what you read in the literature, mortality accounts for a lot of, it's an important factor in the ocean you have. And this is, we consider it again a global mortality because you can have viral lysis, you can have, you can be flushed away by current, there can be many things. So we still have a global mortality. The way we are including temperature as, the way we are modeling the dependence of growth rate on temperature is with a renews. But here, and here we have a factor to this R, is the way we are distinguishing between fast, lower, so we get the distribution of growth rates. And the way we get this distribution of growth rates is, so this R represents the copy number. And actually the distribution that we're using to draw these copy numbers is a geometric distribution, which has actually biological meaning, which is something that we liked, is that in the ocean you actually see that there are a lot of copy number one organisms and this P, actually the P of the geometric distribution represents the abundance of the organisms with the copy number one, which is the highest abundance. And what I wanted to show you here is that if you just use P as a parameter that you are, that you are kind of, you are using for each dataset, you only use this parameter to fit a GLV model to this data. So here I'm showing you all the points of the mean copy number as a function of, oh yeah, sorry, these hundred pieces were supposed to be here, it's there. So as a function of time of the relative and depth, so the points represent the measured mean community, sorry, mean copy number, and the black line is the fit of the model, which for being a phenomenological model, for being as disgusting as you can think about it, it does a pretty good job in recapitulating the data. And you can, again, let's say try to fit, you can fit a linear regression to the data that you see, I'm coming there, and the fitted slope is very similar to the observed slope, yes. I think I have missed like a small piece, I understand how the P comes into play in the full model, like through the growth rate, but then you also have the competition coefficients between the species and the mortality rates, so how, since you say you have only one parameter to fit the model, how are those expressed in terms of this? So it's not expressed in terms of this, so you can basically only, you can use a random set of parameters and the result doesn't change. So the only thing that you really need for like fitting the data, fitting the model to the data is this parameter P that comes from the original data set. And actually here, this is what I'm trying to show you, is that with the colored bars, you see the measured abundance, relative abundance of each copy number in the original data set, and the black bars are the, after the, at the end of the simulations, what the, the abundance distribution of the copy numbers in the simulations, yes. So I'm a bit lost, maybe I just understood something wrongly, but how I understood that you pitched this whole thing is that we can use the mean copy number to kind of infer the potential maximum growth rate. And then you... The distribution of growth rates. The distribution of growth rates. So if you have abundance of the species, why can't you just infer kind of the actual growth rates of different species if you have a time series over time? Yes, you can. I mean, you have the copy number for that. You can follow the copy number, each copy number in time, that's what you're saying. No, the copy number is like the abundance of species or do you only do this over the copy number? I use the abundance of species to, let's say, give a weight to the copy number of that species for each point in time or in space. Yes, and if I have, so maybe this is a super question, but if I have a time course and I know the abundance of species, of each species, why can't I just infer the growth rate from this? I don't think I get the question, sorry. Okay, I will ask you afterwards. No, no, but from the abundance, I can tell you, yes, this species has this abundance and I have the distribution of copy numbers and that's the inferred growth rate. But let's say the idea was to get the description of the community, not just following the abundance, you can follow the abundance of each species in time, but it's just, it's not a core, here you're saying, okay, how do I course grain? How do I describe the community instead of taking the abundance of each species? I have another question. So in this fit. I don't know if I answered her question, but. I think I have an answer, I'll try to answer it. Fair enough. So in this fit, you are using this parameter P in essence to explain the offset between the three points in the right plot. Yes. And does that make sense? What does it mean, sort of? It means that, for example, in the Baltic Sea, you have less slow growers compared to the other two data sets, which are oceanic, which from what we know, there is a lot of cell 11 or very slow growing bacteria, a lot of slow growing bacteria. So I think it makes sense from what we know. I have a question about the model because you say that trying different realizations of the interactions doesn't change the results, okay? So, but this doesn't make the interactions actually useless. So my point is why shouldn't one tries to fit the data with, I don't know, something simpler, like logistic growth or whatever? The interaction, you can throw them away, it's true. Okay, but doesn't seem scary to you? Seems a bit scary, but I think that if you go back to your single bugs in your scammer stuff, it's actually what, so the logistic model with the growth rate as a function of temperature describes what is going on if you grow them in isolation. So I think it's not that scary. It seems that actually it's so general that you don't need interaction. So it's just how each species is fighting against mortality as a function of temperature. I think that it's reassuring rather than scaring. Okay, but scaring maybe wasn't the best word. No, no, but it's fine. I mean, why one should retain the interactions once you see that they are not relevant at all? Perhaps this is for historical reasons. We started with the Lotka-Volterra model and I think that if I show you the logistic model, I think people, I mean, already with the Lotka-Volterra model there was a lot of pushback. If I show you the logistic model with no interaction, I don't know if it might be even worse. Okay, okay, so from, this is more social, I don't know, motivation to retain the alphas. We retain it because for historical reasons. Okay, okay, I'm fine with that. Okay, thank you. But in the paper we actually have, in the supplement, we do it without interactions as well. Okay, I agree with you at this point. Thank you. There was someone, something else. It was along the same lines. I wanted to ask how many parameters do you fit? Do you fit the alphas every time? Do you have a distribution of the alphas after you fit them? If you simulate the model with Monte Carlo, do you, with these probability distributions, do you obtain the same results? Is the relative species abundance? For example, for Tara, fat-tailed, as it is in the Tara dataset, or does it not have fat-tails anymore if we use this model? Does it respect the global properties? I have no clue. So about Monte Carlo, I don't know. The only thing that I can tell you is that Claire, who did the simulations, we have a sort of sensitivity analysis. So you can see for which range of parameters of mortality, interactions, and yeah, mortality interactions, and number of species you can change. You basically change a bit the fit, but it doesn't change that much. That's the only thing I can tell you. Every time all the alphas arise, how many parameters do you fit per data point? Like, how many? Data point, this is because the number of species, it depends on the number of species. I think we had 100 species in these simulations, and so you have the matrix corresponding to all the possible interactions. Because if there are all the alphas... But not every time. So the alphas are drawn once for each simulation. Yeah, because it's... They don't change over time. So they change based on growth rates. It's not that they... Okay, so you have data with time, and you fit once all the time. No, we fit the temperature. So you fit based on temperature, and then of course you have one data point. Then you get, you can plot it this way because we are using the temperature that were measured in the data, and then you get from the simulations the value of the mean copy number that you get based on a matrix of... It's a distribution of growth rates. Honestly, now I don't remember what was the mean and variance of the distribution. I think we have a ton of this in the supplement, but I don't remember all the details. I know that when you look at the... So mortality was an important... So mortality was the one for which it was more sensitive too, but again, it was a decent... So the parameter range that you can use is pretty wide, and you get always the same thing. I think we still have two burning questions, but after that we'll have to let the speaker get back to the next slides. I was just wondering what's the role of this global mortality in the equations because you could rescale everything and then you can just wash out its parameter. Don't you worry, it's just... Yeah, but it's rescaled then in the model. It's just... So it's again... So it's the same model that I showed you the other days. So in the end, what matters is how the... Again, this mortality burden is the ratio between the mortality and the growth rates. And it's on this ratio that depends how the competition coefficient change with temperature. Similar kind of question, but my main concern is on this parameter A, that's the relation between copy. It's a tenfold change in growth rate. I mean, based on the raw data, the correlation between growth rate and the mean copy, whatever the copy number. It is a really big variant. So if you generate a model where this A vary by 2, what will happen? We tried that. So we also have that. So it's true that even in the experiments, the differences in growth rates between the species that were tested were not picked. So they actually measured A in the first paper, the one with Simon and Claire, in which they were looking at the per vice. And again, we tried different ranges of A. No, no, we can't. So we tried, you know, again, a distribution of values. And there is a range of A for which the thing don't change. I can show it to you later. Yeah, possibly. But for a range of A's, you still get the same things. But we use some... So we went to the literature to look at the possible A values. So we didn't actually did it randomly. So like in casey review, there was a range of A's. Yeah, okay. I don't know where I am on that curve. But more or less, when we were checking also the bugs, what we made sure is that we were still in the thermal limits. So all these species, you can kind of have an idea where you are. And all these points are actually like in the range of mesophiles. I know, but actually it matters where I am on the curve, whether I'm a... So if I have the growth rate as a function of temperature, it matters for the result, whether you are... In which point of the curve you are. And there's also like a natural death rate, right? So there's something... In the ocean, there's... No, people say there's a natural turnover, everything in a matter of a day or two. Yeah, the delta. Yeah, okay. But that's roughly the same for everybody, right? It's like a... Yeah, well we don't have the... So if we want to look at the data, you don't actually have the delta for each species. Even when they measure it in the ocean, they say it's more or less 80% with a certain frequency. So the thing is that the simplest thing you could do is just take, okay, this is what they are able to measure. It's a bulk measurement of death rate, and that's what we're putting in the model. And it viralizes. Okay, and I understand that you can change a bit. It can change. But I think that even in the simulations, if you make the delta change with each species, you get in the end the same. And I actually thought with someone at a certain point. And well, I don't remember... Well, I can tell you this later. But anyway, I think that the global delta is what approximates these bulk measurements. No, okay, so you're not going to someone else. All right. Okay, now let's get to the point... Temperature is not the only thing that is changing, which is a very fair point. Okay, and here I want to make a disclaimer before I go into this, is that we can measure temperature very well. We can measure salinity really well. We can measure many things really well. Nutrients is a bit of a pain because in principle, what you're measuring is the... let's say, is what is produced and what is consumed. So it's never the actual measurement of nutrients. So the actual concentration of nutrients is always, let's say, the product of two processes. And so what we're measuring is the product of two processes and not exactly what is excreted or what is consumed. So disclaimer. But let's say for the LMO data, for which we have the, let's say, the most data available, we actually... So they measured both the dissolved organic carbon, the concentration of nitrates, ammonium, and phosphates. And so you can see that each of these variables that are related with two nutrients, that are nutrients, have some seasonal changes. For example, it's pretty clear that the DOC is pretty high in summer compared to winter. So given the disclaimer that I gave you, actually one point is, well, in principle we should see more faster growing species here because there are more nutrients. This is actually what they saw in the previous paper that I showed you where they supply manure and they see more faster growing species. So I think this is a good data set to test actually that prediction. So what we did, so this is a statistical analysis, we just asked, okay, if we take into account all these different variables, is the negative effect of temperature mean copy number still there and is it still significant? And the answer is basically yes. These are the three data sets I showed you before, the temperature, and here what is in this table is the color represents the magnitude and the sign of the parametric coefficient that we estimate from this generalized linear model multiplied by the standard deviation of the predictor so that we can compare among values. And again, you can see that the parametric coefficient for temperature is always negative and significant, but it doesn't mean that other things are not important. So no, no, this is with everything. So every time it's in the model. So these are separate. So I take the LMO and I... Tara is with that, I'm sorry. So the Tara is with... You can specify, you can put that, but you can also take into account latitudes in these models. You can have them, for example, in the random part where they don't give you parametric coefficient, but just how much the slope or the intercept of the relationship is affected by latitude or depth. So you take into account the axis of variation of temperature, but then you ask for each of these variables which one is important, basically. And what this analysis is telling you is that temperature is kind of always important. And also the other thing is that you have to take into account whether the different variables are correlated among them, but apart from this. So you can see that there are other factors that might be important, like ammonium or phosphates, which affect the mean copy number, but the effect of temperature is still there. And I want to show you another two data sets, for example, for which we have, for example, other variables that are extremely important. So like this is a Mediterranean data set. We have a very, very short span of temperatures. There are always high temperatures. And you can see that there is another important factor that affects the mean copy number, that is how many hours of light you have during the day. So what I'm trying to tell you here is that of course there are many other things that can affect the mean copy number, but temperature is there and is actually always kind of exerting a negative effect. All right. And this was one concern. The second concern is there are tons of oligotrophs. So one comment that I received once is, well, probably you're just tracking the distribution of these oligotrophs and in particular is R11, that is super abundant in the ocean. So what we did, and I remember Otto telling me, you should do these analysis. And so what we did was excluding the oligotrophs. So everything that had copy number one from the analysis and actually think that what you can see is that the signal is actually stronger. So the negative relationship between mean copy number and temperature is actually strong. I think it's stronger. Or at least it's still there. And you can still, and this is the same analysis that was showing you before. And in each data set you see what is the relative abundance of the species with copy number one. And you can see that LMO has actually fewer oligotrophs compared to anything else. And you still see that the effect of temperature is still negative. The other thing I wanted to show you is that you can change the method with which you estimate growth rates. This is based on the codon usage bias for which you get a predicted growth rate based on a data set that the authors produced. And here the signal is there, is less strong. It actually becomes stronger when you reduce. So when you don't include the slow growers in the analysis. And I think because especially with this method we are really bad at capturing slow growers. All right. So these were kind of the concerns we had, oligotrophs, different nutrients. And then we started to say, okay, can we see this in other places, not just the ocean? So recently I came across this study that is a study done in compost communities. So in these communities, temperature actually rises because of the activities of the microbes inside these compost communities. And you have this strong increase. So this is the beginning when they're still not doing anything. Then temperature increases and then it starts decreasing over the course of the days. And this is just... They measure temperatures across all these days and they had different... I think there were 10 compost communities. They decided to also measure the mean copy number and you can see that even in this study which actually covers a different range of temperatures. So here we are not dealing anymore with mesophiles but we are actually dealing with more thermophilic species. We still see that the mean copy number decreases with temperature. It's actually a very strong correlation, probably better than what we saw. But it's still actually a difference of one and a half copy number over almost 50 degrees of temperature which is, I think, proportionally the same decrease that we saw in our, let's say, 30 degrees span of temperature in the ocean. And the other thing that I wanted to show you is that this effect might actually also help to explain other things that have been bothering us. For example, the distribution of prochlorococcus and synecococcus. So I'm not saying that this is the only explanation but it's consistent with what we see. So I don't know if you're familiar with prochlorococcus and synecococcus. These are two cyanobacteria that are the most abundant cyanobacteria in terms of biomass across the oceans. And those bugs actually contribute a lot to primary productivity and the production of oxygen. And we know their distribution from quite a while. So prochlorococcus is more abundant in the tropics and the equator and its abundance starts to decrease as soon as we reach more temperate latitudes. And synecococcus, instead, is fairly abundant everywhere but its distribution across latitudes is bigger and is also present at more temperate latitudes. I don't know why I cannot say it. So it turns out that prochlorococcus from measured growth rates in the lab is a slower grower compared to synecococcus. Yes. Just out of curiosity, how does one sample the distribution of these species across the entire ocean? Well. So these are extrapolated data. So they have measurements from cruises across the ocean and then they can extrapolate the data. So this is fair. But let's say they, from the different... So prochlorococcus was discovered, I don't know, many years ago by Penny Chisholm and since then it has been studied consistently and so there are many, many groups sampling prochlorococcus and synecococcus and so if you put together, I think in this paper they were using this data set to interpolate depending on the latitude. What's the difference in lab growth rate and copy number between the prochlorococcus and synecococcus? One pro, two syne. And then they have measurements of growth rates. Of course then it depends on the strain. So let's say the distribution of prochlorococcus is centered around one and distribution of synecococcus is centered around two. And then let's move away from the ocean and we saw that it works in these compost communities and here in this study it's still going on in the Harbour Forest they are hitting plots of soil and then they are measuring the composition of communities and you can see that from, and this has been 20 years of warming, they see that the copy number decreases over the course of the years in the plots that have been warmed. How much time do I have? Three minutes? Four, great. So okay, we had so much fun doing this or at least had a lot of fun. So we said well maybe we can predict something else. So Yana just joined just group and so we decided well maybe we can work with salinity that is another fundamental variable in the ocean and we know that from several papers that salinity might be more important than temperature or any other variables in governing the distribution of bacteria. So here in this paper they examine many, many different samples from different environments and then they did a PCA, that's what you do when you have a lot of data and they saw that actually the axis that was best separating these samples is between non-saline and saline environments. It was super striking. So salinity actually had the largest effect than temperature. So we said well maybe we should look into salinity because it might be important. So and I showed you before the growth rate is a function of temperature. Well it turns out you can also look at growth rates as a function of salinity. This is actually what we are measuring in the lab. So we isolated several marine strains from samples that we got from the ocean and we started growing them at different salinities and you can see that there is a pretty conserved relationship that has been also shown like in textbooks that at the beginning you increase the growth rate but then it starts decreasing past threshold. So basically you can actually, I think you can either use a Renius or Ratkowski and instead of using temperature you can use salinities you can still fit these models to these curves. So given this basically, I'm not going to show you anything of this. I just want to tell you that basically if the effect of salinity is to favor the faster growing species because if you think about temperature if you increase temperature you increase the growth rate if you increase salinity you are decreasing the growth rates. So the idea is that the salinity has the opposite effect of temperature in our phenomenological model with the growth rate as a function of salinity. So in this case we decided, well, we tried per wise co-cultures. We looked into the real data set. We can try a different approach that is using these marine microcosms. So we took an approach that is very similar to what Terry showed you about taking the gold for paper. So we went in different places. We went to the Charles River. We went to another place in Boston and we went to Nahant which is a very nice beach. We collected water in bottles. We filtered it while we concentrated it and then we filtered it to get the bacteria. And then we used these let's say different inocula that we collected in different places to start a culture in marine growth. And this marine growth had different concentration of salts. And we did these daily dilution experiments with the community that we don't know. And what we do next, we can do 16 assembly consequencing and get the final community. And since we have the species in principle, we have the copy number, we also have several growth rates. So we can actually estimate the real growth rate of the community because we isolated a lot of the bugs that are present here. So we can calculate the main copy number. And so this is the results. So we just looked at three salinities, 16%, 31%, 46%. We want to expand a bit more the range of salinity that we can look up experimentally. But let's say even though we have different effects with different, these are the starting inocula. So this comes from the Charles River. The salinity of this community is zero, basically, at the origin, while Naant and the focus here, the salinity is that of the ocean, so it's around 31%. We can see that in general the main copy number increases with salinity. So what I think we are showing here is that basically we found a phenomenological explanation which we can discuss again whether it's true or not, but we see that communities change predictably with salinity and temperature across the ocean and possibly also in other environments. And I think that the bottom line is that slow growers tend to thrive when the environment is more benign in general. So when you have not much salt, when the temperature is cozy, and I think that this is, if you think about it, when we think about many, many patterns that we see in the wild, like why we see increased diversity with the equators, it might be because you actually, these are the more favorable conditions that allows the coexistence of slow growers and fast growers. But this is just a speculation because these models of the described relationship between temperature and diversity say that well, at the tropics the environment is just better. And I think this is actually going in that direction. So there are many reasons why the environment is benign and I think there is a reason why then we can see slow growers thriving. And I think that's it. We will give her an extra three minutes for questions. All right, it seems everything has been asked before already. Dan. Do you need also the charger? Okay, then I'll leave it here. Just so then I can get everything later.