 Great, so we are about to stop before I introduce the next lecture. Let me remind you to check frequently the program for the next two days. So in particular tomorrow we are going to have a colloquium by Ned Wingring on modeling microgrid diversity, and I strongly suggest to attend that activity. Again, a reminder, if you go on the website in the program, you will find a link to register to the separate zoom meeting. So follow the link register and you can follow it and you can also follow it from YouTube. Then tomorrow there will be the first of three round tables on the pandemic also with an interface with economics. We have in a great set of panelists, and then on Thursday we will have the last three lectures and two round tables. So, just to remind you what to expect from the round tables, this is going to be a sort of free informal discussion between the panelists, and there will be time for you to ask questions to express your opinion so it aims at being informal conversation. And that said, I'd like to introduce again, Alvaro Sanchez who is giving the second of three lectures on the assembly and the pollution of my global community. So thank you Alvaro for being with us. And when you're ready, you can share the screen and start the presentation. I think you're muted. Am I muted. Can you hear me okay now. Yes, perfect. Perfect, perfect. Okay. Let me see if I can move this here. Okay, so I'm going to continue what I started yesterday and tell you about this work we're doing in our lab to try to understand the rules that govern the assembly of complex micro communities using enrichment cultures and I just wanted to Before, before I get into today's material I wanted to just give you a brief summary of yesterday's most setting points, just to emphasize what the basic results are, and what it is that we're trying to explain. So a big question that I'm very, very fascinated with is this idea of how reproducible micro community assembly is. This is a question that can be explored in natural systems and in fact it has been explored in natural systems I give you an example of this one among many but this, this one is particularly interesting I believe is a work by Stanisławka and Michael doubly and the collaborators where they were examining the assembly of microbial communities in these water tanks that form within the base of of the foliage of bromelia plants. And again, these are tropical plants, I think this particular study was done in Brazil, looking at water tanks and plants that are in close proximity to one another so that you know many of the physical components of the environment are shared among those habitats, temperature humidity, etc. And, and also the same regional polar species. So when these folks looked at the microbiome of these different plants. They found that they did not contain that they were very variable from plant to plant and most of the plants, most of the ot use where where absent from from each other and only about 1% or less of the of all the ot use were shared among all plants so there was a lot of variation within at this level in this microbiome. Yet when they looked at the metagenome and they examined the abundance of of different genes involved in non metabolic pathways. They found that the fraction of the metagenome devoted for instance to fermentation to respiration to carbon fixation, natural respiration and so on and so forth. Those were very, very consistent from plant to plant. They found very similar quantitative ratios of these metabolic functions in all of this habit of despite the fact that they find substantial taxonomic variation across them to And this, I also mentioned that this same finding has been made in a wide range of other habitats, it's been made in marine environments in human microbiome in other communities, and the list is long. However, it is difficult in natural habitats to disentangle. What are the mechanisms that are underlying these these phenomenon. The, there's many ecological forces that shape the assembly of micro communities. Some of them are governed by chance, such as mutations, the arrival of species into a habitat or dispersal. There's other ecological processes that are more deterministic such as selection, which will you would expect to be a force that will generate more homogeneity in community composition across habitats that are at least those that are very similar. Now you can also have the confluence of the two, which we can call historical contingency, and where it is, you know, there's many different mechanisms that can lead to historical contingency one of them is environmental modification is that there are some habitats that are very similar to one another, once they get colonized that bacteria and metabolic activity of those bacteria will change the habitats in subtle, subtly different ways. And, and that will cause an inevitable level of historical contingency because when you have different taxa different habitats getting colonized by different bacteria by random dispersal those habitats that were originally very similar will become less and less similar. As a function of time. So, what we have next is that the problem to understand this this question of, of reproducibility is that in addition to being many different ecological forces that shape the assembly of micro communities. We also have that the selective pressures that are present in in most natural habits are very difficult to to know right, we, we may infer them or we might try to, to guess what the silky pressure might be through data or through just knowledge of the environment, but there are many and the, if you think about it right we really don't know all of the physical and chemical and nutritional niches that might be occurring in a particular because they're so small right many of these are molecular. And, and really understanding in detail, what are the forces that shape selection in different habitats is something that is very very difficult to do in nature. So, the approach I'm following my lab is we're trying to ask if it is possible to study the process of micro committee assembly in a synthetic in synthetic habitats, where we can know the selective pressures. We can know the chemical composition of the and the nutrient composition of these habitats we know exactly what niches are we supplying right, and we can also know for a fact what is the, the, the, all of the spatial determinants of my community assembly we can fix them. We can know the temperature, the pH, we can control all of these factors. And we can also control ecological factors and ecological processes that continue to connect with the assembly then control the, the, the rate of migration from the regional pool and control what is the composition of the regional pool. We can control to some extent the population size, the connectivity between habitats, and so on and so forth. So, the question we're asking is if we know all of the things that we do not know or are very difficult to know in natural environments, can we then understand the, the more mechanistically the origins for these patterns of convergence at the functional level, but the various and taxonomic level that are found across a large number of natural habitats right so that's what my lab is trying to do. And, in other words, we're trying to more mechanistically understand the reproducibility of my community assembly. This is just an idea of what are the experiments and experimental pipeline we're, we're, we're following in my lab. We're using high throughput in which in communities. We take our purpose samples from the environment for example take soil samples. We stick that in a bottle with a water saline solution. We filter the this, this to eliminate all the larger particles that were left only with the bacteria. And then we sample from that large bottle of bacteria into a small miniature test tube, and with the fine synthetic medium. And, and here we are supplying the nutrients to the bacteria that are that we have a lot of control over. In today's talk those nutrients and the first part is going to be glucose later on, we're going to tell you what the results we get when we use other resources as well. And what we do is that after we inoculate by random sampling from there from the regional pool we let the bacteria grow for a period of typically 48 hours and after that period of growth, we apply a bottleneck. We take a small sample from here and add it to a new new new little test tube where we replenish the nutrients and we let that grow again, and we can iterate in this process and I showed you. And at the end of every 48 hour period we are doing 16 sequencing to quantify the composition of different species in our communities. And I showed you also the results of a typical experiment. This is the relative abundance of different genera in one of these enrichment communities as a function of time as function of the transfer that we're looking into. This is a relatively shallow sampling where, if you think about it, we just kind of sampling 10,000 random individuals from its community and identifying them. That's essentially what we're doing. And the different colors here each color represents a different, different genus and the width of this column represents the relative abundance of that genus in the community. We're seeing this plot after about maybe eight to nine transfers community composition stabilizes and becomes quite constant through time. So, what we are asking here is, is how reproducible community assembly is. And then the main question is, if we do the same experiment multiple times, do we find the identical outcomes that is the ultimate question we're asking. To answer it, we, what we have done is that we have inoculated multiple replicate communities from the same regional species pool. And we, we can take, say eight different test tubes, we fill them all with the same, with the same medium right put them in the same incubator, and we inoculate them from the same species pool. So, what we find is that when after we propagate those eight communities in identical environments for about 80 generations or so, 12 transfers, and we examine them composition of these communities. We have found two things. So one is that we see very strong convergence at the family level right all of these wells contain essentially this very similar ratios of two main dominant families, but very much more variable community assembly when you look at it at taxonomic level of genius or lower right. So, this is reminiscent to what we have observed before in in the studies that was mentioned before right, there's a lot of variation at the species level, but a lot less variation and much more convergence at the level of family. Well, that what they were saying actually some families is function right so our ideas and perhaps the two are connected that this family level converges a bit of a serving is reflecting a convergence at the functional level to. And just to give you an idea of what this looks like when we repeated our experiment with 12 different regional pool species. What we find is that the, this is across 12 different inocular eight replicates for each and I'm plotting the family level composition of the communities for all of those experiments. And you can find is that there's a very strong reproducibility, most of them have a dominated by the same to dominant families in blue it's in terms of bacteria Asia, in red soon one of the Asia you have other more rare families that are appeared in some but not other of the communities. And even the quantity ratios of these two are quite similar. I just wanted to give you another example of this kind of phenomenon of family level convergence. And this is work by my colleagues, the Connecticut agricultural station, Tim Blair, and, and, and try saying and when you, you find this in the study what they did is they looked at community assembly in the stigma of the flowers of the apple tree. They looked at the communities that assembled on a hundred different flowers. And one of the things they found is that, again, they find fairly consistent community assembly in in this habit that's at the family level of taxonomy with interactivity again in blue and in some of the Asia in red, which are finding all of these flower communities. Yeah, they find a lot of variation at the species level. If you look at the, the different ot use with the sumo and Asia and the, and the use with the interactive easy find that even though the family level this community is a very similar from one another, they contain different specific members of its family right that this result again is consistent with this, this pattern, but in this case again they're finding this pattern not at the level of function but at the level of family, which is what we did. And I'm just plotting these two things side by side so you can get an idea of the results are, but of course they're not identical but these environments are very different right we're talking about a flower and a, and a sugar based in synthetic medium. It's interesting because the, we think that the environment that this Microsoft experience in the flower is at least the nutritional environment is not that different from what we find in our experiments. Anyway, well this is always painting a picture that that presents a number of questions and the first one is why do so many species coexist on a single limiting resource and this was the subject of yesterday's this lecture. Second is, we will, we need to understand this wise community assembly so convergent the family level. Again in our experiments with some family of convergence in, in, in others they found functional level convergent right so we, we need to try to understand that better mechanistically, and in tomorrow's lecture I will be talking about why community assembly is so variable at the species level. And again this was yesterday's, and that's going to be today. So today's what I wanted to, to, to go through is the work we've tried to do, I tried to understand why, why we're finding these results why do we see a community assembly so convergent at the family level. And matching this for the results that I would have been, you know, hammering to you yesterday and today. Our first guess is that this is actually reflecting some form of functional convergence. And to test that idea, what we did is very simple. We took 13 or 14 of this kind of an exact number of the communities that we had, I had shown you before that we have assembled in glucose as the only carbon source we took those communities. And we, we did dilutions and we spread them over agarose petri dishes. And we spread them at very low densities of cells, so that we could. And when cells would grow on these petri dishes, they would form little colonies and sell from different species from colonies that are different from all of them. So we could take those, those colonies, then we, we sequence their ribosomal DNA to to understand who they were, what was their identity and compare those with the 16s ribosomal RNA sequence that we have previously obtained from community level 16 sequencing and the two matched very nicely so we were able to isolate a large number of the members of 13 different communities for a total of around 100 different terabactoesia and terabactoesia strains. So we're able to take the numbers of those communities and separate them and grow them in isolation from the community where they were isolated. And what we did then is that we, the first thing we wanted to do is we wanted to ask whether the pseudomonasia and terabactoesia, which is these two families, would differ at the family level in the growth rate in glucose. Because it's the only carbon source we're supplying and that in terabactoesia is about two to three times more abundant than pseudomonasia in our communities. So we wondered if that was because in terabactoesia is actually better at growing in glucose than pseudomonasia is. And so what we did is that we took each one of those isolates and we grew them on media containing that is exactly the same medium where our community have been assembled. This is minimal media. For those of you who know or are knowledgeable about this, this is M9 medium with glucose as the only carbon source. And we determined the growth rate by just looking at the growth curve over a period of about 48 hours. We determined the maximum growth rate of the terabactoesia strains and the pseudomonasia strains individually and we're plotting them here. In blue, I'm showing the growth rate of the terabactoesia. In purple, I'm showing the growth rates of the pseudomonasia. And as you can see, the terabactoesia grows better and reaches higher growth rates than the pseudomonasia as a family. So what we find is that at the family level, there's conservation in the growth rate of in the supply resource, which may explain why so interabactoesia is observed at higher abundance of pseudomonasia. The next thing we did though is I told you yesterday that an important component of coexistence in our communities is metabolic crossfeeling that the interabactoesia when they grow in glucose, they are known to secrete various metabolic by products, including acetate, succinate, lactate, as well as others. And these three are, we use mass spectrometry to determine what are the most abundant byproducts. And these three are the ones that came out as being the dominant ones. And one of the things that is interesting is that we find that the pseudomonasia here I'm showing is the amount of acetate produced over a period of 48 hours as a function of time for our collection of isolates. And in blue I'm plotting those results for the interabactoesia and in purple for the pseudomonasia. And you can see that as the interabactoesia grows on glucose, the amount of acetate they produce, particularly after 16 hours, and this is the relevant timescale, I'll tell you why in a minute, is very conserved, right? All of this bacteria produce very similar amounts of acetate in this habit. Now the reason why 16 hours is the most important timescale is because this is the time that it takes for glucose to be exhausted. And so what happens between this and that is that the single strains that are growing in isolation ended up then metabolizing some of their secretions. And in many cases we find, for instance, for citrobacter, which is these guys that you see over here, that acetate secretion continues for a little bit longer. And that's probably a byproduct of organic acetate metabolism because the glucose has been exhausted by that time. So the relevant timescale, which is what happens with the glucose when they consume it, is that all of these, all of these interabactoesia produce very similar amounts of acetate, which is the dominant metabolic byproduct. They also produce similar to the less so amounts of succinate and lactate. And what you can see is that there's a conservation on of quantitative niche construction for all of these species of interabactoesia over with respect to my comparison with the seromonas. And what's more is that we find that when you plot the amount of acid that is being produced as a function of the maximum growth rate that these bacteria can attain, you find that there's a fairly strong correlation between the two. And this is something that has been documented before for other interabactoesia, such as E. coli and salmonella, which is what the faster they grow, the more acetate they release. And that is because more and more of their metabolism is shifted to fermentation rather than respiration, even though it's really called a flow metabolism. But more of that of the metabolism kind of stops acetyl-CoA and then leads to peri-refermentation and secretion of acetate and less and less of the glucose flow is directed to TCA cycle and to full respiration. So what that leads is that the faster bacteria needs to grow, the more overflow we'll have to do. And that leads to a strengthening secretion of acetate and other organic acids, the faster these bacteria grow. And that is also observed, its point represents a different strain. These are for many different genera of the interabactoesia, including rautela, citrobacter, interbacter, plexiola, seracea, and others. And you can see that this kind of trade-off between yield and growth rate is applicable to broadly to all of the strains of interabactoesia, regardless of the specific species or genus they are. By contrast, pseudomonas doesn't engage with that as is known, that overflow acetate production by flow metabolism is not documented to occur in pseudomonas, which is primarily a respiratory bacteria. Okay, so it is believed that this trade-off emerges from physiological constraints in protein malocation, but that's a subject that is beyond the scope of this talk. But the point I wanted to make with this slide is to illustrate that this amazing picture from the two slides I've shown you before. First is that the glucose is selecting for fast growers and strong growers and glucose. And the faster, and we find that interabactoesia grow faster than pseudomonasia. We find that the faster the cells grow, the more acetate they produce. So strong selection for fast glucose growers leads to bacterial secret organic acids, and which do so, the amount of succusion are relatively similar. The variation you see here in the way access is not that high, but it's around 10 plus minus five. But also we find that that leads to the secretion of organic acids in a manner that is correlated with the amount of growth you have. And the next thing we did is we repeated those first measurements of growth rate for all the other dominant fermentation by products that are released by the interabactoesia. Acetate, succinate, and lactate. And what we find is that in this case, the interabactoesia grows on average less well on those byproducts than the pseudomonasia does. So we find that both for acetate, succinate, and lactate, the pseudomonas grows faster on average than the interabactoesia. So we also find that in addition to there being a phylogenetically conserved growth of the bacteria in the supplied resource where interabactoesia grows better than the pseudomonasia in glucose. There's a convergence in each construction, and there's also convergence at the level of growth in the byproducts of glucose metabolism, where in this case the pseudomonasia grows better than interabactoesia. And I just wanted to show you this, this is for the evidence for this thing I'm trying to tell you, let me see if I can move this here. What we find here is I'm showing for a specific community, the concentration of glucose as a function of time, and the ratio between pseudomonas and interabactoesia. And this R&F ratio will, I will explain a minute, but this is the ratio of pseudomonas to interabactoesia, right? So you find we see that glucose is depleted, and after 21 hours there's none. It's a little bit earlier than that, but in this plot there's what we'll quantify. And acetate goes up as glucose is being depleted, picks it around 21 hours and then drops to zero after 48 hours for this one community. And at the same time we find that the ratio between pseudomonas and interbacteria declines in this first phase when glucose is being depleted. And this is consistent with the idea that the interbacteries are consuming the glucose. And after the, in the last 20, you know, 23 hours or so, 27 hours or so, you find that as the acetate is being depleted, the ratio of pseudomonas to interbacteria in this community goes up, right? And that is consistent again with the idea that the pseudomonas are eating this organic acids primarily. I mean, again, this doesn't mean that none of the glucose is being eaten by the pseudomonas or none of the organic acids are eaten by the interbacteriesia. We have every reason to believe that actually both of those things are happening, but the primary consumers for the glucose is the interbacteriesia and the primary consumers for the organic acids are the pseudomonas. We've done other experiments that confirm this point. We have truncated the growth time at 24 hours and when you do that, the pseudomonas goes away and the community becomes entirely dominated by the interbacteriesia. And with organic acids that accumulate and no one eats them. So, you know, we have substantial evidence that that's what's happening in our communities. Let me see if I can put this back in here. Alright, so let me give you a summary of what we think is happening in our, in our, at the functional level. We have these communities that assemble into very similar ratios of these two dominant families, interbacteria, in blue and pseudomonas in red. And what's happening is that the interbacteria are being selected for by the glucose because they grow faster. And as one, one thing we observe is that the faster they grow, the more organic acids they produce. So, as these bacteria and interbacteriesia growing on the on the glucose, they are releasing organic acids like acetate, succin, and lactate. And as they accumulate the environment in the second half of the incubation time of this 48 hour period, the environment is no longer a group of environment. This primarily an organic acid environment. And in those environments, the pseudomonas has an advantage. And, and that's what we're seeing it here. So, the interbacteria are occupying a functional needs, which is that of respiratory fermentative bacteria that specialize in the glucose, whereas the pseudomonas are occupying a functional needs, which is that of respiratory bacteria that specialize in organic acids. And we call this respiratory functional group R, and, and respiratory fermentative group F. And the, the coexistence between these two groups is stabilized by a metabolic pulse feeding that primarily goes from one direction to the other, although that's only the first order factors also and we have evidence for that too. And all other byproducts are being released by both. So it's not that we're saying that the only thing that happens is that the glucose goes entirely to one and the other. We know that that is not true. We're just describing a first order effect. And, and what is the primary consumer of each of the two niches. So, all of this, what shows to you is that this family level conversion will be observed, and does represent functional convergence, right. And, and it is so through the evolutionary conservation of the relevant functions for growth and fitness in our environments. So the next question we wanted to ask is whether it is possible to take this ratio between our respirators and F fermenters that we're seeing here and explain that quantitatively from the known physiological and biochemical processes that are occurring at the serial level. Again, we, we get a ratio between the monas and interact area that is around point 27. And that is the ratio between respirators to fermenters and other systems. And to see if it's possible to recapitulate that finding what we have been doing is genome scale metabolic models that are based on flux balance analysis. And I'm not going to go into details of how FBA works, but suffice it to say this is a genome scale metabolic model where you could take a metabolic network. As you can take, for instance, the network of bacterium like E. coli or you could make as we did for this particular project we built a super metabolic network that contains all of the known by metabolic reactions in prokaryotes and put them together into a big matrix. What you do then is we you you give that network and input, which is a set of nutrients. And with flux analysis, what you do is you calculate the vector metabolic fluxes that would optimize growth. This is optimized a given biomass function that you give the model in that environment that you're supplying. What the FBA will do is we'll find what is the vector fluxes that will maximize growth. And, and that vector fluxes will give you an output which is the amount of biomass produced per per unit molecule consumed you can put it away. And also an output which is the byproducts that are released in the process of growing optimally on that substrate. So what we did then is we, we looked at a, we took two models, one of an enterobacteria and E. coli and another one of a, I think this is wrong, this is actually P. butyta. It's a, another of a sort of monos is in this case P. butyta. What we did is, is very simple. We took and calculated what it would be per group of molecules, how much biomass of E. coli would be produced, right. And how much acid it will be secreted. And now we took that the acid that had been secreted and fed it to this model of P. butyta and calculate how much biomass will P. butyta produce. And this was by taking basically this off the shelf models. Evaluating their growth. And what we find with E. coli and P. butyta is the predicted ratio of P. butyta to E. coli biomass per glucose molecule that enters this, this, this traffic chain that we created was actually around 0.3. So if all the glucose goes to E. coli and not, and all of the acid, acid it goes to P. butyta, right. But even with these very simple assumptions we get a ratio between P. butyta and E. coli that is very similar to what we found before, which is around as being the average ratio of E. coli and P. butyta, which is 0.27. And we repeat the same exercise for a large number of previously well-created models, metabolic models for both enterobacteria and seumonidasia. And for each of those models we calculated the, we did exactly the same exercise that I showed you before. And we calculated the ratio of biomass of seumon as biomass to enter to enter bacteria biomass. And the, the, here we're plotting every possible pair of those 100 models. And you find that actually that value of 0.3 that we just found before is not an outlier. It sits very, very close to the average ratio that we find between seumon as an, an enterobacteria biomass. And so by comparison we show you here the ratios of the monastroenterobacteria in all of our experiments. And the two are fairly close, right. This is really not a prediction per se, what we're trying to argue here is that one can explain these ratios from very simple arguments and assuming that you're just simplifying assumption that all of the glucose goes to the enterobacteria and, and all of the bipartisan go to the seumonidasia. And when you do that as some that simple exercise what you get is something that seems very similar to what we find in our experiments. Right. Okay, so now that the last thing I wanted to show you is results that we've collected a while ago. And it's that this paper is not being written at the moment. And what we did in the same experiments, but a large collection of other sugars. And here the question was, okay, we found that this ratio of around 0.3 of seumon as to enterobacteria or respires to fermenters. And we find it for glucose, right, but how different would it be if we had done, used other sugars, what if we had used, I don't know galactose, just another hexose or ribose, just another sugar in this case of pentose, or, or any other sugar alcohols, like like inositol, inositol, manitol, or blissful. So we did that experiment, we repeated this, this enrichment experiment that I've been discussing before, using two different inocula. These were two different potted plants in, in, in, in Josh's house. And from that, from that soil, what we did is we established communities on environments that contain either one of these sugars in isolation. Again, not all of this mixed, but one at a time, right? And here on the, on the, on the X axis and plotting the identity of the, of the sugar we're adding in, on the way axis I'm going to plot the ratio between respires and fermentative bacteria. This dashed line marks the average of the glucose communities that we have seen in the, in, in, you have been showing you before. And this, this gray zone here is 95% dispersal around that mean for the data. When we, when we plot this data on this plot, what you find is that for all of the sugars, the, we find a very similar metabolic structure, right, that the ratio of respires to fermenters are extremely close to the value that we found for glucose, and there are some outliers, but by and large, the results are very, very consistent. And just as a control, we wanted to see if, okay, what if we don't use a sugar, right, what if we use a, a metabolite, a nutrient that cannot be easily fermented and would be more likely to be respired. And then we use a collection of different organic acids, many of which are fermentation by products, but others are just, you know, components of the TCA cycle, we also use some organic alcohols and a bunch of other, other nutrients, and, and repeat the same experiment from the same to inocular but now we assemble them on each one of these collection of, of nutrients. What we find is that the ratio of respires to fermenters when you do not use sugars is very different from what you get when you do use sugars. And this experiment is quite interesting. For instance, I think I look particularly like this data, this here is part of it, which is an organic acid that sits in between glycolysis and the TCA cycle, right. It's kind of intriguing that you get something that is kind of a transition in between the, the, the kind of ratio between respires and fermenters that you see in the sugars, and this cloud that you see here when you use organic acids. Okay, so I guess this plot would make the case that when you have similar nutrients you are expecting to see similar, similar community compositions. But can we more quantitatively define what nutrient similarity is? I mean, we say the sugars are similar and organic acids are similar to one another, but I'm basically waving my hand so far, right. Can we make a more precise definition of how similar to two different nutrients are. We again resorted to flux balance analysis and what we're doing here is very straightforward. We're taking a, again, our metabolic model and we're feeding it different nutrients, right. So for instance we're feeding one nutrient and we're calculating the vector of metabolic fluxes and we can implode this vector in that space of metabolic fluxes. So now we could take and through the same metabolic network, I mean we can, in this case what we're doing is we're constructing a universal metabolic network that contains all the biochemical reactions, metabolic reactions that are known. We're asking what would be the optimal way to metabolize its nutrient, right. So this is the vector of metabolic fluxes for nutrient A, this is the vector of metabolic fluxes that would be optimal for another nutrient B, and the distance between the two can quantify how different those two are, right. So if two nutrients are very similar, then they're going to be metabolized optimally in very similar manners, right. And if they're very different, they're going to metabolize in very different manners, right. For instance, glucose and galactose are metabolized very similarly, and they're going to give you a very small distance between the two, whereas glucose and, I don't know, losing are going to be metabolized more differently. And they're going to give you a larger vector here, a vector with a larger distance. The next thing we did is we took this library of carbon sources that we had studied experimentally, and then we calculate the metabolic distance between all of them. And the first thing that, and then we did the simple hierarchical clustering and one of the things that was very reassuring is that the results we got from this exercise made our sense. We find in this dendrogram that all the sugars are plastered together. All the carboxylic acids and organic alcohols are also plastered together. We also find that within these two groups, their structure, here you find in this neighborhood around here, most of the hexoses and all this glucose containing the saccharides, and here you find most of the sugar alcohols and the pentoses. And here's the same thing, right? In this cyan group over here, you find all the TCA cycle intermediates, these are organic alcohols, and so on, right? So even there is within this, the structure that makes sense, or that one would expect perhaps that those nutrients would be more similar to one another or less. Right, so then when we ask this to what extent this similarity between carbon sources can predict quantitatively similarity in community assembly in those nutrients. And we find is that it does actually do a very good job. So here I'm showing you the composition of the family level for all the sugar and sugar alcohols. Here now in blue, you find interactive Asia and in green, this orange guy here is alkyliginasea. And when you compare that with organic acids and the alcohols, you find that it's quite different. In most cases, interactivity is gone and this is dominated by radioactive bacteria like sulmonadasea, alkyliginasea, and others. But even within the sugar structure, in this first group that are made by glucose like sugars, we find very only very rarely that you find alkyliginasea. Whereas in this other group of pentoses and sugar alcohols, we find that alkyliginasea is much more common. And in fact it's interesting that here in galactose when it isn't actually this is probably just a misclassification because of the nature of the hierarchical plastic algorithm because the galactose is a hexose. The same is true for the organic acids and the alcohols. If you look at the the TCA cycle intermediates you find that all of them contain interactivity here in blue. And even within those there's even structure. So this group here is formed by fumarate, malate, and succinate that they enter side by side in the TCA cycle and all these three carbon sources recruit rhizobiasia and they are the only ones that do, right? It's an endemic species for this group of carbon sources. Likewise, there's endemic species in this group here. This group is not found anywhere else and also this convergent community composition within this other group as well. So another question is whether the distance between nutrients can explain distance between composition of the family level. And what we decided to do is just plot once, once against the other, right? We can, at the family level you can also have the communities assembled in nutrient A and the communities assembled in nutrient B and you can calculate the family level composition for both. You can calculate the distance between them, right? And when we did that, we were plotting here the Euclidean distance in metabolic fluxes against the Euclidean distance in community composition. And what we find is that there is a fairly decent correlation between the two. And this is just a very crude measurement. I mean, we're trying, not trying to extract, do smarter ways to extract information that simply plotting the entire distance of the entire metabolic space. Maybe there are some components that are more telling of how different two, two nutrients are that simply the entire distance of metabolic fluxes and to see if, if there's more signal that can be extracted this way, but this is where we are at the moment. We've been also using machine learning algorithms to, to see if it is possible to predict what's going to be the communities assembled in a new carbon source from a new, from a new noculeum. So just very basic doing cross validation at this moment. And what we find is that the, the, the, the, the, for the families that are most represented like interactivity, ACA and so on and ACA, there's a very decent. The very simple machine learning model is capable of, of, of predicting the family level composition. The same is true for, for function, which is even better. Right. So if you now group the, the taxonomic composition by whether they are fermentative or respective bacteria. And then again, you train a model with one inoculum and with a set of carbon sources and then you investigate what would be the expected. Community composition from another inoculum in, in, in one of the covers that you left out of the training set, what we find is that it is possible to measure fermentor to we get results that are make sense right and that the model is predicted to some degree. And I mean this is all very plain work, but I just wanted to give you a sense of where we're going with all of this. Okay, so what I wanted to tell you today is, is what we think are the reasons behind the fact that community assembly is so convergent family level. And what we have guessed that this could reflect a functional convergence because people have seen metagenomic convergence. When being by metabolic function right so we had original guess was that family level was representing the evolutionary conservation of metabolic traits that are functionally important in our in our ecosystems and that's why we're seeing the signal at the family level. But because our environments as a simple one is capable of resolving these questions mechanistic, and that's what we've been trying to do. And we are finding that when you have sugars and not only glucose value of the sugar, you find that there's a very predictable ratio of bacteria that are preventative that are that are found that especially in the sugars. And there's another group that are respiratory that specializing in consuming the by products released by the former. So, one of the things that that is interesting about this result is that when we tend to think of metabolic traits, in particular the consumption of carbon source of of of of nutrients that are based on carbon. The previous work has found right that this is a highly variable trade this it's a tree that is not supposed to be considered a family level right. So if you look at, for instance, by contrast, there's other metabolic traits like the users of specific sugars that are much more deeply conserved evolutionarily. And I just want to give you one example right this is this is an experiment that was done with the bacteria me call I that was grown in some missing the reference here for some reason, that it was grown in in a collection of different nutrients. And it was assessed whether he call I could grow or not right. And this this authors took a I think it was 150 different strains of the call I am close to 100 different nutrients. And they just measuring whether growth or not growth in each one of those. I'm just doing it because this this plot is a bit of a mess. But here you have for instance, a subset of these of these trains. And you have serine raffinose and sucrose which are three different nutrients and as you can see, some of those trains can grow on serine but others cannot this can the others cannot write the same thing is true for raffinose sucrose. And, and most of these other carbon sources on this group here, only seven of the carbon sources test that were able to be used by all of the he call it. Okay, so what these suggest is that metabolic traits are typically thought of as being shallow right and that it's easy to for for bacteria to gain or lose the ability to metabolize a specific substrate. And so that's kind of seems to be contradicting what we found right there's family level consideration of this of this metabolic traits, but I just wanted to bring up again that the what we've been studying is not the ability for my group to use a substitute or not use it, but rather how good that microbe is at using it right we're measuring quantitatively how much about you can how well about you can grow in a given substrate, and what my products are being released, and how well those other bacteria can grow on those by products is not just that you can use glucose in fact we find that glucose can be used by both their activities sense of monadiesia. It is how well you can go on those that determine whether you can find that environment or not, because, again, as you may also very intuitively think that's because the species exhibits a trade doesn't necessarily mean that that trade is going to be relevant for the ecological role of the species place in nature right, and, and that is true for bacteria as well, and I think this is making a case for us, measuring quantitatively. The, not just the the basically to measure that they realize needs right and to quantify how competitive micros are on different carbon sources, if we want to really understand whether a micro will be found or not in a given habit. And that's it. Again this work was done by the amazing people in the lab. And I think we may still have time for some questions so should we have time for questions. Yes, there is one from Sylvia. Yes, hi. I wanted to ask a very general question on the patterns that you showed us in the previous lecture and at the beginning the fact that the functional composition and the family composition are constant. So I was wondering, can we compare these patterns with a null model that would tell us that we would expect more variability in the absence of a mechanism that brings this constant constant. Right. Yeah, the question is what would be the normal model, right. I mean like, yeah, I mean like, you, you may imagine multiple different models for community assembly right you could have neutral models for committee assembly you could. You could assume that all nutrients that basically that all of these species are equally good at eating both all the nutrients that you that we're providing them. So I guess, yes, I mean it is completely possible to compare the expectation to, to know models right. If you just sampled randomly that I guess that the simplest one is that if you say that you sample randomly species from the regional pool and put them together right and in our. The question you may ask is then, do you see the patterns that we observe. And the answer that is no you don't write that there is no family level conservation of any kind. And in fact, depending on what new thing you add you're going to find different different species on each right so that that really tells you that that there's a very strong selection by shaping the assembly of these communities. So you compare this to very neutral models. And then of course, if you if you're trying to compare these patterns to other no models I guess, like I go with no models ready, the model will needs to reflect a no assumption or no hypothesis right. So, so you would need to define what that no hypothesis is. And, and then, and absolutely you could, you could create a normal model that that predicts what that expectation would be. And, and you could quantitatively compare findings with that. Yeah, or the parents be upset with that model. But for the one, the very simple ones which you may imagine it's just simple, random, randomly drawing is this original pool and asking whether the same is true that the answers is right. We have checked and that is for sure not the case, but as for other no models that again that it would it would depend on what the, the, the no hypothesis that you're trying to to this previous. Okay, thank you. So then, a question from Kiseok. Hi, thank you for a great talk. And my question is when you're constructing the, when you're doing the simulation with the FBA model and the experiments with this metabolic model for like to do on us and enter about it. Okay. Do you use like 100 strains for each of these family, or like, how do you represent the family. Right, so we, we took just models that were well benchmarked that people have done a lot of work with before. So I could, I could send you the list of the models we used if you're curious. They, some of them, they're basically published models by other people that, and then we had to just adjust them a little bit to the specific environment that we had. But this was not our models right we're taking models from other groups, and they were not 100 I think that I can really remember the actual number for each of the two. There I think the total it was around 70 something models, I believe, but they weren't equally represented so I think there were more interactivity and so the monads. In fact, and this is the monads were not sampling the entire phylogeny of the system on Asia. So, the, there's some degree of, of, you know, that of not homogenous sampling of the of the two families in terms of the models right we, we focused on on those metabolic models that were well benchmarked experimentally before. And if you want the list again I'm happy to send it over if you're interested. Thanks so for the experiments you did, you did use your, like, different kinds of strains for each family. And also, they're not, again, we were not trying to sample the entire, you know, phylogeny of the system on Asia interactivity Asia. We were really biased by those bacteria that came up in our, in our, in our communities right so we took bacteria that from communities that had assembled in glucose. Both humanities interactivity Asia. And we have also, we also did similar experiments with random soil isolates that were not in our communities and we included them into the into the set. And we don't, I'm sorry, we don't know their set of experiments. The results are not that different right so it doesn't seem that there's some, there's some patterns but but it's not very obvious that the randomly selected bacteria from soil are very different in their patterns of secretions or even maybe have a slightly lower growth rate. Then, then the ones that in our, that will have been selected on average, but, but the patterns of secretions were similar and we didn't really appreciate any in the significant difference, but it's not is, we have not really tried to do a very exhaustive search for and to have activities as somebody said to see how concerned they are throughout the entire, the entire taxonomy group. Okay, thank you very much. Great, there is a question from Martina. Hi, thank you this was all very interesting. And I have actually two questions if I can. How do you, I was looking at your plots before and how do you justify the fact that so the monads don't really have different growth rates in acetate. When you show the different plots. Let me see if I can get your question right. Yeah, yeah for here. It seems that you have definitely two strings that grow much better, but they're not really different from one another, or am I getting it wrong. It's a different string, right. So each of these are different. The strings have different 16S right so we're, we're not even. Every single one of these has a different every 16 a sequence right and we did some sequence for for the entire region. So it's high quality sequence. So they are they are different all the strings. Yes. No, but my question is, is not the big difference between entero bacteria and so the monads. And that's why I was wondering how then you justify these changes in acetate, even though the growth rate doesn't appear to be completely different. No, it is. No, but it's different right I mean I think even if you remove those two strings that the statistically significant right. And is this is just the maximum growth rate. We also find that if you look at the average growth rate which is not just the maximum you have, but basically the counting the lag face and and how long does it take for them to reach half of the of the maximum growth they have. So the monas also grows better than than than the directly Asia. And the same, the same thing is true for lactate I mean, even though we're focusing a lot of acetate concentration of lactate actually is not that let me show you here. It's not that much slower right from what we see in acetate and and there's more carbon per. So the lactic is like this is more valuable than acetate metabolic right. So we think it's actually something you cannot neglect and we're not trying to understand exactly how it's not just the acid itself is a combination of acid that like the acid of another organic acids that together recruit the swim one of these so it's not just, it's not just one. And it also is not just a growth rate right it's the, I mean the growth rate is defined as you know the growth rate this changes right over the entire growth period. And on the one hand the maximum growth rate, which is important. But but of course that it's only sustained for a brief period of time by the entire population right and the, how rapidly you get out of lag face and and the average growth rate over that period is also an important parameter. But you know, just to clarify that this, this is essential significance is not just because of these two guys right it's, it's the entire distribution. This is only reflecting the maximum growth rate, but the same patterns would be seen if we have. I was showing you the average growth rate over the entire over the first. I think until we chunk it until the first thing is T one half right until it reaches half of the maximum. Okay, and if I can link this to another thing so I was wondering if these two strains that have higher maximum growth rate are then those that are more abundant in your experiments and related to this I was wondering if when you find these convergence, whether it's driven by the most abundant strains. Or do you think that this has nothing to do with abundance. No, I mean like the case is a good question so I really don't think we have correlated the growth rates and something that we will be meaning to do for a while. The growth rates of this bacteria in isolation with their abundance in the other committee level. This is an idea we've been floating for a while but I haven't quite done it yet. Although we have all the data shouldn't be very difficult to do. The other thing that you brought up I think it's quite interesting is if you look at this data right and we see there's convergence there's still substantial variation right among the communities right there's still variation in the race with and interactivity and I absolutely agree that in fact we have some evidence that points to that right that which is the identity of the dominant bacterium matter impacts the actual R of ratio you see. I'm going to show some data tomorrow that will submit that point that when we have alternative stable states from the same inoculum we have we have looked at the specific ratio of respite to ferment or on each one of the two. And depending on on which is the identity of a respirator if it's an a pseudomonas are not collegiate Asia. They don't quite reach the same abundance there's variation between the two. I think because this community is a very small right so we're still talking about, you know, five to 20 species but most of them are very rare so there's variation on which is exactly the the pseudomonas you find on each community right. If these committees were larger I think all of these things would average out that you will probably have more convergence that will be my guess but we don't know right. But because you only have three species right which happens which one happens to be the dominant member of its taxidermic group. So, you know, adaptation to the specific dominant carbon source is not the only, the only selective force we have right and any other differences, as well as variations on adaptation to a carbon source that we also find should matter for this So that's what we're focusing on averages rather than the fluctuations because this is an area we haven't yet explored, but I think you're like a very fair point that I think it's very likely that if you examine the more detail and try to understand the fluctuations around this this average value, and that that could be correlated with traits of the specific tax that are found in those communities right and maybe there are correlations between them, we just haven't looked at. Okay. Thank you very much. You're welcome. Great. I don't see any other questions in the chat or in the participants link list. So, if there are no other questions. Well, thanks. Again, for the lecture of today, just to remind that the next lecture by album is going to be two days on Thursday. And so thanks again and now we are going to split in.