 So as soon as everyone joins back from the break out rooms, we'll be able to start. Okay, I guess everybody's back so we can start again with the lecture by Daniel Segre. So please, Daniel, if you could share the slide, but first thanks to be here. Okay, thank you. Okay, should I start? Yes, yes, please. Okay. Great. So hi everyone again. We're going to continue talking about dynamical modeling of communities. And if you remember, our third part is going to be on spatial temporal modeling and long term history of metabolism. But where we stop last time before we can actually talk in detail about this, I want to remind you one of the issues that we saw arises when you start when you try to build models flux balance based models of communities based on this multi compartmental multi compartmental multi compartmental multi compartmental multi compartmental multi compartmental multi compartmental multi compartmental approach, where you have different cells containing different metabolites. And the metabolites have just different labels based on the compartment they're in. And we said that this would require some assumptions about ecosystem level objective, and it have issues such as, you know, we then really allow you to predict the abundance of data and it also cannot accommodate for predictions of this spatial temporal dynamics of these communities. And we started showing this figure which I'm going to now discuss in detail, which as you'll see will solve all with all of this problem at once in a in a way that is opens up a number of possibilities. And this is something that is called dynamic flux balance analysis. It's an extension of flux balance analysis that adds back the temporal aspect of this in a way that is a little bit different than what you would see when you build a standard kinetic model. And the idea is the following. So imagine right remember that when you have a metabolic network with definition of the boundary conditions, or you know the molecules that come come in. They are described by inequalities that tells you what is available and by how you know what to rate. And you have the biomass function by mass reaction that defines growth of the cell, you can solve the flux balance problem which will give you a prediction of all the flaxes that allow the cell say to produce in an optimally efficient way, its own biomass. The outcome of the simulations right are is a vector of fluxes which include the rate of uptake of each nutrient, the rate of production of biomass and the rate of production of each of the byproducts. Now imagine that building a this dynamic flux balance modeling. It has a stepwise approximation of the growth curve. So you start from an initial amount of nutrients in the environment so this blue curve is environmental nutrients available. And you start with a very small amount of biomass. So in your first step of a flux balance analysis if you solve flux balance analysis under those conditions and we'll talk in a second how we translate nutrient availability into an uptake rate which is what you need for flux balance. So you infer the fluxes for the nutrient uptake and the biomass production. These are essentially the slopes will give you the slopes of this curve right will tell you how fast the organist, organist grows at this instant in time. What you do is assume that this is that it's reasonable to extend this initial slope for a certain amount of time and interval delta T. So you will update the new biomass after this time delta T in a similar way, you can predict how much nutrient is being consumed by just multiplying the rate of consumption at this interval in time to have a prediction of the new updated nutrient abundance and you can keep doing this again again so you're here you'll solve a flux balance problem again. And you'll end up having a new level of the consume nutrients and the biomass increases and so on and so forth. And what you can also see happening here is that some point. There may be no byproduct present at the beginning, but as the organs grow, there is a secretion of this byproduct. So if you update the amount of the byproduct in the environment, this will start increasing. So by doing this dynamic FBA, you will have a piecewise linear approximation of the growth curve in green and of the abundance of each different nutrients. For example, this initially available nutrients could be glucose and a byproduct acetate in the example of the college that we saw before. So, one, one, one thing I want to mention right away is why is this helpful for modeling ecosystem so this, you know, you can imagine this being useful for modeling the abundance of a certain organism in a in an environment. So this is very similar to flux balance analysis, but it adds this temporal component. But what is additionally very important about this is that imagine you put in the same simulation, a second organism that has a biomass, which we call biomass prime so it's a different organism that has a similar resource allocation problem. But imagine that the second organism only grows on this pink byproduct cannot grow on the initial nutrient. So what could happen here is that this organ is that could not grow at the beginning because it didn't have its preferred substrate can now grow on this pink byproduct this new metabolite that is being secreted and this process now becomes an emerging phenomenon right we didn't know a priori whether the second organs could grow or not. It could only grow after this byproduct is being accumulated because of the activity of the first organism. And what is most interesting here is that there is no assumption about an ecosystem level objective. Each organism is trying to maximize its own fitness its own growth capacity, but as an outcome of this process, we can still see, as shown in this example, the emergence of cross feeding right one organ is secreting a product that an other organ is can use as food source. So you can see how powerful this part of them can be, because it allows you to model communities and exchange and competition right this to organize might compete for the same nutrient so that nutrient will run down faster without having to assume this explicitly and this all depends on the intracellular circuits of each organism what is each organs can and cannot do. So it's really a way of observing and predicting the emergence of competition and cooperation based on exchange of metabolites straight from the organisms genomes. Now I hinted to the fact that this is requires some additional care in terms of predicting the uptake rates, and what is in the end nice about this this is essentially a hybrid approach that uses some components of kinetics standard kinetics and some components of FBA for intracellular metabolism. And if you think about this right FBA will require the fluxes the incoming fluxes, but here we start from an initial concentration of the extracellular metabolites not the flux. So how do you convert the concentration to a flux. Well, the obvious ways that you use Michaelis Menten kinetics the classical saturation curve we mentioned before, where you can predict the uptake rate in this case what's going to be the upper bound to what the cell can take in as a function of the concentration. So you'll need to know the parameters that define this curve so this is the traditional Michaelis Menten constant in the K cat for enzyme kinetics, but just for the boundary condition so you'll need to know this kinetic parameters only for the reaction of uptake of the different nutrient from the environment. So, again, there is a kinetic component in the uptake rate, but then once the molecules are inside inside the cell you assume that the cell is at steady state, and you predict the intracellular fluxes and the growth rate as a function of the standard steady state approximations, but again with environmental condition dictated by the concentration of the metabolite rather than just arbitrary bounds on the fluxes. So this allows you to monitor the change in the environments and see how the environmental composition is modified the presence of whatever microbes you have here in this in turn can affect the future of the community. So, when we develop this, this idea of using dynamic flux balance for study microbial communities, we also wanted to embed in this the capacity to model the spatial structure of communities as well. In addition to implementing this engine, this dynamic flux balance engine in a given region in space we added process of diffusion. The initially we model the pressure generated by cell when they grow and divide potentially fluctuations in the environment, and we do this by looking at the local local neighborhood in the end we do the numerical solutions of the PDEs for describing these processes, and we end up having this discrete simulations in time and space where each region in space represents a certain average amount of the biomass of a given organism. We can do the simulations of say colonies growing on a petri dish. In this case it's just one single organism, but of course you can do this for an arbitrary number of organisms, and again model the dynamics of communities in space and time which is why we call this computational microwave systems in time and space. The first work presenting this was from 2014, but the the platform has evolved significantly. Let me show you some example of how we first tested this platform. A lot of the parameters such as the VMAX and the KM for the uptake rates were taken from the literature. Similarly, we could implement a death rate that was known from previous measurements, metabolite diffusion, biomass diffusion. And there are other parameters that are essential for the modeling but there are a very limited number of parameters relative to what you would have again if you had to model the whole kinetics of the cell. Also, there are no internal kinetic parameters internally everything is based on FBA so there is no internal kinetics. This is a snapshot of the simulation of simple organisms growing. Then one first test was showing that the rate of growth of colonies on a surface is actually known to increase linearly and the growth rate obtained with comets with the simulations was very similar to the growth rate obtained experimentally in prior conservation and you can see that it strongly depends on the carbon source that is available highest with glucose and lower with lactate and acidate. So this was initial if you wish testing or calibration of the model. Now, the model has evolved into a much broader platform. So for those of you that are interested. This is what we call comets two is a much enhanced version, which is now available at this website run comets.org. So this is a collaboration between our lab in the lab of curial coral lab at the U and the lab of Alvaro Sanchez at Yale and will hardcore Minnesota. And what is nice this turned into it I mean it was initially and it still is an open source platform with idea that different groups could add different modules and the hope is that people will be interested in using this platform, reporting if they find anything. They can use it to see or it's not working properly and also consider adding different modules so that this is written in Java and but we have now Python and math lab interfaces. Just to give an idea of what this can do, right, you can predict, as shown before, they call me or let's say colonies going on a surface or a petri dish. At any given time you have all the variables that flux balance and dynamic flux balance analysis give you. So at any given time you could look for example that the growth rate you can see here. The colonies tend to grow on the sides, the perimeter of the colony, you can look at the amount of the metabolites left on the plate at any given time so for example, glucose is being depleted and acid that is being produced these are equal colonies. There are a number of other features we are adding now non-linear diffusion finite population effects so you can see this dendritic structure and sectoring happening and the nutrient dependency of collaring morphologies. This is to Alvaro's input comments now how the capacity to implement evolutionary processes and there are a number of other features that are being added, exocelular enzymes, secretion and functions and so on. So this is for now available as an archive preprint. Again, manual instructions should be all on the website so this is the platform but let me show you a little bit more of what kinds of things we did early on and we're doing now to using this approach to really try and understand community dynamics and interactions. So, when we first implemented this we were lucky to have an exciting collaboration with the group of Chris Marks. And Will Harco who was the time was a postdoc in his group had developed this very nice artificial community of two strains, one was an E. coli strain that lacks the capacity to produce methionine so this organism cannot grow on its own unless you provide methionine in the medium. And the other partner was a Salmonella strain, except that the Salmonella could grow on acid but not on lactose. So, if you were to grow the system on lactose, the Salmonella would not be able to grow but as you can already see, because E. coli secretes acetate that can feed the Salmonella. And if the Salmonella could produce methionine to help this E. coli, then this could be a stable community of two obligatory synchropic bacteria. Turns out, in addition to engineering this strain, Will had to evolve the system in order to make sure that the Salmonella could really produce the methionine to feed the E. coli and this ended up working beautifully. So what we did, this was for us an opportunity as we wanted to test comets, we built incorporated the model of E. coli, of the E. coli mutant and the model of Salmonella, and we asked whether the model would recapitulate the experimentally observed proportion of the two species. So this was the experimentally observed proportion of the two species. Interestingly, this was this proportion with a higher E. coli and Laura Salmonella was converged to irrespective of the initial conditions. So this was a stable composition reached from different initial conditions and comets recapitulated quite well those proportions. Now, you could think that this is a little bit of an overkill and in fact you could imagine making much simpler models of this organize that would recapitulate based on the uptake and secretion of compounds. These, the observations, but first of all, I mean, it's in this case there is no tuning of internal parameters, the Michaelis Menten parameters were taken from the literature so it's still quite nice to see this agreement, and what was somehow surprising is that this works worked also for a three species community and this is again experimental work done in the Marx lab. In this case, in addition to these two organism, they added a third bacterium called methylobacterium extorquence. This is a bacterium that typically grown plants so it's interesting that this community is really a synthetic communities that is composed of organisms that do not come from the same biome. These are just have different origins but you can make them coexist. And in this case, the way methylobacterium was added to the system is by providing methylamine as the only nitrogen source. And of course, each of these organisms will need nitrogen so if they don't have ammonia in the medium, they will need to get the nitrogen from the methylobacterium which can produce ammonia and in fact feed these two. So these three species community now is a community where each of the species need the other two, none of the individual organisms and none of the pair can grow on its own, but the three species can grow together. And again, there was the experimentally observed proportion of the three species after a number of passages and this was recapitulated reasonably well by the comet simulations. Now, what is also interesting here by the way is that somehow counterintuitive way methylobacterium was the slowest growing organism was the most abundant in the population. And this is because probably was not producing the needed nitrogen at smaller amounts so the only balance that the community would find was with a higher abundance of that organism. This is promising and this is really the first indication that comets might be a valuable resource for modeling ecosystem level metabolism. And, and, you know, more about this later but let me show you first some example of how one can use the spatial aspects of comets to also address questions about the spatial structure of community and interactions in want to play. Oh, before actually going there one thing I want to highlight that is actually important is that, you know, remember we talked that some of these secretions are spontaneous secretions so for example the acid it produced by E. coli that fits the salmonella is this natural production that E. coli will have to maximize its own growth rate so this is a you know one of the kind of the the costless secretions we discussed last time. But this other secretion the secretion of methionine is really something that was an evolved trade. So it's somehow imposed even if it's a costly trade it's imposed by the necessary interaction between these two organisms were co evolved on plates in in in our simulations we had to impose this methionine secretion flux, because flux balance model could not naturally take into account the mutation that the salmonella strain had to overproduce methionine. In this case, so this is something that is kind of material for for future research how to really and whether it's possible to take into account this evolutionary mutations that could produce give rise to the production of costly methionine there is a very nice paper by a former postdoc in the lab, which I'm not going to discuss here in detail, but looks exactly at how this costly secretion in dynamic FBA or FBA can be combined with game theory to address questions about stability of micro communities connected by leakage of the complex. But let me go back to as I was saying earlier the spatial structure of these communities there is one simple experiment that will did with the two strains the salmonella and the coli just growing them at different distances and as one might expect because they depend on depend on this diffusible molecules, the closer together the better they can help each other and therefore the better they grow faster they grow. And this is recapitulated also in the comments experiments, but this was somehow quite trivial, but we'll have the idea of testing slightly more complicated scenario. And the idea was the following so I imagine you put to these two colonies on a dish you have an E. coli strained or methionine knockout strain in our salmonella evolved salmonella strain or plate. So, as shown before, they will grow diffuse probably acetate and methionine there may be other metabolites involved but likely these two would be key and be able to grow. But now the question is what happens if you put a second salmonella strain in between these two. And the expectation we had and one of the reasons we model this is that we expected what we called kind of an eclipse effect so we expected that somehow this salmonella strain would take a lot of the nutrients the acetate secreted by coli and leave this initial salmonella a little bit in the shadow without and not allow it to grow as efficiently or maybe at all as it did before. So this was somehow the expectation we wanted to model this metabolic eclipse on a petri dish. And we actually did the modeling first. And what we found was quite surprising that is what we saw is that the this salmonella and this is showing the growth of the distal we call the distal salmonella. So this colony in presence and in absence of this intermediate colony. And what we saw from the model predictions that was that this salmonella could grow faster in the presence of this eclipsing intermediate colony and this is somehow puzzling. We weren't sure whether this was an artifact of the model. But when we will did the experiment to confirm that this is also happening experimentally that is this salmonella strain in the middle of the plate. The end that are helping this distal colony rather than harming it. And as you can probably imagine, the reason for this is that even if the salmonella is really potentially using some of the acetate and that the coli is secreting. Of course diffusion goes around. And it turns out that what happens is that this salmonella is closer to the coli so it will help this coli grow more efficiently produce more acetate and the net effect on this distal colony is that the the growth rate of this colony is increased and helped by the extra acetate produced by coli more than it is reduced by the eclipse effect of the intermediate coli. So somehow, sorry, somehow the idea is that this intermediate colony, which seemed originally or in our minds was was going to be a competitor of this one ends up helping because it helps its partner. So this is just to clarify very quickly that intermediate one can also excrete methionine, right? Yes. Yes. Yes. And we thank you for the question and we did do the control with non secreting salmonella. And in that case, you do observe really this eclipse effect. Thanks for the question. So, you know, one could obviously explore different geometries. There are some nice follow up studies to this. But I think the main take home message from this example and for me was kind of quite revealing is right, we tend to often think of interactions as being positive or negative. But when you look at them in the spatial context, things can get quite more, quite more complicated. And I think that's something that is important to keep in mind that taking account when you look at community in spatial settings. One thing I want to show you one can look because of the capability of comments, you can look at any given time at different aspects of the simulations. For example, again, these are the three colonies, you can look at the intensity of the acetate secretion flux. And as expected, you can see the coli in Lewis secreting acetate where whereas the two salmonella strains are using up the acetate, and you can see that this changes as the colony grows. And most interestingly, recapitulating what we discussed early on at some point at the periphery of the colony, right, the E coli it's still secreting acetate but the internal component of the colony were probably the lactose which is the main carbon source here is running out. These E coli cells starts to take up the acetate again that they secreted before and grow on that acetate. So there is this phenotypic change that happens within a colony. There are independent confirmation of this happening to an E coli, but it's interesting that this happens also in this case. And again, it shows you the potential insight that one can get by looking at these different layers of the metabolites in comet simulations. So there is a few things that comets and its different applications can help with. One thing we start doing and this is feasible through a network visualization software called Visan developed by Junjun Hu and Charles Delizi. We combine this with our comet simulations in order to map the outcome of the simulations onto a network where you can have both the individual organism again this represents salmonella this E coli. And this is not really immediately obvious but this represents in a way that is not really intelligible but represents the whole metabolic network of salmonella and this represents the whole metabolic network of E coli. And these are the metabolites that are being exchanged in red are the metabolites that are used by both organisms so these are sources of competition between the two organisms and there is the oxygen, nitrogen, sulfur and phosphorus sources. There are metabolites that are produced by both such as CO2 and then there are metabolites that are being exchanged in gray such as the methionine and the acetate. And for example here the model predict that for some reason also galactose could be an exchange metabolites something that is a new prediction of the model. But this is just to highlight right that in addition to representing the simulations as dynamical graphs showing the change of abundance of different species, one can start building ecological networks where you have both the species, the microbial species and the metabolites, potentially getting insight into what aspects of the internal network are responsible for the exchange and utilization of the different resources in the environment. And in an early attempt, which is not really representative of what happens probably in the real gut microbiome, but we started doing simulations of some key taxa from gut microbial communities, including the famous or infamous costridium difficile. And you can see that there are a number of metabolites that are exchanged or that organisms can compete on. And again, this is just the tip of the iceberg of the kind of things that we and others are doing and can be done in the future to use these dynamical models to try and predict the structure of communities. I will end by showing a couple of examples of how we're using comets for a number of other applications. One is the use of communities for trying for the use of flux balance modeling to try and predict what my environmental composition could give rise to a desired community structure so you saw this from others talk and we talked about this. And I must ask that somehow, right, there is a lot of interest in understanding how environmental composition affects microbial community structure, and the question Alan Pacheco in the lab asked recently is whether we can use this dynamic complex balance modeling to try and induce a desired composition by based on just the medium composition based on the molecules that you feed to the community. So here what Alan did was he was to use a genetic algorithm to try and design communities with a specific proportion of taxa. Let me illustrate how we use the genetic algorithm in this case. These are these squares represented different nutrients given to the community in dynamic flux balance common comets simulations. So based on the set of nutrients you give it could be three, five and so on, you get a certain dynamics for the community. And you can rank or give scores or rank that the communities based on how close they are to desired composition to desired structure. So for example, if you want all species to have equally abundant. To be equally abundant at the end of the simulation. This would be a very good simulation. So this means that this nutrient set is a very valuable one. This will also be quite good. So you can select those two and do mutations and recombination of these genomes that so to speak that represent the nutrient composition of the media in obtain a new set of media which can then be fed to the algorithm again to provide a new round of the optimization process. So this is essentially just an optimization process that is performed using a genetic algorithm based on the fitness calculations for the community that are obtained with comments. And this is one an example of the outcome of the simulations where you can ask for example for high abundance of one of the species in this case out of three. This is the subtleties. If you ask for a high abundance of this species, but for survival and so the other species do not disappear. You'll get a certain set of nutrients and a certain structure of the community that will be predicted to achieve this composition. And this will change of course if you change which the organ is that you prefer to be the most abundant. So there is a lot more than one could do and a lot more data that is in this bioarchive preprint, but I want to quickly soon switch to something else. I'll conclude this part by just by saying that I think one of the goals and one of the exciting part of using this dynamic modeling in comments is that one can start thinking of making predictions for natural communities and engineer communities and try to see whether we can gradually reach the capacity to design communities with specific properties. And of course this will require extensive experimental testing. We and others have started doing some of that. And I think it will be exciting to see how this progresses. One thing I want to highlight and I want to conclude this part is that, you know, in addition to the kind of mechanistic modeling that we discussed right where you try and predict the interaction networks in a community and the mechanism of interaction the exchange of metabolites starting from the genomes. There is a lot of data set that come from metagenomic sequencing and predictions of co-currents networks. One of the exciting endeavors in the future will be to try and understand more of the interplay between these two types of networks. And it's clear that these co-currents networks do not necessarily mean actual interaction between these pieces. But I think we are, it will be very useful and interesting to try and understand what is the connection between these two because we'll be able to do more and more of this type of networks and there is already high abundance and there will be more of co-currents networks based on sequencing of multiple communities. Let me pause here for a second before we move to a very different topic. There is a question. So there is a question from Miguel Rodriguez. Yes. Thank you, Daniel. Two very quick questions. One is, I saw you had in many of your plots, either error bands or error bars, even in the simulation, is that derived from stochastic simulation or is it an actual measurement of error derived from there? This was, yeah, that's a good question. I think that the error bars I showed in the older models here, yeah, I think these were based on uncertainty in the parameters. At that time, we didn't do yet stochastic simulations, so we could only vary the initial conditions or the parameters based on the uncertainty in the parameters. But now we do have, can put stochasticity in the simulation so we can also generate error bars based on the stochasticity of the simulations themselves. I think, yeah, both are possible and early on we didn't have stochasticity now this is part of the model, yes. Just a quick follow up, there is one of the, one of those, your slides has a bunch of the different organisms for which you model the potential interactions from the gut. It was, it was clear, yeah, this one, it's clear from here that E. coli has the richest metabolism of all, but obviously that's probably just because we know more about E. coli than we know about the others. Is there a way to measure that the incompleteness, the effect of the incompleteness of a model into the simulation? That's a good question. And yes, I absolutely agree. I think the fact that we know much more of E. coli is what causes this abundance of different arrows. I don't know, I mean, it's, I'd have to think about this. I don't know that there is a clear individual metric that can tells you how complete we expect the model to be. Certainly, from the gut feeling algorithms one can estimate, let's say, how many kind of reactions were missing early on, and how many have been added from the process of gut feeling. I would imagine that could give an idea of how much missing knowledge may be present in a given organism, but I haven't seen anyone and we haven't done this. I think it's a very, you know, good idea and interesting point that would be helpful to have, you know, some standardized metrics. I know there is actually a beautiful approach called memo. It's a suit of tools to analyze genome scale models built by a large group of people. I think that that may have some quantifications of this uncertainty. So it's worth checking that. But that's a very good question. I think it will be important to have metrics like this. Thank you. Any other question from Keith. Yes. Yes, it's a sort of extension from Miguel's question. But if I want to use this comet platform. Do I need to get the growth rate of each species and genome of species and have a great annotation. So we need three of those components. You do not need the growth rate. You do need the annotated genome or an already built stoichiometric model. So I think I don't have it here, but if you look at some of the slides from the first presentation or the second presentation. There are pointers. So there are a number of resources where you can download already built stoichiometric models. So there are, there is a database called big the IGG that has a number of organizations that's from Bernard Paulson's website. There are other groups that have their own models. There is a number of publications with already built at the models and resources like K base that can build automatically draft models from genomes and do got feeling. There are a number of tools to generate this model from genomes. Once you have that information, in addition to the stoichiometry itself in order to do the dynamic flux balance model you do need the kinetic parameters for the uptake rates. I should say that, you know, in the simulation that showed before, for example, we assumed standard KM and K cat for all metabolites based on glucose, which is certainly not accurate, but it was good enough to give this predictions, maybe because the main carbon sources were still organic acids. But yes, in principle, you do need those parameters. They're not very difficult to measure much easier than intracellar parameters, but they're still necessary parameters for running dynamic flux balance analysis. You do not need the growth rate though the growth rate is an outcome of the simulation. And again for this kinetic parameters, I think, if you know nothing you can assume kind of some uniform parameters from the literature. But the more you put in, of course, the more accurate the predictions can be. So when I, if I want to create this natural community, like how many species with this model take, like, is there like a limit or threshold? I think there, it's, I think the problem, you know, you can put in certainly tens or I think hundreds of models easily, I think, I don't know exactly what the largest number we tried, but I believe it's in the hundreds. So I think the limitation is not so much, you can imagine, I mean the simulation is will grow linearly with the number of species because you'll have to go with each model to do its own flux balance but it doesn't grow more than linearly with the number of species. So I think the limiting factor for doing models of complex natural communities is actually the knowledge of the, you know, having good accurate stoichiometric reconstruction from the genomes, not the simulation engine itself. And I should say, you know, we have tested the simulations on small communities. I, you know, whether or not and how accurate this will be on more complex communities still unknown. And I think it's an interesting question that we are interested in addressing. Thank you so much. Yep, thank you. I think it's a good question. Great, I don't see any more and raised. I'll move forward to just tell you a little bit on, you know, go to a slightly different approach which is related to something we mentioned before, which is the ecosystem, whole ecosystem level approach to metabolism. This somehow is something we applied to the study of the early evolution of metabolic networks. This is work done. Josh Goldford and Hyman Hartman, Temple Smith and Bobby Marslan. The idea, the starting point for this is somehow the following. So we tend to think of fossils as something that you see in rocks, right? You can see something like this. This is actually the current fossils evidence of the first, some of the first multi-cellular organisms. This is from Canada. And we can also, we got used to the idea that sequences, right? DNA sequences are also somehow fossils of early life. They contain information about the ancient history of life. And the question one can ask is why not networks, right? Do metabolic networks, the same metabolic networks, or they contain in their structure, in their architecture, information about the ancient history of metabolism and of life itself. The question is how can we tease out information about this? And this is related again to this idea we expressed early on that when you model a community, you can think of a community as a set of organisms, each of which has its own internal circuits, and they could exchange things. You can ask this question of whether you can predict ecological interaction based on the intracellular circuit of this organism. But we also raised the question of whether perhaps at some point we've maybe complexity of communities are complex enough and we know now from Alvaro's lecture and evidence in other contexts that functions are so important in determining the fate of a community, perhaps it doesn't even matter which organism performs what function, we could treat the whole community as a soup of enzymes, right? And look at the set of all the reactions as if they belong to a single compartment. Now when you study ancient life, there is a very special meaning to this concept, and this meaning is related to horizontal gene transfer. And we know that microbes can exchange enzymes with each other, and enzymes can be transferred for one organism to another. So metabolic networks that at a given time seem to be stable for and be associated with a given species, if you look at the long term history of life, you know, they can change and move from one organ to another, so you can imagine this being a very plastic process where really it makes more sense to think of ecosystem level metabolism as a property of the whole ecosystem and not of individual organisms. So when you think this way, you can start asking these questions of not just what an organism can do with its metabolism, but also what an ecosystem can do. What are the metabolic capabilities of an ecosystem? Except that now, when you ask questions about the ancient history of life, you can make hypothesis, sorry, about what was possible in the presence of a few individual specific molecules that might have been present on early Earth. So you can ask questions about the expansion of this metabolism from an early subset of metabolites. So it's as if you can try and get some historical record of the growth of metabolism starting from its early seed of possible compound. And the way we did this, we applied this was by using an algorithm that has been developed a beautiful, simple but very powerful algorithm that was developed by Oliver Ebenhoch in Rainer Heinrich's group several years ago. And this is called the network expansion algorithm. And the idea is the following. I'm going to illustrate this on a very simple toy model of a network, but imagine this representing the huge network we saw before. So this is the network of all possible metabolites as circles and reactions with the arrows. So imagine now you start with a seed of possible compounds. For example, these two molecules are present. So you can ask simply, given that these two subsets are present, what possible new what's a new subset could be possibly appearing in our world, given that these reactions are possible. And of course, this reaction could in principle occur in these two new metabolites could be added to this network. So you can define as the scope as the total set of metabolites that are being producible. This reaction cannot occur because you don't have this initial substrate. So the scope that you obtain is the set of these four metabolites. If you add this initial molecule in the seed, then you can have these two, but also by the second net reaction, and the whole network now becomes feasible. Now, in, in generating this network, we don't say anything about the presence of the enzyme that are needed to catalyze this reaction. Let's assume somehow that catalysts are present enzymes or proto enzymes are present. If we look at the early history of life, we can get, we'll get back to this later. So I hope this is clear again, this is a very simple topological algorithm that allows you to know what portions of a network can be reached based on an initial set of compounds. So we can play the same game for the real network by taking some initial compounds and ask which of the, you know, 10,000 or so metabolites present in the community can be reached. Is there a question. I think it was not a question. Oh, okay. Please. So, so one can ask what, what space of this network can be reached based on the initial set of compounds. And I'll show you first the way we applied this algorithm a few years ago in work done with Jason Raymond, asking a question related again to one of the things we discussed early on in my first lecture, which is the transition from an anoxic one oxy world. So if you remember, about 2.2 billion years ago, the atmosphere started becoming from anoxic started to be filled with molecular oxygen. And this was due to the activity of bacteria. Now, oxygen can be very toxic, of course, and cause a lot of changes in metabolism, so we asked based on this network, what could happen to metabolism if it's initially does not involve oxygen and after some transition it does involve oxygen. What changes would you expect could occur in metabolism because of the presence of oxygen. And there is a lot of aspect to this so I'm just showing a snapshot of one of the outcomes of this analysis. In blue, you see the anoxic network. So this is, again, this is not a network of an individual organism, but it's the, again, the expanded network from an initial set of plausible early earth metabolites into an expanded network that is involves reactions that are present in multiple organisms. But I want to point out that what is striking is that there are these additional branches that become possible only when you add oxygen to the initial seed of the network. So when the network expands in presence of oxygen, there are also new molecules that become available. And interestingly, there are very little changes at the core of the network. So we know that, of course, there is oxygen has a role as a electron acceptor for oxidative phosphorylation for this more efficient metabolism that we discussed. But there are a lot of other roles that oxygen has, which are known, but they're kind of diffuse through different pathways and see you can hear you can see the impact of oxygen availability as a molecular substrate that enables the production of molecules that are some of these more complex molecules such as flavonoids sterols, which includes cholesterol and a lot of molecules that are involved in communication such as hormones, the terpenoid metabolism. So there are a lot of molecules that are, in fact, can be signatures of eukaryotic and multicellular life that are really associated with the rise of oxygen the atmosphere. So this is one possible utilization of this network expansion algorithm, but I want to show you a more recent result which is also obtained for the same algorithm addressing what is known as the phosphate problem in origin of life research. So this is a mineral called apatite, which contains phosphate. And then this is a an example of what could happen to a marine ecosystem where phosphate is added in large amount in this case, due to pollution. And this causes a huge rise in the amount in the growth of algae, photosynthetic bacteria. Now, what is interesting about these rocks is that these are the kind of rocks where you expect phosphate to be found, except that it's very poorly bioavailable. So it's difficult to extract phosphate from these rocks. In fact, it can be extracted by bacteria through the secretion of organic acids, but it's hard to imagine how phosphate could have been available for early metabolic processes. And this is a problem because we know, as we saw again the first time there are molecules, central molecules to life such as quinzium A and ATP that contain multiple phosphate atoms in orange here, and there is plenty more. In fact, one cannot really imagine a life without phosphate because DNA and RNA are phosphate containing molecules. So a life without phosphate would not have nucleic acid would be a life without genomes and without transcription and translation and without this energy currency that we discussed, ATP. So we started asking this question, though, whether it's possible that an early metabolism could have emerged prior to the availability of phosphate. And therefore, whether it's possible that perhaps living systems could have emerged as an early metabolic process that could have later on even rise to life as we know it today with Darwinian selection based on genomes and transcription translation. So we asked this question of whether, you know, if you start from a seed of metabolites that contain some carbon sources, sulfur, which is known to be present on early earth and nitrogen, different sources of nitrogen. So these are all carbon, carbon, nitrogen, sulfur, but there is no phosphate containing molecules. The question is, could you get any metabolism at all based on this phosphate free seed of metabolites. And our expectations when we were to run this network expansion algorithm was that phosphate is so strongly embedded. It's present every ATP that drives reactions contains phosphate. There is phosphate everywhere in metabolism today. So we thought it would be pretty much impossible to have anything but small pieces of this network. But we were very surprised to find that instead there is a core network that is that counts 315 reactions to 160 metabolites that is fully connected and that does not contain any phosphate at all. So this is this expanded network. It starts here, you can see CO2 and ammonia and some simple molecule. And as the iterations of this network expansion algorithm progress, you add more and more molecules. And it turns out that you can add 10 out of the 20 amino acids are part of this network and some of the precursors of nucleotides. And again, no phosphate at all, but there is this core network that is embedded in present day metabolism that is again not present necessarily any individual that is, but an ecosystem at an ecosystem level this could be a snapshot of an early metabolism before phosphate become available. I don't have time to go into all the details of this. But I want to highlight first of all that this would be. We don't really know right we're speculating about something that could have happened 3.8 billion years ago and we and nobody really has any idea of what exactly happened at those times. But what is interesting about this kind of net, you know this network we show is that this really exists I mean the network is there whether or not it tells us something about ancient history we're not sure. But it is potentially a fossil of this early metabolism and one additional evidence for this and that could provide some cooperation to this idea that this could be a fossil of early life is the fact that. So, those reactions would have been catalyzed by simple mineral surfaces and maybe small molecules but there were no proteins at that time. So how could we get some insight into the possible catalysts at that time. And what one can do is look at the enzyme that catalyzed those reactions today those 350 reactions. And the protein itself is a very complex structure that could have not been present early on, but this the co factors are some molecules that are in the active regions of this enzyme. Some of these are very ancient molecules for example. In the sense and contain this iron software clusters that are minerals that are known to be associated with early earth environments. And we looked at how frequently this iron software clusters appear in this core network in this network of phosphate independent metabolism versus the full expanded network. And it turns out that there is a very strong enrichment of this iron software clusters in the, this core network relative to the complete network, indicating that perhaps really this network captures some of the early activity of metabolism on our planet. There is follow up work which I'm not going to have time to go into exploring in a much more systematic way how different assumptions about the early earth environment could give rise to different proto metabolism. And this thing takes into account not just the carbon sources nitrogen sources and so on, but also what were the electron donors where the electrons came from the pH and the temperature. And importantly, these coming to play by looking at the thermodynamic feasibility of this reaction so in this network expansion that I just described initially we did not think about into account thermodynamics, but you can imagine that in addition to the topological feasibility of this network you want to look at the thermodynamic feasibility and this will depend on pH and temperature in a way that is can be estimated based on the distribution of each of these components. So we did this calculations and one can find again, which networks can expand to the full to a large networks and which conditions do not give rise to expanded network of finding more parameters that seem to be conducive to initial metabolic network. And I will conclude just by saying that what we started doing which I think is also, you know, a seed of something that could be done more in the future, we started applying this flux balance model to proto cell models to proto cell system so we took this network that were coming from this network expansion algorithm, and we tried to model them with the same tool that we use now to study present day cells, and one can look at whether this proto cells could sustainably produce a simple biomass, composed in this case just of much simpler molecules than the biomass of organisms today, such as simple lipids and keto acids that could be precursors of present day proteins. So I will stop here and just acknowledge my group and thank all of you for listening from all over the world, and I'm happy to take any questions. Thanks a lot Daniel for these three fantastic lectures. So we have time for a few questions so if anyone wants to ask please use the raise and feature or type it in the chat. I have a question myself so regarding the result on the network scoping and sort of reconstructing this primordial metabolism. And perhaps I missed that, but is there, I mean, somehow you see that there is this backbone and you see that but what is the new and you are sort of using that to infer something that as you say that then three point two billion years ago and there is some somewhat a growing process on top of this net right so is there a new model you can compare this is the result to so is this something somehow you couldn't expect by chat to happen by chance. That's a great question I think there is so you can you can like one one example of this would be, you know, which I didn't go into in details but these are all the different conditions so you can add or remove different molecules and ask what happened if you didn't have this compound with this other compound, and you can see that many initial conditions do not result in an expanded network so it's not an obvious case that all possible conditions will lead to an expanded there needs certain conditions to occur. And when you add the thermodynamics you can see that also you have ranges of temperature and free energy, and so on that, that give and not give rise to this network. But I think what you're pointing out is is a broader question which is, this is based on the chemistry we know today right and, and how do we know that we're not biased towards just, you know, this is based on all the reactions that we present today but there may have been other chemistries that, you know, appear that is appear throughout the history of life. And, you know, one, what could argue about this one one thing that I think is very helpful is to do exercises going back to what I introduced at the very beginning this artificial chemistry and that that can give opportunities to ask generate null hypothesis with arbitrarily complex chemistry is to ask how likely it is to obtain an expanded network based on different assumptions on the nature of the chemistry and the amount of reactions that were lost throughout this process. I'm not sure this address your question. I think it's a it's a hard question to ask me what is a null hypothesis when you talk about this early metabolic processes. Yes, I mean, I guess it's why you're the question about the origin supply. There is another question from Matteo, who is asking whether you see a connection between your work is work and recent theoretical advances in stochastic thermodynamics of biochemical networks. I, I, yes, I mean the short answer is yes I think there is a lot of interesting connections. I think that right the way this flux balance model started early on there was no thermodynamics. But I think that adding the thermodynamics opens up a lot of new possibilities. In particular, yeah, one can look at the, not all this metabolic flux states that are feasible based on FDA are necessarily feasible thermodynamically in the classical examples you can have, you know, three reactions running in a circle, that is balanced in terms of fluxes but of course it's infeasible thermodynamically. And there's been a number of studies trying to add thermodynamic constraints to flex balance modeling. And, and I think, you know, the works you're mentioning are I think are somehow complementary to the literature but I think there is a lot more work to be done in bringing these two pieces together. I think there are also interesting questions of whether we can revise the objective functions based on thermodynamic principles. So I, you know, I think it's it's an open area but I definitely think there is there are connections and I think a lot of these are still there to be explored. So, I don't see any other questions so let me thank again Daniel for these three great lectures.