 Okay, welcome back everybody. We are live again. This is Antonio Cianani speaking and it's my pleasure to chair the final part of today's session. And we have another first of three lectures by Daniel Segre from Boston University and you can read the title by yourself from genome scale to ecosystem level modeling of metabolism. Thank you Daniel. Thank you, Antonio. Welcome everyone. It's a great pleasure to be talking to all of you. I'm talking from Boston, where it's mid morning. I'm going to start by showing you the agenda. So we're going to talk today mostly about the logic of the cell, how to think about metabolism as a resource allocation problem. And we'll see next how to scale from genomes to ecosystems and looking at spatio temporal modeling and the long term history of metabolism. So my hope for today is really to convey the way I think it's interesting to think about metabolism and some, you know, addressing some of these questions hopefully motivating deeper insight and I'll keep this fairly broad so to make sure that everybody's on the same page. And we'll address questions such as what is metabolism and why does it matter why should we be interested in studying metabolism. And why do we need mathematical models to understand it. And we'll see also we'll start seeing why we can think of it as an ecosystem property. And I'll start by sharing something that I always find stunning which is that microbes, despite being so small, can really change the destiny and fate of a planet. And in fact they did. This is what you see here is the line of the amount of oxygen present in the atmosphere on our planet throughout the history of life from about 3.8 billion years ago until today. And one thing that is clear here is that early on there was no oxygen in the atmosphere barely any. And the reason oxygen started rising and becoming what it is today is due to the activity of bacteria such as this one. Oxygenic cyanobacteria that photosynthesize and in doing so produce oxygen. And it is because of these microbes that we have oxygen today and it is because of this microbes that the planet change completely, allowing the rise of multi cellular complex systems complex biological systems. So it's really important to understand how the metabolism of these organisms can affect such as global scale other systems. Another reason to study metabolism as of course all of you know is that we are largely made of microbes that is many of the cells in our body are microbial cells. This is an estimate from Ron Milo and colleagues showing that there is a little bit more bacterial cells inside and around us than there are our own human cells. So these are mostly in our gut, but there are microbes everywhere, and we're just starting to understand what they're all they play and how they interact with our body and with each other. And this also happens largely through metabolism as we'll see. And there are many other reasons to be interested in metabolism and micro micro metabolism in particular here are some examples. Some of you may be familiar with this but there is a lot of interest in trying to understand how microbes can help produce biofuels and useful molecules from plant biomass. And how they can affect global biogeochemical cycles, whether they can help plant and crop production, and there is a rising interest in how microbiomes in the built environment affect human life at different scales. Now, if you zoom inside and try to understand what is it that happens and that makes all of this processes happen. This is what it is. This is metabolism. Many of you have seen this chart hanging on the walls of biochemistry labs. In metabolism, what is one of the things that is beautiful about it is that really spans all scales of biology from individual cells again to metabolism is happening in the biosphere. Each line in this graph is a chemical reaction. We'll see some examples soon. And this is a global biochemical chart that is the collection of all chemical reactions happening across all living systems. These are all now available on databases, but it's, you know, one of the questions how do we start understanding the history of this system and how it translates into all these processes that microbes are involved in. And I like to think of the hierarchy of biology in a way that maybe is a little bit unusual. We tend to think of how you go from molecular reactions. This would be a simple chemical reaction and how this builds complex networks. So, for example, the intracellular circuits in a cell and this in turns leads to cellular dynamics growth and division and so on and then scaling up to ecosystem level processes. But what I hope will appreciate is that actually the this is process can go back and forth. And in fact, a lot of the modeling will discuss will have to do with starting from determining the molecular structure but also looking at the ecosystem dynamics and cellular processes and trying to understand based on what we know at the higher level how those functions constrain the way molecular reactions should work. So somehow we can navigate this hierarchy back and forth and not necessarily just building upwards more and more complex systems. By the way, I should say, feel free to interrupt any time I don't know if I'll be able to see in the chat, but feel free to step in. If you have any question or anything is unclear and I have an agenda for today but if we don't cover everything it's it's fine and I'm rather be happy for everyone to have these concepts here. So one of the questions we'll we'll delve in between today and the next two hours is whether and how we can predict ecological interactions that is this green arrows based on intracellular circuits that is what happens inside each cell and how do we do that. I will start by actually stepping back into something much simpler. And what I like to think of as a sandbox for playing with toy metabolic networks. And, and this is a think a really useful exercise to start thinking about what is metabolism all about in a much, much simplified way but it also raises a lot of interesting questions as you'll see so this is was motivated by work. We did a few years ago with Sid Redner, Paul Kropiskin bill real and then it's inspired by other work on the string chemistry is and this is a artificial chemistry a toy chemistry that is very simple. And in this case there are just monomers a that can be combined. For example, in this way, one a plus two a will be right the three as you can express it in this way. And in general you have this joining processes where a polymer or length I joins a polymer string of length j to give rise to the combined polymer and you can write the diagram for the complete chemistry, for example, all the way to polymers up to length four, and you can see that this is fairly simple. And if you ask at this level for this kind of chemistry questions such as what is the most efficient way of producing a four from a one. This is a pretty straightforward problem which we can solve manually, and we want to have no waste of products and just use the minimal number of reactions. And the solution of course this this logarithmic growth from two a ones you produce a two and from two a four you produce from two a twos you produce a four. And get complex quite quickly. For example, if I show you the chemistry up to a seven and ask you how would you produce in an optimal way a seven from a four. This is something that would require a bit of thinking you may have to break a four to a twos and so on and so forth and gradually build up to a seven and I'm pointing you here if anybody's interested in playing with this chemistry can create now. There is this tool we generated to produce arbitrary string chemistry with multiple monomers and different length, and you can play with this with this kind of scenarios. But I want to show you how this connects back to our real metabolism. And in order to do so, I will start by showing how one can find this optimal pathways for going from any initial molecule to any final molecule in again in this simplified chemistry. And just to give you an example again if you want to go from molecule again of this string chemistry of length one to one length six, you can make the two, two ones produce three and so on and this is easy, but there are some interesting patterns that emerge. For example, for some of these processes, the optimal solution such as going from a the seven to eight includes having cycles like this, where you need to have as an input a molecule that is generated by the process itself so this is a little bit like an auto catalytic cycle where you need some of the internal prod molecules in order to bootstrap the activity of the cycle and this is interestingly similar to cycle that is present in real biochemical networks and this is the representing the carbon backbone of the TCA cycle which will see shortly it's a fundamental biochemical pathway present in almost all living systems and it has this interesting property that is very similar, you have an input of a certain molecules, these are molecules with two carbons, but it needs the cyclic behavior in order to sustain itself. So this was interesting, but let's see, you know, if we go back to real metabolism, why this is so important and how we translate this analysis from our artificial chemistry to real chemistry. And of course I don't have time and you know I'm covering here in a superficial way material that could take full courses, but I want to just give you an idea for those that haven't looked at the chemistry. In recent times, what are the kind of molecules we think about and we'll take into account when looking at metabolism for microbes. This is methane. It turns out there are microbes that can survive on methane as the only carbon source and the only energy source, which is quite striking given the simplicity of this molecule, and this of course contains just carbon and hydrogen. But the chemistry of life of course requires a lot more types of atoms. This is glucose that includes oxygen, a main source of carbon and energy for our own metabolism and many microbial cells. And you can add to it nitrogen, which of course is the essential on essential atom of all amino acids such as tryptophan. And I'm pointing this out because sometimes the chemistry so complex that one gets easily lost, like myself, but it's always useful to remember that just by looking even at what atoms are present in different molecules. So what you can figure out what are the demands and the needs of the cells in order to produce a certain category of molecules. So nitrogen is essential for the production of amino acids and proteins. And there are molecules such as ATP that contain phosphorus. This is the three phosphate groups that can be hydrolyzed to release energy. And in fact, ATP, as you probably know, is a fundamental molecule that stores energy. So this is the energy currency of the cell. And it's very important because that's how the cell transfers energy between processes and allows the driving of reactions that would be otherwise thermodynamically infeasible. So the other atom that I want to point out is sulfur. It hasn't appeared yet. And it appears in this molecule which is called quinzine A. And you'll see by the way similarities this molecule is similar to ATP. And there's this chain that contains now a sulfur molecules. And here we really pretty much completed the main elements that are essential for living systems. There's more, there are metals and so on. But by and large, these are some of the atoms one worries about when thinking about basic metabolism. And quinzine A is a cofactor that is used for transferring groups between reactions and is again a very widespread molecules across the system. And I'll jump to a different scale to show you a different type of molecule. This is a protein of one, a very large multi protein enzyme complex called ATP synthase. This is a stunning machine. I'm always amazed when I see this. This is a molecule that is made of many, many atoms, as you can see, and it's one of the enzymes that catalyzes enables the reactions that transform the small molecules we saw before. In this case, this molecule is what enables the production of ATP across the membrane through a process called respiration, which we'll see in a very brief overview very soon. This is in the mammalian cells, but what is amazing is that every single cell that contains this molecules has to produce these proteins in enough amount to carry the reactions and the reactions themselves are needed to fuel the production of these molecules in a very complex set of feedback loops. Notice, by the way, that proteins don't contain phosphorus. So again, interesting to look at the elemental composition of different classes of molecules. Protein of the mass of our own cells and they don't contain phosphorus. So one can start asking questions that we'll see later on when and how different elements took part at different stages in the history of life to make it possible for living systems to jumpstart these metabolic processes. So I want to now jump from this to starting to tell you how and why this metabolic process is really can affect ecology. We want to get to ecology fairly quickly, although today we'll mostly talk about metabolism at the single organism level, but I want to start getting some into why and how this is so important ecologically. So these are two basic metabolic pathways, and I'm drawing this again in this very simplified way that is the carbon backbone. So these are the number of carbon molecules involved in this first molecule, which is actually glucose, it's broken down, ultimately into pyruvate which is a three molecule. And in this case, there is another cleavage leaving a carbon two molecule, which we'll see later is a fermentation by product. And in this process, the cells can produce two ATP. So there is the energy currency production is two ATPs per glucose that is being broken down. And this is a fermentation process, what gives rise, for example, to ethanol in a yeast and so on. Now there is a different pathway and many cells can carry both processes. In fact, if you continue feeding some of these carbon two molecules in a number of different ways through the citric acid cycle, which is the cycle we saw before this kind of semi auto catalytic cycle that we saw before. Therefore, this process in a much more complicated way, which I'm not going to go into now can lead to the production of 32 ATPs per glucose that is consumed. So this is an addition that makes a big difference in terms of the production of ATP. And again, many cells have the option of just stopping metabolism here doing fermentation or keep going and for and respire the molecules. Now, of course, things that are much more complex in real life, but this is just to give you an idea for those that are not familiar with metabolism, the kind of questions one can ask. And what are the implications so what are the implications for ecology and how one think about this. So one thing that may be already obvious from this very different yields is that in fact there may be a rate yield trade off that could be very important for competition and cooperation in across different bacteria. Of course, this kind of metabolism is much more efficient, you produce so much more energy currency per glucose consumed, but there, there is evidence that this is somehow more cumbersome potentially slower in terms of rate. And it certainly requires a lot more proteins in order to carry these processes. In fact, this molecule the ATP synthetize I showed you above is exactly what allows the production of this 32 ATPs. And what is interesting is also of course this requires oxygen, as you might imagine, respiratory pathways, this is the final electron acceptor for this process. But what is important is that without oxygen, or other electron acceptors this metabolism cannot occur cells can run this metabolism in absence of oxygen. And it turns out, what is quite interesting and still poorly understood some cells decide to use even in presence of oxygen the ferment fermentatory pathway. And in fact, this is one of the hallmarks of cancer for those of you that are interested in mammalian metabolism and there is also their questions about the ecology of how different human cells interact with each other and tumor cells, but there is this phenomenon called the Warburg effect, where human cells even in presence of oxygen will decide to ferment. There is a beautiful paper if anybody's interested in reading more about this on this rate yields trade off and possible consequences for the competition between fast, but inefficient and slow but efficient organism and this is this paper by Pfeiffer Schuster and Bonhoeffer. There is another important possible impact ecological impact of the dichotomy between this different metabolic pathways. And this is the fact that as we said, this molecule that is a carbon six molecules could be glucose. The C one molecule that is secretly CO2 and typically the fermentation by products are organic acids. These are carbon two molecules such as lactate acetate ethanol lactate is the fermentation but by product produced by human cells, for example, and by cancer cells acetate is produced by CO2 is produced by yeast as well now. So what is interesting is that if cells choose to use this fermentative metabolism, they will secret this by products and these by products are perfectly usable carbon sources for other organisms. So you can imagine that the decision of individual species to carry one metabolism versus another can have important consequences in terms of the capacity of interact with other organism, enabling cross feeding and exchange of molecules across different organisms so keep that in mind, and we'll get back to this later on. So, the, the thing I told you a little bit about right in this previous slide is how metabolism is important for generating the energy currency ATP, and also the redox equivalents which we haven't talked about, but there is another key function that metabolism carries, which is not less important in it's very complicated and it is the production of all the different monomers the different molecules that are used for producing proteins the DNA RNA and all the components of the cells so what you see here is a histogram of the proportion of the different biomass components in an cell. These are the different amino acids different amino acids, possibly if it's nucleotides and this is, you know, if you're to take a snapshot of the dry mass of a cell and measure how much there is of each of these compounds this is what you would get this is in millimoles per gram of dry mass. So, it turns out that the same pathway I just showed you so you can recognize here the fermentation or glycolysis pathways going from glucose to pyruvate and then feeding into the TCA cycle. So this same pathway in addition to producing ATP can be piped along the way to produce other compounds a lot of the amino acids and precursors for nucleotides and lipids are all coming along these different pathways. So, in the same path with the present energy needs also to carry the production of all these other molecules. And all of this has to be balanced in a very delicate way because of course you need to have the right amount of each of these molecules it doesn't help you feel very good at producing glycine but cannot produce alanine. So these are needed in the right proportions and all of this has to be accomplished while at the same time the cell also produces the right amount of ATP because ATP is used for degrading molecules and building molecules. So it's all a very complex balance of different reactions that have to fit together in order to efficiently produce all the biomass components. So we start seeing the flavor of a resource allocation problem for the cell and a problem that in fact is amenable to mathematical analysis. So we're going to start talking now about genome scale models but actually let me, I don't know, I can pause here just give an opportunity if there is any questions so far. Okay. I'll keep going. Okay, so we're going to start introducing genome scale constraint based models of metabolism. And you'll find these models. Again, some of you may already know about these. These are known in different ways, for example, they're known as genome scale models and times constraint based models or stoichiometric models and they're all of the above. So the genome scale because one tries to model the whole metabolic network of a cell constraint based because as we'll see will rely strongly on constraints, mostly related to mass conservation stoichiometric because they are based on the stoichiometer of the different reactions in the cell. One of the first problems in trying to make a genome scale model of metabolism is to construct from this universal metabolism I showed before that contains again all the reactions known across all living systems. And now actually the numbers are larger. This is an outdated number. There are probably more than 20,000 molecules that are part of this database is now, but you need to filter this through the genome of an individual organism and ask for example, which among all these different reactions is a call I capable of doing. And this is written in the genome you have to read and see what reactions are present and encoded by enzymes in the genome of this organism and translate this into a smaller network, which is the metabolic network that is specific for this organism, a typical organism has of the order of 1500 reactions and about the same number of metabolites. And this is what is called metabolic network reconstruction is this is in itself now a whole process partly because we don't know the function of all the genes so when you when you're trying to read the DNA of an organism. There is a lot of unknown genes with unknown functions or partially known functions and certain functions. This is very, very complex set of steps. And they were, you know, it's really interesting. It involves literature curation sometimes manual curation there's a lot of efforts now to try and automate this process. But we're going to not go into the details of this and just assume that we have a genome scale network, the network of all metabolic reaction occurring in a specific organism, and then we'll ask the next question of how to model this. So if you're interested in where to find this networks these are some pointers for another person that UCSD has a list of curated models on and he's one of the first to bring this field to really systems biology field. Model seed is a database of automatically reconstructed models from genomes. These are all present in the K base. Also an open database from the Department of Energy that has a number of tools for reconstructing models from genomes. And I want to just remind if you want to go into this, that you have to remember that there is a wide variability of completeness and accuracy of this model some may have undergone thorough experimental testing for many years and maybe very, very well tested and accurate. Others are just straight from the annotated genome, and they may not be as accurate and complete but it's all still always interesting to have something to start from, and to build upon. So, what is it that we're really talking about how you go now from this network and how you represent the network and how do you translate this into a prediction of what an organism can cannot do. And I will start from illustrating what used to be or is still somehow the classical approach to addressing the question of what an audience can do which is kinetic modeling. So again, if a typical prokaryotic cell may have about 1500 reactions and metabolites. This involves about 10,000 kinetic parameters you could write differential equations, describing the change of each metabolite in this network as a function of the reaction that produce and consume that metabolite. And I assume many of you may be familiar with this, this is the Michaelis Menten equation telling us how the rate of the reaction depends on the amount of substrate present in the incoming molecule and this Vmax that is the maximal attainable reaction rate, which depends on the amount of enzyme present for that reaction. And what is important here is that this is a nonlinear function of the substrate so if you know the concentration of the substrate, you can calculate the flux, but there isn't a one to one linear relationship between the two. And at some point, if you keep adding substrate the reaction will not keep growing in rate because you might be limited by the reaction. And the important thing is that you could write an equation like this for every single flux and every single rate in this network. And the real reaction rates would look more like this than like this because they may have two incoming molecules and two outgoing molecules multi substrate multi product. So you really have a lot of different parameters and complex nonlinearities that make this kinetic modeling approach very complicated. But we're not going to go into this and we're going to abandon the kinetic modeling approach for this much simplified version of metabolic modeling, which is, which is going to come next and it's called so but let me tell you first briefly with this example, how do you represent the network. Again, this is an illustration of a simple network where you have metabolite a for example in a cell that is being imported through this reaction be one and produced by a consume by reaction be one, and then also consumed by the two produced by the three back from this metabolite C. So you could write for metabolite a this differential equation where there is each term for each reaction that consumes or produces that metabolite. And this is a linear relationship between this different fluxes but remember that each of these fluxes has this dependence on the substrate this lean and nonlinear dependence based on the Michaelis Minton reaction we showed about so this is the same symbols but it's actually a fairly complex differential equation you can write such a differential equation for each metabolite. And you can easily represent this in the form of a matrix. So you'll have a vector of all the fluxes a vector of all the metabolites and their derivatives in time and in this matrix. So really converts the set of fluxes into the changes in metabolites is what is called a stoichiometric matrix of the network and it's really a compact and very valuable representation of the structure of metabolism so we're going to this stoichiometric matrix is essentially what you need to have as an outcome of the metabolic reconstruction in order to start modeling metabolism using flux balance analysis. So, as I said, this started as a, you know, resource allocation problem it started in the field of chemical engineering from Terry Papuzaques and others and brought now to the forefront of systems biology by Bernard Paulson and colleagues and now it's a very widespread approach and it's often called flux balance analysis as we'll see in a second why this is again a representation of metabolism for E. coli there are nutrients and remember nutrients have to contain at least one source of carbon, nitrogen, phosphorus, sulfur and so on all the elements that are needed to produce the different molecules and then these molecules flow into the proteins and DNA and RNA and lipids through the construction of the precursor the molar that are essential for for building those molecules. And the cell by putting these molecules in the right proportions as we showed so before produces what we call biomass new cells that have again the right proportion of biomass components. There could be production of byproducts and what we need to do now is try to find a way to simplify this process. So if I zoom in one of these reactions this is glucose six phosphate again the beginning of that fermentation or glycolysis pathway I showed before with glucose coming in. There is a very simple mass conservation, which we imposed by assuming that the system is at steady state, and this is the flux balance approximation, which is very, you know, very helpful. Of course, one could argue whether or not this is reasonable we can talk about this more later if anybody's interested, but for now, imagine that you're keeping a population of cells in a bioreactor in steady conditions. It seems reasonable to assume that overall on average between all the different cells, they net amount of each metabolite overall in the population stays constant there is no net accumulation or depletion of compounds. And this translates in the fact that now this sum of all the fluxes producing and consuming this molecule is now balanced to provide zero so this is the flux balance part. And now remember again each of these fluxes would depend on the concentrations, but really this is where we abandon the metabolite concentrations and really focus just on the fluxes. And if we just focus on the fluxes, if all we're interested in is knowing what are the rates of these reactions, then we're really just dealing with this linear equation. And this is the world of flux balance analysis is a world of fluxes where we forget about the metabolite internal metabolite concentrations, and we just try to understand the fluxes of these reactions. And of course this makes this problem much simpler. There is a constraint like this for each metabolite in the cell, but there is also more constraints that you can add. In particular, there are constraints about the capacity of what is coming into the cell, and these are very important constraints, because they define the specific conditions under which you're running a certain experiment. For example, if a cell is grown, a population of cells is growing a bioreactor with glucose available, and you know how much glucose you're putting in the bioreactor, the cells could not possibly take more glucose than you're providing. So this will give rise to an inequality in the flux of the glucose intake. So again, this is a linear relationship, and you'll see in a second how all this the linearity of the constraints for the fluxes will make it possible for us to have a mathematically tractable problem. There are other constraints, for example, some reactions, all reaction or metabolic reactions are supposedly reversible, but effectively some reactions may be so unbalanced and more dynamically that are effectively irreversible at physiological concentrations. And if this is known, one can impose additional constraints on some reactions going only in one direction. So for example, this flux for this irreversible reaction would be said to be positive, and this is again another constraints and another linear constraint on the fluxes. So all of these constraints together, the linear constraints for the conservation of mass at each node, at each metabolite, for each molecule coming inside the network possible irreversibility constraint, they form, they define a space in the multidimensional space of fluxes, which is called a feasible space. And as you can already see, this is a convex polyhedron. Why is this a convex polyhedron if this is not obvious? Well, because you have just hyperplanes, right? Reactions, constraints like this are hyperplanes of dimension n minus one in the n dimensional space of fluxes. And of course, it's difficult to represent this. So I am representing here a projection of the space on two dimensions to arbitrary fluxes. So you have hyperplanes and you have one hyperplane for each constraint for each metabolite that is conserved. So you can observe this hyperplane will intersect each other and form a subspace, whose dimensionality depends on whether or not these constraints are linearly dependent or not. And when you add this capacity constraints, you take half spaces and you end up having this polyhedral structures, convex polyhedral, that represent the feasible spaces for this node. And this is, again, a simple, ideal projection into the dimension of this space. So this is in itself interesting and there is a lot of work now on just sampling this space. So if you know nothing more, but you have these constraints for intercellar metabolism, you know what the capacity constraints are related to the specific conditions under which you're running an experiment. So we have this characterization of what the cell can and cannot do. And this is already quite intriguing and quite interesting. And again, if you think about this as in contrast to kinetic models where you have differential equations you solve the differential equations you have a specific solution. This has a very different flavor here we don't have a specific solution, but we have this algebraic representation of a space of where the cells can be found. So it's an interesting geometrical object that allows enables a lot of subsequent analysis. And one thing that has become kind of the standard approach in stoichiometric modeling is the idea of using optimization. You know, you might imagine why optimization might be helpful. But if you think of a cell as a system that has undergone long evolutionary selection towards you know for for being efficient producing its own biomass growing efficiently growing fast. You can imagine that objective functions such as the maximization of the growth rate might be a reasonable hypothesis for what a cell might be trying to do. And the advantage of having an objective function is of course that now you can look within this space of possible fluxes, you can look for a flux that is optimal for a given objective function. For example, if you're to find with trying find within this visible space, the point that maximizes bj this would be the point up here. If you're have if having general a general objective function represented by hyperplane that you can imagine sliding along the space. And when this encountered an extreme of the space this will be the optimum for the function. And you can in this way find your maximum for this growth rate, which will tell you among all the feasible points for the cell that balance all this consumption and production of molecules. You can find the point that allows the cell to grow in an optimally efficient way. Okay, in this point what is important about this specific point is that this is now a prediction that one can test experimentally you'll have as an outcome, a vector of all the fluxes of the cell for every single reaction, as well as a prediction of this growth rate. So you'll have a value for each of the fluxes, as well as the value for the growth rate, which will tell you how fast given all this concerns how fast you expect a cell to be able to grow. And by the way, something that again we'll see more next time but you can also predict whether a cell will produce by products and those by products if you were to somehow embed this organism into an ecosystem with other products that by product could now be the source of a cross feeding interaction between multiple species, which is why, you know, one at some point, realize that this kind of models can be really helpful for modeling the ecology of microbes. I want to point to some practical resources. If you have never seen this, of course, I really enough information to get started right away, but I want to point to a couple of things that might be helpful. This is a paper from Jason Papin's lab that I think is a really nice overview and practical way of getting started with doing these flux balance models. I think it has some Python scripts that you can start using right away for doing first simple models for simple networks and then going into increasingly complex networks. There is a very nice Python toolbox called Cobra Pi Cobra stands for constraint constraints based reconstruction and analysis. So this is one of the names by which you'll find this flux balance models. So Cobra Pi is an freely available resource for doing all sorts of things with flux balance modeling uploading models running optimizations and so on and it's quite convenient. So this would be a good starting point. I also put here in a GitHub on our lab webpage, some basic scripts that I've been using for doing simple FBA and some models, including the human cell model. That is now available and it's a big world. So there are a lot of possibilities out there. There is a MATLAB toolbox. There are different resources. Feel free to ask me later on if you need pointers to specific resources, but these should be a good starting point. Now, Okay, in the next few minutes, I want to point out some of the applications of flux balance modeling before going back to the ecological side. And you'll see actually that there is interesting connection between looking at what happens inside individual cells and scaling this up to the ecosystem level. So one of the typical applications of flux balance modeling in the past has been to try and understand what happens if you delete the gene from a network. So imagine having E. coli, you can predict the growth rate using flux balance modeling and ask what happens if you remove one gene, one reaction, say from the organism, will the organism be able to survive. And one thing I realized now I forgot to mention is that the reason one of the reasons this method is so valuable and efficient. Some of you may be already aware of this is that let me go back here for a second. Solving this problem is really in itself a very efficient process. This can be done through a number of linear programming packages and algorithms starting from the simplex algorithm to now more advanced models that use heuristics, but essentially in a fraction of a second, I think a hundredth of a second or so, you can have a solution to a single flux balance model. So imagine now, yes, there are caveats and there are things to be careful about the assumptions we made the simplifying assumption way. But on the other hand, in a fraction of a second you get a prediction of all the fluxes in the cell. And again, this is why this is because it's so fast. One can use it to address questions such as doing all possible perturbations of the environment or the internal circuits of the cell to see how the cell responds. And we're going to go back here to this slide where the process of finding what is how a cell responds to a perturbation can be viewed as a problem of reducing the space and finding again a point that is physiological relevant on the reduced space. Let me just give an intuition for why that is the case. When you have so this green region represents again the feasible space for the wild type unperturbed organism where you can find its own objective function. If you remove and let's leave aside the fact that there may be a complex mapping between genes and reactions, but let's assume for now you just remove a reaction from the network. A reaction is made impossible because of the lack of a mutation into a given gene. So that flux will suddenly only have the option of zero flux. There is no flux through that reaction anymore. So we have an additional constraint in this multidimensional space will reduce the space to a subspace represented here in yellow. And now you can find within the subspace. What is, for example, again the optimal the maximal capacity for the cell to grow. And in this way you can find, first of all, whether or not the cell can still grow after doing that perturbation and how fast, and you can compare and predict all the different knockouts in the genome of an organism relative to each other and relative to the wild type. And again, this enables a lot of different downstream applications from metabolic engineering to evolutionary algorithms and start thinking about evolution of metabolic pathways and so on. There is one thing that some of you may be asking yourself and which is a question we were interested in many years ago, which is whether really metabolism should be optimal also for knockouts. And you can think of, first of all, this question of whether or not we know what the objective function is, is in itself an interesting open question, but it's particularly tricky when you think of a knockout organism. If you remove a gene from an organism that maybe did undergo long term evolution optimization for being efficient at growing, there is no reason for a perturbed organism to be efficient in its own subspace, right? So it's entirely possible that this perturbed organism will not be able to perform in its optimal way, given that you just performed this perturbation. So what is interesting is that you can look at alternative points in this space that may better represent what you expect a perturbed organism to do. And one possibility is to look at the projection of this wild type point optimal on the original space onto the space of the knockout. Why might this a good, better prediction for the knockout? If you think about this, what the implication of assuming that the wild type organism is tending towards this optimum is that its internal regulatory pathways really allowed it to upregulate and now regulate the different genes to achieve this optimal production of the biomass components in a balanced way. But once you remove a gene, the organism still have that same regulatory circuit, so it's still trying to go towards this optimum. And you can ask what is the point that is as close as possible to this wild type optimum but still constrained onto the space of the knockout, and this would be the point that is at minimal distance on the yellow region of the knockout space that is as close as possible to the wild type within this yellow space. And you can solve this by minimizing this distance. There is no obvious reason whether one should choose Euclidean distance or L1 or more. Other distances, all of this has been tried. You can use quadratic programming for minimizing the Euclidean distance and you can find this prediction of the knockout which turns out to be in many cases a little bit more accurate than the prediction of what the wild type or what the knockout optimum would do. And of course one could imagine that throughout evolutionary processes, mutated organisms could go from this suboptimal initial point to the optimal point in evolutionary steps. Let's see, I think I have a few more minutes. So this method is called minimization metabolic adjustment and is one of many methods now that people use to probe metabolic networks or under different scenarios. One thing that I haven't told you and I want to give you a glimpse of whether and how one can test these models and also the caveats that one has to keep in mind in making these models. This is an example from a comparison I did many years ago based on data from Uwe Sauer's lab at ETH. So this is E. coli grown in a chemostat in a bioreactor kept constant flow. And by the way, you can recognize here again the pathway so before glycolysis and the PCA cycle. And what was done here was to compare experimentally measured fluxes with fluxes predicted with FBA. And first of all, I want to mention that measuring fluxes experimentally is a very tricky and laborious process. Typically this is done with carbon-13 labeled metabolites that go through the network and the carbon-13 atoms are dispersed through the network in different ways. And one can then figure out the actual fluxes by mapping where the carbon-13 went. And but this is very complicated. And by the way, we know to do this and not we experts in metabolic measuring can do this for individual organisms. But it's still an open challenge to try and measure fluxes for communities and very important one as we'll see later on. So, but what I want to highlight here is that there is a good agreement overall if you compare experimentally measured fluxes with predicted fluxes for this E. coli grow under carbon limited conditions. So this is one mode of running this chemostat carbon limited. And there is very high agreement. Which was for me when when I first saw this and now there's a lot of testing of different models, some work better, some not as well. But this allows me to illustrate the fact that even the same organism under different conditions can have very different degrees of agreement with the flux balance prediction. So for example, if you take the same organism and just compare experimental fluxes with predicted fluxes under nitrogen limited conditions, you can see there's still some correlation. There's literally something that we don't understand here, or there is something that the model doesn't predict correctly. And one could speculate of on, you know, why and what could be going wrong here why is the model working under one condition and not the other. You know, one possibility for example is that the assumption of maximal growth rate is reasonable for carbon limited E. coli but not for nitrogen limited E. coli perhaps evolution adaptation and the regulator circuits in E. coli are really compatible with his idea and hypothesis of maximal growth rate when there is abundant carbon. Sorry, when when carbon is limiting resource, but not when nitrogen is limiting and there may be other strategies that the cell mean choose to pursue. So this is one example and just to illustrate again that I view flux balance modeling as a hypothesis testing tool as a way of asking interesting biological questions. Sometimes you can use for valuable predictive modeling for metabolic engineering application, but one has always to keep in mind that some of these assumptions we may may not be true under all conditions. I think I have a few more minutes. I will point just to other papers that I think are representative of how people tested these models. There is very nice work on showing how adaptive evolution can lead for from organize that are initially sub optimal under certain condition to gradual optimization. There's other work showing that this is again not always unnecessarily the case this is from Bernard Paulson slab this is from Chris Marks and will Harcom. This is actually based on data from the lens key evolved E. coli lines. Very, very interesting work. So, this is, you know, there is now a lot of work using this models and you'll see more and I bet you'll see. There's a lot of exciting work from Alvaro Sanchez group that also has more of an evolutionary flavor. I will last conclude just by mentioning some of the good and bad aspects of flux balance so I think I want to I want to highlight again that there are some really valuable implications for using this modeling approach for a number of applications including as we'll see looking at ecological interactions. It's very fast. It's very scalable you can easily look at larger organisms, multiple organisms together. And very importantly, you do not need the kinetic parameters right once we made this transition from the world of metabolic concentrations metabolic concentration to metabolic fluxes. We really forgot about the metabolic concentrations, and therefore, the steady states we compute do not really we don't really to know the kinetic parameters. It gets more complicated as we'll see later when you want to go to communities, but we'll leave this for next time. But one thing I want to highlight which again is, is on the positive side here is that, you know, concentrations are obviously important, but there is something really unique about fluxes and I think the cells care about the fluxes we care about the flux if you know how much is produced of a given compound, how much is consumed, this is going to be super important for ecological interactions. So it's somehow fortunate that these models through the simplifications are good at predicting fluxes because they were there will be very helpful for embedding the single organism model into ecological models. But there are some limitations and sometimes you really would like to know metabolic concentrations inside the cell because that's what is more easily measurable now with metabolomics approaches. And because of the lack of metabolic concentrations intercellularly, you cannot really explicitly model regulation. Because regulation, in particular allosteric regulation where small molecules bind to the enzyme for, for example, this is strongly dependent on the internal metabolic concentrations. And this is really beyond what flux balance models can do. There is very interesting work now being explored where if you incorporate some thermodynamic constraints in this networks. It's possible to put back concentration of metabolites, but I think that's where, you know, the field needs a lot of creative energy for people trying to think how to go beyond, you know, the kinetic paradigm and the hybrid models that have the best parts of both. There are other limitations you cannot easily model fast dynamics because of this inherent dynamic steady state approach. And what you predict is really population and time average is not single cell fluxes. The other thing that is important to remember is that how accurate these models are will depend also on how well you know the boundary condition for example the update creates of different nutrients. That's how the challenges some of the open directions which again will be relevant for ecology as well. They showed you the snapshot of the biomass composition of a cell. And this is often treated as a fixed vector of numbers but in real life this is a condition dependent composition right E. coli and all living cell will change their biomass composition as a function of the environment. These are the examples of marine bacteria that in terms of fossil lipid use also lipids when they're under phosphorus limiters conditions. So this is really important, and we barely know, you know how the biomass composition of cells change so this is very interesting. We know very little about the maintenance how much energy is spent by metabolism in doing non metabolic processes. Sorry. And there is interest again in, as I was saying, mapping the global effects of thermodynamics on this network. And but this is an ongoing challenge partially because we don't know the chemical potential a lot of these molecules. The last and not least important challenge is that as I said, we know quite well the models for some organisms, even despite the limitations. But when you start thinking about extending this approaches to modeling a whole gut microbiome with thousands of different species, or the ecology of microbes in soil, this becomes a much bigger challenge. And what a lot of people struggle with is how to efficiently build models for all these different species and scale up these modeling approaches to really start looking at interaction between species in complex communities. And this is where we'll start from next time. So I will pause here. And I think we have time for questions. Thank you. Thank you. Thank you very much, Daniel. Yeah, I think we have a raised hand from Ashish, you know, Josh, you, you can ask the question. Hi, hey Daniel. That was a very interesting talk. I had a question actually about the first part in terms of like this kind of acetate cross feeding in E. coli or like the respiration fermentation kind of thing. What's so special about, I guess, acetate or I guess, starting stopping in this point just before the TCS cycle. Like why not cross feed somewhere else, why not in between the TCS cycle a little further above. Is there something special about this point that E. coli likes or like cells like to. So that's, that's an excellent question. And I think, I think this is, this is still a highly debated and hot area of research. I have an idea, for example, one of the, you know, there is one of the classical examples which is true both for yeast metabolism and for cancer metabolism is that we really don't know why certain cells, despite having the possibility of doing the TCA cycle and despite having oxygen around choose to ferment. For yeast, for example, one of the hypothesis is that in a competition for survival with other organisms, it may actually be beneficial to stop here secret ethanol, as you know, ethanol is a good disinfectant right so it can kill bacteria around. It's a good strategy for yeast to first secret the ethanol, kill competing organism, and in fact, yeast is capable of taking up the ethanol again, doing the respiration of ethanol after consuming all the glucose so this is called a secret. And, and so there is a lot of complexity in this processes where cells may decide to do first the first half to the pathway, and then retake up the compound that was secreted and use it through the complete respiration. So like acetate and glycerol or something would also be could have inhibitory effects, like, I guess like E. coli shows like excrete acetate glycerol or something. Yeah, so let me tell you something else that happens so again, this can vary very much from system to system. And, and, you know, people have come up with different reasons for why organisms times for men, even if they could respire. But in this, let's say in the case of acetate for E. coli, for example, what might happen is that E. coli might be actually oxygen limited. And if your oxygen limited right you cannot run this pathway. Or you can run it only partially so it is possible in principle that when you're limited by oxygen. You know you have no option, but to run this pathway to keep going and then you produce acetate and if oxygen becomes available, you could in principle respire. But in other cases such as the lactate production for cancer cells is really highly debated and there are many different reasons for you know people believe this might be happening, one of which has to do with efficiency and again this trade off between rate and yield. I think this paper I was mentioning by Piper Schuess and Bohoffer has some really interesting hypothesis about the fact that really maybe there are some deep thermodynamic reasons. And there are some old papers proposing this that the fermentation might be indeed inherently faster than respiration so really there is a trade off between rate and yield, and you can imagine that if there is a population. The fast but inefficient will take over, typically and that would cancer cells might do, whereas in the slow but inefficient, the slow and inefficient, sorry, the slow but efficient organisms or cells would be able to survive in the competition with the fast and inefficient if there is spatial structure so this is one of the hypothesis of this paper and it's potentially a hypothesis that could explain for example the competition between planktonic cells, cancer cells, versus the structure of the body that is based on this efficient well organized metabolism but also about the rise of multicellular organisms and this all seems to match in the sense that you know when oxygen becomes available, you can do this more efficient metabolism, there is a more, you know cells are more thoughtful about using resources efficiently enabling the rise of complex multicellular system. So I don't know if this address completely your question but the complete landscape can be very complicated and and some of this has to do with, you know, deeper biochemistry reasons which we cannot go into now and they have to talk about. Thank you. That was good. Okay, we have a question from Martina. Thank you for your talk and so could you come back to the slide that we were talking about about the usable by products. This one. Yeah, so my question is, so here it seems that the cells, let's say releasing the environment usable by by products that can be used by other cells only if they are ferment, but isn't it a bit of a huge assumption. Yes, so this is definitely not the case so this is one way in which cells could secrete something that is usable by other cells but by by no means the only option. I think this is observed often, but there may be many many many different ways in which cells secret by products that are used by other organisms. And we'll see this extensively next time but there is a lot of. To measure the molecules that are spilled out of a cell. They can be very different and complex. There is also a whole really interesting and again fairly open question of how much of this cross feeding happens through by products that are secreted by live cells as opposed to cells dying and spilling out everything inside so in that case, everything that a cell has inside becomes usable by product for other cells with huge ecological implications, definitely in the ocean cycling and so on. So, again, I think you point to a very interesting questions, you know, in some some cases we know, but in most cases, we really are just starting to scratch the surface of mapping this usable by products. Okay. Thank you. Any more questions from Monday. Yeah, please. You talked about limited oxygen in the in in the two part way of the fermentation and the respiration, but I'm looking at something if they there's a limit in oxygen, would this pathway still exist or they can be a reverse or a distortion. So the pathway itself right if an organism has this pathway, this pathway is there, meaning that the proteins for making this reaction happen, they are in the genome, they're staying there. And, of course, the organism in absence of oxygen could decide not to express those proteins that may not be expressed, although, as you remember from this slide right, there are other reasons for running the TCA cycle in fact the cycle itself can run, even in absence of oxygen what needs oxygen is some of the downstream processes, but in order to produce this by products, the cell may need to run some of all of the reaction in this cycle. And it's also true, and I think this is what you're hinting to that this pathways can also running reverse and the diversity of metabolic pathways across microbes is out, you know, incredible. And some organs will have some portions of the cycle, some organism, you know, and, for example, if you feed an organism ethanol or acetate, they will not have the glucose that is necessary for building other molecules so they will run this pathway backwards. So, different organs will have different options and they could, depending again on whether, you know, that there is the two aspects of this one is what capabilities they have in their genome. And some may or may not have all of these capabilities but even if they have the capabilities based on the presence of oxygen and through sensing and signaling and so on, the organism may decide how it ends up to express. Does this answer your question. Yeah, thank you. Thank you very much. I'm okay. You don't see any more questions. So, maybe it's time to take a short break before we start with a new lecture. Thanks. Thanks again, Daniel, and see you soon. Thank you. We'll make a five minutes break more or less and we can meet again at quarter past six CET time.