 Hello, and welcome back to Beyond Networks, the evolution of living systems. Today we're going to come to the core of the argument, and this is why the lecture is called Beyond Networks and not against networks, for example. So what I want to do is I want to talk about how we can use network analysis to find out what a network does and the central problem with that approach, which is sort of a mantra that I want you to repeat over and over and over again for the rest of this lecture. That mantra is structure does not determine function. And that is also the title of the lecture today. Structure does not determine function. Repeat with me. Structure does not determine function. If you remember one thing from this lecture, please let it be this. So what we did last time is we looked at network analysis, network graphs, graph theory, and sort of, I tried to give you a very quick outline of what this sort of structural perspective gives you in terms of understanding the underlying system. Remember when we talked about models as epistemic tool, Tariq Nutila's approach, she said, your approach constrains what you can do. So how does the network approach, graph approach, constrain what we can do? Well, first of all, it is a statistical analysis on sets of nodes and interactions, edges of networks. And it analyzes the structure of those sets. It is ideal for sort of the big data sets that we have available today, omics data sets, interactions between proteins, transcription factors, gene regulatory networks, neuronal networks, social networks, et cetera, et cetera, et cetera. Lots of data available. Remember that the data are always fit to the model just as much as the model is sort of fit to the data. It's sort of a symbiotic relationship. And the main aim, as we saw, of this type of analysis is to understand the robustness of these networks against perturbation and also their connectedness, for example, how fast can a signal travel or a virus travel in a global network? But what these methods don't do is they don't tell you what the system does. Remember, this is what we set out to explain in the first place. I said much earlier that there are real systems out there. They're sort of pattern processes. And we want to understand why are these pattern processes happening and how are they happening? This is a different thing from what network analysis does. So we need to move from a sort of statistical correlational analysis to something that gives us a causal explanation. We'll be busy with that for a few lectures, actually. So let's get back to this idea. Maybe a good strategy to understand what a network does is getting back to the idea that nature is somehow decomposable or almost decomposable. Herbert Simon's notion of near decomposability where he says everything is connected, and that these things are more connected than others. And we want to sort of home in on those. The world is a large matrix of interactions in which most of the entries are very close to zero and in which by ordering those entities, according to their orders of magnitude, a distinct hierarchy structure can be discerned. And this structure, we sort of arrived at a little bit. Last time is this fact that systems, biological complex adaptive systems, are very similar and you can sometimes see this modularity at the structural level. This is called community structure. It means that some nodes in a network are more clustered amongst each other than connected to other clusters of nodes. And you can see a nice example here from an actual data set of, I think, protein-protein interactions. Modularity not only holds the key towards making networks robust against perturbation by limiting the spread of perturbations, but they also hold the key to understanding what a complex network does by us subdividing it into smaller chunks that we can actually understand. And this is sort of a very common strategy. Here's a very famous example from my own field of research. Aviva Odivo, Eric Davidson and Doug Irwin in a very famous paper in 2006 and science have sort of introduced this method and it looks at this big network that they have that describes the early development of a sea urchin, the specification of its endoderm and its mesoderm. So this endomesodermal specification network can be subdivided in different ways. One way is already indicated on the graph itself. These different colored boxes indicate genes that are active at different times in different tissues during development. But you can do more than this. For example, you can home in on those components and those interactions of the networks that are most conserved in evolution. We'll come back to that later on. But also are sort of the core factors that are at the very center of the network. And Davidson and Irwin called these the kernels of the network. And you can see a little trick of this network here. Basically the same genes are repeated in different places. Here's one blimp one crocs, blimp one crocs. Also here blimp one crocs. That's because this is a transcription factor that is so central and important that it's redeployed in different contexts during development. So these parts of the network are most conserved and the most sort of central to the functioning of the network. We'll get back to what that means, the function of the network. Other sort of parts of the network, they mediate between those different parts, these different color boxes, for example. Here you have the delta ligand of the notch signaling pathway. And you can see there's a connection that goes from the pink box here to the blue box over here. Davidson and Irwin call these sort of sub circuits switches or input output devices. You already get the point here that the computer metaphors are very important to them. And so it's not surprising that they call further downstream parts of the network plugins, just like you have different plugins for your computer, which for example, in this case, lock certain cells into a specific fate, which is to become skeleton and the larva. And so these are transcription factors and they mediate the effect that is coming from the kernel via these switches down to what they call the skeleton or the skeletonic differentiation gene battery. Differentiation gene battery is the sort of set of genes that are actually enacting the changes, the differentiation changes in a cell. So they are structural proteins. For example, coherence, you can see here, and other factors that are causing the actual differentiation. So this is a sort of a non-mathematical way of subdividing a network. But we want to be a little bit more rigorous than that maybe. And there's a very sort of interesting approach that you could take, and that is not only looking at local structure of the network, but also looking at how often a certain structure reoccurs in a network. So this has worked from about 20 years ago, which introduced the idea of a network motive. So what you do to identify a motive is, you first of all enumerate, for example, for three nodes in a network, all the different connections that you can imagine. And here in the lower part of the slide, you see the 13 different types of connected subgraphs that you can have. This is the complete set of interactions of directed graphs between three nodes. There are 13 different combinations. And up here they're telling you that these networks can stand for anything, transcriptional, regulation, neuronal connections or being eaten in a food web. It doesn't matter. So this is a very abstract analysis again. So you can take this set of subgraphs and say, okay, let's look at all the different subsets of three nodes in my network and count how many times each of those 13 different types of subgraphs occurs. Okay. And we're going to focus for what follows on this specific motive here, which is called a feedforward motive. Quick rant. This is sometimes called a feedforward loop. Okay. Do not ever call this a loop. It is not a loop. There is no feedforward loop. Look here, number nine, this. This is a loop. There's an arrow starting from this node to the other nodes coming back to this node. That is a feedback loop. There's a very clear mathematical definition. The feedforward motive starts here, flows away from this node and never comes back. Okay. There is a branch in the pathway, but there is no loop. So don't ever, ever, ever call this a feedback, a feedforward loop. Feedforward means there is no loop by definition. So that the idea of a feedforward loop is a contradiction in terms. It doesn't make any sense at all. So please, everyone stop calling it that. End of round. So we're looking at the feedforward motive and we want to see how often does it occur in a specific network. So let me give you a definition of network motives first. So network motives are patterns of interconnections that recur in many different parts of a network at frequencies much higher than those found in randomized networks. What does that mean? So not only do you go and count, for example, these feedforward motives in your network here, they're indicated by these red dashed lines. But then you generate a set of randomized networks, which have the sort of same in-degree, out-degree, same number of nodes and connections as your network, but they're randomly rewired. And then you count how many of those motives you have in those sets of different networks. And then you can do some statistics and you can check whether the number of motives you have in your particular case is significantly different from the average number that you expect in a random network. And that's been done here. This is obviously an example. It has a law compared to only two feedforward motives here. So clearly this is probably statistically significance, otherwise they wouldn't show it in this paper. But this is sort of the way you go about it. Not only do you identify a specific structure, but you also, it's only a motive if it's enriched, if it occurs more frequently than you would expect it to in a random network. And so the first study to use this approach looked at the transcriptional regulatory network of the bacterium E. coli. And here you can see this absolutely fantastic graph that shows you the entire regulatory network of E. coli and all the different motives in it. And you can see every triangle here is a feedforward motive. It has a relatively flat structure. Other networks are very different. And this is what these authors found out around, in the group of Uriel on around the turn of the century. But let's look at sort of what they did is they sub-sampled the transcriptional network of E. coli. So they took different sort of sections of it, set networks from it with increasing size. And they looked how many times does this feedforward motive occur compared to what they would expect. And so down here with the error bars is a series of measurements in these randomized sort of sets. And as the network grows bigger, you can see that you would expect less and less frequent occurrences of this feedforward motive. And also, of course, the error gets smaller as your sample size increases. On the other hand, the real occurrence of this particular motive stays more or less the same across all these sort of networks. So if you have a specific, so here is the threshold where the result becomes significant. And so if you have a network that's sufficiently big, you can be doing the sort of statistics that allows you to detect significantly enriched network motives. So it's again, it's a big network approach, but not only based on structure, but also from this enrichment, you sort of infer that these motives must have some sort of function maybe, but that's very, very highly debated. And what's interesting is if you look at different types of networks, you see different types of motives that emerge. This is a beautiful graph here from a study from 2004, also from Uria-Lon's lab, that shows all the 13, down here on the x-axis, you see all the 13 different combinations of subgraphs among three nodes. And so what they did is they looked at how much they are enriched and that the y-axis is a statistical score of significance for the enrichment of a specific motive. The whole thing is called the triad significant profile triad because it's three nodes, and then how significant is the enrichment of each of those nodes. And then you have four different classes of networks up here are transcriptional regulatory networks, not just from E. coli, but also from other microorganisms like yeast and bacillus subtilis. Down here you have signal transduction networks and sort of much smaller at the time, datasets for transcriptional regulatory networks in fruit flies, trusophila, and sea urchins, again the sea urchin here, and also in the neuronal network of the worm sea elegans that was reconstructed at the time. Down here technological networks. So these WWW networks are obviously some sort of subsets of the worldwide web. Dataset number two is websites related to Shakespeare and I forget what the others are. Social networks as well. So they are very similar worldwide web social networks between prison inmates, sociology, sophomores, and other very strange partitions, parts of the society are depicted here. And down here you have a very interesting network. These are adjacency networks of words. So in English, French, Spanish, Japanese, and a very abstract bipartite mogul, in which two different groups of entities preferentially associate with each other. And you can see that the significance profiles for each class, superclass of these different networks differs and you can group them based on this profile. What's striking here is that the transcriptional regulatory network of E. coli and these other microorganisms only has one enriched triad and that surprisingly is the feedforward mogul figure, which also is enriched in eukaryotic transcriptional and signaling and neuronal networks. So that really piqued people's interest because it seems that biological regulatory networks have that motive enriched while technological networks don't. But the situation is a bit more complicated than that. So you remember that we had undirected graph and for the graphs and directed graphs we need to add a little bit more complexity to study transcriptional regulation with a graph theoretical approach. That's because there are transcriptional interactions that are positive, activation or negative repression. So we need to add signs to our interactions and so you get a signed directed graph. The way this works is you label all those interactions that are positive with a plus indicated in blue here and all those that are negative with a minus is indicated in red. Okay, just to introduce some notation what we're going to do from now on is not use the plus and minus notation because it's a bit clumsy, but we're going to use arrows for positive interactions and sort of t-bars for repressive negative interactions from now on. So this gives us a sort of an additional dimension to look at the structure of motives and so we can revisit this sort of transcriptional regulatory network of E. coli and you can see that here you have positive regulation in blue, negative regulation in red, just like in the graph I showed you before. So they did this already for the E. coli regulatory graph. And so then on these signed directed graphs you can identify a whole range of different motives. A bunch of classical examples are shown here so you could have negative auto-regulation. This can be direct or it can go through intermediate steps by the way which we're not shown here. Positive auto-regulation. So this is auto-activation, negative auto-regulation is auto-inhibition. The feed-forward motive of course they identified what they called single input modules where one transcription factor was responsible for a whole range of targets maybe with different sensitivities and then what they called these dense overlapping overlapping recylons that allowed them to partition the transcriptional network of E. coli into these big chunks where there's sort of a shared set of transcription factor that regulates an overlapping set of targets. There are more and obviously it's not, it's sort of a little arbitrary to set up different classifications of motives. We'll encounter a few more motives as we go along. But if you actually zoom in on the feed-forward motive itself now that you have signed interactions there's a whole zoo of network motives within the feed-forward motive because every interaction could be activating or repressive. So you can subdivide very roughly these feed-forward motives into two classes coherent and incoherent feed-forward motives. The coherent ones have no contradictions in the two pathways. For example, the type one feed-forward motive here has a direct activation of X on Z and an indirect activation of X on Z through Y on the other branch. While incoherent motives, here you have an activation of X on Z and then a repression, overall repression of X by a Y on Z which doesn't sort of make sense in the first class but we'll have a look at that. So within this single motive of the feed-forward motive there are at least eight different types of motives and it gets worse because of course where the two branches meet you can have different functional forms. For example, here we assume that the two branches reunite through an AND function. So both the direct activation by X and the indirect activation by Y need to be present to activate a rate C. What we can do now with such a simple motive is we can study what it does. We can model it and often these networks are so simple that we can even get an intuition of how they work. The idea is therefore to get that sort of knowledge of what they do and then to put them together again and to learn what the whole network does basically by it's like Lego bricks for networks. You build the whole network again from after you dissected it and basically it's assumed that the whole network will just be a sort of a combination of the behaviors of these motives. Okay, so but back to the coherent type one motive here. What does it do? It seems completely redundant, right? So there's two activation pathways but one of them is indirect. So what it does is imagine that you have an input signal coming in here at the top and there are two pulses. There's a very short pulse of signal that immediately induces a very short pulse of expression of X and then there's a longer pulse that has a more sustained expression of X. So if you imagine, so X will immediately try to activate C but it can only do that through Y if it signals through Y as well. Okay, so it needs to activate Y first and so if you have a very short pulse of X there's a gradual buildup of Y but it never sort of reaches the threshold that it needs to activate C. Okay, so that's the idea. So this short pulse is filtered out and only a sustained pulse will build up enough Y to get expression of C. So here is the output of C and you can see that it only reacts, it only responds to the sustained pulse while filtering out short sort of pulses. So this motive can act as a persistence detecting filter. It will filter out noise, brief signals that the organism doesn't need to detect and indeed there are systems like the Arabianose metabolizing system in E. coli that work like this and you can experimentally show that the principles, that this simple toy model implements that they apply to the processing of Arabianose, the induction of Arabianose processing in E. coli. Okay, so let's turn to the incoherent feedforward motive which is a little weirder, right? So there seem to be two branches to contradict each other. So what happens if you have a signal that switches on and stays on here? You get, of course, an induction of X and at the same time also Y is building up and you get an activation of Z. But at some point, Y will reach the threshold at which it starts to inhibit Z. So basically Z builds up and then as Y continues to build up it'll be degraded again to a certain level. This has two effects. The main effect is that the incoherent type one feedforward motive can act as a pulse generator. Okay, so you end up with a pulse of Z and interestingly also the response that it shows here in the beginning is faster than if you just have a normal single basic transcriptional activation. So this is an interesting type of behavior and again you can find it in the Galactos processing system of E. coli which implements such an incoherent feedback motive. So, wow, cool. This is great. So all we have to do is sort of go, we take a complicated network, let's take a more complicated example. Here is the sea urgent early developmental network and we need to sort of subdivide it into small enough motives so we can understand what the whole network does. This was actually done in this posthumous paper by Peter and Isabel Peter and Eric Davidson after he was already deceased that showed attempted this for the whole sea urgent network. Okay, so this is great. Divide and conquer. We exploit the near decomposability of complex adaptive systems to get down to their basic parts. If we understand those parts we understand the behavior of the whole system. Seems like a great strategy, right? So what could be the problem? What is the problem with this approach? The problem is quite simple. It does not work. So sometimes it does. You get lucky, right? We saw that, especially in a very flat transcriptional regulatory network like that of E. coli. These motives, once you've identified them, they're really strongly sort of separated from the rest of the regulatory apparatus. It's a very flat network. There's not a lot of sort of feedback interaction. So you can separate those little modules and they actually do what you predict them to do. But if you try to do this for an animal like Drosophila or a sea urchin or any more complex system, the economy, whatever, it won't work because these complex systems are heavily feedback driven and they're much harder to decompose than an E. coli regulatory system. And the main problems are two-fold. There's two problems. Let's have a look at our little toy model here again. Let's find feed-forward motives in here. There is an incoherent one. Type one as it happens between V6, V2, and V3. Okay, but you see immediately there's a complication. There is an additional backward interaction here. This often happens, of course. So what is immediately clear here is that network context matters. So we can't just count all the different instances. We need to sort of keep track of all the interactions and the context because they will affect how a motive works. We'll come back to that in a second. Just another point. The second problem, of course, is, well, let's look at this feed-forward motive here, which is a type one coherent one between V4, V6, and V2. And what we find is that whoa, actually this is a feedback. Imagine it's a feed-forward motive or take it as a feedback loop. But the main point is, so now in this case we have identified all the interactions between those nodes. We'll come back to sort of discuss what does that mean, all the interactions. It's really hard to actually identify all the relevant interactions. It's always arbitrary to remember which interactions and nodes you include in a system. So here we should be fine, but the problem is even here, even this simple circuit exhibits many different types of dynamics. This is a feedback loop, not a feed-forward motive. Again, I apologize for the confusion. What happens is that even this simple circuit will behave in many different ways, depending on the precise strength and also the nature of the interaction. So it matters a lot whether this is a transcriptional interactions, translational, protein-protein binding, whatever. The details matter in all these cases. And, of course, if you get a different dynamic behavior, depending on the sort of nature and the strength of those interactions, you usually also get an effect on functions. So if a circuit, a motive, behaves in a different way, it also has a different function. Not always. Not necessarily, but most of the time. So let me illustrate these two problems. So the sort of structure does not determine dynamics problem, and also the network context with a simple example. I'll introduce two more motives here. One is called the recrassulator. It was built, one of the first synthetic circuits to be built by Elywitz and Leibler in 2000. Beautiful paper. And so what it does, you can model it very exhaustively. What it does, it can do two things. It can oscillate. What I'm showing you here is a mouse pre-semitic mesoderm. It doesn't matter. But you can see oscillating pulses of gene expression. So this sort of circuit can produce sustained stable oscillations. It goes on oscillating on and on and on. Or, depending on the strength of the interaction, it can produce an oscillation that will go away after a while. This is called a damped oscillation. So, basically, we'll go through the expression of these three genes, the green one, the red one, the blue one, and you will cycle through them and at some point it will stop or it won't stop. Okay. I'm not mentioning the default behavior of this circuit. For most strength of the interactions, for most combinations, this circuit doesn't do anything. No expression at all. And that's the case for most circuits. You need to tune quite finely to get the behavior you would expect to be the default behavior in many, many cases. So that's one problem. But now, if you add just one more interaction, and that's what we're going to do, just like in that example I gave you before, so if you add one back interaction between two of the factors, you get a really cool circuit. It's called the AC-DC circuit. The reason it's called that is because it has both a positive and a negative feedback. You see, three negative interactions are a negative feedback group here. And two negative interactions, so the negative, double negative inhibition of an inhibitor, that's a positive interaction overall. And so this has a positive and a negative interaction. And it can do a lot more based on those two different feedback groups. It can do whatever the recresolator has been doing. The oscillations, they're not even stamped or stable. But it can also switch between the green and the red genes. Stably. Either green on or red on. If these two interactions are strong, you know. Or it can do something really cool. It can implement the switch between the green and the red that is basically going back or just like in this little movie. This is called a relaxation oscillation. Doesn't really matter. The important thing is by adding just one interaction in context. Even if that interaction is weak, it can create a whole new repertoire of behaviors that this circuit can possibly implement. So it is extremely difficult to just, you know, pick out motives and say they do this and then put them together. Because remember even just the choice of how many nodes you look at. Three or four, also four node motives have been looked at, etc. It's pretty arbitrary, right? At which level are you going to look at them? So network context and the nature and the strength of the interaction are crucial for the behavior. And so depending on those parameters of the system you get all kinds of different behavior and even a very, very simple network like a repressulator has three different behaviors. Nothing at all. Two different types of oscillation and this simple AC-BC circuit can do five qualitatively different things. Okay, three types of oscillations switch like behavior and also of course most of the time it does nothing at all. Okay, so this is the crux of the matter here. You cannot infer what a complex network does from just looking at its structure. It is impossible to understand the function of a network. You need to understand what it does and what it can do. And this is what we use dynamical model Okay, so what you need to do is you need to switch from one type of explanation of networks the one that is based on graph theory and that's been called topological because it's structural explanation to a more causal that is a mechanistic explanation what does this network do and I will tell you in the lecture that comes up next what I mean by mechanistic explanation. I hope you tune in again next time and as always thanks for listening.