 Welcome all to the new academic season of the Virtual City Computational Biology Seminarsery that we have started last year. So today we have the pleasure to have Woody Gouma 1, who is from the Institute for Chemical and Bioengineering of the ETH Derrick. So we go briefly, He is a chief career and who has obtained his PhD in 2003 and chemical engineering at the University of Illinois-Rubiner Champagne in the US. Then from 2003 to 2006, it was a post-doctoral research fellow at the University of California Santa Barbara in the US. Then he was fellow in Chemical and Pharmaceutical Engineering at the Singapore MIT Alliance for Research and Technology in Singapore and was also appointed as assistant professor at the National University of Singapore. And in 2011, he moved to Europe and became assistant professor at the ETH Derrick, where now he leads the Chemical and Biological System Engineering Laboratory. So his research interests primary line of intersection of systems engineering and biology and more specifically systems modeling and analysis of cellular networks. So his research group activities focus on the development of methods for model identification and analysis of generic laboratory networks, signature trends and pathways, sorry, and metabolic networks. And today we will share with some of the tools developed in the group and as well as the work used for biological network inference. So, Rudi, thank you again for accepting our invitation and the floor is yours. Well, thank you very much, Dan. Of course, thank you for organizing this and inviting this and thank you for bearing with my incessant emails, sorry about that. For those of you in the audience, thank you for coming and for those of you watching the webcast, thank you for listening. I have to be honest, this is my first time delivering a webinar, so I will probably do a few for Pa here and there, so I hope that you can bear with me. As you can see, the title of my talk today, Ensemble-Based Design of Experiments and for Biological Network Inference. And the word ensemble here refers to the idea of doing an ensemble modeling where we are looking at tracking a set of a family of mathematical models for biological network and using that as a way to represent uncertainty. And the title also suggests that we are gonna be using this ensemble as a way to design experiments and you'll see a little bit about that. So, I'll start with a short introduction about my lab into this scope, as Diane said, chemical and biological systems engineering lab. So, disclosure, I'm an engineer by training, so you'll see a little bit some evidence of this in the way we looked at our problems. Also, as Diane mentioned that we looked at our research, looks at, lies at the intersection between systems modeling analysis, building mathematical modeling in particular of biological networks. And there are two research, parallel research tracks in my lab at the moment. On one side, we're looking at development of methods and tools to extract or what we call reverse engineer models from data. And on the other side, we are interested in one particular application where we have been looking at the biology of aging, in particular looking at mitochondrial DNA mutations. And for today's lecture and seminar, we'll be talking mainly on the first track, which is development of method to do biological network inference. And I think I really don't have to talk too much or motivate this problem to this audience because you guys have been looking at the forefront of looking at biological data, omics data and extracting networks out of that. So these are an activity called network inference. And what I just want to stress here is that a lot of time we are dealing with this problem of under-determined in the mathematical terms. And they have different names here because of the dimensionality where you have high dimensional data but low number of samples or what Mike Stumpf from Imperial College mentioned data rich hypothesis rich. We do live in a situation where you have tsunami of biological data. So data rich environment but we also have a very rich hypothesis. There are many, many different hypotheses that could one could formulate to fit the same biological data. And of course the activity of network inference is not linear like what you saw in this slide whether it's an iterative process. And a lot of times this starts with some prior knowledge of whatever the, perhaps your collaborators gave you or also what you can find from the literature. And then you start by, let me see my cursor here. And then you can start based on the prior knowledge, existing knowledge. You could start writing down the model equations. And a lot of times some of the parameters are unknown and then you can do parameter estimation. Again, to stress here that these steps are under-determined mathematically. And then the next step would be to look at model validation. Well, I put here model invalidation because you can't really fully validate a model. If the model is sufficiently good enough for your purpose then you can use that. Otherwise you go back to do more experiments. And this is where I think that the model based design of experiments could be of value. And this is the topic of the lecture today. So I'm gonna be talking about basically two strategies that we have developed. Some of these are under review, so it's not published. Perhaps some of these are unpublishable. So we will, that remains to be seen. But I'm just gonna give you two strategies that we have come up with. Yeah, so the first one is the design of experiments, particularly looking at deciding the best knockout experiments, G-knockout experiments for the purpose of generative network inference. I'm gonna try to mitigate this problem by looking at generative network networks as a simple graph, a simple directed graph in this case. And this is a problem that people see very often that one would wanna do if you have a gene expression data. And by the way, a lot of these problems could fall under the umbrella of causal inference. They decide extracting causal relationships among nodes which where the nodes here are genes and the edges reflects the regulatory interaction. So if you have an arrow from node i to node j, that means you are saying gene i is regulating the expression of gene j. So the inference of genital network strong expression profile is still an unsold problem even after this many years. And this is something that dream challenges and I think some of the people in SIB are participating in this also. And the finding is that that disinformed problems is undetermined and especially differentiating direct versus indirect regulation is something that is quite challenging. And why is that? And this is one of the things that we are trying to address in our tool called trace. And you'll see in a little bit what that is. So we go back to the idea of knocking out or breaking the network as a way to infer it. So breaking the network in this case means performing a gene knockout. Let's, for an illustration purpose, consider the following simple example where you have five genes, very simple arrows, about five arrows here. And the idea is that if you were to knock out A and under steady state condition, perturbation of A would lead differential expression, not only in the gene B and D which are directly regulated by A but also all the genes that are downstream of A. So what you get from the differential expression, potentially, are all the genes that are directly and indirectly regulated by A. The same way if you were to, let's have B, knockout B, you'll see maybe C being differentially expressed, but also then B and B. So you can repeat this exercise and what you'll get out of the gene knockout data and at steady state is essentially a transitive closure of the graph or the accessibility relationships among the nodes. And that is a fundamental problem with the data and has nothing to do with the methods. So this is where we are, what we're trying to convey, that it's not the problem with the method but rather it's a problem with the data and the way the experiments are done. So of course then later on we will use this as a motivation to design better experiments. So how do we cope with an undetermined problem? So in this case, we know that for a given transitive closure you could actually have many, many directed graphs that have the same closure. So mathematically this is what described this indistinguishability problem. So the idea there is you could have an ensemble of graph with the same transitive closure. So keeping this in mind that the strategy that we are developing recently, this was published last year, is called TRACE, TRACE stands for transitive reduction and closure ensemble method. And what we want to address here is the ensemble nature of the multiplicity of the solution. So the solution that what TRACE produces is an ensemble. In particular, we are producing upper and lower bound of the ensemble. The upper bound represents the most complicated graph that could agree with the data. And the lower bound is the least complicated, the simplest in terms of the number of edges. So that's what TRACE does. TRACE takes in differential gene expression data from gene account experiments and produces upper and lower bounds of the ensemble. If you want to construct all members of the ensemble there is an easy way to do this but we don't really advise you to do that because the number of graphs could be unmanageable. So I'm just gonna describe to you a little bit how we do this. And I've described already how we constructed, how you can construct an accessibility relationship from a single gene account. What we did with TRACE is that we generalized this to n multiple gene accounts. So we treat any n gene account, simply as a single gene account in the background of n minus one gene deletion. So that's essentially the trick that we did. So if you gave me a single gene account, double gene account, three gene account, I could construct from them accessibility relationships but not just accessibility relationships of the gene regulatory network that you are interested in. For the n minus one gene deletions then you have the gene regulatory networks associated with the deletion of this n minus one. What do I mean by that? For example, consider the complete set of double gene account involving B for this small example. So these are not set of A, B, C and BD. So essentially you have single gene accounts of the following gene regulatory network, what we call GB here. So the difference between this gene regulatory and network from the original one is that we have removed edges from the node B in this case. And what we get from A, B, C, BD expression profiles are the transitive closure of this network. That's what we did earlier. In the second step, what we did is what we call transitive reduction. Transitive reduction actually is a directed graph which has the minimum number of edges among all possible graphs that have the same transitive closure from the accessibility relationships that you had earlier. So this is something that has been done, an algorithm exists from the 1970s. We know that the transitive reduction of a DAG, directed acyclic graph, if you don't have cycles then you have a unique transitive reduction. But if you have cycles which is something that you have to deal with when you are looking at real gene regulatory networks, this is, there are multiple solutions. So what we did there is that we simply removed edges from cycles because as you'll see in a little bit, these then, what we did essentially, we assigned these edges as uncertain edges, yeah? So to just give a bit of a background to determine really the direction of the cycle we need to break that cycle. Okay, and the third and last step is we then now have, since we now have accessibility relationships or transitive closures, and then we have for each transitive closures, the transitive reduction, what we did finally in the third step is to merge them. So this is a simple exercise where for the upper bound we simply take an intersection of the edges that the set of edges and for the lower bound we take the union of the set of edges. So you see here upper bound GU is the most complicated network that agrees with the data that you have and the lower bound is the least complicated network that also agree with the data that you have, yeah? For this particular example, it just so happened in this case that the lower bound is the actual gene regulatory network. However, you can never be sure from the experiment which one is the true network. So this is what Trace does. So again, it produces upper and lower bound and we didn't think that one needs to enumerate all possible ensemble and that's a rather compact way to represent the whole family of solutions here. And what we did now recently is we take the idea of having upper lower bound of the ensemble and use these to design experiments. And here the idea is, can you tell me the best gene account that I should do next to be able to guide the network inference, yeah? And for these reasons, we define what we call uncertain edges from the upper lower bound. Here I'm gonna illustrate the principle that we came up with for finding the optimal gene accounts using this simple exercise. So for the moment imagine we have the following upper and lower bound. So in this case we have edges that appear in the upper bound but not in the lower bound and for those edges are what we call uncertain edges, yeah? So for this particular example, we have the edge AB, the regulation A to F, sorry not AB, AF, A to F and then BG, the regulation of G by B, right? So these can be easily identified from the upper lower bound by just looking at the difference of set. Now the basic idea of finding the optimal gene account is the following. If we can disconnect all in the right parts, yeah? From the associated with an uncertain edge, for example here for the edge AF, let's imagine if we were to knock out C if C is not there, then confirming the fact or confirming whether AF exists in the network is a simple experiments of perturbing A, yeah? So it's about disconnecting in the back parts that are associated with an uncertain edge. So for BG we can imagine by the same principle we can knock out B, so if you were to knock out B, then confirming whether you have a BG edge, it's a simple perturbation of B because we no longer have the indirect parts, yeah? For this particular example, there is actually an optimal knockout which is E. The reason is that when we knock out E we can confirm both AB or AF and BG, yeah? Of course this will involve individual perturbation of A and a separate perturbation of B, yeah? So with that basic idea, we then define what we call edge separatoid of an uncertain of an edge, yeah? The edge separatoid of IJ, you give me an uncertain edge, node I to J, then the separatoid of this edge is the set of nodes whose removal would enable the verification of that edge by the same principle that you saw in the examples, yeah? And these we came up with three sets. There are many, many sets of many, many ways to define of a right at the edge separatoids of a given uncertain edge. The reason why we chose these three is because just purely for computational reasons. They are easy to identify and they are not too many of them to store, yeah? And you'll see in a little bit that during the optimization we have to store all of these separatoids while doing the online optimization. So for the set BG, what are these three separatoids, sets of separatoids? The first one is the progenies of B and the intersections of the progenies of B and the ancestors of G, yeah? So that's that guy. And the other set, the second separatoid, we consider is the descendants of B, intersection with the parents of G. And finally, we also consider the intersection of descendants of B and ancestors of G. So all of these are possible notes that you can knock out for which you can then remove and verify the uncertain edge correctly. So that's the basic principle. And what we did is then we wrapped this with an optimization, yeah? So this is a simple integer programming problem. We actually didn't use anything fancy. We used genetic algorithm. Of course, you can use more advanced programming algorithm. With the genetic algorithm, we were okay with that. That's the first thing that actually we tried. And for the ECOI network, we could get a solution in about 90 seconds. So we were quite happy with that. Nevertheless, we realized that you can use much, much better integer programming algorithms. But the idea here is very simple. Find the optimal combination of gene neckouts so that I can verify the highest number of uncertain edge. And that's the basic principle behind the design of experiments here. And we can put in constraints. For example, you don't want a knockout essential genes. These are genes whose knockout will be lethal. You could also limit the number of simultaneous gene neckouts because obviously the likelihood of you having syntactic lethality of 10 genes, knocking out 10 genes simultaneously could be very high. But two or three, perhaps, this is something that you can work with. So the tool is available. Manuscript is kind of in the under review. Hopefully it's published very soon. So the idea here is to do this iteration. So we begin with some prior data and using trace, for example, you can generate upper lower bound. And using reduce, we can then design the experiments. Then we can fit this back to the experimentalist and you can do the experiments for us. Once you get the data, then you can update the ensembles. And that's the loop that we are doing in the next couple of slides here. So we are doing these iterations. So it's proof of concept which applied this to E. coli gene regulatory network first under ideal scenario where we don't have noise. We don't consider noise. This is unrealistic, but we'll show an example where we actually put in noise. So this is probably about 1600 nodes, about 3,800 edges. The constraints is that we exclude essential genes. These are set that we got from the literature. And we also put a limit on the simultaneous gene accounts. So the first exercise that we did was, let's limit to the number of knockout to 10. I know this is actually we got a reviewer comment saying that look, look, 10 is very high. I'll show you in a little bit why sort of modification of that. So up to 10, we can get to the true network. So what I'm showing you here is the jack up distance of the upper bound from the true network and of the lower bound from the true network. We put a negative there. So when you see this, I wanna see the cursor. So when you see this coming down, that means the distance between upper bound and the true network reduces. And same way, if this come up for the lower bound, that's the distance between lower bound and the true network increases. I mean, they need, that's when you get the convergence that you make solution. This is a condition where we call the network is inferrable in this case. So 160 iterations involving about 430 knockout experiments. So the reviewer comments came and saying, you can't do probably 10. So we did something that is different where we do perform the iteration, whereby we gradually increase the number of knockouts. So we start with double G knockout and we perform until the reduce will not give you any more solution. And that happens when the set of separatoids all involve more than two genes. So we did this. So you can see now, you sort of see the steps. So where we did up to two, up to three, up to four, five, six, seven, eight, nine. So just we did this all the way to convergence basically. And so I think now surprisingly here we require more iterations and also more knockout experiments. However, it's I think notable to say here is that limiting the G knockouts to up to three genes, we can already verify about 95% of the uncertain edges. So that's, I think it's quite useful in that regard. We also applied this to Dream Challenge 100 gene network. And this is where the data we can actually generate using G net river. And this is a product of not too far from here, I think. Yeah, exactly. So we used that platform as a way to generate benchmark data. So this is the results of the number of verified edges for network one, two, three, four, and five. There are five networks, both standard networks in that challenge. The colorful bar here shows different, well, the results from the iterations and we did this until convergence. So all of the iteration converge to a single solution, but you notice here a little bit, I don't know if the cursor is there. Okay, so that we don't get all two positive and true negatives. We have false, false, and false negatives. I think that's sort of unavoidable when once you deal with real data or noisy data. So the iterative procedure can give us a unique solution, but that unique solution is not true network. Yeah, so I think this is something that one has to deal with as soon as we start dealing with noise because of the certain statistical significance that you perhaps have to do a certain cutoff that you have to do when you decide whether an edge exists or not. Now, we also noted that the errors were mostly associated with fan-in motives. So fan-in motives is when you have multiple regulators of a single gene. So in that case, what happened, if you knocked out just one of the regulators, you may not see any response in the downstream genes because of compensation effects, yeah? So this is something that perhaps it's a limitation in the data, it's not really the method problem, but it's really the problem with the fact that you have multiple regulators of a single gene. Okay, just to compare a little bit with the WG knockout, you see here, I forgot to show you the results of the WG knockout. So the gray bar here is the number of edges verified by WG knockouts. This one is a little bit more complicated. Maybe I'll skip this outcome, this results for the seminar, but we compare this to the complete set of WG knockouts and we were able to verify more edges. But also here, this table shows that we can verify those more edges in fewer experiments. So the total number of knockout experiments, as you see in the last column here, it's much, much lower than if you were to perform just systematically perform all possible WG knockouts. So we could, with this design, be able to provide more informative data per genet of experiments. Okay, so that's sort of the first half of my talk and in the second half, I'm gonna switch gear and talk a different type of modeling of a sort of the gene regulatory networks, where in this case, we are looking at more of what I would call it an engineering model. Here, we're considering writing down differential equations model for the gene regulatory networks of this type. So here, X would be the derivative, the change in concentration if you are from the engineering side. So these are change in concentration as a function of time. S here is the stoichiometry. If you're doing with metabolic networks or you can also call this connectivity or structure of the network. V encapsulates the kinetics. If you look at the edges in this example, for example, that describes the nature, the dynamics of that edge, and P is the parameters. When you're dealing with this type of models, not only you have now structural uncertainty potentially, but you can have kinetic uncertainty and parameter uncertainty. So when you deal with undetermined problem, this comes in at multiple levels when you start looking at more detailed modeling of this kind. And for the illustration in this case study, we're going to consider the myeloid progenitor cell differentiation signaling, not something gene regulatory networks shown here on the left. In this regard, what we're going to consider uncertainty in both the structure as well as in the parameters. And I'll show you a little bit how we do that. For the parametric uncertainty, we actually have a tool now already available on our website. So that's sort of an advertisement. So what we call redemption, it stands for reduced dimension ensemble modeling and parameter estimation. So what it does is it provides, it is done in MATLAB, so that's the engineering side of us. So we write this in MATLAB. We have requests for people, is it available in Python or R? At the moment, we don't have any plan to port this to Python yet. But so if you have any problems, let us know. Nevertheless, I'm just going a little bit what redemption offers. It gives you, we give user interface, which hopefully makes it easy for people to write mathematical models. So these are really just putting in, sorry, I need to get the cursor out. Stoichiometry, sorry, Stoichiometry matrix here, the parameter values and what are the bounds of the parameters initial conditions. Loading the data, which hopefully be dynamic data in this case. And then we also, if you desire, if you want to do preprocessing, for example, smoothing, if that's something that you want to do, then you can do that also here. There's a module also in here, this is one of the key modules is to estimate parameters keeping the data for the model equations that you have written down. Yeah, so there's a module to do parameter estimation. Now for the purpose of this talk, and this is where we are really pushing on the ensemble side, it's also has a module to generate parameter ensemble. And what this gives you are a family of parameter combinations for which the model would provide reasonable fitting to the data. Reasonable up to a certain threshold that you specify on the likelihood function value. Yeah, so the idea here is that you need a click of a button and a simple data entering that you can do this. By the way, we are using a tool called hyperspace to generate the parameter ensemble. And hyperspace is a parameter exploration tool written by another group in the SIV, Yorkstellings group and Andreas Wagner. And we have been getting a really good success with that, with our tools. So here's an example where we applied this to a very simple, applied redemption to a very simple network here shown here. This is a metabolic network branch pathway. So the idea is that if you were to provide, write down the equations in the redemption and then provide the data points shown here in the cross for all the metabolites, that this is sort of the output of the ensemble models that you get. And this is a true dimensional projection of the parameter ensemble. Yeah, so each, so it's consists of, in this case, 60,000 dots, blue dots. And each blue dot is a parameter combination. If you were to take five random points from this ensemble, this is the fitting to the data. So the idea here is that you don't get a single model, but rather you have a family of models. Each one differ by the parameters values that it has. And they all can fit the data reasonably well. So for this particular task, what we want to do is account to be able to account for both a structure and a parametric uncertainty simultaneously by doing ensemble modeling. And let's see how we do that. And we're gonna use this to also design experiments. And this is the sort of the theme of today's seminar. So I mentioned this earlier. We're gonna be looking at the Miele Generator Network. This is published by a group in Sweden in Lund. So what they have is they propose a network that looks like this, which is essentially consists of all possible arrows that you can put on three nodes. And the idea is that given this data, can you isolate the model? And they did that already. And we're not gonna repeat that. So essentially what they were able to do is with the experimental data that they have, this is the time series data of normalized expression level. And they were able to fit this and narrow down from 30 to different structures to 16 structures. So they cannot narrow down to a single structure in this case. And this is again, I would say, one of the idea that the challenges that I mentioned earlier has to do with the undetermined nature. So what do we do here? Is we decided to expand this a little bit to look at not just the structures, but also the parameter values that can be, you know, that you can have in each of the structure. So what we did is we repeated this data fitting. So given a structure out of the 16 possible, then I do a global optimization to do parameter, essentially to get the best parameter that fit quote unquote, fit the data. So, and for each one of this, then I have a simple rejection criteria. If the profile, the simulated curve are within this balance of 99% plus minus standard deviation. So three times of standard deviation, sorry. So 99% confidence interval, then we accept the structure, otherwise we kick it out. Yeah, so that's the first thing that we did. Once we kick it out or whatever that remains, we perform ensemble modeling, looking at all possible parameter combinations that could provide reasonable fit to the data by the same criteria of this plus minus three sigma from the experimental data. And this again, we use a hyperspace and we can use redemption to generate this. So from this, what we get for 16 structures, essentially for each one, so we numbered this one to 16, a family of parameters. Yeah, and what I'm showing you here on this plot is the volume of that parameter space. And that volume represent to us in this case is the uncertainty in the data regarding the parameters. Whereas the 16 structure represent the structural uncertainty. This one varies between 19 to 25 parameters. So it depends on what edge you add and remove. Yes, so immediately from 16, two of them we can reject. So we have now 14 structures to start with. Now in the design of experiments, we're gonna consider driving the system through what we call here the erythroperitin EPO. And this is something that you can do by putting the cells under hypoxia conditions. So this is just to show that if you put the cell under 1% of oxygen for different length of time, you can up-regulate at EPO for different levels. So the design of experiments, we borrowed certain concepts from Michael Stump, the work of Michael Stump's in the Imperial College. So the idea here, we wanna come up with the experimental design EPO level that would maximize the utility, what is that utility function? Well, the utility is something that you can define, a metric that quantifies the informativeness in the data. For us, this is something that we borrowed from the Fisher information matrix, where we looked at the covariance matrix. Either you look at one over the determinant of the covariance matrix, different or just looking at the eigenvalues, maximum eigenvalues of the covariant matrix. We try different things here. So the formulation looks like this. So this is the mathematics behind it. So there is a utility function that captures, again, the informativeness of the data for a given EPO level. And we integrate this over the posterior distribution provided that new experimental data. And the prior distribution is given here. This is the priors that we get by constructing the ensemble modeling earlier. That's how we get them prior. So the way we solve this is we treat whatever the terms in the argument of the integral as an augmented probability density function as using a surrogate density function called H. And then we perform a Monte Carlo random walk. In this case, we actually had better success with simulated annealing. And to calculate the posterior distribution, we use a technique called approximate Bayesian computation. And this is something that, essentially, it is a Monte Carlo way also to construct the posterior distribution. I don't think I had too much time to describe the approximate Bayesian computation, but there is a very nice introduction about how you construct posterior distribution using this method. And this is in this cross computational biology paper. So of course, with any Monte Carlo method, there is certain noise that you have to handle and then you can only get maybe asymptotic behavior that's very nice. But what you see from us here, and this is also something that we still are dealing with. So by the way, this is still unpublished work. We are still modifying a few things here. We have to deal with issues regarding noise and stochasticity of the objective function that we are formulating. Nevertheless, what we saw here, I don't know why this is, okay. So this is on the x-axis, I'm showing different EPO level as a solution from the simulated annealing. And how many times the simulated annealing is giving us that solution? Yeah, so here you get EPO level. So you get high frequency for the higher EPO, and in general, that unfortunately, there is no clear optimal solution, optimal experiment in this case. However, we did saw that for EPO of 10, the information is not that high potentially here because the simulated annealing didn't end up there very often. Okay, for the proof of concept, we took one structure which is structure 10 as a true structure, and then we can generate the data from that. And this is just an in silica test at the moment. So let's pick one of the optimal experimental condition which is the high EPO level of 730. And then we simulate the model and then we use Gaussian noise to generate the data. And then using this, the new dataset, we're gonna do the second iteration of that inference of the modeling inference. So we applied the same step, sorry. In this case, not only using the original data, but also with the new data. And then let's take a look how many of the structures remain in this case. So for this EPO level of 730, we can kick out quite a large number of the structures with only three remaining there. And it's encouraging to note that structure 10 remains there because that's the true structure that we generated data with. We also tested different EPO level. The other most visited solution is 360. And we also did that. We also get, we were able to pick up most of the structure except three structures. And again, it's encouraging to see the structure 10 still remains there because we know in this point or this example that that is the true answer. Now, if we were to do the EPO level of 10, the one that we suspect to give us low information, that we can only kick out to additional structure from earlier. So we still have about 10 remaining here. Oh, 12 remaining structures, sorry. So this is just to summarize what we did with this structure identification exercise. With the design of experiments, we were able to reduce from 14 structures that we started with to, in this case, three structures using the optimal design experiments. Whereas if you were to use a less optimal solution that is you still have to deal with quite a large number of structures. Okay, so that's all. So this is really the take home message. Sorry, I'm rushing things a bit at the end but I hope that I can meet the 45 minutes. Okay, so what I'm trying to convey in the seminar is I think something that perhaps, something that you can bring home with you and when you look at the modeling problem, when you start approaching a network inference problem in your work is to think about uncertainty. There's a lot of challenges in calling biological networks beyond undetermined nature of that but I hope you could also start to think about uncertainty when you start, when you build a model for your biological systems and what I'm showing you here is that by embracing that uncertainty, one could actually use it to then go back to your experimental partners and then tell them what are the best, the next experiments to do, yeah? So what I'm showing here is one strategy using ensemble to tackle the undetermined nature. This is what I would, the benefit of that I would quote unquote is unbiased but clearly this is assuming that the model equation is correct first so that's clearly even not valid. Nevertheless by embracing this uncertainty I think it's quite amenable. This is a nice framework for which you can also directly approach design experiments, yeah? There is a drawback in this. The fact that you do need to track and identify a large number of models and in some applications, for example, using trace, you could have a compact way to represent this but it's not necessarily always the case and what I found also looking at the literature there are ensemble modeling approach work publications. Some of these are, still I would say in the ad hoc manner some of this a little bit more organized and I hope that more people are jumping onto this bandwagon of doing ensemble modeling and perhaps we could as a community come up with more tools where we can use ensemble, build ensemble, use ensemble as a viable way to address biological modeling. Finally, last but not least, I need to acknowledge or show gratitude to the founding agencies that pays the bill. We get received from Etihad Zurich grants also from SNF. These are the group at the moment as of August so it's two months ago and I need to acknowledge Minhaas who's doing the generic regulatory network such as Ginaka design experiments and Luoyang and Erika are the two that are doing the myeloid the differential equation based design experiments. With that I thank you again for coming, for your kind attention and for listening to me.