 Hi everybody and welcome to this E-Life Symposium on Computational and Systems Biology. My name is Aleksandar Walczak, I'm a senior editor at E-Life. And we're here to sort of try and showcase a little bit some really exciting work that's been going on in some sub-areas of computational and systems biology. So of course this is not, I mean even saying it's a huge field is a terrible understatement because these couple of words encompass essentially all fields of biology from neuroscience to evolution, from behavior to metabolism and everything else I haven't mentioned, but share more of a common approach to how people approach problems in biology and try and find answers, both using theoretical computational data analysis but also experimental approaches. So we're going to have four talks today that hopefully will be, well I know it will be very exciting, but I just want to say that in no way does this cover the breadth of this field. And there's plenty of stuff that will not be covered also on the technical side such as image analysis, a pure machine learning, to name a few sequence analysis, genomics data. And I just want to say that by no means does it mean that this is not part of this field, we just had to make some very hard choices. So with that let me introduce our speakers today. Gautam Reddy, Kanakarajan, Erik van Neywegen and Ellen Mordelom. And they'll be covering fields from behavior neuroscience, gene regulation and biodiversity. And hopefully they'll be showing us how we can actually go from a biological problem through data with theoretical ideas and computational approaches to answer these biological problems. There are some participation guidelines. We want everyone to enjoy the symposium, but we want everybody to sort of behave in a way that they would behave in any other circumstances. So if you want to find out more about that you can read about it. The questions will be at the end of all the talks. So please post questions as you're listening to the talks as they come to your mind, but we'll be asking them collectively after the four talks. So hang around for the end. And if you have any problems during the symposium, please email and E-live members such as Maria Guerrero who's here or Shane Alston. And so without further to do, let's move on to the first talk. So let me stop sharing. And Gautam, so Gautam Reddy is our first speaker and he's going to talk to us about older tracking. So take it away. Hi, Alexander. And I'd like to thank the organizers for organizing the symposium. I hope you can see the slides and the pointer here. So I'll get right into it. This talk is going to be about how animals track older trails. So what I mean by this is about how animals track surface-borne older trails, like the kind that you would expect dogs, ants or gorlins to track. For example, here you see this beautiful photograph, a long exposure shot of a dog tracking a pheasant trail. So basically this bird has been dragged along this yellow line and you can see the dogs exagging around it. And that's what we want to understand. The goal for this talk is just to give a flavor of what theory can offer in answering questions of this sort, behavioral questions. Most of the details are in this preprint over here. And this is joint work with Boris Shryman and K.I.T.P. and Massimo Vargasola at E.C.O.L. So what do we know? So it turns out that we know quite a bit about trail tracking. And this comes from training dogs to track trails and so on. But we know surprisingly little in terms of actual scientific experiments. And what little we know comes from experiments with ants and gorins, essentially. So what you're seeing here is a video of an ant tracking a pheromone trail. This is happening in the dark, so the ant has to use its antenna to track the trail. And what one can do using these videos is to use some kind of deep learning methods to extract the body movements over time. And that's what you see below here. You're seeing the two antenna, the body and the trail in black. And as you can see here, the ant is really good at tracking this trail, right? It's pretty much right on the trail. And it's not using some random strategy. It's actually using some crisscross sampling strategy here. It's alternating between using its left antenna and its right antenna. And we don't really know what exactly it's doing there. One can do similar experiments with rats. So this is an experiment from up in Urbana's group back in 2012. And the idea here is that you have a rat on a treadmill whose speed is controlled, and you have a person drawing a trail by hand. This is a chocolate trail, and the rats are trained to follow these chocolate trails. And you record the ant and the trail at the same time. And over the course of days, the rats get pretty good at it. They're able to track the trail, as you can see here. And you see some interesting behavior. So for instance, if you break these trails, the rats still track the trail, but they tend to exhibit these casting trajectories. These oscillatory is exact trajectories of increasing output. And I'll get back to these trajectories later on. Now, regardless of what the ants are doing here or the rat is doing here, any kind of trail tracking strategy should involve, should require the animal to continuously reorient itself along the trail based on its past experience. And this kind of team or decision-making problem is quite common to many behavioral problems which involve solving a task. So typically what you have is that you have some sensory input. You have the animal's self-motion cues. These are being transformed in some way. It's a black box. We don't know, and you would like to find out. These are being transformed into some decision that the animal takes. And this decision, again, feeds back on the input. So there's some kind of active feedback control going on here. Now, of course, this black box is going to be informed by whatever physical or computational constraints that this animal is experiencing. So we'd like to understand both the black box as well as these constraints and the sensory input that is most useful for the animal to perform this task. I mean, trail tracking in principle could be arbitrarily complicated. The issue here is that you have a very long history dependence. The decision-making strategy of the animal in principle can depend on the entire history of the animal. But we are going to try and simplify this problem into its basic elements and hopefully identify some general principles here. So if you dig deeper into the problem, as I was saying, what you'd like to do is to follow the trail. So you want to figure out where the trail is headed. And the simplest way of doing that is to interpolate a given that you have two points. So you can't do it with one point, of course, because you have to interpolate a line. And you can do it with more, but two is the simplest version. So what do you get when you have two? When you have two, a clear solution is that the trail is most likely headed along the line joining these two points. But of course, it's not just the line joining these two points. You also have some uncertainty. And this comes from the fact that trails curve. So the trails are not always straight. They can curve. They can break. And so you have the most likely heading and you have some uncertainty. And this forms some kind of sector. It forms an angular sector. To give an analogy, you can think of this as if you're walking in the woods and you see a curved trail over here and you can't see beyond that. And someone asked you to extrapolate where the trails have it. So of course, you would say that it's most likely headed along this direction, but there's some uncertainty because the trail could curve. And that's the idea. But what this suggests is that you can break down this presumably complex trail tracking problem into simpler parts. The idea is that every time you make contact with the trail, you immediately lose contact with it. Now you search and try to reestablish contact with the trail as you're seeing here. So you execute some kind of trajectory and reestablish contact with the trail. And then you repeat this process. So you can break down the full problem into these smaller chunks. And each chunk is a search task. So in this reduced framework, in the simplified framework, we can ask more specific questions. So for instance, we can ask, how should that animal sample in order to reestablish contact with the trail? We can ask if there are limits on how fast an animal can track a trail. And of course, there are going to be limits because you can't track infinitely fast if the trail is curving. So there's a relationship between trail geometry and speed. And the third question, which is relevant for animals, is how do you integrate past information? How do you take into account all the history of detections that you made? I won't focus on the speed in the time that we have. I'll briefly touch on the first and third questions here. So to answer the first question, we're going to use something known as the reinforcement learning framework. It's been developed maybe 30 years ago in computer science inspired by neuroscience, of course. And the idea of reinforcement learning is that it provides algorithms to solve the feedback control problem that it said before. So the idea is that you have an agent which interacts with its environment by taking certain actions. Every time it takes an action, gets an observation back from the environment and it also gets a report. And this loop forms a perception action cycle. So this generates behavior. And the animals simply choose the actions that maximize some kind of long-term reward that is specified externally. So in the trail tracking problem, for instance, what we're going to do is simply to ask the animal to reestablish contact with the trail once it has lost it. And we're going to give it a reward whenever it finds a trail again. So what you're going to see here are the trajectories that come out of this learning simulation. And the animal is going to, the agent is going to start from the origin. And you have to imagine that it's going to go from left to right. And you have to imagine that the trail is somewhere along a cone within this region. And the distribution of the angles is given by this prior distribution. So you're going to see the trajectory in blue and this is the learned trajectory. The unlearned trajectory initially is completely random. It just wanders off. But as you can see here, the agent learns to perform these oscillations. And you can see that the posterior turns into a bimodal distribution which shifts between a left and right. And these oscillations persist until the agent finds the trail. You can do something similar. So the real power of reinforcement learning is that it's completely model free. So how you do it is that you set up your simulation, you give it some reward whenever it succeeds. And you just run it from any trails. So you're not giving it a lot of your own structure to the problem. And it's able to learn new strategies. So you can do the same thing with not a single sensor, but two sensors similar to an ant. And that's what you would see here. And as you can see that the trajectory is quite different from the single sensor case. Instead of the oscillations, what you see is this criss-crossing strategy. And if you stitch these trajectories together on an actual simulated trail, you kind of concatenate all these subunits. You can see that it qualitatively reproduces the kind of behavior that we saw in experiments. So for the single sensor case, it follows the trail over here. It's pretty much exactly on the trail. And once it loses the trail, it performs this casting trajectory. And the case of two sensors is going to alternatively use this left and right. The inclusion behind this strategy is that it's quite simple, right? So you have a sector. You have to search over some angular space. And it turns out that the most efficient way of searching over this angular space and going forward at the same time, the single sensor is to do this casting motion. And the two sensors is to do this criss-crossing sampling and this kind of rationalizes the behaviors that we saw. So I mean, I'm not going to go into the details of how past information is integrated, but one can potentially ask if you have multiple contacts with the trail, how do you integrate this information in order to find where the trail is headed? You can use ideas from statistical physics to do it. And the idea is that you have a stochastic model of the paths that would go through these various points. And the key point that I'm trying to make here is that when we write down these models, we get an effective description of what the sector looks like given the distance between points of contact and the history of what the animal is detected. And so this is going to inform how the animal behaves, which we can measure experimentally and this will tell us something about what the animal is actually doing. And I'd like to end by saying that a little bit about the general approach here, which I think is relevant for behavior because it's much easier to collect data than to interpret the data. So if you can split the problem into what is the problem? What is the solution to the problem and how is the solution implemented? We are square in the first two places, the computational and algorithmic aspects. And what we'd like to do is to build theories to explain the computational and the algorithmic aspects of behavior and hopefully generate some feedback with experiments. And ultimately figure out what neural circuits actually implement these algorithms. Thank you. Thank you so much, Gautam. Kanaka, would you like to share your screen? Okay, wonderful. Thank you so much for having me. I will try to conclude on time, but please... Okay, sorry. I actually forgot to introduce you. Yeah, but I'll make speakers Kanaka Rajan and we're going to move in more into neuroscience with neurons this time. Okay. Yeah, no worries. Thank you for having me. I understand that many of us haven't done these 10-minute talks since our lab's physics days. So I will do my best. All right. So in life, when we're evaluating actions and Gautam set me up perfectly for this, right? We evaluate whether those actions are worth the effort or not. So if actions are repeatedly fruitless, ultimately you become kind of dejected, right? And in the extreme, this can manifest as hopelessness or sometimes also known as learned helplessness in the depression literature in the extreme. This kind of phenomenon is seen in many nervous systems and it involves essentially a response to persistent and inescapable stress as perceived by the animal. So it's seen in smaller nervous systems like the larval zebrafish or larger nervous systems like the mouse, the rats, humans, everywhere, right? So one of the questions that as theorists we can ask is, are there any circuit mechanisms that are conserved when you go from these smaller nervous systems to larger nervous systems, as well as identify key divergences? But one of the ways that we approach such a problem is to build computational models. And in my lab, we build neural network models that are constrained directly by experimental data. And I'm going to show you an example of that in a minute. We analyze or reverse engineer these models that have been built constrained data. And the idea here is in for circuit mechanisms that are inaccessible for measurements alone. So let's look at one of these examples in detail. I've written a few papers on the subject and so you can always ask me something at the end if you like. So the experiment I'm going to be talking about today involves larval zebrafish. In Carl Deiserot's lab, it's an ongoing collaboration of my lab and Carl Deiserot's lab. So all the data I'll be presenting were collected there. So larval zebrafish are exposed to stress over a prolonged period of time. I'm talking about half an hour. And when the stress first comes on, and these are very mild electric shocks, the fish with their tails trying to get away. But their head fits so they can't get away and it's an open loop experiment. So there's no correlation between their escape movement and the shocks coming in. So eventually fish go from active coping to the state that you saw in the cartoon before where they start struggling entirely. And that phase transition is going from active coping to passive coping. And I've quantified that here for you. In blue is the shocked fish tail whips as a function of time. And in black is the control fish. And you can see those two sort of epics playing out in this gray, in this pink bar here. There's an active coping phase which involves an increase in this vigorous evasive movement followed by something where it decreases below the control state. Now the Deiserot's lab is essentially an expert monitoring brain-wide activity. And so that's what they end up doing in the larval zebrafish system. They're doing different neurons in the brain of the fish. In fact, the whole brain we have access to, they record activity from this over this entire half hour, 45 minute experiment. So of course this begs the question what actually happens in this neural activity that caused the behavioral transition that you see both in the cartoon and in the quantification on the left of this movie. So we're returning to our approach, right? We're gonna answer the question what a brain-wide mechanism mediates active to passive coping in larval zebrafish using our idea of building neural network models constrained by data. So the first thing we do is to go from single region recurrent neural network model. So when I refer to RNNs in this talk, I'm talking about recurrent neural networks where neurons connect to one another bidirectionally their speed forward and feedback connections. And the only thing we do here is to wire multiple ones together, one for each area the larval zebrafish brain we're talking about. And these types of models are simple and interesting because they have a mathematically tractable parameter set that controls how they behave. So in the single module RNN, it's this matrix of connectivity or variously known as directed interactions. In the multi-region case, this interaction matrix has an interesting shape because it now has, it could have block diagonal structure within area connections going down the principal diagonal and across area interactions going in the various sub blocks. And this is a feature we'll return to in a second. The second thing we do, as I hinted to in the beginning of my talk, is to train the activity of each unit, the activation of each unit directly to match experimental data. So it's not this realistic plasticity mechanism or anything, it's a trick to get these types of artificial systems to behave like the neural system. So these are the two things we do. And one of the outputs we get from such a mechanism, right? We get a multi-region RNN that once it's trained, every unit in it behaves exactly like the experimental data it was trained to match. This by itself is not super surprising for people that work with neural networks like this they know universal approximation and they figure out that this is not that surprising. The surprising thing is our ability to infer from these the matrices that give you the interactions within and across regions by reverse engineering these trained models. Finally, and this is the thing I'm going to tell you about in the next slide, is the product of those two objects gives you currents due to recurrence both within and across areas. Now these last two things are quantities that you cannot get for measurements or not. And that's what the power of these types of models is. They provide a platform for us to better leverage existing data and to be able to design experiments based on testing predictions made by these models. So let's look at the third one in detail, right? Fitting these types of models to time varying data gives us this matrix where across the diagonal are the within area connections or directionality magnitude and type of interactions. And across our connections for example from region A to region B to region A, region C to region A. Now the dark product of those two objects gives you currents due to recurrence. And that's why we call this process current-based decomposition or curb for short. So that's what we end up doing in the larval zebrafish system. And I'm going to show you an example of we've done this in the whole brain and we're analyzing those trained networks now. But we build a three region model which I'm going to show you today. One that looks like the habanela, one that looks like the raffae, and one that looks like the telencephalon. So these are three regions in the larval zebrafish system that are known to be historically known to be mediating this type of behavior. And this is inherited from mouse literature where a lot of this work has been done previously. So we're doing exactly the same exercise as I told you before. We're training every human in this model to match data collected from the respective region. And from that we infer this connectivity matrix. The product of those two objects should give you currents. Now based on how you actually do the dot product, you can get currents within a region as well as across regions. And so what we're going to do now is to decompose the activity of the habanela into these component currents. And now instead of this one N by time matrix giving you outputs in the habanela, you can look at the sources of this activity decomposed into within habanela interactions and currents that result in the habanela activity that are inherited from the RAFE as well as currents that are inherited from its interactions with the telencephalon. So essentially by doing this decomposition you get one within area current and two between area currents which you could not have gotten from measurements alone because you needed the matrix which came from fitting the RNNs in the first place. Now that you've decomposed this N by time matrix into three of them, the sum of these three still gives you the recorded output but now you can do dimensionality reduction or state space analysis in the current space. And so that's what I do here. Now I'm deriving a coordinate axis by just doing principal components on these three currents and that gives you this coordinate axis system with habanela to habanela currents in blue, RAFE to habanela currents in red and telencephalon to habanela currents in yellow. Then I take the recorded outputs and project them into this new coordinate space. In addition, I've put the timing of the various electric shocks that are given to the fish on it as dots. Now the warm colors are early part of the experiment which is supposed to correspond to an increase in the activity of the habanela and later on is later shocks are in cold colors. What we see here is a separation of time scales. So you see that rather than activity being concentrated within the habanela it's actually early part of the shocks or active coping is mediated by rotations in the current space external to the habanela specifically from the RAFE into the habanela and it's only later on that the habanela to habanela and telencephalon to habanela currents come on. And so this kind of point of view gives us something that the traditional point of view doesn't but I was taught never to trust a 3D plot so let me show you what these things look like as a function of time. So on top you see the activity derived from the exact same analysis over time for control fish. This is models that have been fit to individual fish that have not gotten shocks. So there's two sources of variability here. First initialization of these matrices and two across individuals. Now in black is the tailwhips which have made into continuous trace just to give your eye something to look at. In blue is habanela to habanela current in yellow telencephalon to habanela and in red RAFE to habanela. In control fish the three currents seem to mirror one another but if you look at the exact same plot for the shocked fish you observe two things. One is that the movement in black goes up and then goes down to zero indicating the active to passive coping transition as we expect but here there's an early ramp that is driven by the RAFE to habanela currents and it's only later on that the habanela to habanela and telencephalon to habanela currents start to ramp up. This is conserved across individuals and we're now probing this through causal experiments in our collaboration with so the prediction here is if you eliminate this current passive coping should be either delayed or eliminated. So this is the kind of prediction that a study like this can make for you. Right Naka, I'm sorry to interrupt but we really need to wrap up and move on to the next. We're really done. So this basically is something that we're now trying to publish as a powerful alternative to the traditional point of view of looking at neural activity which is looking at neural activity and correlating it with behavior sorting neural activity, averaging neural activity or in fact looking at activity in the principal component space. So this type of decomposition is a much more powerful alternative or at least a compliment to the traditional point of view that is all we wanted to say here and this is just conclusions and with that I would like to say thank you for having me and to our funding sources for their support of our ideas. Thank you. Thanks so much Kamaka. So let's move on to our next speaker Erik van Neymegen if you could share your screen and we're severely shifting topic now to gene regulation. Thanks, Alexander. Well, for inviting me to take part in this I would like to start my talk by reminding you that gene regulatory networks are in the sense responsible for the intelligence of cells. They allow microbes to respond to changing environments and adapt to them. They allow multicellular organisms to express a single genome into a sort of society of diverse genotypes and for 20 years now we've also known that in bacteria the regulatory capacity is growing quadratically with the size of the genome so the bigger the genome gets the bigger the fraction of the genome that is devoted to these regulatory networks. The question I want to focus on today is how does gene regulation evolve the novel? So typically in bacteria transcription factor responds to some external signal and then up or down regulates a set of target genes by the appropriate amounts depending on the strength of the signal but if you think about it it seems that quite some things would have to happen in parallel in order to evolve such regulation in novel so you need a regulatory protein to become responsible responsive to the strength of a particular signal and then all the appropriate target genes have to evolve binding size for this regulator and no other gene should and then you also have to tune the strength of these target regulator interactions to lead to the appropriate induction levels so you may ask how does evolution accomplish this so so readily and I want to share with you some insights that I think we gained about this question from actually analysis of gene expression noise in E. coli so for about a decade or so people now have been using fluorescent reporters where you stick a promoter sequence in front of a fluorescent gene either on the plasma or on the chromosome in combination with either flow cytometry this is what we used or microscopy to essentially measure single cell distributions of genes on a genome wide scale and here is an example I show you that what you get from most genes is that the log expression is roughly Gaussian distributed so that you can characterize the expression distribution for a single gene by the mean and the variance in log expression that's rotor all right now in this plot here all the black dots correspond to native promoters from E. coli where along the x-axis is the mean expression and around the y-axis is the noise of all these native promoters now in this project what we did is that we evolved random synthetic promoters by essentially taking a large library of random sequences of about 100 to 150 base pairs long and screen from them those short sequences that drive expression at similar levels as for native coli genes and to a surprise what we observed is that the noise levels of these synthetic promoters that I'm showing you here in red are generally lower than the noise levels of native promoters now we know that these synthetic promoters were not selected for their noise properties so what this shows is that if you just take random sequences that express in E. coli their default behavior is to have a long noise this means that these noisy native promoters that we see must have been under some pressure of natural selection that increased their noise levels so this raises the question what is special about these native promoters that are noisy to make a long story short what we found is that the more noisy a promoter is the more regulatory inputs it tends to have so here we sorted promoters from low to high noise and you see as you go the higher and higher noise the average number of regulatory inputs that these promoters have as known in the literature is increasing systematically now our interpretation of this is that the noise in these native promoters comes from propagation of noise from regulators to their targets so that the noise of the gene in a sense reflects how noisy the regulators are than the regulators gene now this raises the next question why is it like that is it that this is just an unavoidable cost of gene regulation and that cells don't like this noise but they just cannot avoid it or is there maybe an evolutionary benefit to this noise propagation so to look at that we developed a general theory of how this noise propagation affects fitness of genes that are asked to do certain gene regulations so to give you a flavor of this theory this cartoon tries to show you what happens when you couple a gene to a regulator so we're imagining that cells undergo three environments here a red a gold and a green environment and these little shaded regions show you that in order to survive and grow in the red environments the gene has to express in this range a gold environment in this range and in the green environment in that range now if you start with some gene that is without regulation it might have a expression distribution across single cells that look like the blue distribution and this shows you that in the gold condition this gene is doing very well because its distribution of expression levels asks overlaps well with what the environment asks but in the green and red conditions it's very bad and this organism will not survive because none of these cells are expressing the appropriate level now ideally there would be a transcription factor in the genome that was upregulated in the green condition and downregulated in the red condition and if now this gene evolves a binding site for this regulator its expression in these different conditions will look like that and its fitness now what is important to realize is that the effect of coupling the regulator to the target has two effects here first there is the condition response that means that the mean of the target becomes dependent on the mean of the regulator that regulates it and this is of course how one normally thinks about gene regulation but second because the regulator itself has some noise in each environment that noise is propagated to the target who by this effect increases its noise so we work out a general theory how this effect the fitness of coupling a regulator to a gene and it turns out that under some fairly weak assumptions you can show that it only depends on four effective parameters one is the variation in the desired levels of the target gene so sort of it's a measure on how unhappy the unregulated gene is second the signal to noise in the regulator how much the regulator varies across conditions versus how noisy it is in each condition how well the levels of the regulator correlates with the desired levels of the targets in this condition and finally the coupling standard and in terms of these you can really get an analytic expression of the fitness effect of coupling a regulator to a target to explain to you what this does this shows you what would happen if you take an unregulated gene that starts out quite unhappy and coupling to a regulator as a function of how noisy the regulator is so from low signal to noise to high signal to noise and how well the expression levels of the regulator across conditions match with the desires that this target gene has going from no correlation at all to perfect correlation obviously the best you can have is if you find a transcription factor that has high signal to noise and that almost perfectly correlates with the desires of this target gene if you couple to such a transcription factor you will have the highest possible fitness and basically this is the standard condition response effect right that the regulator is pushing the target up precisely the way it wants but if you take instead a regulator that is very high noise and it doesn't correlate at all with the desires of the target you find that you also much increase the fitness for this target gene but here the origin of this increase in fitness is not regulation in normal sense basically the target gene uses the regulator as a source of noise to implement a bad hatching strategy that already makes its fitness much better but the crucial insight is now that there is a continuum of solutions that interpolates between these regulatory strategies so there is an essential you can have regulatory strategies anywhere in this plane that combine this condition dependence with noise propagation and so this line here shows you the sort of optimal combinations of noise propagation and condition dependence that you can start your gene regulation by coupling to a regulator that only increases your noise can you wrap up Eric? time's up so to summarize what I've tried to show you we showed that noise propagation is functional can act as a rudimentary form of regulation it allows regulatory strategies that smoothly interpretate between signal and noise and it allows a crude regulator to combine it with noise propagation to obtain fitness to it that is close to the optimal that you could achieve and it looks like E. coli is implementing such noisy regulatory strategies alright thank you very much so thanks so much and let's move on to our last speaker Ellen Morelon so Ellen will be talking about biodiversity yes thank you for organizing this and for having me here and giving me the opportunity to present some of the research we're doing in my group where we develop computational approaches for understanding the deep time evolution of biodiversity if I can move on one side so we're interested in understanding how biological diversity evolved on geological time scales to explain patterns of species and phenotypic richness as we see them around us today and so to give us specific examples we're interested in a big overarching question we're interested in understanding why species group are more species rich than others, why some regions of the planet are more species rich than others and why phenotypic diversity is richer in some species group than others the overarching approach that we have to address this question is to develop stochastic models and there are two types of stochastic models that we're developing and that other people in the field develop and to give you an easy example with the simplest model we can think about a model of diversification where we have constant rates of species and extension so the underlying process is just the process of a plate that starts with an ancestral lineage and diversify with decision rate lambda, extension rate mu and we can develop statistical approaches to adjust these types of models to phylogenetic data so data of phylogenies represents the relationship between extant species and the type of results we get are parameters that are biologically relevant and interesting for us such as rates of species and extension and by comparing different types of these models we can also compare their relative statistical support and so in the end get an understanding of an ability to compare different modes and tempo of species diversification same principle same principle to study how phenotypes evolve to produce current patterns of phenotypic diversity we can use models if we think about models for modeling quantitative traits we can use stochastic diffusion models where we have diffusion processes that run on phylogenetic trees so same thing one of the simplest example the Boolean process that we're going to run on a phylogenetic tree that is determined by the diffusion coefficient sigma if we can adjust this model to data which in this case is data that combines the phylogenetic relationship between extend species and their phenotypes that we can characterize today we can get estimates such as the diffusion rate that we can interpret as a rate of phenotypic evolution and we can compare the support of different types of models and get some conclusions or some ideas on which modes or tempo of phenotypic evolution are the most likely to have occurred given the data that we observed today so what I did because this is obviously a pretty broad field is to showcase a couple of examples of models that we can develop and applications to data to show you the type of questions that we can address in macro evolution if we develop models where the rate of species extension can vary through time we can try to get an idea of how these rates change for time and potentially have an estimate of how species richness are varied in the deep past from data from extend species so here for example applying these approaches to receive desinfagogy we see a case of a plate where species rate might have declined for time in constant for time and another case where the opposite occurs where species rate might have been constant for time and essentially decreased for time generating patterns of increase in diversity over geogy for time and then declining diversity that have also been observed in the foreseeable Dylan may I just ask you if you could speak a little bit louder please yeah sorry I hope you could hear so far I wonder if my volume maybe is not high enough is that better like this yeah if you could just try to speak a little bit more so same thing if this time we try to develop models where rates vary across lineages not only for time but across lineages so here it's an example where at each species event we are going to have a model where we draw a new species ratio distribution that is centered on the ancestral rate through the potential temporal trend and a variance given by another parameter that controls how irritable the traits are so no sigma values are going to model and species rates that are very irritable across lineages and large sigma values some that are much more lab-libious we can also develop inference machinities that allow to estimate hyperparameters of the model which control which are going to give an example or an idea of the overall trend in diversification rates for time and of the variability of the rate of the rate across lineages and we can also get estimates of species rate for each branch in the Farage D2 and what we've been able to show using this model and adjusting it to the the bird radiation is that there can be several orders of magnitude difference between diversification rates across lineages even in pretty constrained taxonomy groups and what these types of approaches can be useful for also to try to understand not only like to quantify how these rates vary across Farage T3 but also what drives these variations we can try to find correlates of diversification rates with a lot of different results from the technological characteristic of the species or biographical characteristic of the species. Here is an example where we looked whether we could find an association between species level genetic diversity and species rate with an expectation that we might have a positive relationship whereby we need genetic diversity has material for species rate to occur. What we see here is the opposite trend where we found a consistent negative association between genetic diversity and species rate across mammals which we interpret has an effect of the fast diversification of lineages leading to the amount of genetic diversity that can be accumulated species rather than an effect of genetic diversity on species rates. We can also develop models where we can link past species and extinction rates to past variations of the environment. For example, past temperature for which we can get estimates and so what we have found using these models on empirical data for example, the body size evolution we have found faster rates of body size evolution under cold climatic periods across birds and mammals and we have found in diatoms critical contrasting effects of different environments in different diatom groups where the main environmental driver of diversification is going to vary across groups and potentially also the direction of this effect can vary also. Another class of model that we have been pushing forward are models where we can test the potential effect of interspecific interaction on diversification and rate evolution and the type of results we can get here. Two examples again, one example where we tested the strength of interspecific competition in traits that are used in particular for resource use. There are two traits that are involved more in social interaction where we found a stronger effect of interspecific interaction on trait evolution for traits that are involved in resource use. And also on the right where we tested this time how the effect of competition vary with latitudes with an hypothesis that has been put forward for automation for species richness in the tropics which is that species interactions are stronger in the tropics and tend to spare diversification and by testing these hypotheses using the bunch of models where we are able to estimate the strength of interspecific interaction in tropics or their species. We don't find such an effect. Then we need to wrap up whenever you can please. Yes, this will be short. Last example showing that these types of models on phenotypic evolution can be expanded on high dimensional phenotypes such as freedom of matrix data and to wrap up I will just say that there are these kinds of computational tools, that's the stochastic models adjusted to phylogenetic and phenotypic data that allow us to improve our understanding of how the biotic and biotic factors have shaped diversification and phenotypic evolution. We made these tools available to the community through the packages and with that I will just wrap up by thanking everybody and family sources. Thanks for listening. Thank you very much Elaine and thank you very much to all the speakers for these very stimulating talks. So let us now go into the question session the first question for Gautam. Are there examples of animals where you can effectively remove the sensor and then show that the strategy shifts in a way predicted by reinforcement learning? So the answer is yes for the first and no for the second question. So yes people have clipped an antenna off an ant and it freaks out what eventually it learns but we haven't really tried that out in our own. Alright thank you. So next question for Kanaka how do you know that this problem is identifiable that you cannot get equally well fitting models that have very different direct interactions between regions? So it is totally so this thing has the contra variance question that I think this may be biophysicist asking the question like this one but so the more data we have to constrain these models the smaller the dispersion of the solution space. So what I mean by that is you know like in a system like the larval zebrafish we have between 10,000 to 40,000 neurons which means that we're building very large networks so there the dispersion or the sensitivity to initial conditions is much less than in a system like the Makaka or the human where there will be sensitivity to initial choices and like that too we're simplifying away a lot of the complexity of real biology like the influence of glial cells like other cell types like neuromodulatory influences. What we're capturing is the matrix of connectivity that gives you the full dynamical system so it is better than the correlation matrix from the outputs but it is not capturing the full dynamics right so for full complexity which means we could be inferring indirect or disynaptic interactions or even neuromodulatory influences as as a direct connection so yes and no is the answer to your question. Okay let's next question is for you Eric and that's the model work for independent for independent gene networks in an interdependent interdependent yeah you're right that's why I didn't understand the question in a network system how would you define noise? Okay so these are two different questions the answer to the first question is yes to an extent although we're sort of looking at what is the effect of adding one regulatory interaction at a time so you're imagining the network is working in some way and then you're saying well let's put an extra binding site that compels this gene to a regulator that exists in the network and depending on how that regulator behaves and what the fitness function is for the expressions of that gene across conditions this will have now some effect on the fitness of the gene on that so in that sense it also works for networks what do I mean with noise in a network context I don't think I totally understand this question there is just cell to cell variability in the levels of transcription factors and also just in the binding and unbinding which is also stochastic right so there's thermal noise in the cells all these molecules are doing Brownian motion and so I mean I also don't know what the question was but we could imagine the question being how do you combine the noise of various genes into a fitness function ah so basically well the assumption that we make here is that in a given environment the global sort of survival you know death rate and reproduction rate of a single cell is a multivariate peak function of the gene expression levels of all the genes that it has and so basically if you assume that you can characterize each gene by sort of like an optimal level in this environment and the sort of standard deviation over which fitness is affected then the model goes through okay thanks our next question is for Ellen when it comes to accessing diversity is phylogeny more beneficial than functional characteristics of the species so I guess it depends on the question that we're interested in if we're interested in understanding ecosystem function obviously having functional characteristics of a species is going to be more useful in this case phylogeny sometimes is used as a proxy when we cannot measure functional characteristics now when we're interested in questions such as the one that I presented more looking at past dynamics of diversity then this type of inference is done with phylogeny data okay thanks so let's go back to Gautam how much of a role learning from past experience is there how much better does the animal get a tracking with experience can you incorporate this into your models in learning does matter right like in the rat experiments they learn over days to do it well of course you don't know if they're learning the task itself and not how to track but for dogs for sure right like veteran trail tracking dogs are much much better than they want incorporating into models is quite hard because when you're training a reinforcement learning model I mean it is a learning process but it may not really reflect how the animal actually learns maybe have a follow up here I mean when you know you're giving these examples and is it more I mean is it just long term learning from like one experience to the other is it also the short term learning that happened a few seconds ago in the same trial that matters do we know that in the kind of in the in the trail tracking both matter is a learning it within a trial is about like eliminating where the paths are whereas learning across trials would be how do you search for eliminate the different paths over the long okay thanks so next question for Kanaka would you comment on how you plan to perturb the patterns you see these regions have a mix of cell types at many levels including the simplest one which is a mix of regulatory versus inhibitory cells I know you can do it on to genetic I think this was of optogenetic optogenetically right but how do you know what cells to target tricky question I don't know honestly at this point so the one thing we can make a prediction for is blocking feedback from all cells into the RAFE or all RAFE units into the HAVENILA the advantage of using this kind of current base decomposition is that the matrix comes with positive and negative elements so rather than me putting in cell types up front we can infer the excitation versus inhibition currents from just looking at the model fits and they don't necessarily have to be symmetric the matrix doesn't have to turn out to be symmetric so we're waiting for them so this was a zero order exercise for us so we're waiting for them to come up with a different design system and then design experiments and add more complexity in terms of sparse inter-area connectivity excitatory and inhibitory cells and so forth the third point to this is that we have fish that do not lapse into passivity because they have been exposed to ketamine so in those fish the prediction if I am right eliminated and that's something that we do find but as you had asked in your first question I imagine it's that we're inferring indirect connections also as potentially direct connections so given the works of biology I think we have a powerful method in our hands okay I think we're going to be cut right now there's more questions coming in for everybody so okay there's also a very broad question for Eric about evolutionary explanations and asking how this drift come into this and how is drift as an explanation of the experiment rejected is there a short answer thank you for whoever asked that because I guess it didn't come across that the whole developmental theory started from the fact that we observe that if you take a random sequences that express but have not been selected for their noise properties then they are systematically lower noise and lower noise than many native promoters so that means under drift alone noise would go away okay you need some explanation for where why the noise in the native promoters went up and then we found that what characterizes these high-noise native promoters is that they have many regulatory inputs and so then the question is okay why would that be and basically all that the model shows is if you combine noise propagation with the effects of regulators what would be the effects on fitness so it's not that we want to make up an evolutionary explanation that involves potential benefits who are forced to do this okay thanks and last question for Ellen is there an expected diversity scaling law for different species okay so if I understand well the question the scaling law along the body size axis where has small body species are going to be more diverse than large body one and there is a diversity scaling law for this relationship I don't know of in scaling law with diversification rates so like the type of processes that I talked about for explaining differences in species richness across species groups okay great thank you so thank you again to all the speakers thank you for everybody for your wonderful questions and if you have any more questions please feel free to reach out and see you next time and thank you to the staff for organizing this alright bye everybody thank you