 Today, I would like to share with you my thoughts about how ideas and methods from physics can be used to help us broaden our understanding of the brain. There are two broad ideas. One idea is that we are embedded in the physical world and therefore our sensors have to obey physical principles because we interact with photons, we collect sound waves and other signals from the environment. And today I will tell you some broad ideas about why I think hyperbolic geometry is applicable, is relevant for many aspects of organization in the nervous system and also how symmetry breaking can be used to categorize biological complexity. There is a viewpoint that one of the distinguishing features of biological systems is that they are quote-unquote complex, they are composed of many heterogeneous elements and so maybe that's one of the distinguishing parts from physics where we have many body systems but the units are homogeneous. So I will describe how some ideas about phase transitions can be applied to understand and categorize biological complexity. I am at the Salk, I am in the computational neurobiology laboratory but half of my talk will be today actually about plants, tomatoes, strawberries, blueberries and other plants and the reason I became interested in plants is because there is a long-standing idea in neuroscience that in order to understand the brain we cannot do it in isolation but coupled to the natural environment. And for the sense which is our faction which is one of the least understood senses, the sense of smell, the natural world is to a large extent is defined by plants. So that's how I started with physics and we will talk about a little bit of chemistry and we will see how hyperbolic coordinates can help us categorize the sense of smell. So a general idea, so in the smell but also in visual system, so imagine that our world, our visual, our goal is to figure out what's happening, what's out there, so that's how we use our senses. And there is a certain hierarchy within the natural scenes, you can imagine that there are hidden causes such as three bears or other object trees. And then they all reflect light, they reflect sound waves and they result in rather detailed measurements that are spread out in the world. And the job of our nervous system is to reconstruct this. We receive detailed measurements corresponding to say, intensities of light across our sensors. And the job of our, of the brain is to figure out what are the hidden causes that led to those observations. So one of the recurrent themes for today's discussion is that both natural scenes and the nervous system have opposing matching hierarchical organization. When it comes to the sense of smell, when preparing for the stock, I read this article that was actually written by Alexander Graham Bell more than a hundred years ago where he lamented the fact that in the sound and we have a theory of light because we have a way of measuring light and we have a theory of sound because we can measure sound waves. But how do we measure distances between orders? So he said it's very obvious that we have many very different kinds of smells from violets to roses and others, but until you can measure the likenesses and differences you have no sense of order. So that's one of the problems that we would like to address. It's still an unresolved problem and I will describe our approach, our ways towards that. So if we think about human order perception, especially we were discussing yesterday in other circumstances, if you're buying wine, right? They describe to you it's rosy with, I'm not a big wine drinker, but you know there are a lot of qualitative terms. And sometimes it says, where does this come from? So you know we will characterize this smell in terms of whether it's natural or chemical, it's rosy, does it smell like a mushroom, and that just begs for something more quantitative. So the goal is to find what is the dimensionality of oral factory perception. And the goal is to ultimately produce a set of coordinates such that you can say my preferred wine coordinates are 3.5, 2.4, and 7, and that's my preferred wine. Somebody else can prefer a different wine, but you know just like in the science of color I can say I want to paint my walls pink, but I can also go to a store and say give me the color with these RGB coordinates. In the color it's a little simpler because we only have three receptors. In the alfaction we have hundreds of receptors, but that still does not exclude the possibility that this space is low dimensional and this is what we found. So in thinking about this problem it was interesting that I learned this apparently that the way we classify orders is very similar to how species were classified before Darwin. So they would classify, apparently there were these games played in Europe for how to classify species based on the set of overlapping descriptors and in ways that are analogous to how we describe orders. So they would say what is the shape of the hoof, what is the shape of the paddle, all kinds of qualitative descriptors. So apparently Darwin knew that there is a mathematical mapping from a set of descriptors such as here to a hierarchical network and that's what the mathematical fact he used to reconstruct the tree. So the idea is that if I have a, I can represent a data structure as a kind of a category within a category in terms of overlapping circles, but I can also equivalently represent it as a tree. So the largest, so for each circle here the X and Y coordinates become X and Y coordinates here, but the Z coordinate is the size of the circle. So the largest category gets assigned the highest Z coordinate and the smallest circle gets assigned the lowest Z coordinate. So one can see that the all-encompassing category is the root of the tree and then it branches to a smaller circle. So there is a mathematical mapping from one set of descriptors to a tree. So that's one of the suggestions for, these are some of the facts that I used to propose that perhaps the hyperbolic coordinates might be relevant for the categorizing of orders. The idea is that the way we use smell is, you know, I want to figure out whether something is good for me to eat or not. So it's in a very correlative way. So the idea is that the molecules are produced in correlated manner by hierarchical biochemical pathway within a plant and so our observation of individual order molecules will be like leaves in that tree. So I roughly organized the talk into three parts. I will talk about hierarchical organization of natural orders. So this is an example of a study of natural seen statistics. And we'll talk about neural representations and going inside the nervous system and there I will switch modalities to talk about vision which is one of the best-studied visual systems in neuroscience. And the third part will be application of symmetry breaking to understand cell types that are encoded to one dimension. So moving on was the natural orders. As I mentioned, the structure of the factory space is very complicated and hard to understand. So if we think about a strawberry and for insects, for example, its plants are a very important source of natural orders. One of our collaborations is on bees and insects and it turns out that the bee can navigate to a flower but then it has to make a very important decision whether to land on a flower or not. And it will land or not depending on sometimes she needs pollen, sometimes she needs nectar. And when the flower is pollinated it changes its smell. And the bee is very dangerous for it to land on the flower because once it lands the spider can hide under the flower and if there is a spider that's the end for the bee. So she uses alfaction to make important decisions. So in any given plant or a flower there is a complicated mixture of molecules approximately. So people have measured 80 molecules per sample and we took advantage of datasets that are publicly available from food industry. So they're interested in creating a perfect strawberry, a perfect blueberry, a perfect tomato. As you know that when the plant is, when you grow your own strawberry it smells nice but then when we go to, when I go to supermarket then sometimes they don't even smell at all. So and smell is a big component of how we perceive taste. The two senses are very coupled. So when people study this is a dataset from this paper and so they look at different genetic varieties of strawberries. So there are approximately 80 different genetic varieties of strawberries in this dataset. And each of these genetic varieties is measured in terms of approximately 80 different monomolecules. So here is the concentration of this molecule from this sample or of the same molecule across a different sample and so on. So my way of thinking about it, of how are we going to define distances between molecules? Previous attempts were to define distance between molecules based on the chemical structure. But it turns out that molecules that have similar chemical structure can smell differently and so on. So that hasn't been particularly successful and in fact one of the leading models of the olfactory system is that it's randomly organized such that they have a set of receptors and each of the receptors has more or less random selectivity for the odorants that are out there. And one can use ideas from compressed sensing and other techniques to figure out from these sensors the composition of orders. So our approach was to define statistically distance between molecules, meaning that I will define distance between molecules. If they are more correlated across these samples, then they will get assigned a smaller distance. The idea is that this is how I use an olfaction in every day. So I smell a sandwich, does it smell good or bad? So if it smells bad, then maybe there is, I learned by prior experience that maybe there is a certain bacteria that is poisonous and that maybe I shouldn't eat the sandwich. So we use the olfaction in a very correlative way. So that's how the, so this is, the motivation was to define distance between molecules based on statistics. If they are often co-occurring in the natural environment, they get assigned smaller distances in our approach. So now imagine that I have 80 molecules and I give you distances between them. So the next question is on what surface I can embed these molecules? So suppose I give you distances between major cities on Earth and if they from London to Brisbane, from London to New York, from London to Beijing. So at some point we will figure out that the distances are not consistent with the flat geometry. So in our case the distance matrix will be between say monomolecular orders and we will try to figure out what are these distances with what geometry they're consistent with. So and the spherical geometry, so for each order we measure distances with other orders and we figure out whether it's consistent for example with spherical geometry or not. So the mathematical approach is that we get molecules, so in this case six but in reality it's about 80 and we measure pairwise distances between all molecules. Then we generate points randomly from different candidate geometries and we measure distances between them and then we are going to compare them and to do that we will use a topological analysis that was developed by Vladimir Ritskov and colleagues and basically it says give me if I'm given this matrix I'm going to threshold it at for example the given level and if the correlation is higher than my threshold then I will assign a link between nodes 3 and 5 and so on and then I will measure a set of cycles or holes in this network this is my joke with Swiss cheese but in this network and we can measure these cycles and holes in many dimensions so for those who work with topological analysis it will be the number of the Betty curves in one, two and three dimensions so in one dimensions a circle then kind of a prism and also in three dimensions. So I view this topological analysis as the first step in performing nonlinear dimensionality reduction because we only have 80 molecules we do not have that many samples to sample the surface so we want to find kind of roughly in ways that are invariant towards nonlinearities in measurement what is the how whether this sampling of a surface is consistent with the data so in this particular case the the data points will be in black triangles and the measure the variability from the model geometry will be in these colored yellow. Is it guaranteed that you can always find a consistent? No it's not guaranteed so it's a way of I would say ruling out geometry so we can so I will show you that some geometries we can allow it and some we can say they're consistent but that doesn't exclude that if I have more data maybe that geometry will also be ruled out yes sit again. Have you measured the distance? Yes so the distance matrix it's a key question thank you for your question so the distance is a correlation in Pearson correlation coefficient between two orders like p1 and p2 here across different samples so going back here so this is the distance between say p103 all and p103 1 you measure you take a correlation between these values and one can try different measures for example one can take a logarithm prior to taking correlation coefficient and we did it in both ways and the results didn't change so there is some flexibility with respect to how to define a distance is that is that okay so it's a correlation across samples the distance is a correlation between two molecules across samples and in principle it just for the strawberry but this is sort of a segment of the natural world and then we will try to broaden it to bigger categories so we tried starting with strawberry data we tried three candidate geometry so of constant curvature so spherical euclidean and hyperbolic and the hyperbolic is our favorite because it provides a continuous approximation to a hierarchical tree like networks so what we found was that the hyperbolic geometry matched the strawberry data but not the and we could rule out spherical and euclidean uniformly distributed points so this is an example of analysis that I showed on the previous slide the triangles are data from strawberry and this is Betty numbers one two and three and then if we look at euclidean the first we fit parameters of the model such as dimensionality to to match the first Betty number but then use these fitted parameters to see how they would match Betty numbers two and three and it's are clearly not consistent with the data same thing for the spherical model and for hyperbolic it was consistent so that doesn't exclude the possibility that um you know with additional measurement maybe the curvature will not be constant and we optimize the curvature here so in this model we optimize the curvature and also the geometry was that to match the data we had to take points near the edge of the space I will I will show you visualization of the space so it turns out that there are more data sets the student who did this work I asked him to take a look at the strawberry and he came back and said I have more data sets I found data set of samples of mouse urine blueberry and tomatoes so this is like many samples here so each time each data set is approximately 80 different monomolecular orders and it turns out that the hyperbolic geometry in three dimensions and was points lying near the edge of the space describe all four data set and the optimized curvature value was also similar across the data set so now I'm going to visualize so we found that we want to find the points in the 3D hyperbolic space so instead of two-dimensional tree we are going to go to a three-dimensional tree and many of you know but so the hyperbolic space is difficult to visualize so if I if this is in like a 2D hyperboloid but I can squeeze it into a Poincaré disc and the distance between the two points kind of goes has to be evaluated by going closer to the center now this in 3D instead of a Poincaré disc we will talk about Poincaré sphere and we then use the modified the multi-dimensional scaling with the hyperbolic metric to find to put points actual automolecules into that Poincaré sphere and this is visualization of the two molecules two example molecules before I started that didn't tell me anything but just maybe maybe you know more about the chemistry but this if your brain turns out it's some molecule that is characteristic of berries from Finland so these are all kind of characteristics melt from berries and just to remind you that when I said we rolled out spherical geometry it's true so this is a representation of the hyperbolic space squeezed into Poincaré sphere and the reason is that to emphasize this fact I'm showing you the the geodesic between the two points that goes closer to the center so in my you know the way I think about it this is a continuous approximation to a tree so in order to compute the distance we have to go closer to the center of the tree and then come back up is that okay so and we can visualize this sphere three 3D oh we fit them based on the Betty curves so whether whatever fits so higher dimensions didn't fit the Betty curves could be rolled out by the Betty curve analysis so as I mentioned it's like an idea you know in general to perform nonlinear dimensionality reduction is difficult so we can use topological analysis as the first clue about what is the dimensionality of the space and what is the metric and once we determine dimensionality and the metric we can embed points in that space yes so beta one is the number of cycles cycles in one dimension so it was four points it's like a square so and then beta two is like a pyramid so the number of cycles in two dimensions and beta three the number of three-dimensional cycles in the network and what happens is that if you go back to that network that approach so in reality so as I said that you have to threshold this metric at some level and for a given threshold you will get some number of cycles in one two and three dimensions and then you can adjust this threshold and what you get is a Betty curve because the number of cycles and I will be it's shown here in the I guess I'm not showing Betty curves in the beginning when the threshold is very high the network is not connected so there are no cycles and then you add more edges to the network so cycles appear and then at some point the network becomes fully connected and again there are no cycles so generally as a function of the threshold or the edge density it increases the number of cycles and then decreases and then if you integrate that curve that's the Betty number one and then same thing for Betty number two is the number of cycles in two dimensions it grows and then it drops to zero and then three grows and drops to zero and then we integrate this curve to have one number and in doing this it becomes independent of the setting of the threshold and that's one of the advantages of this method because it's invariant over its one-to-one non-linear transformations of the distance matrix what's the difference well in our case it's x squared plus y squared plus d squared minus t squared equals to a constant with a corresponding constraint so it's a point career ball model but I thought that it's a version well in the Minkowski spacetime is curved I thought that Minkowski spacetime was curved really yeah so it is curved so the I mean it is curved I have a curvature and everything so so you know I don't know what you mean by flat or curved but it has a curvature it's not a euclidean metric Minkowski space is not a euclidean metric yes well maybe we can discuss it later but because I think I can map it to both both spaces yeah so this is technically it's a point career ball model so it has one minus and some pluses and we can I'm not going to argue you know whether it's three pluses and one minus or vice versa but it is curved and we have a curvature of seven so so this space so far that I showed to you had data points that were purely based on statistics but now we actually have ratings of how much people like these strawberries and these orders so now the color of these points represent how much people like the order so in one can see even though we didn't say anything about the nervous system there is a continuous mapping here such that there is a region of space in in this case where that people prefer best so this is one of the so the color here is the normalized pleasantness value and we can define this red axis that goes from most pleasant orders to the least pleasant orders and this is sort of the prediction of how we can predict the pleasantness of new orders kind of separating the orders into some sets to find the axis and others to test predictions and one can have significant predictive power for new monomolecular orders and this was actually one of the open problems in alfaction of how to predict how the order will smell based on its physical chemical properties and yes there was a question here yes good yeah thank you that's right so you can we did it actually for both individual molecules and for orders so the in this graph each point is a molecule and now there are two ways to can to evaluate either we assign pleasantness to the order by how much it was correlated was the reported pleasantness when part of the mixture so it's kind of correlation between with the reported pleasantness in the mixture and it's and it's correlated and its concentration or you can instead of these points you can plot mixtures that correspond to different orders so if I have for you know a mixture will have 80 components and for each 80 components I have 80 points and they take a weighted combination of these orders to put points into into the space so in the paper we have a sphere that where each point is an order and another sphere where each point is a sample okay so yeah yes so intensity yes so for now the intensity we haven't systematically studied the intensity the right so yes yes so there is intensity in axis I would say it's the next question one answer is that to some extent our depending on one can to some extent our within within a certain range our perception is invariant towards intensity and then of course there are violations of this so we didn't specifically look at intensity in this case because it's was normalized correlation there are yes that's right the intensity matters but also you know the order identity because I can be at different distances from the order source and we would like to know what it is and then a separate question of how far it is so because the space is low dimensional three-dimensional actually most of the points here lie near the boundary of the space we can predict we can look at other axes so in this case molecular boiling point how easy this molecule evaporates so that's a green axis or how acid was the sample and it's again the correlation between the order concentration and acidity in the sample and one can see one that there is these axes predict these values for new orders but more interestingly now because the space is curved you can if you know the coordinates along these to other axes you can predict coordinates in the pleasantness axis so there is a way of predicting how pleasant the order is based on its other properties so the close this is close to the summary of the first part and I would like to summarize it with the fact that we find that there are hyperbolic coordinates are consistent with the orders implying a hierarchical organization for orders and the all the orders are kind of the leaves in this tree so the idea is that they are volatile and therefore they do not react too much with each other and so they do not lead to other new molecules and so they are all equally kind of distant from the hierarchy of chemical reactions and so how they are perceived in the brain so there are the alfactory system is one of the oldest sensory systems in some way it's the simplest and it has what from a machine learning perspective would be a three-layer neural network and some of you know that there is a theorems from going back to Kalmagorov that we can use this type of architecture to encode any function and and perhaps this is what the alfactory system is doing because we have receptors they have some sensitivity to the molecules so that's receptors are like hidden layer and then the neural responses in the in the cortex would combine responses from from the receptors or from the glomeruli to reconstruct what were the smells that were present so the theorem by Kalmagorov says that you can approximate any function so if it smells like this maybe it means a predator and run away if it smells like the different way maybe it's food and you can hide it for the winter so one can encode any function but of course the size of this layer can be very large so that's the trick and it turns out that each receptor requires a separate gene and the family of alfactory receptors is the largest gene family one of the largest gene families in our body so if I want to increase the accuracy of the alfactory systems every cell in my body has to carry an extra gene so it's of course a very expensive solution our other senses we are not particularly we don't rely on alfaction in everyday life so there are so the information capacity well we do but we rely on vision much more or that's quality so anyway one can quantify the information capacity for the alfactory systems but in vision and other senses there is a different solution many of you know there is a revolution in machine learning where they say that the accuracy of the number of the expressive power that we can achieve using these architectures can be exponentially increased if we instead of relying on three layers we will stack these layers and that's sort of the solution that has been found by newer sensory systems including vision and auditory system so in in my second in the middle part of my talk I will describe briefly our progress in understanding what happens across these hierarchical transformation stages in vision it is a it's a very old problem and right now one of the challenges for example in machine learning is we can train this deep network that it will predict many things based given the data but we would like to understand what is it how does it do it the focus is on explainability and it's perhaps instructive that that's the problem that we have been facing in systems neuroscience and if you've been working on it for about 50 years because we have a brain it's a deep neural network and we would like to know what exactly it does so in the visual system there's a hierarchy of areas the first processing stage is primary visual cortex in the back of the brain and this was analyzed or the first breakthrough was from Hubel and Wiesel who got their Nobel Prize by demonstrating the ignorance in v1 are sensitive to edges to orientations within the different scenes so decomposing the visual scene into little edges so here's an example of the neuron from the primary visual cortex from their paper such that if the edges horizontal there are no responses and if the edges 45 degrees the neuron if this is the voltage across time across the neurons membrane and one can see there are these they're called spikes in the voltage stereotypic events in the voltage trace across the membrane so when the edge for that neuron is at 45 degrees it produces lots of responses and then when the edge is horizontal for that neuron it doesn't and therefore each place and for each orientation there is a corresponding edge in the primary visual cortex so from the primary visual cortex signals go to secondary visual cortex and it's still not known what does this does and then they go through a series of edges and at the end in the inferior temporal cortex we find different kinds of neurons when it's the animal sees the hand the neuron responds with lots of spikes now this is the number of spikes per second if you simplify hand to a method the response is reduced for that neuron if it shows a face the response is further reduced but then if you show it a hand at 90 degrees the response comes back so now how do we wire signals from v1 which represent edges of individual digits in ways that are invariant towards the position of the hand rotation of the hand size of the hand and not get confused with a metron or a hand so it's a very complicated problem but what gives me hope is that and I spend a lot of effort analyzing the signals that there is a enormous amount of convergence even from this area v1 to the area v4 so visual area 4 approximately each neuron in v4 collects ultimately signals from 1 6 of the v1 surface so it's a huge amount of convergence and if it was without any patterns or any order I think all the hope of selectivity would be lost so I spend a lot of effort understanding how signals propagate within the visual system it's very technical work so I will just summarize briefly so think about that we have layers as in deep network from say retina or thalamus which is another relay nucleus to primary visual cortex secondary and v4 but here comes a trick that each unit here that in the machine learning is just a single threshold like unit in reality is a systematic network where a signal is a process across layers in specific ways and how to think about this well we try to find a model that an element of a circuit and the methods are described here that can capture some of this complexity but at the same time can be manageable and from the fitting perspective so the way we did it is kind of fit summarize this unit as a threshold linear unit with respect to an argument of the logistic function that depends on multiple input components from one matrix and then we're going to diagonalize that matrix so it's semi convex so this this unit can be fit in a convex way but also we can then find the eigenvectors and relevant modes for each of the units so each of the units has multiple features so I'm glad I would be happy to go more into detail but in the interest of time I will just summarize the results so a signal so if you are faced with this you are looking at the koala bear here signals go from v1 to v2 and then they split into v4 an area empty with middle temporal area sometimes known as v5 and in v1 roughly things code edges such as the edge here in v2 neuron's recent work shows that they represent textures and they called second order edges so what's a second order edge is like here this is an edge that has similar orientation similar luminance on both sides but the texture is different and then in area v4 neurons are sensitive to curvature and color I think so this is a broad description and in area mt is like pattern motion so I will just summarize what we found for v2 fitting these rather complicated semi-dip networks with each unit that is modeled taking to account what we know about neurobiology and what we found for example here is one example analysis for area v2 and instead of being sensitive to one edge as in v1 it now collects information from a number of edges and positive edges that excite the neuron are in blue but it also sensitive to edges that are orthogonal that are in red that suppress the neural response and that's important because what does it mean to have an edge it means to have the edge of the correct orientation and absence of an edge of the wrong orientation this is one example v2 neuron this is another example so they will be sensitive to different types of textures and then it's only the first part of the model and then the second part of the model is how the signals are pulled so this element is actually very small and the model says well it's going to be tiled across the visual space over some range that we will determine by fitting and then we this is an experiment kind of the fitting mask that we construct from the data and for most of the neurons it's uniform so the corresponding neuron would be recording a patch of texture but for some neurons this the mask is biphase going between negative values here to positive and such neurons a will be able to encode the second order edges and b they're pulling masks are very similar to what Hubel and Wiesel described for v1 neurons including luminance but here we find the masks that v2 neurons apply to v1 output so there is a kind of a hierarchical approximate repeating computation going on from different stages of cortical processing and this is another summarizing another maybe six years of work in area v4 we find that the neurons are tuned to curvature so here example relevant features for v4 neurons but there is a range each neuron has we could find at least two features and usually they had a similar curvature so range range from so there's a one two three four five six neurons each neuron is quantified in terms of the two features and some neurons are more curved and some neurons are less curved and those that are turns out that were more curved were less positioned invariant one can understand this as if I want to quote the contour I need to be more precise about its position if the curvature is tight and linking back to the hyperbolic coordinates in the first part of the talk there is actually all the evidence that our visual perception is also hyperbolic and one of the ongoing efforts is to put these features as elements of a hyperbolic system and so I have maybe seven minutes or is it seven minutes yeah okay so the phase transitions and but characterizing biological complexity so what I described was hierarchical organization natural world how it can be described as hyperbolic coordinates hierarchical visual representations and then this is for neurons that are sensitive different features in the environment but there's another type of hierarchy that some neurons there's a set of neurons that are devoted to encoding the same feature in the environment the same signal but at different thresholds the reason is that suppose that we need to do a kind of an analog to digital conversion in the brain meaning that if this is the feature contrast it can be say temperature or any other environmental variable of interest it can be for example the degree to which the edge matches the the visual the natural scene matches the pattern that we are seeking and it fluctuates in time and now I have a neuron that is supposed to report to me the value of this coordinate but neuron to the first approximation is a threshold like thing and if if I just have one neuron encoding one feature the most I will be able to convey one bit of information so I will not have full value about the underlying signal one bit per kind of unit of measurement but if I have many receptors then and they are all very precise then I can stack them according to the distribution of input signals then in this way from the pattern of their responses they can reconstruct the the underlying analog variable so in the limit the optimal distribution of thresholds should match the distribution of input signals and that's result that was derived by Simon Laughlin but now there is a trick well the point is that each neuron doesn't have a perfect threshold in order for it to have a very small there is some variability in either the signal the input signal there is noise or variability in the threshold so it turns out that the optimal solution if each of these threshold like devices is has a finite uncertainty associated with this threshold then it's no longer like this but it's actually kind of clumped into groups and the number of groups in exact position of the groups depends on metabolic constraints it depends on the amount of noise and ultimately if the noise is too big then it's optimal to have all neurons be with one threshold and just average their responses this is a qualitative picture and now I can show you the quantitative analysis for two neurons and white I'll explain why it leads to cell type specialization so the idea is suppose I have I simplified this problem very much I have an analog variable and I'm going to encode it with two thresholds and they each threshold has a certain amount of noise that is plotted on this axis and what's plotted on this axis is the threshold difference between neurons so I'm asking is it what is better is it better to set the thresholds equal or separate so for example if there are two workers and they're unreliable sometimes I have to assign to them to do the same task and average the results and when they get more reliable you can assign them different tasks in order to get the overall job done so this is the same idea so we compute how much information jointly the two neurons provide about the underlying analog signal as a function of threshold difference and when the noise is large the information is maximized when the threshold difference is zero so that's the redundant solution you just average the results then this information surface has a bifurcation so this single maximum splits into two maxima and the moment the noise falls below the critical point and one can see that it actually matches the landow theory of mean field theory of phase transitions we can identify all parameters noise is equivalent to temperature information is equivalent instead of maximizing information you minimize the free energy what's this there's a symmetry breaking is the exchange symmetry between neurons because initially in this situation I have two identical neurons to identical detectors with the same amount of noise and I have to randomly assign one of them go towards a threshold that is closer to the mean and one will go further from the mean so in biology as you might know there's these events called gene duplication so you have identical gene and then they're duplicated and one is mutated and usually one has higher sensitivity receptor and the low sensitivity receptor so that's the idea we can actually test these ideas in the retina which is the the sheet that is in our eye and it turns out that there are neurons that are sensing light increases these are on cells the yellow cell here and light decreases which are off neurons turns out that the off neurons have smaller amount of noise and they split into subpopulations and it was very puzzling why for each of the type of on type there is a pair of off cell types and now we can explain this actually with no adjustable parameters because we can place the position of this critical point if I know the kind of the average threshold position then we can make a prediction for this critical point and it separates the values for the off cells from the on cells explaining why the off cell types split into sub types the everyday illustration of these ideas is that you might notice that we prefer to read black letters on white paper and not vice versa so it turns out we have higher acuity in the off channel than in the on channel so it's much easier to have been constructed a black paper and a little bit of white ink but we have a lower resolution and so this is an illustration for why in the off channel we have smaller noise and higher acuity and there are more cell types and more neurons in the off channel there is more I can say about this because it turns out that these two cell types have slightly different noises and in this case this picture of the phase transitions gets converted from a second order phase transition to a first order phase transition because this surface tilts now and the analog of the magnetic field is the difference in noises between these two neurons so the the one that has less noise goes towards the mean of the distribution and the one that has larger noise goes towards further from the mean of the distribution so as a summary the work I presented we had three parts the LFACTION was done in collaboration with brand smiths who studies BLFACTION and was interested in natural statistics of orders and the work was done by Yuan Chengzhou a student at UCSD and Ilan Zhang did information theoretic calculation for random coding in the factory system the the visual work in v4 was done in collaboration with laboratory of John Reynolds and analysis Jude Mitchell, Anirvandhanyi and Minchenko are former postdocs who now have their independent faculty positions Ryan Rovenkamp is a student who is now a postdoc in my lab and he did analysis of v2 neurons that we did using the data set that was made publicly available online from Jake Gallant's laboratory and actually there was a lot of this work under was based on a large background work on computational methods and how to fit these deep networks that have multiple components at each unit to neural data so some of the work that we are there it is ongoing is analysis of this recurrent circuit I mentioned that each kind of computing unit in the cortex has a stereotypic architecture and we are interested in those recurrent networks we think a lot and there's a I think a match between this is like infotaxis solutions that we were inspired by Masimo's Virgasola's work together with British Rhyman matching the infotaxis solutions to how worms search for food and also thinking about eye movements and actually within cells with yeast in collaboration with Vicky Lundland and the Salk Institute I didn't talk at all about auditory perception but prediction is very important in addition because we have to think about in auditory perception and in the music it's the prediction is very important part of it and the retinal work that I mentioned was done in collaboration with Stephen Bacos's laboratory at Stanford so this is my qualitative summary slide of why I think the hyperbolic geometry is applicable to neuroscience and the reason is the one word summary is speed because these tree-like networks allow for fastest communication between different parts such that the system can respond best to internal and external perturbations and of course this is true both for for different kinds of animals both for plants and for us so in final I would like to thank Terry who is a great mentor at the Salk Institute and my group who joins me in Vishenu thank you for your attention