 Hello and welcome everyone to Acton Flab Livestream number 38.1. It's February 16, 2022. Welcome to the Acton Flab. We are a participatory online lab that is communicating, learning and practicing applied active inference. You can find us at the links on this slide. This is a recorded and an archived Livestream. So please provide us with feedback so that we can improve our work. All backgrounds and perspectives are welcome and we'll be following video etiquette for Livestream. Just release an unending torrent of emojis if you have to speak or just raise your hand. I'm sure that we'll get to it. If you're watching live, please feel free to write questions in the live chat and we'll have enough time to hang out and discuss during this dot one where we'll be opening into the paper. Check out activeinference.org for updated information on participating in any of the lab's activities. I hope you'll find something that resonates with you. Today in Actif Livestream number 38.1, we're going to be learning and discussing this cool paper, the evolution of brain architectures for predictive coding and active inference by Pazulo, Par and Friston from December 2021. And we're just going to enjoy discussing it and opening up any ideas or questions that us here on the panel have or those who are live chatting us. And we have some ideas and thoughts prepared, a few things that we know that we can go into. And also, I hope that everyone has brought some other prepared seeds and also, of course, spontaneously feeling like new things are arising. So we'll just start with some introduction and warm up. We can each say hi. And then maybe it'd be cool to just also mention like what got you excited about the paper or what made you want to discuss this paper. What's something that stayed with you? So I'm Daniel, I'm a researcher in California, and I was very excited by the evolutionary focus. A lot of my research over the last years has been in evolution and ecology. So it's always awesome to see how people are thinking about how active inference free energy principle and evolutionary studies can all be learning from one another. And I'll pass to Steven. Hello, I'm Steven Sillett. I'm in Toronto. I'm really interested in how this paper connects to my work with spatial meaning making and social typographies. Because this paper talks about a more biological, nascent stage of development rather than some higher order meaning. Often people think about, and I'm really interested in how this sort of grounded, bottom up, nactive ecological approaches can be thought about as something where we can actually ground a lot of our meaning making. So I'm interested in how this might match with some of that thinking. And I will pass this over to Dean. Thanks, Dean. I'm Dean. I'm in Calgary. What I found interesting about the paper is that given that I've worked with a lot of young learners for a lot of my life, it was really interesting to see how this affirmed a lot of the thinking that I was doing when I was trying to get people past the idea that science is only about the biology and the chemistry and the physics, that there is a statistical and predictive component to this. And those relationships in that realm can build a certain sense of what the underlying architecture is. So as I said, this is a lot of this is affirmations. And it's kind of seen as being not part of what is typically addressed as core learning, but I think it should be. Back to you, Dan. Cool. Nice intro. Let's just start with a big question. And then either of you have any reflections on the big question or we'll go over to blank slide and just bring up some questions we have. And also it might be good to go over just some of the key points in the paper. Like what did they actually do in terms of their contribution? So the big question or at least one way to phrase something that might bring someone to approach this paper. What is the evolutionary neuro physiological basis of cognition? And how do complex cognitive phenotypes arise? Like you don't go from zero to colony in one time step. How does it happen that evolution arises from precursor, sometimes simpler, but also sometimes more complicated precursors? Stephen mentioned like what is the basis for sense making and cognition and sense making are very related. And what's been fun over these last weeks and months is we've explored basic or simple or reduced or basal cognition from a variety of perspectives. Like we talked about the bioelectric components of thinking about basal cognition with Mike Levin. Now we're thinking about a slightly different approach to understanding basal cognition, which we'll be discussing here. And then also we've looked at some of these more complex cognitive phenotypes like mental action, counterfactuals, deep temporal inference, all the abductive logic, which we'll get into probably later. How can we use integrative models of perception, cognition, action and impact like active inference to study this whole continuum and diversity of relatively simpler in our perspective cognition and also relatively more complex cognitive phenotypes and everything in between. So yes, Stephen and then on with it. Thanks Daniel. Yeah, I mean I suppose also what some of the paradigm impacts of that, of taking that all the way through. I'm one question that might come up, which you may or may not have an answer to or thoughts on, but you know Mike Levin's work with the teleological colons at the different nested levels and how this sort of a could be thinking about what kind of action control potential in these different nested levels might exist. I would be interested to think about how this may either be able to connect to that. I can see it connecting philosophically or whether there's another bridge needed between this modeling and that type of modeling. So it's just something I've mentioned. Dean anything to add or perhaps? I'm going to wait till you get to figure one. Okay. So we won't go through all of the minor points in the paper. For the reading of the paper itself, nor will we even go over all of the overview, which is what the dot zero video 38.0 is for. So you know, turn back time and watch that one if you haven't or pause the video if you're not watching it live. But what this paper does as evidenced by their roadmap is introduce a few basic principles of cognition, predictive regulation and control, and then also structure learning in generative models. That's not how every neurophysiology text is going to begin the building blocks of cognition. It's not how every evolutionary neurophysiology text is going to be framing cognition, but that's what they do here. And that will play into how it's similar and different than other approaches. Then with those building blocks in hand or on the floor, they provide three examples of motifs that ancestral brains may be modeled as having or may have actually had. With those three examples in hand, it's then possible for them to state a more general way of thinking about the transitions amongst different structures. These structures represent brain design as structure learning in generative models. That is called an evolutionary algebra, and they introduce five operators that can basically either leave unchanged or change the structure of brain design with respect to generative models. And each of those are explored in terms of what are the architectural changes that that evolutionary transition is, and then what are the functional consequences of that kind of an evolutionary transition. Then they have some discussions about how sequences or patterns of application of this evolutionary algebra could lead to different evolutionary phenomena. So, for example, increasing temporal depth in the future means that models are increasingly prospective. Increasing temporal depth in the past is like memory, etc. And then they close with mapping to a phylogenetic tree and thinking about that evolutionary algebra of state transitions being mapped onto the bifurcating tree structure that's called a phylogenetic tree, which shows the relatedness of different life forms. That's the roadmap. Let's go to the long-awaited figure one and talk about the action perception cycle and predictive regulation. And so they're again discussing this in the context of predictive regulation, anticipatory regulation, cybernetics, and control as a basic design principle for the brain. So, Dean, what do you see here or what would be cool to think about? So what this follows, I think, is the usual pattern when we're trying to explain where we're going to go. And so you just had a roadmap up, and I think a roadmap is a way for the people who are reading the roadmap to find their way. But I think what this diagram shows is that in the middle of there, there's something called a discrepancy. And that discrepancy is later going to be given a label Y. And what I think is really interesting here is there's a flip, potentially, that can happen here. That discrepancy is where what I would describe as a rule factory and rule in the sense that we find patterns. That's kind of our place where, as it says, the prediction and the observation come together. And I believe that that's different than a roadmap, which is find your way. I believe that discrepancy or that rule factory is way finding. And I think what we have the potential to do with this paper, because of the way that they have sort of presented the information, is the words eventually become explicated rules, which then become action in the phenomenological space. That's where it gets really, really interesting. Because I think most of the time when people look at things as a subject, they've taken the world and they've collapsed down to words and diagrams and models. What this potentially allows us to do is flip that and move from the words back out into the space with a little bit more confidence. So yeah, that's why I wanted to kind of start here, because I think that we're discrepancy. It's the first time I've seen it in this kind of figure model, and I really like it. Cool. Yeah, it makes me think about setting off on the road trip on the mental actions that reflect the paper. And there's the roadmap, which is super informative, but it is like instructionalism. It's saying you're going to go two streets and then take a left turn at the stop sign. And then when this happens, then you'll do that. And if you've seen this, then you've gone too far. Kind of classic instruction type sequences. Here we have a figuring out because it's visually arranged with some local connectivity that's suggestive of a causal connection through time. But by no means is there only one way to read this simultaneous figure. And so that allows potentially for more of a figuring out, including a figuring out of rules. Yes, it is indeed a little different with discrepancy at the intersection here, Stephen. Yeah, so this ties in with the lower bound evidence control approach of action. So, you know, the idea is that action is what we can tractably approximate. And perception is something that we can access but can't. And we can try to make more sense of, but is a harder piece of the equation to get a handle on. So I'm wondering if that's, I mean, how do you feel about that separation in terms of its use across other areas of applied active inference? So I think it's a, it could make things a little bit more digestible for a number of contexts. Okay, I'm thinking about this in the context of recognizing that it's different than other action perception loops that we've seen, which is just really important to keep in mind that we're not perceiving something that we're projecting too much. Because other times the outgoing arrow from the entity is what? If this were a Markov blanket type diagram, which they often are, the outgoing arrow would reflect active states and then the statistical dependencies that are outgoing. But where is action? It's on the bottom right. So let's really try to understand why the pieces are placed this way. The outgoing feature of the cognitive entity is the prediction. And here's observation coming in. If this were a Markov blanket diagram, we'd have the world and then the incoming statistical arrow would be the observation. That would be the sensory states. So here it's the outgoing prediction of the entity and the incoming sensory data or observation. Those two are being differentiated to form a discrepancy, which is just a qualitative term, but it could be then brought a little bit more formally into a prediction error or a little bit even beyond that into like a free energy differential. This discrepancy has two arrows coming out of it. From the discrepancy is arising perception and the changing of beliefs. Does perception always involve changing beliefs? And discrepancy is also giving rise to action. So what kind of a thing is discrepancy such that the inputs are prediction and observation and the outputs are perception and action? So at the very least, this is not the Bayesian graph representations that we've seen before or that we'll see in just a few slides with a more traditional interpretation of nodes as random variables and edges as statistical dependencies. This is a little bit more like a thought map that then connects to the variational free energy equation, which we talked about in number 37 a lot more. But just to recall, there's the red and the blue lines, these two different components of the variational free energy, and those are shown again here. So check out 37 to learn more about the red and the blue and about variational and expected free energy. But for here, we're starting with this action prediction discrepancy motif and then connecting it to perception and action as variational free energy minimization. Stephen? I think one thing that's useful with this more nascent representation is discrepancy can go in different directions. It can be discrepancy in terms of temporal occurrence. When was the prediction predicted to happen? When was it observed? But it could also be what kind of prediction and what kind of observation or whereabouts was the prediction? Whereabouts was the observation? There's many different, and at different scales, so there may be that at the kind of lower levels, there's quite a big jump between when I predict the baseball coming to my hand, when that prediction was made, when that observation occurs, and also when that is chained up at different slower, bigger steps of the nested Markov blanket sequence. So this idea of discrepancy is probably the biggest bucket they could find, I would imagine, at that spot, and that may be partly why it's there. Can I just add to that, Stephen? I think one of the things that's interesting, especially after just doing the 37 paper, where we were talking about guides, to me the timing of this was absolutely immaculate because now all of a sudden we've gone from guiding to almost taking potentially a referees position on the world, which is a whole different thing than taking up the position of being a guide. And that's not nuance, that's not subtle. That's quite an identity shift, and we're going to actually get into identity when we look at some of the evolutionary steps later on. But I wanted to slow down on this because I thought that's not a minor change. That's a whole different perspective shift, and I think we need to really make note of that because I think it colors a lot of what's to follow. What makes you say that we're a referee in this situation? Disgrapency. There's a difference between what the world is telling us and then how we rule on that. So think of any game where you get into an argument with the reps. Are they wrong? Are you wrong? It doesn't really matter. The point is that there's a difference. That's when you said differential graphing is maybe one of the ways to be able to make that explicit. But that's not how it's typically framed out when we're talking about Bayesian or Markovian stuff. So I think this is a big move. I know it might seem like a tiny one, but I think as we go, again, as we go deeper into this paper, it affirms a lot of the stuff that I actually saw when you're trying to go out there and forage and figure. So I'm excited to keep going. Yeah, it could be like the observation. The baseball player is running towards the base and the coach is observing and there's this ongoing prediction and observation with no discrepancy because part of the generative model includes the person moving through time. And then given the observation and the generative model, the coach predicts slash expects and prefers, which we'll come to again later with the three piece, that the baseball player is safe. You rarely see super animated refusals when a call has gone towards somebody, but it's when it has gone against, it's violated their fitness that there's a discrepancy with what the referee has called and what the interested coach has called and that is going to lead to some consequences. And again, we're still not at the Bayesian graph level, but that's going to be the next figure. Steven? This can then build on the idea of what's cognitive, what can you have a perspective on? So I think what you're saying there with this referee position is as well as the kind of the swarming dynamics and the kind of inactive processes, much of which is beyond our ability to sense or integrate, what is it once it starts to hit, what's the big bucket at the kind of cognition level that still isn't too much of an inflation? So in this case here, discrepancy, there is a sense of being able to distinguish perception and action at some level and find a discrepancy, which in some ways I would imagine covers a lot of what cognition would require to be thought of in a cognition way. Other types of action, for instance, how we heal or grow, maybe action and perception won't be so separated, but this is trying to come at the idea of cognition. So maybe that also informs this process. Cool. So section two, again, was just about the two concepts of predictive regulation and control. That's what we saw here, predictive and control. So it's just conceptually laid out. In figure two, they formalized brain design as structure learning in generative models. So here we have a different figure. We still have the cognitive entity and the world. We add in one layer of absolutely essential active terms, which is generative model, generative process. The generative model is the cognitive entity's model of. The generative process is the underlying phenomena that gives rise to observations. It's the difference between the cognitive model of vision, a generative model of vision, and the generative process of visual input, which is like photons and the sun and all of that. So these are very different. They're complementary, but just so that we're really clear going forward that we're going to be using those in their specific sense. Not using them like, oh, well, it's a generative model because this is a model of cognition that makes me excited and think of ideas. That's not how we're using generative model here. So just to be clear on that. Now we see action as changing the world represented by you and the cognition involves partially action selection, policy selection, planning as inference, but not going to all those details yet, which can have some influence on actual, unobserved hidden states of the world, X star. X is the cognitive model of that hidden state of the world that is being inferred and Y is the observation that then feeds back into the cognitive model. So that's what these nodes mean. It's about partitioning the cognitive model from the generative process, separating the generative model from the generative process, and we're starting to see the traditional blanket form with observations having incoming statistical dependencies and actions having outgoing statistical dependencies. What else do either of you see in this model? Yateen, let's see. So let's go back to being a referee for a second. The assumption is that you're already attending and most of these models that we have taken up in the past, especially in 37, what they try to focus on is getting from A to B. What this is essentially saying is, let's add something to that translation from A to B. Whatever we're trying to incorporate in this representation, let's maybe look at the interpretation part of it now and the recitation part of it. So if I'm the umpire, and we can all be umpires, we don't have to do that just in a baseball game, this is incorporating now a certain critical process. I was attending and I saw this, but then there were 60,000 other people also paying attention, and they saw that. That's where I think this is getting really interesting now. This is actually bringing it back to sort of, I still think it's an instrumental piece, but I think now we're going to incorporate some of the reality. Were the 60,000 wrong? Or is the one person who yelled out, safe, wrong? That's what I see in this, because that you is definitely embedded in the generative process, not the generative model. Great, thanks, Stephen. So the question that comes into mind is where the body is in all of this. And I sense that the generative process is the bigger process with the world. The generative model gives a way to access those fires and to give away the hidden states. And the cognitive model, generally speaking, cognition is thought of more as a deductive and inductive reasoning and logic. And it could be that there's elements of the abductive that could be held within the kind of body. And I suppose there's a question there to how and where that is. I don't think anyone quite knows the answer to that, but that's my thought. So let's remember that this partitioning is specific to a given instantiation of model-based science, just like Majeed was talking about. So we're not going around in assigning aspects or phenomena of the world into either generative model or generative process. So where does the body fit in? Well, with respect to a model of the body being a structurally real thing that gives rise to observations, it's a generative process. So looking at the coin from the other side and thinking of the body as a generative model of its niche, doing inference on certain things or acting as if, in that sense, it's generative model. So different kinds of entities are not going to be just assigned simply to one side or the other. It's going to come down to what is specifically being discussed. There's a few other not complexifiers, but first off, just note that the notation here is not the same that's used elsewhere. So we will move towards better and cleaner or reformatable notation, but like x star as an external state and x as an internal state might be clearer for some people. It also carries a little bit of a baggage that the hidden state is exactly what is being inferred about the external world. Like there's a temperature parameter in the brain and then there's a temperature parameter outside in the world. As we explored in the representation's paper it doesn't necessarily have to be that way. There could be a hidden state internally having to do with movement left or right and then there's a temperature variable outside and then the model, the generative model of the cognizer is about movement conditioned on temperature observations but not necessarily simply a thermometer being instantiated in the head. It's good to look at this graphically and think about what is being connected to what without worrying too much about all of these side questions but this is what they're setting up as the basis of their further discussion, which is about prediction and control and we can use Bayesian graphical approaches to represent that. Stephen? And that inferred state as you mentioned in some ways that's always slightly hidden from us. What does that mean to be an inferred state in terms of is that the cognition that's coming out of that in some sort of deductive would it be an effective sense of how well something's going? Again, it's not directly asked here but accessing that is one of the big problems that happens. How do you access what has been inferred when it's not necessarily something you can access through direct reporting from a subject matter or from a participant? Right. Another example of that might be somebody who has skilled action with respect to investing but they may not be able to give an estimate for a number that's a certain asset is expected to be at because it's not like they're doing the asset price prediction and then doing a strategy, the cognitive model may have a very different structure. It's not a representation of the actual stock market because it's not the same variable but then the aboutness of the investment decision would be a representation with respect to what was happening in the generative process. So those were some of the side avenues that we've looked at previously but they're all in play at once and so the question is just how to linearly structure to respect the specific contributions that are made here and the insight that can be gleaned without every single time pulling back to some of these questions but it's great that we have like specific papers and memes and core terms that we can refer to and then carry on with what they actually contribute. Okay. Any thoughts on one or two before we get into the structure of the Allo stat or the Homeostat first, I guess. Let's go. Let's dig in. We're going to go from this black and white Dorothy still in Kansas mode to some predictive motifs of ancestral brains and the three motifs, the red, green and blue are homeostasis so returning to a set point, allostasis anticipating or approaching a set point in the future which could be a fixed or a changing one and then implementing behavioral control not just scalar homeostasis or allostasis on an entire accepted variable. Okay. Citation 20 in the paper chance at all from the future, March 22 is where to look for more details on the homeostatic formulation that they're using here but we can see it in terms of their figure three. Okay. So the left side just for reference. There's figure two so that we can remember the structure of perception, cognition, action, impact that the authors are working with here and we're going to connect this black and white figure two to figure three A. This is a graphical model graphical in both senses, meaning visual like we're perceiving it through computer graphics and graphical meaning like a network. So nodes and edges because there's computer graphics that aren't network topologies but this happens to be a computer graphic that we're perceiving visually that also is reflecting a graph in terms of nodes and connected edges. It's a generative model for the regulation of a single interseptive variable. So here's A and B with the homeostat and C we split out to talk about later but first A and B. This generative model includes an interseptive thermoreceptor and a belief about body temperature. The prior over X, which is body temperature is kept fixed and hence it acts as a cybernetic set point. Any discrepancy between the predictive thermoreceptor activity given beliefs about X so Y conditioned on X and the measured Y is registered as a prediction error that is canceled out by an autonomic response U for example a thermoregulatory response. So hidden state on temperature beliefs about temperature Y, the thermometer, reception and then there's the selection of action so some sort of like vasodilation or thermoregulatory response and then that's going to change the underlying unobserved true temperature but that's not needing to be shown. So any comments on that first part? We're looking just at the top half of 3A and connecting X, Y to U action. Can you still hear me? Can you take the cursor now and just reinforce the feedback and the feed forward part of this because the author's spent a bit. It's sort of unpacked like they have at the figure 3 unpacking that we have on the bottom of the slide. Can you just run the cursor over all the examples of feedback and feed forward going on concurrently because I think that's really important to see that it's happening in both directions at once. Let's label everything and then definitely can do that. Perfect, thank you. Okay, so light blue is action. These are subtracted, so red circles represent expected values of X, the red circles which are used to make predictions about Y. These are subtracted to form a prediction error. Because those lines with the arrows on the end of them are just dependencies. They don't really show both the feedback and the feed forward. Yes, agreed like having some rounded edges and other directed edges. Let's see whether they're like kind or not. Let's start with action. Action influences the state of the world which an edge can be drawn to how that changes the state of the thermoreceptor. About why here was the observation. Yeah, that's the state of the thermoreceptor is the observation. Okay. Yeah, so action changes the observation, the state of the thermoreceptor which is being contrasted with the belief about temperature. Yes. From there, a prediction error is generated. It could be zero if there's no difference or it could be higher. Can you just show with a green, some colored line that has an arrow on each end, a connection that demonstrates within that diagram the feedback and the feed forward at once. Is that possible using this diagram? Let's see. This representation, so a single line that has an arrow at both ends is a different color so we've superimposed it over this but that it shows that there's a feedback, feed forward loop going off at the same time. Okay, so here's the temperature information measurement flowing in to contrast with the beliefs and it is in general really important to label the edges, not just use color coding. Okay. We'll let it roll for now. The information is flowing from the measurement to the belief and then that gives us the prediction error. Thank you. Then the prediction error is used in the selection of action which then influences future observations. Now the red circles represent expected values of X. I'm actually not sure what exactly the red bottom larger circle is meaning because the prediction error... Well, there has to be some way of representing that discrepancy, right? So they needed the second ball on the bottom to show difference. Between prediction error and expectation, I think that's all that's trying to show. Yeah, or another possibility might be that the prior is staying fixed. That's X. Yeah, yeah. Priors can also... They can be flexible, but in this case it's a fixed prior. That's because we're talking about homeostasis. Then the observations are diverging from the prior and the posterior is kind of like the realized perception, which is a compromise between the sensory data coming in and the prior. And so, yes, there's a lot of degrees of freedom depending on how parameters are weighted. This green line might approximate the red a lot more sharply. We call that a weaker prior because sensory data updates the posterior to be more like it. Or it could be the case that having a lot of observations different from your prior don't change it. That's a strong prior where sensory data do not change it as much. But then here is the prediction error in relationship to X. I don't know. We can look at the paper, but is Mu shown in an equation? I don't have it copied out if it is. Let me look here while you're doing that. Yeah. But that is the... Perhaps here. Okay. Part of Steven. Anything? Yeah. I'm just noting how the effect of the belief on temperature, it flows through the thermosector down into the... to get the expectation and prediction error. So it's... You know, it's like you've shown there. There is a dynamic going down from the belief through to prediction error being mediated. It's like the thermosector is kind of like a mediator between belief and prediction error on what to do for action. And another point. The reason why you can't find the Mu is because it's not actually pointed to in the description. It's not in the caption. Yeah. And I remember reading and looking and looking and not being able to find it. And then looping back up paper to go, okay, so how about feedback feed forward? Now, there are cases where Mu is used to describe internal states. That may be implicitly how it's being used, but it's super important that all variables are defined in a paper. I wish they all had a table for every single variable and expression that were used. Yeah. It would make the dot zeroes easier, but also it would reduce uncertainty. For example, is this epsilon even described? Okay, Stephen. I mean, I suppose in a thermostat in some ways, the internal states of the bioelectric strip in some ways it's kind of, it holds the kind of the way that the expectations of action can happen because in some ways it dictates the way that the thermostat will behave in some sort of ways, even if it's kind of an analog process. So I'm thinking, where would the kind of, in a thermostat, where would, I know this is a homeostat, so it's a bit more sophisticated now, but extrapolating that out, you've kind of got a belief and there's a belief of what something is and then there's an expectation of what you can do about it. For instance, my beliefs can go bigger than what I can do in my actions. I could have beliefs about temperature which exceeds where I could even exist or where I am able to change it, it depends on the scenario. So it's sort of held in both scenarios, in both the body and the context and the kind of, the probabilities available. Okay, so for a non-living thermostat, it doesn't have a cognitive belief on temperature, but there could be something that's computationally like that reflected by like just a digital prior on temperature. And again, these aren't cognitive, personal, affective, experienced beliefs. Bayesian belief, this is just saying random variable reflecting in a model variable on temperature. So even a sincerely held incorrect psychological belief is not the Bayesian belief. They might coincide at times, if it were a parameter, but the belief here being the prior, it must be adaptive. That's the evolutionary twist that actually helps resolve a lot of this because otherwise, right, the design space of all edges by all nodes and then any variable. I mean, it's just like saying, here's all the words in the dictionary. And so evolution helps restrict the discussion to cases that actually do manage to achieve adaptive control. Dean? And this is where that translation from the silhouette of a head to a statistical density is assumed that the person who's following along with this just sees that. But you can also see how easy it is to slip into the idea that, oh, wait a second, we've gone from the physical space and we just held on to the physical space when really now we're talking about a statistical density space. And again, if you're not really, really careful, you can see how people can carry forward something, but the actual thing that they're talking about has changed. That's why the word discrepancy was such a big deal to me because I'm normally gloss over things, but this time it actually, I went, oh, okay. So there's going to be some moments here where we're actually talking about different things, even though in the continuum, we kind of think that they're talking about the same thing. No, we're not. Yep. It's the travails of realism and instrumentalism for biological active inference, episode 55, because are these terms, are they an example or is this an example of a model being used to discuss a real system? But this suffice to say is the architecture of the homeostat. It undertakes action to reduce discrepancy relative to a prior held belief fixed in this case about what temperatures are expected slash preferred, the dialectic of the first two Ps from live stream number 37. This is an expectation and a preference because expectations having to do with survival are as good as preferences for survival over evolutionary time. Okay. This is going to be contrasted with figure three B, which is the alostat. And so we see the same stack of X1, Y1, light blue, dark blue, red, except there's now a second column next to it and there's some cross connectivity. So again, the X's are going to be beliefs about, so priors on, and then these are observations. And so they're saying this generative model extends the homeostat by including a second set of exteroceptive variables that correspond to light intensity, Y2, observations of light, and beliefs about sunrise. So like beliefs about the generative process, so generative model of the generative process. Furthermore, the model includes a predictive relationship between sunrise X2 and body temperature Y. So again, I wish we could label every edge because the edges are meeting different things even at different times. Like this is a predictive relationship, so it's an anticipatory one, but then the one about light intensity and the belief about sunrise, there would be ways to make that an anticipatory or an instantaneous relationship. But the result of this sketched architecture is that inferring a sunrise, which will only happen with high posterior confidence if the visual observations are consistent, so low prediction error, inferring a sunrise, not quote, seeing a sunrise. That is not what is happening in the model. Inferring a sunrise and finding that visual observations are compatible with it can trigger the autonomic response, the behavior U, of thermal regulation in an anticipatory manner. That is before sunlight actually increases body temperature. Well, how would the parameter be set that way? Because if that were adaptive, then other parameter combinations have been weeded out already by evolution. So that is where we tuck the thread back into the ball of yarn, which is the parameter combinations that are non-adaptive die. They fail to exist. They're not going to be measured as things in the future empirically. However you want to take it. So the quote, like, how does it work? It's the same as it was 50 years ago, which it has to do with survival of the persistent. Steven? And these graphs, what they're used for as well is they do show beliefs tied to the observation space, the space available around the observation. Whatever the sensorium that is available, that's what gives the beliefs its scope. And then the expectations are tied in here more clearly to action. So the errors, what's important, the actual dynamic that's driving action to change something is coming out of the prediction area, error that's feeding into action. And I think that's quite important because this is like the proto-animal. This is like the proto piece here that ties into a lot of how we think about knowledge and meaning. I'm not saying that I'm just extrapolating a lot, but I think it does show how beliefs are tied to the type of sensorium that is being used to shape the observation space. I think that's quite useful. Yeah, and these are just... Yeah, Dean first, go ahead. I just want to get both of you guys' opinions about that orange barbell at the bottom of the diagram because we're talking identity, but we're also still talking about dependency. We're talking about duplication. We're talking about anticipation. So that's a lot of stuff packed in one orange barbell. What do you think? Yeah, I'm going back to the full caption. The red circles represent the expected values of X. So if we're interpreting these as the posteriors on X because the expectations, the priors, are the top blue ones, so it wouldn't have to be the red circles. And they don't mention the word orange. Note the lateral... It's there, and it's identity. Note the lateral modulatory connections in the allostatic network C24 for details. And 24 is the graphical brain, the frist and par and device. So yeah, again, this is just a sketch model. They're not using it to fit any data or even simulate any data. And it does relate to what Stephen said about the inter-sensory or the intermodal inference, which is one could imagine a cognitive model where if sunrise... If there's noises that are associated with sunrise, then beliefs about sunrise can be a variable with edges coming from different sensory modalities. So it gives us a separation of the organs of sense and internal cognitive modeling. Stephen? Yeah, it gives a sense of when would something be important to act upon. So there's many, many things that I could believe I'm seeing or being perceived in terms of this is sunrise or this is the type of light coming in. This is the nature of the light. This is where the light's coming from. There's all sorts of stuff. The stuff that's really filtering down is what is important and this is where the affordance is coming in. What affords me to make some action? If it was expanding out, there can be lots of things I notice. I might notice that there's a line there and I may notice and believe it's orange, but I may not act upon it until I take my attention to it and think it's useful as there's many things going on on the page. So when we have beliefs, but then there's what's being acted upon out of all of that to then create some sort of change. So I think that is also quite useful in this architecture. Can I add to this? Because I think it's really important. So we're talking about in the context of a sun coming up, but let's see whether it's still is true and I think it is just to prove a point, whether that orange barbell still makes sense if we're talking about Daniel anticipating 61-year-old Daniel and Stephen anticipating 61-year-old Stephen and Dean anticipating 61-year-old Dean. I'm much closer to 61 than the two of you, but do I need the Monte Carlo, do I need the play out to be able to make that anticipation as long as I have the identity and the parallelism and the backwards and forwards looking dependency that we see between X1 and X2 and Y1 and Y2. That's what I think it makes this a very interesting way of being able to sort of build on the previous homeostatic example. Thanks, Stephen. Yeah, I think also this speaks to traditionally we think about the meaning. What does this mean? All the meaning out there, as you mentioned, all the data, the stuff that it could mean to relate all these variables, but ultimately and the bit that is actually what active inference gives us is that barbell is where the meaning comes in. It's like the way to know how being 61 is for you, Dean, is for you to imagine what it would be like to make choices and to act in the world as a 61-year-old Dean, not necessarily to go out and look at all the data, so to speak, all the beliefs and the things we think about the world, but to actually bring that in and of course you've got a better chance to do that because you're closer to being that kind of Dean. So I think there's quite interesting that barbell at the bottom in terms of pairing together the kind of belief about what actions and prediction areas are available and it's more into that meaningful action, meaningfulness realm rather than what does it mean? What's the data tell us? We might get there and I'm not pushing back on what you're saying, Stephen. We might get there, but I think for now what we're saying is that in order to get there we have to have more than a single stack of dependencies. We have to have a double stack in order to be able to extend the homeostatic nature of how we've evolved to where we are. Again, I'm not pushing back on you. I'm just saying I don't want to jump that firing pistol just yet because I think that orange barbell is going to turn into a green arrow and then all hell's going to break loose anyway. So we should carry on. I don't want to take Stephen's stuff away, but I think I want to park it because we're going to go somewhere in a minute. I'm hesitant to ascribe too much specificity to something that wasn't labeled in the caption, doesn't have a clear labeling in the figure either, but I think it is already demonstrated that using graphical frameworks and partitionings like we have an active inference, we can start to approach some of these questions. How would different kinds of variables be connected? How does that relate to future inference on action, counterfactuals, et cetera? These sections 4, 5, 6, 7 were just laying out the relationship between an evolutionary or a physiological function like homeostasis, allostasis, or simple behavioral control, models of things that are core evolutionary features and functions, connecting them to the kinds of graphical representations which are like a little bit of a hybrid of a Bayesian graph and a factor graph because there's some kind of computations being implied here, whereas in a Bayesian graph, the edges only reflect statistical dependencies, whereas we're bringing in a little bit more of like a nuanced type relationship that one could imagine could be unpacked a bit more were to be specified. But it's just to show how different graphical models relate to different functions. Architectural underpinnings, whether you think this is the actual architecture or whether it's just an instrumental architecture, like a model structure on a given phenomena, how do model architectures relate to evolutionary functions? Where is evolutionary time in this model? We started out by asking, like, how do cognitive phenotypes arise specifically over evolutionary but also over developmental time, and that's where we get to one of the main pieces and contributions of the paper, which is figure four. The five main dimensions of elaboration of generative models introduced in this paper. So there's other kinds of changes in ways that this structure could evolve, but they're going to focus on these five. There's starting with the homeostat. Does that have to be our starting point? Maybe not. Starting with the homeostat, there's the I operation, which is identity, unchanged. There's I plus I, which is just a parallel isolated duplication. That's like from one photoreceptor to going to two photoreceptors, so to speak. There's the allostat, which is a duplication as well as a crosslinking. So these are just sketches. One could also probably write that one as like a duplication followed by a linking, and there's no like change in linking, for example, here. So this is like a few of a taxonomy of operations. There's increases in temporal depth within a level of the hierarchy. So looking one more unit further in time at a given time scale. And then there's hierarchical nesting with an H. And the figure is shown like this because these are like the things that can happen to the homeostat, and then they can have a second round and so on and so on. There's so many other ones that could happen, like where is reduction? Where's loss? Where's deduplication? Where's the uncoupling of the allostat? Reduction of temporal depth, reduction of hierarchy. So there's a broad space, but one of the main contributions of this paper is to connect functionally oriented graphical models of homeostatic and physiological function to the operations that result in the elaboration of simpler model architectures into different architectures. This is an interesting part. I know Stephen wants to say something too, but really quickly, there's no plus, minus, supply and divide. There's an operation, but there's no symbolism for that. And again, that keeps it safe on a statistical level. I think that's a nice tell there. Again, for somebody who's just sort of looking at this for the first time, pointing that out, it's not there, and because it's not present, that tells us something to your point about maybe how do we get deduplication? Yes. Well, i plus i, it's a suggestive use of the addition operator, but we're not adding these just like integers. So yes, they're kind of like categorical operations that constitute this evolutionary algebra. Okay, Stephen, anything on figure four? Yeah, I was just saying that they've set a kind of a paradigm from the homeostat, and they've taken that through and they've shown that plausibly can carry through to cognition. And then, like you say, there can be other ways, but what they've now given is they've given some legs another way to think of the legs that can be attached to that paradigm because this modeling approach, you've either chosen a lot others by taking it. There are certain things that are hard baked into that early stage choice making, which then makes other paradigms wouldn't fit with it, right? Certain, even with the go instrumental or realist, your instrumental model or your realist models don't make sense unless you, with this, unless you take in some of the paradigms that this structuring creates. Yeah, agreed, thanks. Maybe think like, is this functional model realism? Like the function of the model is what is being tracked through time. And so that, there's many other alterations that could occur. And then there's the even more granular, like parametric changes or changes in connectivity or edge types. And then I think, as you said, Dean, it went all hell breaks loose. So then I'm just, what's interesting here, let's look at the Allostat in a little bit closer detail. Okay, so previously we looked at the Allostat and we were like, okay, there's an orange bar. Now the Allostat has a green bar. It's a green arrow and it's a unidirectional one. Well, this one, the Allostat has the green bar. Oh, it's okay. Well, those are my bad eyes because I saw, I saw the little arrow from the blue actually turning. I didn't actually see that there was two circles. Thank you. There's my 60 year old eyes for you. Yeah. Okay, cool. And so each of these are in quite a different domain or at least like a different aspect of cognition. And these are the transitions that it can engage in. So like duplication retains the exact same function, but it's a very important step. And that's a little bit what I explored in the dot zero with genetic duplications, like on the left side here at the screen, having a region of the genome duplicate. Yes, one of the duplicate copies can just degenerate. That may be like opening up space or creating motifs for other kinds of later evolutionary steps. So even this is not like a failure, it's just a change. There's neo-functionalization where then those two models can start to subspecify. Like if you only have one photoreceptor, it has to be doing what it's doing. But duplicating it then allows one to focus on a different wavelength than the other, for example. And then there's sub-functionalization, which is where initially a composite function, AB, becomes two like enzymes start to subspecialize on one of the two functions of the ancestral. So this is kind of like a role duplication. Okay, so duplication is an important event. Then there's the allostat and multiple things happen with A, but we're seeing it as there is a duplication occurring and there's also cross-linking occurring between beliefs in one modality or beliefs about one type of thing, influencing observations or about another kind of thing, as well as this lateral modulatory green-orange bar. And then we have two kinds of expansions into temporal depth, a sort of local expansion with the T operation, just pushing the model one step deeper, and then the H hierarchical nesting operation, which brings it to another time scale of analysis. Dean? And this is where I really struggled, and that is when they were at allostat, it wasn't just comparative. I felt like you had to build in time, especially if we're talking about identity. But then it was like, no, we're just going to leave it comparative, and then we're going to introduce the when-ness of this. Now, I mean, I understand why they're doing that to try to make it tractable, but I don't know how they can say it can be comparative, meaning I know the difference between A and B right now, but not sort of address the fact that in order to differentiate into A and B, some amount of time had to been applied to that split. So I was saying you have to go back in time in order to just be able to do a comparison so it's temporal only going forward in time. I don't mean all hell broke loose, and I did ask this question. You're right. Like here, there's the triggering in an anticipatory manner. And so in one view, it's like, but this is a single time step model. So how could that occur? Because it doesn't have a temporal depth. On the other hand, again, like kind of that those two senses of representation in the structural sense, if the model is a single time step model, it cannot have a representation through deep time. Ergo, it cannot be anticipatory functionally. The other side of the coin would be if it's enough at this time step, such that beliefs about sunrise trigger and response, the model can be functionally anticipatory without a temporal depth representation, which is why the representation discussion was very important leading up to this one because we're all over those eight quadrants here. Because in evolution, and this is something that cognitive psych people talk about is our cognition being shaped towards effective, which is to say productive slash reproductive success, not towards ultimate truth discovery. And so should we be too surprised that our instrumentation, which has had a certain objective function, is then susceptible to being convinced about things that are not true? Well, I don't socialize with those people. So I don't know. I just knew that when I was looking at it, I was thinking of it in real practical terms. What is built into this in order for it to still make sense? And that wasn't necessarily parsed out in the diagram. You kind of had to fill that in through your own interpretation of what was going on. So thus, back to discrepancy. That discrepancy thing is going to come up again and again, and I'm really glad that they inserted it at the beginning. Thanks, Stephen. I think it's useful here to hear this temporal depth and hierarchical depth being sort of spelt out. Because I think sometimes it's a little confusing in other papers quite where it fits. So if temporal depth, and this again might be where some of this ontology work can be useful just to sort of see if that can be kind of consistent. But if temporal depth is, you know, you've got variables for past, present and future states, but they're operating maybe within the same kind of dynamics. And then if you've got different temporal rates, operating simultaneously, then you get that hierarchical depth. And I thought in the past, I'd sometimes thought temporal depth would have covered that more hierarchical depth. So, you know, this is interesting to think about whether this might be actually the way that that differentiation is best made. Differentiation is probably best made with reference to a specific model. And this gives us the design language and the grammar and the motifs. So if we want to predict 100 years in the future, should we have a decade model with 10 year models? That's a times depth of 10 decades and then a nested model with a depth of 10 years. Or will we have a one layer model with 100? They're very different. And it's not to say that one is like more accurate or less and depending on how the exact situation is set up, maybe the computational requirements of one or higher or different than the other. But that's the discussion to have. Do we want to have a B transition matrix? Let's look at a nested model with temporal depth. So here we have on the top level of the model, S state two, this is temporal depth happening at the upper level of the model. Three discrete time points in an upper level. That is temporal depth. There's also nesting within each time step. There's a cognitive rollout on three time steps. This is one type of nested model, but there's other nestings that can exist. So here we see a depth of two. And then at each level of the nesting, there's three temporal depth, but they're totally different in architecture and in function. So it is the interesting discussion, especially when we have to do multi-scale prediction, like Fermi estimation. So if the prediction is within one time scale, there isn't necessarily a need to nest. But once we get into larger time scales or where there's recurrence over multiple time scales, it makes sense to have a nesting model. If we wanted to predict someone's activity over the next seven days, it might make sense to have a day model with a nested model inside of that. Because then within each day, there could be parameter setting versus just trying to fit one time series and find the parameters that make that one time series oscillate in a circadian way. Instead, we just need a day model with a very specific kind of simple transition and then a transition at a nested level with a depth that's measured at the hourly and minute time scale rather than the days. And that becomes especially important when doing family-based model fitting, like variational inference. So we're not just getting all the data points and fitting a spline through them, but we actually need to have appropriate model structure. Otherwise, free energy minimization will drive right off a cliff, just like we talked about with Axel's bacterium. So if we want to do free energy minimization and model selection on model structures, we can find the best model relatively easily given the families that we've specified. But there's no guarantees once we step outside of that spotlight that we're even in the right category structurally. And so that's why humans having these design motifs in mind and many, many, many more helps us not get into what seems to be a global optimization based upon free energy minimization but actually is a very relatively local optimization based upon unimaginative model structure learning. Steven? Yeah, thanks. That's really helpful. And it also ties into something I've been really thinking about in terms of the modeling itself, is these scales, these steps. You know, you're talking about the multi-year, decades. When we're modeling, it's like what do we have access to? So we tend to ascribe that in terms of what can we externalize and what can we use to put something into a model if it's instrumental. And then there's a question of, well, what is the realist perspective on that? And what is there that's within our conscious awareness of being in the world? Something that can be reported as an event, a story and what time scales are present there when you get into 100 years, it's beyond that. So then you're saying, okay, what are you going to extrapolate? What sort of temporal structure? But on the other scale, if I'm asking what do I think I expect you to do next or to Dean to do next and that might seem intuitive, I can have a whole series of temporal and nested scales going on which is not even available to me in a cognitive conscious way. I might have a sense, have a feel for what I think is going on, but a lot of my predictive processing will be happening at rates unavailable to my conscious awareness. So that also ties in. I'm not saying it's an answer, it sort of adds to the challenge, I suppose, but it is the challenge of how much ends up being what we can get a measure on to model as much as what it is. And then that's the same challenge in a way that comes even from the realist perspective at some level as an organism, what is it that I even can access in some realistic way, not saying we can model it or know, but could be put into something like this. Sorry, Dean, you're going to say something. Well, all I was going to add is that, I mean, so the paper's talking about how do we use certain math to be able to understand maybe the evolution of cognition, right? That's essentially what the paper's trying to give us, you know, point us in that direction. So again, in order to make this accessible, in order to sort of lower the barrier for people who actually want to have this make sense because they don't necessarily spend sums and sums of hours on trying to figure this stuff out like the authors or maybe even us, I want to point out something that I think is obvious to us, maybe not so obvious to people who are looking at this and going, I have no idea what they're saying. I think, first of all, is that there's a chronology which we keep going back to, not just at the temporal depth level, but so that's relativity math. There's an evolutionary aspect to this, so that's algebraic math and there's a dependency, so there's a statistical math. Do we have to be polymath in order to be able to feel our way through this? And I'm going to go back to what I said at the end of the 37. Carl Friston said, you're not going to be a polymath, but you better be at least somewhat comfortable, feeling comfortable with the math because there's no one math that's going to get you here. It's a blending of all of them and then it's how you feel about that that's probably going to be the thing that lowers the impediment to really being able to use this now in real practical terms going forward. So what's the next evolutionary step, right? It's how you're able to turn this into something. So I think that's one of the great gifts of this paper if you're not afraid of being, if you don't want to identify now or you don't duplicate now or you don't allostat now as a polymath, maybe that's something we owe people that want to get into this. Maybe the British pronunciation of maths helps reflect that there's already a plurality of maths and Kirby Erner, coming more from the synergetic side, talks a lot about this actually, like how the discourse around math as universal language makes it sound like math is a plural language when actually math is a plural verse and so maths reflects that a little better better. It's not like these are just sort of slopes on one mountain and that's mount math and it's so high and only a few people get to the top and scale every side. This is just like we're using maths of all different kinds. Stephen? And often you see a lot of this in the papers that are published around active inference. There's often a group of people where maybe one member has that higher level understanding of the math but the other question is to understand what would it mean to make an observation. The mathematician may not be the one some of the work with Ryan Smith's work on the gut it was like well how do you do gut inference? Well how do you get some tractable source on what's the gut doing? So they created some electrode I'm not quite sure how they did it but they had some way of getting sensory information effectively sensory information or effectively some sort of information to put into their model and so knowing how action the implications of that I think is also kind of a big part of this and in some cases it may be a trivial part because it's fairly obvious in other parts that may be the biggest barrier and then that may be the biggest barrier as well often in organisms how do you even access plausibly the approximation science isn't so approximate that it's just chaos and it's tractable Yeah the rate limiting step for impact and improvement in the real world is unlikely to be any single individuals conceptualization of math on a team if it's structured appropriately and for those pushing the frontiers of math their understanding of math is quite literally a rate limiting step for them or maybe it might be something very mundane like time availability but when it comes to thinking about real model based science and translational applications of active inference I think that we're working towards new ways of combining skills and having shared knowledge resources that help make that make sense and using and ontologies narratives formal documents tools because we're doing it on teams and we're online let's just in the last 20-30 minutes go through the final pieces of the paper so that this dot one will have been like a first sweep initial pheromone deposition and then we can return and take some cul-de-sacs etc so section nine explores a little bit how the duplication operation allows for multiple behaviors to arise and that's sort of by analogy to the the genomic duplications and specializations and all those different routes that can occur here they connected duplication a little bit more directly to the factor graph models in the sense that duplicated motifs have dynamics that are conserved over different sensory motor domains so let's just say that we just had a visual model a column of visual and then we duplicated it so now we have two photoreceptors with parallel columns and now the photoreceptor in the second one changes into a chemoreceptor because instead of expressing rhodopsin protein it's expressing an olfactory receptor protein there can be a conservation of the dynamics of inference even when the observation has changed but it was a slot for observation and so we went from having like monocular vision to binocular vision to monocular vision and there in one jump to just like duplicate and transpose it's not likely to happen in an evolutionary context and we can see the structure of these graphical models as equivalent to being factorized probability distributions which is to say that the sparse connectivity of variables means that we can make parts of the model that can be like fine tuned independently in a way that can be fit very tractably that's the factor graph that's factorized Bayesian inference and we've talked about that in other places but just to hear all we need to say is duplication enables like a control C control V copy paste and then let's edit the other version but that can now happen with evolutionary features and functions, Dean. Real quick your hand gestures in this section to the orthogonal mesh beginning and in point two I'd like to pull that apart a little bit but let's carry on but that's what I took away from this the mesh began and in the beginning there was a mesh begins we'll return to that for dot two okay section 10 talks about temporal depth it's about time and about the endowing of generative models with temporal depth and the way that that supports prospective inference anticipation or retrospective inference which is like memory so here's what the operator looks like we have X sub tau X at a time point and then now there's X tau plus one the next time point one could be now and then T could be the last moment so memory and now because I don't I still see figure four slide 29 okay yeah so the structure of anticipation and of memory are very similar it just depends whether one has the stream of observations happening on the left side and a prospection or the stream of observations are on the right side and in which case it's retrospective yeah I mean it sort of ties in they talk about what the police do it's very hard for someone to lie backwards so you always get them to tell their story backwards because it's a it's very hard for them to do that because as you say if we're creating things retrospectively but sort of playing them forward so to speak it would kind of tie in with that I want to emancipate that I looked at police in the paper I see policies is that polices it's a great opportunity for my favorite joke but I won't um this section talks about temporal depth okay from wence temporal depth various researchers have speculated that a major driving force for the development of deep temporal models was foraging and there's a lot of interesting empirical and conceptual reasons to think about foraging in terms of temporal and spatial depth of model and that's true in the vertebrates where they're discussing mainly like hippocampus and entorhinal system and in the invertebrates that don't have a hippocampus entorhinal system but it's one reason why comparative neuroscience is so important because it prevents us from getting fixed on specific anatomical realizations of given functional attributes of evolutionary systems like if the story of memory is just about some brain region in humans it may be a useful model it's not even to say that it's inaccurate it's not even to say that it's a partial model it just is a model of that whereas if we want to understand a given cognitive function in a broader context we have to pull back somewhere a little bit beyond or in complement to the anatomy because we need the empirical anatomy to have anything specific to be talking about but it isn't just the case that vertebrate anatomy is the way to do active inference. Stephen? I think this is really helpful because it shows that the traps you were just mentioning there that people we fall into is like humans have the frontal cortex it's all about this new what's the new parts of the brain that are there but we don't say that with a formula one car we say well it's just got a primitive engine thing in there and the rest of it's evolved well that's itself different and the same with the Mark Sohn's conversation the evolution of what's often dismissed as the primitive parts of the brain they themselves have been more sophisticated they're doing more things they may be helped to be doing more than they were as opposed to there's something else that is doing all the heavy lifting which I think is certainly common a common misconception I think in psychology anyway Dean? Can I read quickly from the paper? Sure The evolution of temporally deep models from simpler models couldn't realize during evolution via the progressive keyword here, parcelization of an initially undifferentiated model so a relative sense of invariance i.e. a model that does not distinguish present from past and future into a model that features separate latent states for the past present and future this is the part I love a key drive for this factorization or parcelation may have been the observation and progressive internalization of the sensory motor sequences sequentially that the animal creates an experience as well acting while acting in other words the self-modeling of one's own sequential behavior patterns c53 for a computational example I didn't open up 53 but again if we want to talk about the polymath piece of this and the fact that our hex cells have to do some parcelization it's right there in terms of sort of the next layer on top of this as we move up through that evolutionary cycle yeah 53 is stoynov at all the hippocampal formation as a hierarchical generative model supporting generative replay and continual learning there's a ton that could be said about that from like a computational and a neuro anatomical perspective the internalization of the sensory motor sequences relates to the sensory motor detachment that we talked about in the representation paper so if some motor region in the brain if it's like a marionette and the fingers and then the motor plant so that motor region is has to be coupled to the activity of that motor region like either in a one directional way or maybe even in a bi-directional way let's just say so that system as it thinks so it does and vice versa it cannot engage in counterfactuals because any direction that the neural system turns the motor system is just simply doing that so it is not able to engage over evolutionary time in maladaptive action the ones that do no longer persist it's always ties it back to reality and to like the finite amount of entities on the finite space of Earth like Darwin's famous calculation like if the number of elephants slowly reproducing like they'll cover the earth unless their population levels are kept in check and so once there's sensory motor detachment so there's some brain region either a motor region or some supplemental area some ancillary area that's able to intervene in that process or somehow play a role that's detached from the motor activity now there can be like a motor planning occurring that opens up the affordance of temporal depth or of counterfactuals all these other cognitive functions are rising via the sensory motor detachment and so foraging is an awesome place to look at that for a lot of reasons indifference life forms and computational foraging and so like just a few of the notes were like what are the real cognitive demands of foraging for different creatures and what about internal foraging like mental foraging and these papers are very good foraging in mind and foraging in semantic fields both very useful because they have to do with the way that they have some nice maths too but they have to do with how actions that are spatial can have conserved dynamics structurally just like we explored here to mental actions and then we see them come together like with a memory palace or something like that so foraging is cool it's good to study 11 endowing generative models with hierarchical depth affords multi-scale inference kind of addressed that earlier with like the decades and years nested temporal modeling is not the same as just deep temporal modeling and then and another way to look at that is like how many operations you would need to get to 100 years well if you're going to do only temporal depth it's going to take plus 100 or 99 time steps versus if it's one nesting and then 10 on the higher order than 10 on the lower order it was 21 operations and so then there's shorter sequence of events to achieve a higher performance model assuming that they're appropriate section 12 yeah go ahead just one thing just on that model and it's also interesting to think about how we think of time because like indigenous approaches tends to be well as a cyclical time right so things in the future will be a repeat of the cycle so the times of a thought of in a cyclical motion of the sort of passing of the day passing of the and of course we have this we have this assumption of time stretching out you know so is it that we will over time is it well what will things be like in four generations and if my world or my ecology follows certain cyclical patterns and it's very stable maybe that's quite a good way to think about you know it's like say plant the tree now to help the people in 150 years you know so it sort of comes back to there's different ways to construct how we think about time cool so 12 looks into a phylogenetic tree of the evolution of generative model so what is a phylogenetic tree and what's the relationship between active inference slash FEP and evolution well we'll have more to share in the coming papers as always but how do they address it specifically here in figure five they map on we'll route the tree so technically as shown right here it's an unrooted tree there is an implicit rooting here but it's a very interesting feature of phylogenetics that few outside of the field know about that computationally the tree is often inferred in unrooted fashion like by clustering the ones that are close and finding the relationships etc however that is leaving an ambiguity as to where the root of the tree is so like for example if this is the overall connectivity of the based upon the phylogenetic inference this is what the tree's apology has been inferred to look like and somebody might say well it looks like this one the red one and then the green one below it they are they're clearly very closely related but actually if the root comes in here then that's not necessarily the case so the rooting of a phylogenetic tree is very important and a tree that's rooted inappropriately or has an inappropriate outgroup selection is just it's worse than illegible because it's highly legible and it can have a high statistical confidence but it can have an absolutely non-biological topology but assuming that they meant the tree to be rooted here here the branches are where there's going to be different operators occurring including the eye identity unchanging one and then the edges reflect like bifurcations expesiation events so here there's an ancestral homeostat and then the lineage leading to orange did not go undergo any changes that were non-conservative on the branch leading to these sister species there was a duplication and then it stayed the same and stayed the same and then the purple node is going to have two duplicated homeostats and so they're just overlying evolutionary algebra the steps that they gave onto the topology of speciation and so that as they discuss opens up a way to think from this sort of graphical functional perspective which is compatible with active to think about how for example from the species today we could infer some of the schema of the early vertebrate brain or the early primate brain or push it back further what about the common ancestors of the vertebrates and the invertebrates but that's kind of what they lay out here and so these are familiar models to evolutionary biologists like phylogenetic ancestral state reconstructions of phenotype which are often even done in a Bayesian way like there's an algorithm beast Bayesian estimation of ancestral states this is also a Bayesian model that allows us to do reconstruction of ancestral states but it's in a very very different way than it's been approached outside of the active world but that's figure five and that's where we see the evolutionary algebra the design patterns or the pattern language for structural changes in generative model superimposed on species relatedness and phylogenetic tree any thoughts on five we try to convert this and reapply it to my picking of teams in the March Madness table and I wasn't able to take phylogenetic trees and convert them into winning thousands of dollars on betting on college basketball so despite my best efforts it's kind of contained in for what it represents not to foreshadow too hard but I think livestream number 39 40 might constitute March Madness and then also here's another you know for those who watch okay here now we have a little bit of a March Madness bracket going don't we? Here's Bracketology so here we go now here the root could have been here and then here's like two big species clades like this is like vertebrates and invertebrates and then there's some rooted clade so that's like why it's important to root the tree because here it looks like all of these are sister to each other at the exclusion of this one but then if we were to have rooted that tree like here then actually this is a small sister outgroup to all of these and so the rooting and the contextualizing is really important it's just like it's super interesting because it seems like you put in so much data into these phylogenetic models and it seems like literally how could this not converge on a super obvious answer I mean isn't it clear how these answer related to each other phylogenetically but then there's a little bit of nuance that enters the picture Stephen yeah I was curious that the allostatic can recombine with the homeostatic to give a base level temporal so it's like it doesn't deflate back to just staying allostatic I suppose there would be they just don't show that it could it could have if it did that I suppose you just wouldn't talk about it you know it's got A when it goes A then it goes A I goes to HT yeah so we have a homeostat here we have two homeostats two homeostats and now here we have one homeostat and one allostat in this blue node but they're just kind of giving general examples like it would have been cool to see this is where we're talking about mammals and here's non-mammal vertebrates like here is a specific brain region and also that is in this Chakraborty and Jarvis 2015 paper this paper gets more into the nuances of neuroanatomy and neurogenetics and how like duplication in brain regions can be underpinned by developmental neurogenetic changes and they do link that a little bit more closely like to studying the bird brain and the song and motor system in parrots which is also cool because bird song has been studied with active so like there could be a nice building upon this work and connecting it to some of the structure fitting in figure five so it could almost be like you got parts of the brain which are more allostatic orientated or homostatic orientated and so in a way by the structuring of that gives some sort of differentiation in terms of how information and control would be carried out totally and like extending temporal depth within a model it's like a brain region getting a little bigger let's just say not that size is correlated with function in that specific way and then duplicating brain regions is like duplicating laterally or nesting a brain region that's hierarchical think about how the eyes developed the insect eye it has a relatively simple module but then insect eyes can change in size over evolutionary time very radically from like taking up you know half of the head to being just one photoreceptor or even being lost because they're existing in more of a duplicable motif whereas the binocular eye system that mammals have it's not as amenable to like just doubling into four eyes or you know splitting into two in the middles or something like that whereas the insect compound eye has like some of those evolutionary affordances so then understanding like parameter change within a model and then structure learning on populations of models and what are the mutational adjacencies there's so many awesome areas and it's like truly just beginning for evolutionary bio and active Steven yeah it was really helpful like having someone like yourself who's got that background to go through that's quite helpful because it's it's a little it's a little bit intimidating actually seeing this for me and I see the logic what you're saying it's quite exciting actually seeing that again one of the things this paper does is it's it's tying threads together so which seem very distant almost in traditional terms like well I don't like using the word kamikaze but it's a little bit like yeah it's bit overwhelming often so I think it's but of course that's the beauty of Act Inf is it has that unifying potential so thanks cool yeah thank you I agree like there's some labeling this is rarely included unless it's a time calibrated phylogenetic tree but like we're looking at the passage of time but that's not even always shown and so yes Dean we'll have our final thoughts can you just go back to that because I really I didn't prompt you to put that up there but I really like that because this at some point we're going to have to in point to talk about that sort of basal gating part of this and most things that are are organized as if then causality but you just said it it's then if go or don't go is very dependent on then before we choose to go or don't go right go or don't go is still held open until one of the others decided so I another thank you for that because that again that in the point to maybe talk a little bit about that cool yeah there's so many ways to think about like this garden of forking paths branching paths how that relates to parameter updating Bayesian belief updating and structure learning what about when the structure of a model is a parameter in a nested model so then that sort of blurs the line from the bottom looking up it looks like structure learning but then from the top looking down it looks like parameter fitting right and if only we had like some mathematics to describe that okay any final thoughts or what are we excited about for the dot to next week well my last thought is yeah I'm excited to keep this going because I think it's got a lot of legs to it and I think one thing that comes to mind as well with all this progressively increased tempo and hierarchical depth is maybe where some of the more basic elements still have a really useful part to play is what Mike Levine said about when do we stop like so we got all these ways where we can start to okay so we got all these ways that we can think how do we know when to stop and sit down I mean it may be in the ultimate ways when our tummy maybe that's where the gut is so useful like at the end of the day the brain can go running off and maybe the gut in the heart need to say okay sit down get some food so which I might well I'm going to get a drink of honey now but I'll bid you farewell and thank you for a great dark time but I'll hear your last thoughts as well Dean I hate having the last word so make sure you won't have it don't worry you'll have it and I just appreciate what Stephen just said because I I think that that's a big part in this ability to go from word back into the phenomenological space and and I think there are other other brains that we know of now that play a significant role in that not just the one that turns everything into symbols great my last thought is there's no better drink after a foraging trip than honey all right thank you and talk to you later bye bye