 Hello everyone. It is June 9th, 2022, and it's chapter 3, week 1, and it's the first cohort of the textbook group. We're going to go to the questions. There weren't any ideas added for 3, but if people have any key themes or any ideas that they picked up on with 3, they can add them here. Otherwise, we'll go to the questions, and also if people have questions in the chat, or it would be also great just to take notes or add questions and continue to upload them. And then next week, we'll continue the discussion on 3, so this is just the opening set of questions. Maybe just the first ideas of the book turning to questions now that we've at least gone through chapter 2 once and part of chapter 3 or the whole thing of chapter 3. What is the high road to active inference? What does anyone think about the high road to active inference? Or about this low road, high road construct that is used in chapters 2 and 3? Just broadly, how did the approach of chapter 3 feel different than the approach or the topics of chapter 2? J.F. and then Ali. I just want to make the general comment that the high road approach is what first grabbed me when I encountered the free energy principle. That what really fascinated me that there was an attempt at a unifying answer to self-organization, cognition. And that this could all be brought together around some relatively sparse number of concepts. That's what grabbed me, much more than the low road. Thank you. Ali and then Ben. Well, yeah, reading chapter 3 actually clarified my previously held, in fact, wrong belief that previously I thought that a low road was a kind of bottom-up approach and the high road is a top-down approach. But now I see that low road was basically tackling the problem of active inference from the inferential point of view and from the statistical and inferential point of view. But in the high road, we have a much more generalized and much more, let's say, all-encompassing view to tackle the problem of active inference, namely by way of defining markup blankets and all the properties of living organisms and so on. Awesome. Thank you. Ben and then anyone else who raises their hand and also you can take notes on this idea as the book like high road generally. Ben and then Mike. Yeah, I think for me as well, this chapter 3 in the high road has been an easier road in. And the thing that I noticed about it was it seems to me, so they say in the introduction that active inference is a normative theory. I feel like the high road to me feels there's much more normativity in the high road is that it's an account of what organisms have to do. And I think compared to the previous, compared to the previous chapter, I think chapter 2 felt a lot more descriptive for me. I'm not sure people will do. Maybe that's just the way that I was reading it. But it felt like chapter 3 was a lot more kind of had a more normative force to it that I really enjoyed. Thank you, Mike. There we go. Sorry. My computer was not responding. So I went through the similar experiences Ali described in terms of going into it thinking about bottoms up and top down and coming out of it thinking about it differently. I feel like in chapter 3 there's still a lot to unpack around the Markov blanket. So that is still capturing this inferential nature of things. It still has a Bayesian quality like what we saw in chapter 2. But it also for me at least it still feels a bit opaque at this point in terms of this sort of indirect influence that's taking place across the Markov blanket. Awesome. Okay. Any other comments? Otherwise, we're going to go to the questions and we'll approach many of those topics. Okay, first question regarding the Markov blanket explanation on page 43 and box 3.1. They write no additional information about the future is gained by finding out about the past, assuming we know the present. How can this be true? That's one question. They write they define regarding the Markov blanket as the set of variables that mediate all statistical interactions between a system and its environment. What is a statistical rather than a non statistical interaction? They write that they have supplemented conditional independencies with dynamical constraints so that the flows do not depend upon states on the opposite side of the blanket. Why does independence between the flows on the two sides of the blanket matter so much? Who would like to give a thought on this first question? How can it be true about Markov blankets that no additional information about the future is gained by finding out about the past? So I added a note in there. You can tell me if this is correct or not. With regard to Markov chains that as I understand the Markov chain in its present state is effectively capturing all the information about the past. So I assume that translates into the Markov blanket. Yes. Great comment and Markov chain is used very broadly. And just to give one example, it's like the current state of the chessboard knowing more about previous moves doesn't tell you more about the future. And so it's like it's a one Markov chain because the dependency is only about that one time step. Now, if it were to violate that condition of being a one Markov blanket through time and have like a three Markov blanket, that would mean that the same board position, depending on what move happened three before was still having some influence on how the system evolved. That wouldn't violate the Markovian property. It would just make it a different kind of Markov chain. And that is commonly used in the temporal modeling, like in time series modeling and in like transitions of stochastic processes. The Markov blanket is abstracting it from this like time sequence idea, like the now is a blanket between the past and the future. And we could think about spatial systems, like a Newton's cradle or something like that, as well as like spatial temporal systems or just abstract causal systems. So how can this be true? It's true by definition because that's what the definition of a Markov blanket is, whether that description is adequate or maps on to carving nature of the joints and all these other things are respectively statistical model comparison and ontological philosophical questions. But it's true by definition about Markov blankets. Yeah, I mean, I think actually you just pretty much covered it, but I would just say this is the same condition if you're building RL models of different kinds of autonomous systems. This is the same restriction. I think your comment about the nested blankets is interesting because in the RL space, we typically don't have that kind of richness. But what we do then would be, for example, we want to know the distance. And in that case, we would just create a basically create a state variable, which is the, the, you know, the, the history of the, of the state transition. So we fudge it a little bit. But your, your explanation that in, in, in this conception with nested Markov blankets, then that, that would be the more comprehensive solution to that issue. Thanks just to kind of clarify there, like one hack or workaround would be like you make the list of all of the last possible three moves. And then you make a one Markov blanket with all the last combinations of three. That's right. So it still uses the machinery of a one Markov chain, which has the simplest transition matrix, but it could encompass the last states, but it sort of compresses them into a one Markov chain format. Yeah. I mean, very often you want to know the, you know, across, you want to know the last, the average of the last, you know, 20 time steps. You're just trying to smooth something. Right. And it's just way too much work to build all that machinery in there. So you, you make, you make that average actually a present variable is what you do. Great example. If something really does depend on the last 20 time steps, or you want to do model comparison as to whether it's a better model that is conditioned on the last 15 or 20 or 30. You can just condense it with summary statistics and descriptions. Again, into this one Markov framework. Otherwise the combinatorics of how the 20 influence each other is fast. Brock. I guess that's related to it. My comment was just going to be that the way they worded it. If you're not rigidly thinking about a singular Markov blanket. That, you know, I mean, if just practically speaking, finding out about the past, usually does give you information about the future. If you're talking about outside of a singular Markov blanket. So if you, you know, adding the words, no additional information, like about the Markov, you know, about the future of the Markov blanket is gained by finding out about the past Markov blanket. Like, you know, balance that. But in the case of like the machine learning example and comparing models, you're talking necessarily talking about a much larger Markov blanket than just a singular one. Right. So same thing. Yes. And just like many concepts have like a broader conversational informal sense and a narrow more technical sense, assuming we know the present means fully knowing the present fully knowing the blanket. So it's like, but I could still learn things about the past. So that's not to say that there isn't novel information that you could discover about the past. Like you could still go back in that chess game and learn information. So it's not that there isn't information in the past. It's not that it wouldn't even be interesting. It's that for the purposes of how the present goes into the next time step, the present blanket in this temporal Markov chain, which is like a blanket through time. It contains the information you need for the transition frequencies. Okay. They define that it's the set of variables that mediate all and also the Markov blanket. Using last names, you know, personal opinion is not helpful because not only are there multiple Markovs, but the Markovian property is not something that's technical or defined. So maybe someday we'll have better ways to describe and be more specific that don't involve invoking ambiguity around names. That being said, it was worked out analytically by Markov and son and others. But most recently it was Pearl 1988 and causal inference with Bayesian perspective. So this is not like an active inference ism. This drawing upon a total Bayesian statistical framework, any Bayes graph that's not fully connected is going to have some nodes that intermediate. So that's going to relate to this question. So what is the statistical rather than a non statistical interaction? Ali, and then anyone else who would like to address that. Well, I think it has something to do with the way these interactions are parameterized and modeled because you see every phenomenon can be parameterized and modeled from many different many various perspectives and frameworks within various frameworks. But here and the reason I think they put the term statistical in parentheses is that they mean that these variables mediate all interactions that have been parameterized statistically. So I think it refers to the nature of parameterizing and modeling these interactions as opposed to distinguishing between statistical and non statistical interactions. So anyone else can raise their hand but just here's a few thoughts on that. A statistical interaction is the edge in a causal Bayes graph. And people might be more familiar or have seen like structural equation modeling where nodes are variables and edges are the correlation between or among variables conditioning on the structure of the graph. In the causal Bayes graph, the nodes are variables and the edges are statistical causal influences. So causality and statistics is not always the same as what people mean by cause in the real world. But like Granger causality and just this notion of like coarse graining and cause. It may line up with the sort of narrative natural language description that somebody has between when they're speaking, it might be narrower than that. It might be a different thing than that. It might be more general than that. What would a non statistical interaction be? I mean, there's various ways to approach that it's kind of pointing towards some sort of like a touch could be an interaction. But what if things are touching but those variables don't change as a function of their touching. Then do those things interact or not? That's one question. And then cause there's like just tremendous real world and philosophy questions like what causes what what's the difference making cause what's the cause that makes a difference. Necessity and sufficiency. There's so many questions in this like causal philosophy. And this is highlighting. We're talking about the model. We made a base graph with height, weight and shirt color. And here are the statistical effectors. And so we're not saying that there's no interaction between this and that in the real world. Now we're within the model and we're talking about the causal graph. And again, unless the base graph is fully connected, there's going to be some partitioning scheme that will result in some set of nodes being conditionally independent from another set of nodes when conditioned upon a blanket. So it's not like features of the real world or even variables in a statistical equation can just be unilaterally tagged as blanket states. It's always a partition that co instantiates two sides internal and external, but there's a symmetry and the blanket. So it's not like features of the world or even features of the model are intrinsically internal, external or blanket states. But that's a partitioning scheme that can be applied to base graphs of like a vast various structures of base graphs and base graphs can be applied to a vast number of real world situations. And many, many papers and live streams discuss like sort of like all encompassing Markovian monism. Markov blankets all the way down nested Markov blankets are the world, ranging to the critical perspectives, not necessarily detractors, but just those who believe that the usage of the concept is out of alignment with ways that people have used it. Does it make sense to think of a Markov blanket in the context of an internal agent state and an external environment? Is the Markov blanket serving as this kind of a decoupling component because of the way that things are conditionally independent based on the variables inside the blanket? What do people think? Yes, I mean for the internal state to be different than the external state or have some other equilibrium other non equilibrium state that the external environment is is in, you would have to necessarily have. I'm not sure if decoupling is the right word but some sort of conditional dependent independent sort of zone that separates those two regions. So here's figure three one. Thank you, Brock. The active in the sensory states compose the blanket. Now, in various, like live streams and papers, the history of the concept again from the analytical pre computational phrasing of insulation of fluctuation of random variables to pearls, Bayesian and computational framing. However, Pearl's Markov blanket concept doesn't have a delineation of active and sensory states. It's only like there's one kind of fabric in that blanket. It's just blanket states. First in maybe this is also will be shown to be traced to other places in ways but one of the core things that first in at all have brought into the picture was interpreting within the blanket states, the states that have incoming dependencies to the system of interest as sensory, and then the states that have outgoing dependencies from the system of interest as active states. So people may already see like challenges and complexities that arise with the arrows and the directions. And a lot of that is addressed in the how particular is the free energy principle of Aguilera at all. And so we're going to continue talking about this entity partitioning Ali and then Lyle. Well, I think we should point to a very important typo typographical error in this picture here, because in this figure, active states are expressed in terms of external states and Markov blankets and sensory states are expressed in terms of internal states and Markov blankets. That it should be vice versa. I mean, the flows of internal active states are independent of external states and the flows of external and sensory states do not depend on internal states. So we should change x and mu in those two equations. Yes, sometimes with a possible type of like that it's like it's it's so egregious slash blatant that it's challenging to even know but we can absolutely ask the authors but just to like read the equation and think about why it is that way. The dot is the rate of change of a variable. So the internal states are mu. And so this is the rate of change of internal states, rate of change of external states x and the rate of change of you, the active states and why the sensory states. So we're looking at how these partitioned states in a Bayes graph change through time. The states that are the blanket and the internal are the particular states. Those are referring to the particle that's like figure versus ground, the particle that's moving around the curious particle, the active entity is the particular states that is partitioning the thing, literally the thing away from the niche. Then there's the blanket states that are intermediating that interface, and then we'll also talk about autonomous states, which are just the internal and the active states. So the equation is saying the rate of change in, for example, active states is a flow function f and a noise function. Both the flow and the noise function, the subscript is referring to like that one. It's kind of like ipso in Latin, like it's the flow on mu and the noise on mu. The flow, which is what the principle of least action converges us towards in the limit when the statistical fluctuations from the omega are low, is a function of certain variables. And so this is saying the flow of active states is a function of external sensory and active states. However, as Ali has pointed to, the arguments that should be guiding the flow of active states are actually the blanket states and internal states, not external states. And then, analogously, sensory states should have a flow that's defined in terms of blanket states and external states, not internal states. Lyle and then Ali. Yeah, so actually, I think you're getting right at the question, because I was confused about the bidirectional arrows, particularly between internal states and active states. And I think you're drilling right into that. I still don't completely understand what that picture should look like. But I have trouble conceptually, I sort of get, I feel like I get the Markov blanket, but that representation, I have trouble working through the details on that. Cool. So guest stream number seven, how particular is the physics of the free energy principle? It explores different entity-niche-causal relationships. Like, for example, we could imagine a simple around-the-clock entity model. External states induce sensory states, one directional arrow. Sensory goes into internal, internal to active, active to external, like an OODA loop or something like that. But just an around-the-clock, no bidirectionality, no connection between active and sensory. But given this partitioning, one could imagine that topology of cause. One could imagine all kinds of topologies of cause, some that violate the Markov blanket condition, like internal states and external states that are no longer internal and external because they would have a causal link. But even preserving the Markov blanket condition, we can imagine different topologies, different sparsities of couplings between these different states. A bidirectional arrow is kind of like there would be a non-zero cause, and in the other side of the matrix, it would also be a non-zero cause. Versus a unidirectional relationship where like variable two has a causal influence on three, but three doesn't have it on two. In this paper, in guest room 7.1 and that paper in further line of research, they explore what sparse coupling structures, causal architectures of the entity-niche relationship and interface would grant different statistical capacities. So I could be wrong on this, but active inference is not a commitment to this specific coupling architecture. It may be possible to design or describe or imagine systems that have different causal architectures, maybe even true at this basal, most kernel loop, but certainly true when we start thinking about nested structures and cognitive structures and so on. So yes, it is a little unfortunate about this. However, like, this is just thinking about causal relationships of the particular partition. And we're describing statistical relationships here. Causal influence as inferred from like time series data with Granger causality or other methods on specific random variables. So this is not the conversational notion of cause like, but what I'm thinking is eventually resulting in things happening in the outside world, or aren't things in the outside world eventually changing things? Yes. But just like the chessboard through time, the blanket is statistically intermediating. And again, that's why it gets so messy with the application of like, well, the retina must be the sense states. If one can read that example and not fall into the quicksand, it could be didactic. But to tag retina as sense state is like the tip of the iceberg of a partially specified model, often one where the measurements have not actually occurred and so on. So then that can be extremely difficult for those with less familiarity with the formalism to generalize accurately from. Because shouldn't it be like, you know, retina sense states and then like the arm is the active state, but it's a partitioning that co-instantiates and it's model dependence. It's not describing features of the real world. Ali? Yeah, I just wanted to point to one of Ramsted at all recent fascinating paper on Bayesian mechanics, which I put the link in the chat. And in that paper, especially in section three, they have a very illuminating discussion about these whole business of partitioning and markup blankets. And one sentence that's pretty relevant to our discussion here is that the key point to note is that the flow of internal and active components, i.e. their trajectory through state space does not depend upon external components. And reciprocally, the flow of external and sensory states or paths does not depend upon internal states or paths from page 15. So, yeah, in that, if you refer to that article, it has been elaborated much explicitly. Also, in four days, we will have Dalton and Maxwell at all for a guest stream. The paper just came out. Now we'll be able to have a discussion with them and learn and ask them questions. So, like, if people have suggestions for guest streams, that can be accomplished if people want to facilitate or participate in different streams. If they have questions for different streams, being there live, but especially getting involved eagerly and actively and early is the biggest leverage point. Because that allows us to design the material so that it's like permanently useful rather than potentially ad hoc questions that arise during, which can be super important. And hopefully people can see some affordances for participation and development. Let's just get to this last question. And it's, it's like, we have other questions, but these are essential and it's one of the opening formalisms of the chapter. Just to conclude here, though, why does independence between the flows on the two sides of the blanket matter so much? These are good things to think about. Just one short thought would be without independence between the flows, there is no distinguishing one thing from another. Ali? I also think it relates to the, what we read in chapter two about hidden states, because that's what the word hidden probably refers to. I mean, internal state does not have direct access to external states and vice versa. So that Markov blanket pretty much formalizes this hiddenness. Yes. Awesome. And partially observable models of the Bayesian type, which are used in like partially observable Markov decision processes, expectation maximization. Any kind of Bayesian priors and hyper priors and so on, all of these have Bayes graphs that reflect conditionally independent variables. So it's not that they're independent in a conversational way, like they don't have any way of communicating. It's that they are specifically involved in a causal network that has conditional independence, but we'll hopefully develop more answers and notes here. We'll move on. What is the differences between and among the terms Bayesian brain hypothesis, predictive processing, predictive coding and active inference? Might you be able to suggest references or resources for where these distinctions are delineated? Sure. So anyone can add more, but here's I think two key resources here. The first is live stream 43 in 43.0 Maria gave a really excellent overview of predictive processing and predictive coding from a historical perspective. And briefly she proposed that predictive coding can refer to a unidirectional data encoding scheme of the kind that we see in signal processing, video compression, so on. Whereas predictive processing is referring to a bidirectional architecture where predictive coding is implemented in this ongoing top down bottom up way. Bayesian brain hypothesis was also touched on and we connected this to active inference, but 43.0 is a long technical review paper, but we have three discussions on it. Predictive coding theoretical experimental review. Recent paper, very good source for the predictive processing and coding and predictive processing and coding initially was more of about a sensory framework. However, at the end of that paper, they show, okay, now we're going to do predictive processing on action. It's active inference. So predictive processing about action is six of one and half dozen of another within a margin of reasonable approximation. Similar, but there's also some key differences like reliance on difference formalisms, but these are all things for us to explore and unpack as we work on the ontology and so on. Specifically, predictive coding active inference in the Bayesian brain. This is an interview that a departed colleague and I did with Carl Friston in 2018. And we specifically asked the Bayesian brain hypothesis predictive coding and the free energy principle are often equated with one another. You yourself have suggested that the three frameworks are variations of the same basic mechanisms. So 2018 was a very different time. Active inference had not been delineated from FEP in the same way that it is coming to be now. And there was a lot of other differences between that and now, but he gives an extended answer. So for people who want to learn about that, his extended answer and Martin's restatement of the Bayesian brain hypothesis are still some of the best places to find clarity on that issue. Does anyone have any other thoughts or questions on this? Just one other note while anyone can raise their hand is like the Bayesian brain hypothesis sometimes could be used in a very instrumentalist way. We're using Bayesian statistics to do neuro behavioral research. It might be a realist implementational claim like the brain is doing Bayesian statistics or something like Bayesian statistics. So this hypothesis floats kind of among the layers of Mars analysis like implementational algorithmic computational and people do use it to mean different things. And in Bayesian graphs in Bayesian statistics, it can be about perception or it could be about action. So Bayesian brain hypothesis, depending on how somebody frames it and what specific models they're talking about, might include, for example, all of these because these might be applied in a Bayesian framework to talk about the brain. Ben? Yeah, I mean, I think you've covered it really, really well, but I wish I'd have had access or knowledge of the result, the talk that you just mentioned, because trying to differentiate predictive processing from active inference has been a kind of source of pain and confusion for me over the last few months. And I just I thought another way of putting it perhaps, because I think I know the difference now is essentially some presentations of predictive processing are active inference in the sense that they're highly embodied, inactive, embedded. And so if you have a predictive processing that draws on these four re accounts of cognition, it tends to just that just is active inference active inference is kind of necessarily inactive and embodied. And there's a paper. I should say there's a paper by, I believe it's called Wilding the predictive brain. It's by Andy Clark, Kate Naive, Mark Miller, and George Dean, I think, which basically takes predictive processing as a theory of neural processing and weaves in these kind of cognitive, these four re narratives really, really well, and kind of flashes out a much more active inference see picture is kind of the paper brings predictive processing active inference together really nicely, I think. Thanks. Great suggestion. And just without going into too much detail, if this is the causal graph architecture of a predictive processing system. One can imagine that like mu is being a Markov blanket with respect to these two epsilons or so on. So predictive processing doesn't highlight as much the Markov blanket formalism predictive processing is not a physics for particular systems like FEP is, but it's not incompatible. So maybe they're just different fingers pointing at similar or different parts of the moon. So there isn't an incompatibility, especially as over the recent years, active inference implemented predictive architectures and approaches and predictive processing started to undergo that pragmatic inactive for e five e whatever infusion, which led to them modeling action as a variable. And so all they did was they just made it so instead of the free energy being only about sensory observations and expectations, they just tucked action in. And it doesn't radically restructure the architecture. It just means that action is a variable in these equations. So 43.1 or 012 are all good places to look. Okay. This was a short question. Internal state and external state were added to the ontology but sometimes like if you have a quotation mark and then an at it won't bring it up. Sometimes you have to type it with an at first. Okay. In our last 15 minutes. Okay. These two are good. Maybe we could do a first pass on them and then in the coming week, like add notes and more questions to discuss. They wrote if one defines preferred states as expected states, then one can say that living organisms must minimize the surprise of their sensory observations. I'm not clear on the ontology here. Is it that preference is the same as expectation or that formally we can use them in the same equation or something else. Does active inference make ontological claims like preference is actually expectation just as he is actually the excitation of molecules. More generally is the free energy and physics just an analogy or is it an ontological assertion. Who would like to approach one of the aspects of this question. I can. I would like to take a stop. But again, I guess. Yes, please. I guess it's not saying that they're the same, but that we want like if we're minimizing free energy, we're trying to make it the same. And but if they're surprised, they won't be the same. I guess the goal is to get them as close as possible. That's that's will be my guess on this. Thank you. Good insight. Also point to two ways that expectation is conversationally used. The expectation with the fancy e is the expected mean of a distribution like expected returns might be about the future but it's actually referring to the mean. Estimate of a distribution in the Gaussian case mean in the mode or the same mode seeking and mean seeking are equivalent. There can also be a variance estimator and so on. So the expectation of a Gaussian is like tremendously informative expectation can also mean things that are predicted about the future. That's a conversational. What do you expect it to be tomorrow? Both can be conjoined like what do you expect the temperature to be tomorrow is the expectation of a distribution of the uncertainty of the temperature distribution at the time point tomorrow. So they're not exclusive definitions. It just has to be understood what this means because it'd be like, well, if our predictions about the future are our preferences, you know, then there's all kinds of tangles that, you know, one might find himself in. Also, the not remembering which number it was, but the dual usage of the same term to reflect the organisms preferred states, and also the centering of the distribution at that time point and other time points. And then just like Jessica mentioned, this operation of free energy minimization is in service of reducing the divergence between the preferred state and the expectation of the preferred state, which is can be set or learned. And then the observations that are coming in. So in one equation, it leans more towards the English description of a preference like the organism prefers to be at 72. And in the other sense, that is like the expectation of a distribution. And there's some observation distribution with a minimization of a divergence such that if the observations were realizing preferences, then those distributions would be aligned. Yes, these terms are not these terms are more like phenomena or natural language tags. Also, it'll be awesome in the last two weeks of June. In 46 active inference models do not contradict folk psychology. This will be extended extended discussions on like wanting beliefs intentions, which variables in active are appropriate for being described that way. Does it contradict one or the other those are like things that will explore. So the equations are just what they are. And then in some usages, a variable is like the organisms preferred states what it's trying to reduce its divergence between. And also that is like where it expects to be. And that's what licenses the use of surprise and minimizing surprise because an observation right at the center of the Gaussian is minimally surprising. We talked about that in chapter two. An observation that's outside of the Gaussian is highly surprising. So if we have some sort of distribution of outcomes, we could ask how closely are they aligned with our preferences. In terms of surprise. Due to this dual use of the same variables. Does active inference make ontological claims do theories make claims. Do people make claims. E.g preference is actually expectation just as he is actually excitation of molecules. Anyone can raise their hand at any time. We'll talk about that in the folk psychology, but it's an interesting question then. Yeah, I wonder if it seems to me like these questions are taking us maybe into the map territory debate that we I think we spoke about last time. About the, you know, how realist or concrete should we think about the kind of ontological claims of active inference. There's certainly certainly the last part of that question I think about whether we should take it as an analogy. But yeah, I think it's quite a big question, but I don't I don't know. Okay, I won't find the slide here, but is the free energy and physics in analogy. Not sure if this is saying like the Gibbs free energy or if that's actually talking about like the variational free energy or the expected free energy, which are statistical. So I don't know if it's an analogy or an ontological assertion, other than just to connect it to like a linear model with a Gaussian error distribution. Is that an ontological assertion that there is a Gaussian error distribution in the world. No. So our statistical quantities ontological assertions about the world. Maybe there's some nuance, but broadly speaking, no. And also just like on the math learning group we've expanded our operation due to many active participants. So just note that the meetings are at 19 UTC on Monday, Tuesday, Wednesday and Thursday. And they're in the discord voice chat, not in this gather. But we've been working on the resources on the notation, basic and background questions, as well as overviews of the math for different chapters, like basically just summarizing. And then several people, and this is an example of like the broken link. So like variational free energy is deleted. So just one time somebody has to come through and just add it back in. But these are natural language descriptions that will be very easy to transmute into computer code and translate amongst human languages and provides a lot of legibility and comprehensibility. So a lot of the math group has been modifying like the equations. And everybody is welcome to join those like discord sessions and contribute with questions or with knowledge and expertise. Like no matter where somebody is in learning or discussing these equations, they are the essence of active inference. So questions, random notes, related thoughts, expertise are all essential for us improving our shared understanding. And then just in the last five minutes, Page 42, they wrote, In advanced organisms, preferred states can also extend to learned cognitive goals and go on to say that advanced organisms like humans can achieve preferred states by increasingly abstract social cultural strategies, like they talked about thermo regulation, and then all the way up to like air conditioning distribution systems. My question is, if they are cast in terms of active inference, must all of these preferred states and strategies be ultimately linked to survival in some way? Or in the case of advanced organisms can free energy be related to something other than survival? Mike and then anyone else? Yeah, it feels like this notion of survival and preferred states gets conflated in a sort of odd way. Not everything is about survival, right? And so you might have some measurement of surprise if you're in a temperature that's 120 degrees, but it's survivable for a period of time, whereas if you're in a temperature of 300 degrees Fahrenheit, not survivable, right? But if you contrast that with other things that could be equally surprising or outside the distribution, as you described it earlier, the notion of that all swans are white, but then you find a black swan. And so you'd never conceive that there might be a black swan before that lives outside your distribution and creates a surprise, but it's not necessarily survival rated. Thank you. So there's a sort of a missing severity in there somewhere. Okay, thank you. Holly and then Lyle. Well, I think first life here is used somehow in a more correct way, meaning a system that resists dispersion and energy dissipation. So it's not necessarily biological survival. Yes, great point. Systems that fail to persist survive will not continue to be that thing. And that's just in like a physical sense. Repeated measurements are only enabled by systems that are persistent at that time scale, not eternally. And so when we want to have a theory or a framework or a physics for things, then they have to look as if they are minimizing free energy, otherwise they will simply not be that thing. Then and then Lyle. So this was my question, I suppose, just to just to kind of clarify it quickly. It just one of the things I wonder most about it seems to me at least in my life, I'm quite often making decisions and having preferences that seem so abstracted from, you know, my attunement with my environment or my continued viability as a system, you know, like my preference for which mug I'm going to drink coffee out of in the morning. It seems in some sense removed from my free energy the way that free energy is described in literature. So that's kind of what I meant. Great. In yesterday's 45.2, we asked specifically like how can all behavior be optimal when like mistakes are made? Or I'll ask the question like what about when there's like sacrificial or alt altruistic behavior? And there's a great answer in 45.2. But suffice to say that with different priors, all behavior can be cast as optimal behavior. So the normativity isn't referring to what should be done or what is best, but it's rather like a process normativity. Lyle? Right. So yeah, this just for me, tease up the question I continue to have about the approach, which is how does how does the concept of novelty, which some organisms would pursue, fit into this concept of I mean, how would you even be represented here? And novelty is not just a human trait, but you would see it across all kinds of creatures that would seek out would seek out novel and surprising experience, you know, not expected. So for me, it's kind of an open question about how that's captured in this framework. Yes, the beginning and ending of 45.2 Carl's focus was on curiosity. And there's various ways to approach it. And it's an excellent question about novelty of different types and scales. And within the hierarchical modeling, one can understand novelty of different kinds. And this again is just like the kernel. This is like the single linear regression. And then there's like model selection across hierarchically nested linear models. Much development beyond the kernel. But this is like a kernel, but, um, and especially when thinking about expected free energy, a path can move to a novel area. Because it's part of the expected trajectory that does reduce risk and ambiguity and so on. So, um, that's the deflationary approach to novelty pursuit. Jessica, and then we'll end the discussion. And then you put it at the end, Danny, you mentioned it. And I guess I see novelty as sort of like the systemic origin. And so it's like you are seeking like new information, new things in order to minimize that uncertainty. Not so much that novelty is uncertainty. Because like, like if you look at, um, nobody thinking that, oh, it's new and therefore it's going to be surprising. And so we want to minimize those things. And then yes, or also if we look at uncertainty equal to risk, we can say like novelty is because of the unknown represented risk. And if we minimize that uncertainty, we're trying to minimize risk. But from what I saw a little bit on chapter two. It's sort of like we're not necessarily trying to minimize risk or like the unknown. And like depending on our preferences, we might tolerate high risk. And what we're trying to minimize is the surprise of what we expect. Like if I'm in finance and I expect very high risk in an investment, but I already know that. And high risk comes back and like my uncertainty was minimized, even though there was a lot of risk or there were a lot of like unknown or like novelty and things like that. But it was already in my expectation of what was going to happen. So I was already minimizing that even though in reality it may look like there was a lot of uncertainty in actuality. But in my model, there isn't. So that's kind of like how I explained that to me a little bit. And in terms of novelty for like adventure and curiosity and things like that, I equated more to epistemic foraging as a weight of learning and updating the model. And that's why we're doing it. Thanks, Jessica. I'll just add one closing note on that. I would argue that it's the pragmatic absolutism, the reinforcement and reward learning centrism that has curtailed an effective theory of novelty or curiosity. Because radically different kinds of actions have to be coerced into a value framework, whether it's deep Q learning or reinforcement or anything like that. There has to be like a novelty bonus or all these other ad hoc, unprincipled approaches, they might be super technical, they might be super effective. But they're not driven from first principles in variational free energy F and expected free energy G. There's this like minimum of two with pragmatic value, which is to bring alignment between the preferences and the observations and the epistemic value associated with what is casually called novelty or curiosity at that scale of potentially a hierarchical model. So active inference does provide partitioning and an imperative that helps us address the phenomena of curiosity and of novelty in a way that is quite disparate from trying to coerce it into a value of curiosity is expected pragmatic value. Or the value of basic research is the expected utility that it could bring. So, but these are all really awesome and open and ongoing questions. So we will close the recording and then have just a one minute break. Then in this room, we'll continue with dot tools organizational unit. And if anybody wants to like talk about something else, they can go to a different gather room. So thanks everybody and see you next week.