 All right, hello and welcome everyone. It's Act In Flab live stream number 34.0 on December 3rd, 2021. And we're discussing the paper, The Free Energy Principle. It's not about what it takes, it's about what took you there. Welcome to the Act In Flab. We are a participatory online lab that is communicating, learning and practicing applied active inference. You can find us at the links here on this slide. This is a recorded and an archived live stream. So please provide us with feedback so that we can improve on our work. All backgrounds and perspectives are welcome here and we'll be following cool, dyadic video chat etiquette today. At this Coda site, you can find our sortable, searchable, interactable and growing database of past and future live streams. And if nothing else, isn't this paper gonna be about the past and the future? And it's also our last paper discussion that we're kicking off for 2021 in number 34. So it's a pretty cool discussion and I'm really looking forward to it. In Act Inth stream number 34, we're gonna be discussing the paper, The Free Energy Principle. It's not about what it takes, it's about what took you there by actual constant in 2021. And just like all the other dot zeros, the video is an introduction to some of the ideas and the context for the paper. It's not a review or an evaluation or a final word. We'll go over the aims and the claims, the abstract and the roadmap of the paper. And then there's a couple keywords, including some that we haven't seen before. And then we'll walk through the key example, which is the numerical example of a simple signal detecting bacterium. And we'll kind of talk through that example and see where that takes us. And I'll just note, as with all the other dot zeros, check out the paper if you want to learn more. And Axel's writing is really a pleasure to read and he has deep insight into the FEP. So we'll start with just saying hello and anything that we else wanted to say. So I'm Daniel, I'm a researcher in California. And what excited me about this paper and got me excited to discuss it was I think Axel's provocative as well as grounded and respectful approach to dealing with what has and hasn't been done in the FEP. And the example is so minimal and pared down but also insightful that I think it lends itself to a jumping off point for many, many other discussions. Dean? Yeah, thanks Daniel, I'm Dean, I'm in Calgary. And I think what excited me about this was that I didn't start out really understanding what the title of the paper even meant. But then when there was a bit of an inversion and a bit of a taking things inside out and I don't know if you want to describe that as a reversal of what is typically viewed as the time continuum or not. It just opened up a whole bunch of interesting ideas which I was contemplating as a program developer because I knew that in order for the idea being able to go from an inference, I believed going from an inference which is a very small timeframe through a prediction set and finally into being able to get to a place of modeling or understanding co-variation as a collective, this paper would probably help. And then with each section I went through, it was very helpful in terms of being able to sort of bring this idea just as an idea, not in terms of its practical implications but just as an idea to life. So. Okay, I'm sure he'll rejoin. I'm gonna just read through some of the big questions and again, these are just the ones that we wrote down. So I'll read the first two and then maybe say something and then Dean, you can take the second two. So two of the big questions that the paper raises are what is the free energy principle about and what is the entailment relationship between life and the minimization of free energy or another way to read that is what is the entailment relationship between life and models that emphasize the minimization of free energy. And I just love how the paper dances between both of them and figuring out what the FEP exactly is and isn't and its distinction from general Bayesian theories from active inference. These are really crucial distinctions. Dean. Yeah, so the second two questions where one was from the author, I'll read the fourth one first from the author, our prior subjective or objective under the FEP. I put the words and with the question mark yes, because I actually have a question about this. He comes to a conclusion, but I'm not so sure what the timeline is which leads back to bullet three, which is what can we say about relationship when we shift from looking at what something is, the comparison of life and free energy principle to how long it lasts. He talks about duration or the when aspect of that. And that's the thing that kind of adds a twist to this because you can look at it in very momentary where you can look at his things as very cyclic. I've used the expression episodic and diachronic and you've used kairos and there's lots and lots of ways to bring that in. But when we do, things change pretty quickly. And it's so funny, you had to go to the fourth one, but was it really the fourth question if you read it third? Right, there we go. So the paper was published on 22221, could have waited a year and gone for the total two, but I understand you got to get those papers out. And a few of the key aims and claims. So first, using the author's words, the entailment problem is the confusion in the entailment relationship between free energy, minimization and life. So if you've never heard the word entailment, never heard free energy, we're gonna go into some of these things, but the entailment problem is people being confused about entailment, that's a problem. In this paper, the author dissolves what's called the entailment problem proposed here by providing a numerical example of free energy minimization in a hypothetical organism. I conclude with some brief epistemological knowledge oriented remarks that may be of interest for those who worry about the explanatory power of the FEP, and then this is the last words of the paper which we'll return to at the end of this live stream. The claim I set up at the start of the paper is that free energy minimization is not sufficient for life. Mic drop. Fidel, true to the unifying grip of hypotheses in historical sciences, the free energy principle is meant to account for all of that. You, me and the organism understudy in a unifying fashion. Any thoughts on those claims? We'll be exploring them and discussing them today. Yeah, no, the only thing that I immediately went to was a question that I would ask people who were sort of on the other side of whatever it is that they were moving towards in terms of a sort of a professional participation. And so the entailment thing for me was, is it opportunity over method, meaning method entailing opportunity or is it opportunity over method or method being entailed by opportunity? And that is the same kind of question that Axel looks at here. So depending on how you see that entailment, things go off in completely different directions. So let's go and get into the paper now because yeah, that's something that you can't resolve until post-diction. You can expect how it'll be resolved, but then you look back and it might've been different. So I'll read the abstract and then feel free to give a thought. And of course, anyone watching along live is always welcome to give comments and questions in the live chat. So the abstract is, I'll read the first and the third one. Yeah, philosophical writings on the free energy principle in the life sciences often give the impression that minimizing free energy is sufficient for life but minimizing free energy is not a sufficient condition for life. In fact, one can perfectly well conceive of a system that actively minimizes its free energy and for this very reason moves inextricably towards death. So where does the assumption of this entailment relation come from? There's indeed an entailment relation but it goes the other way around. Life entails minimizing free energy but another way if you exist now under the right conditions, it's because you've done something like minimizing your free energy. However, the question of whether you exist tomorrow cannot be settled purely by the resorting to the fact that you'll minimize your free energy in order to get there. The simple point I make in this paper is that the free energy principle is not concerned with the sufficient conditions of existence but rather what must have been the case necessary what must have been the case given that you exist. It's not about figuring out what it takes to be alive. It's about figuring out what took you there. And I'd recommend people to put this in a text-to-speech reader. It reads just so fantastically in a robotic voice. I rarely do that for papers but I was on my exercise bike listening to the robot talk about this and it was just, it was hilarious. So here's the simple point in the paper. Free energy principle is not concerned with the sufficient conditions of existence but rather what must have been the case given you exist. It's worth restating because it's a simple point but the depth of how far off one can be by missing this point is vast. It's not about figuring out what it takes to be alive. It's about figuring out what took you there. Not what took the living system there. So here's the living system but we can even finesse that it's you. And so the red line is the system looking back how did I get here? Or putting kind of in a second person way looking back at another lifeline, how did I get here? The question that is often asked by living systems including us is, is something alive? That's answering Schrodinger's question. I'm set at all 2018. It's an active topic of discussion. And so here's the line of another system. And so it's looking at the past, the present and the future of some other system. And is the FEP about this red arc of the system projecting back and asking what took me there? Or is it looking out to another system and asking whether it's gonna be the ruler of like meter stick of what is alive or not? So probably other ways to represent it. Any thoughts on that Dean? Yeah, it's just the what is alive thing now can be placed beside and when is alive. And I think that's what this paper really brings forward. If we look only at what is alive we may over reduce, may, but if we include when is alive, as you said things get real interesting really, really fast. And a really influential paper was Friston's single author, 2013, life as we know it. And so that's often seen to be making a claim about the blue because it's like, well, life as we know it this is like how we know what's alive. But then it's like even in the title with so few words as we know it. Right. So even in there there's a little bit more depth. So this is a nice road trip. There's only a few stops, few sections and figures. So there's an introduction and then minimizing free energy for better or worse. Basically, just like you could minimize your potential energy right off a cliff you can minimize free energy right off a cliff. So for better or for worse, assuming life is better we prefer it to be a lot. There's a section on the conceptual distinctions between Bayes and the free energy principle which is very didactic. There's a numerical example presented in formalism as well as in a figure, which we'll talk about a lot today. And then free energy on a wing and a prior. The last section is called future direction, free energy minimization as a historical scientific principle which I thought was exciting and appropriate for the paper being in biology and philosophy. So for those who are not steeped deeply in the philosophy game there's some different terminology but that's why the whole field is enriched by people bringing these different perspectives to bear. So the keywords that are brought up, prediction and post-diction before and after, historical sciences, the free energy principle and then I think we added active inference and strong and weak claims. They were given as a keyword but they are keywords for this discussion. So let's start with prediction. Okay, I'll just say what I have on the slide then, Dean, our resident prediction matter expert. So first, never make predictions especially about the future, Casey Stengel or as they say in trading either predict the price or the date but never both. And on the Wikipedia page for prediction there's so many different related areas and that just tells you like, well, what is it that's related to all these areas? Things that are spooky and forward-thinking like an omen or an oracle, like almost metaphysical or supernatural all the way to things that are very technical. Reference class, forecasting, regression, trends, expectations, forecasting and I think in Act Inf there's multiple ways that prediction comes into play. So just listing only four of a certain set, there's the prediction or the now casting of sensations and that's like predictive processing like the top-down priors are modeled as providing predictions about sensory inputs. There's the prediction of or inference of hidden causes in the world from those sensory inputs which are not directly observed. There's an importantly notice that these two predictions are actually almost instantaneous. And then we also talk about prediction through time like at time T, there's the prediction or the anticipation of states at a future time. And those could be internal, external sense, action states. So predicted future states. And then there's also the prediction of the consequences of action which is anticipatory cybernetics and that can include counterfactuals. So what if something were to happen that's like a prediction about a counterfactual. So there's multiple ways that prediction and active inference come into play. What do you think about that, Dean? Yeah, I think it's all true. I think it just points, if we're going to ask how time is included, then we can break out how time has been included, like what the difference is in terms of how time is included in an expectation versus how time is included in a trend estimation. Because I think that's what causes us to be able to see multiple different ways of prediction. So not just types of prediction but then the actual time given over to that particular type. Great. Those who study prediction might be considered future studyers or futurologists. And that's also an interesting wiki page and search area. Just like how many frameworks were brought up. I mean, they forgot one. It starts with active and ends with inference. But there's a lot of different methods here and I'm sure we could all think of more. But it's kind of like we have prediction as a concept looking forward in time or even to the now which is kind of forward in time from when we made that prediction. And then there's the field around that. Because we're going to also look at post-diction and then the fields that deal with post-diction. So just whether you're hearing about active inference for the first time or not, think about what is prediction fundamentally looking forward in time and then what are the attributes of fields that look forward in time? For example, they may not have proof because they're making a prediction about something that hasn't yet played out. So they might be an estimate that you'll bet the house on but can one say that they have proof when they're talking about something going forward in time? Now. Yeah, this is where it gets really interesting because if you're an archeologist slash anthropologist and you're trying to figure out whether the plains of Salisbury was the originating place for all those rocks to get tipped up. You're actually looking forward into a future that you think is maybe unresolved even though you're looking into the deep past as well. So again, I know that time zero seems to be going only one direction but our minds don't seem to necessarily honor that. Exactly. So we can contrast prediction with post-diction and so quoting the paper. As a reasoning pattern, the FEP can be used to generate post-dictive statements in the sense of Friston et al. 2017, which we'll talk about. Accordingly, FEP reasoning might be interpreted as a principle akin to those found in the post-dictive sciences, a.k.a. historical sciences, hashtag keyword, like geology, paleontology, archeology, or any science that deals with irreproducible causes, Clelland. Post-dictive scientific statements are concerned with what must have been the case instead of what will be the case. So, Clelland's paper from 2002 is posted here and it's very clear to see how the paper under discussion draws on this one and just to read the conclusion of the abstract, I show that these different patterns of reasoning are grounded in an objective and remarkably pervasive time asymmetry of nature. So there is this like entropy's arrow times arrow and that makes talking about the future and the now different than talking about the past. That's the historical sciences and then the Friston et al. 2017 paper. Now this is interesting because post-dictive was only used one time in the paper. Figure two illustrates the recurrent nature of the message passing that mediates this predictive and post-dictive inference using little arrows. One can clearly see that the first outcome can influence expectation about the final hidden state and expectations about the final hidden state reach back and influence expectations about the initial step. This will become an important aspect of the deep temporal models considered later. So here's that figure two. It's the partially observable Markov decision process that we know and love with a layer of the message passing added because Par and DeVries are authors and this is like a very, I mean, 2017 wasn't so long ago but this was an important work that combined a lot of different areas together. And here's that reach around through time in an interesting way. Dean? Yeah, this is where bringing your history and my history into this interaction creates a new history. And I know that symbolically, that's a crossover simplification but if you were trying to share this with somebody with not as much background maybe that would be a simple tell, I don't know. Cool. And just like we talked about future studies, let's think about the historical sciences. So here we have the past, present, and future and historical sciences can refer to history proper, the auxiliary sciences of history or any science that draws from records of past events aren't experiments also past events? Now the auxiliary sciences include but are not limited to. And there's so many fun fields here, things that are like kind of like subfields of history like heraldry, paleography. So there's some great vexillology, a lot of awesome terms in here. And then I was thinking, well, you know, you can't prove something in the future it hasn't happened yet. And then historically you also can't prove it because it couldn't have been otherwise. Hence you have to only say what must have been the case instead of what will be the case. Like you can't generalize, you can't say, well, the Galapagos are biodiverse because I then just give one singular answer, even putting tin burgeon aside and everything, then so what's left? The past, the present, and the future. And then the now is just this like kind of moving indicator that we're experiencing. And so it doesn't really make sense to prove something only in the now either. So that's just very interesting how there's that, as Clellan calls it, the pervasive time asymmetry of nature. And you know, put your hand down if you disagree. But what does that leave for us to actually work with in terms of scientific modeling? And from that tension is actually where FEP and ACTIMP are gonna be doing some interesting things. Yeah, this was, if you wanna, this is what, I love your, the images that you put up here because one of the things from a learning standpoint that I found really helpful is instead of a task being what answers can you pull from your memory bank? It was a learning exhibition where the chronology was kind of the slow hand. You'd lived out X number of months of working in a professional setting. And now in front of a significant audience, you were laying out a chronological timeline where the focus was on the feelings now top down, organized and narrated. But that whole, that entire picture of feelings of working in a professional setting for months actually now came out in terms of a narrative. So the historical sciences piece here is actually an amazing consolidator as well. And that's why, that's what I found about this paper. It was actually an incredible consolidator. Agreed. So necessity and sufficiency. They're different words. They mean different things. And sometimes it's easy to get it right in a simple case, but then when science gonna science, it gets like really mixed up. So necessity means that you need it. Like the car needs wheels to roll normally. If you take it away, it's a loss of function. The wheels are a necessity for the car. But are wheels a sufficiency for the car? Well, no, because just four wheels doesn't equal a car or you could have all the car, but if there's no gas, like there's something else that's also necessary. So they're very similar terms. They both describe the conditions that are sufficient and or necessary for something being the case. And that's been discussed in the math threema, in this figure, in the mathematical case. Of course, necessity and sufficiency is a key topic. In the gene regulatory network case that I've studied where they talk about genes being necessary or sufficient for a given phenotype. And then also in a legal case from this Yale law journal talking about the most restrictive to the least restrictive. Like if somebody had no causal role in a crime that inarguably occurred, they're blameless. If somebody was necessary and sufficient for something to be caused, it makes sense that they would be guilty of that being occurred. But then there's complex systems in between and there's the difference that makes a difference. And there's so many ways to talk about necessity and sufficiency. So there's sort of a simple bookmark on both necessity and sufficiency and neither. And then there's a complex space in between. All I would say is there's no matter how often you talk about causation, it's amazing how much stuff just gets Roman candle information wise out of that simple word. That's all I need to say about that. The last keyword proper is free energy principle and check out act in stream number 14, number 32, number 26. I mean, roll the die. We talk about it a lot, but it's always important to go over it because it may be someone's first time hearing it and the simple aspects of it are the parts that we want to return to again and again, kind of like a strange attractor in our model, kind of like the FEP. We've talked about some of the technical part on the bottom. So check out especially 32 and 26 for the bottom parts. And then the free energy principle is this imperative to be self-evidencing and anti-dissipative. If I could only choose two words, potentially those would be it and they're both going to come up today because anti-dissipative will be equated to persistent duration of survival. And then evidencing is going to be related to Bayesian approaches to modeling evidence. And then where do you go from there? There's so many places to go, active inference, the corollary process theory. And we're gonna talk and unpack FEP in a few different domains. So good to move to the next slide. All right, so how does FEP free energy principle relate to life in this paper? So the paper says, minimizing free energy is the process whereby one maintains one's structural integrity in face of environmental perturbations by revisiting one's most probable organization of physiologic states. It is in that sense that minimizing free energy is considered a condition for life. One can equate survival with life since one supposes the other under the FEP. And then kind of two slogans, cool sentence structure. If I survived, it means that I maintained my structural integrity in face of environmental perturbations in a dissipative universe, according to the arrow of time. And maintaining my structural integrity is what qualifies me as living. From the point of view of the FEP, when considering metamorphic organisms, it may be said that the life cycle that corresponds to the thing whose integrity is maintained over time, e.g. over evolutionary time, not the specific form that the system takes at one stage of its development, e.g. the adult form of a frog or ant. This cast the FEP within the realm of process ontology versus substance ontology. And I just have to say that Bucky Fuller talks about structural integrity as being like the house that stays up through time, but pattern integrity is in his ontology a system that recapitulates itself. And so it retains dynamical integrity. And then that's connected to the idea of process ontology versus substance ontology, which we'll discuss more on the next slide, but it's just awesome. Like this is an ant. An ant has a life cycle, but then you can almost pull back and say, well, really the integrity of this egg becoming an adult, it's related to the nurses feeding that egg. So even if that egg to adult transition reminds us of a solitary organism, the integrity that's being regenerated generation after generation is like the colony. And so what are we talking about the persistence of when you never step into the same river twice? Like in one sense, there is no integrity being maintained because things are always changing. And so the FEP framing life in that way is very interesting. What do you think? Yeah, this is where it's, I would often say to people, I want you to go into this situation. I know it's new for you, but I want you to go into this situation with both eyes open. And with one of the eyes, I want you to see the project at hand potentially. And with the other, I want you to be able to see the system at hand potentially. And that's very confusing. Cause I mean, the best that we, most of us can do is just deal with the complexity of the project or the complexity of the system, not the complexity of both co-occurrently, but having said that, the better you can work at knowing both are present, regardless of what you decide is your focal point in that particular moment, the better you'll have an opportunity to sort of get the whole picture going on here. It's hard. In fact, it takes some moving away from the way that most people are trained, try to look at things. So it's really hard to be honest. Let's dig into process versus substance ontology. Perfect. So the 2020 Dupree paper is cited life as process. And also just a cool kind of coincidence that it's a bilingual abstract, if not paper with English and Russian. And check out the paper and all the citations if you want to read more, but just to read the highlighted parts that kind of get the essence of the abstract, the metaphysics that has dominated Western philosophy and that currently shapes most understanding of life and the life sciences sees the world as composed of things and their properties. While these things appear to undergo all kinds of changes like metabolism, development, context changes, it has often been supposed that this amounts to no more than a change in the spatial relations of their unchanging parts. From antiquity, however, there has been a rival to this view, the process ontology associated in antiquity with the fragmentary surviving writings of Heraclitus. For process ontology, what most fundamentally exists is change or process. What we are tempted to think of as constant things are in reality merely temporary stabilities in this constant flux of change and the dreadies in the flux of process. My main claim in this paper will be that a metaphysics of the latter kind is the only kind adequate to making sense of the living world. And also related research is this causally powerful processes from earlier in 2021 by Dupree in Synthese. So this is a fascinating line of research about moving from a substance to a process-based view and where causation comes into play with that, with respect to the historical sciences. There's so many other ways to take this, another quote that one of my philosophy professors would always say, and I think it was a quote from Kant, was there will never be a Newton for a blade of grass because there isn't a mechanistic explanation that's timeless for a blade of grass in the way that there plausibly could be for like Newton's cradle. So I wanted to ask you, did you find this, have you known about this before reading this paper or did the paper cause you to go and look and find papers on process ontology? Because two of our colleagues talk about flux blue and Steven a lot. And so when you brought this forward, I thought, okay, so has Dan, you've been hiding this from me all along because he's kind of known about this or did the paper in effect create a situation where you wanted to know more about process ontology? I followed the Stigmergy. Right. In the paper, it says it's unclear whether it might be argued life and survival refer to an enduring process that advocates of process oncology would call organism in the sense of the pre-2020, went down to the citations, there was, and I knew this philosopher, but I hadn't read this paper. Yeah, so exactly process ontology now generating new process ontologies, variation upon variation, right? And still in the FEP, still at that FEP narrow timeframe, it's, that goes back to, as I said, I'm not sure if it's just subjective or subjective and objective. So anyway, I'm glad you found this because now I got to read to pre, because yeah, anyway. Oh, substance where thy sting? So how does the FEP relate to active inference? We're kind of getting ahead in the paper in one sense. So we'll be coming back around to it, but the FEP on its own is a principle, namely a foundation for reasoning about things, e.g. living things. In this paper we approach the FEP as such. However, the FEP can also be read more broadly as a research program that uses FEP reasoning patterns to generate scientific hypotheses. This involves implementing FEP reasoning into a theory called active inference, which is routinely used to study various cognitive functions. When you go to the footnote, there's only one, it's important to note that FEP includes processes other than free energy minimization. It also includes expected free energy minimization and generalized free energy minimization. While minimizing free energy endows the organism with post-dictive inference, which we're gonna go into in the example and talk about more, minimizing expected free energy endows the organism with predictive inference. So FEP is post-dictive. It looks at the oncoming stream of data, and it compares that to basically what must have been the case. But act-dymph or any other predictive theory that's about expectations, it kind of has to be different because the consequences of action are in the future. So it has to be these two different modes. And then we're gonna come back to these ideas, but FEP reasoning yields historical hypotheses and not just like epic or legendary, but they're about the past. Because it operates a historical evidentiary pattern and provides a compelling unifying causal story. It operates a curious evidentiary pattern though, because it assumes that both the investigator and the thing under investigation can form to that evidentiary pattern. That pattern is free energy minimization per se. All right. So if you're making fewer prediction errors, you are, your timeframe carries on, but you actually do become, I don't know if you want the label, but you do become somebody more versed in post-dictive. What do we wanna call them? Post-dictive confirmations? I don't even know how to describe it. This was a huge thing for me. Post-dictator? Yeah, I guess. Yeah, we talk a lot about prediction markets, but where's the post-diction market? And then what if organisms are so good at post-diction, it looks predictive. So let's see. There's gotta be, they got disclaimers whenever you read the forward-looking statements, right? Like this post-dictive is now a new term that I can pull into those conversations around the disclaimer. It's like if something is not predictive, it's either about right now or it's post-dictive. Right, exactly. Let's get to the core terminology that Axl is bringing up here. And this is what reading the paper means is to understand like why this entailment problem is framed, what responses exist to that problem implicitly, explicitly, how it's addressed in this paper and what the implications are going forward. That's like you'll have read the paper to have known those things. Great. This was an earlier quote that sometimes arguments in the literature in the FEP give the impression that in order to be alive, e.g. sufficiency, one must minimize free energy. Such a claim does not straightforwardly apply to the FEP, however, and this is what the paper will demonstrate. Minimizing free energy does not entail life. Candles minimize FEP too. Rather, the argument is that if you're alive, it probably means that you have done something like minimizing your free energy, which is the base optimal thing to do when your life depends upon solving complex inference problems. This is a subtle but crucial point to getting the story straight. I shall call this the entailment problem. That is the confusion in the entailment relation between free energy minimization and life. Here, the notion of entailment refers to the implication, i.e. first order logical property between free energy minimization and the fact of displaying some life related processes. So let's think about two different islands, minimizing free energy and life. So we could think of it as like a Venn diagram. They could be overlapping, they could be partially overlapping, they could be not overlapping. Arrows can be drawn anywhere as long as you label them. Axial frames there to be two different kinds of threads, like two archetypes of framings in the literature. The first is the strong claim. The strong claim is that, or what's called the overly generous claim by Kirchhoff and Fros 2017, is minimizing free energy is a sufficient condition for life. So we're gonna go out there with our free energy omometer and things that are minimizing free energy are alive because it's sufficient. The other claim is the weak claim. If a system is currently alive, it means that it minimized its free energy. In other words, minimizing free energy was necessary but not necessarily sufficient for being alive. Such a claim does not assume the FEP is designed to set the bar for the sufficient conditions of life or meant to predict what things may or may not be alive. Rather, it limits the scope of application of the principle to beings that we think are alive now. What made us think that? That's not in the theory and enables us to know the necessary conditions under which those beings can be living, i.e. can actively resist the loss of structural integrity. What took them there? Right, what do you think about that? Well, I just think it's interesting that he talks later on, we'll get to this, is that you need the correct or right priors and if you go with the strong argument, you assume that the right priors going forward will continue to just materialize versus the weak argument, which says it's not until afterwards whether we know whether or not the right priors were the ones picked. Yeah, there's so many fun examples. So let's talk about Bayesian organisms. Bayesian approaches to animal behavior, whether it's Bayesian approaches to foraging or the Bayesian brain, propose that one can model red light. One can model, not that organisms are. Right. One can model, so just to forestall any kind of map territory distinction which we always return to, but can't be returned to enough. One can model organisms as representing their relation to environmental states using priors and a likelihood. Let's call those representations Bayesian beliefs, not in the organism, it's our representation of that in the model. On the basis of those beliefs, organisms generate adaptive behavior. Bayes beliefs can represent the probability of environmental states prior to observing a signal that's called a prior means before and the relation between environmental states and the observed environmental signals, a.k.a. a likelihood. Bayes theorem, from which terms such as the prior and likelihood come from, is typically expressed as an evidentiary relationship between some prior hypotheses p of h and the observation or hand or data, that's e. So e is gonna be like evidence or empirical h is for like hypothesis. In the way that this sentence is read, this expression, this is Bayes theorem and it's read as the probability of the hypothesis given the evidence. Vertical bar means given or conditioned on. So the probability of the hypothesis given the evidence equals, there's the equal sign, both sides are the same. The probability of the evidence given the hypothesis multiplied by the probability of the hypothesis that's in the brackets, divided by the probability of the evidence. That's Bayes theorem. People use Bayes theorem to model all kinds of things about different systems. The free energy principle is a Bayesian formulation of the manner in which organisms infer the posterior probability of their prior beliefs after having observed an environmental signal with a given likelihood. And in doing so, infer some hidden or unobserved variables. Same equation as the last slide. What stands for the h are unobserved variables whose prior probability forms the hypothesis. And what stands for e are the sensory signals organisms receive the data. So this is like, I think it's equally likely that it's day or night, like it's 12 and 12. So 50-50 on whether the latent cause of the world, the hypothesis is daytime or nighttime. But you can't observe daytime or nighttime. All you get are photons on the retina. So then it's like, you have a prior, that's like 50-50, and then evidence comes in, photons. And then if there's a lot of photons, you update your prior to get the posterior. And then you think, okay, it's daytime because I'm getting a lot of photons. Hence it is often said that under the free energy principle, organisms are viewed as embodying a hypothesis belief or best guess about the cause of their sensations or sensory signals that they receive. That's because the evidence for an organism are its onboard sensors. And so the hypotheses are about causes of what is giving rise to sensation. Just like in the daylight example, it was a hypothesis about what was giving rise to the photons. Because beliefs are embodied by the organism, hashtag embedded, enacted, et cetera. And thus the organism's own states, the uncertainty and the likelihood in the prior can be viewed as representing uncertainty inherent to the biological apparatus, instead of uncertainty in the world, as would be the case under typical Bayesian models. Under the FEP, uncertainty should be thus read as reporting a Bayesian credence or credibility score over the organism's own beliefs as it reports the probability of a state or hypothesis relative to other possible hypotheses. What do you think, Dean? Well, I'm gonna, what do I think? I'm gonna ask you a question back. So is this another way of saying an account of uncertainty, right? Like, so that sounds like, wait a second. How can we have an account of something that we don't know? But is that what he's, I know I don't wanna interrupt the math because I really like the way he step-by-step goes through this. But I, when I read this, again, when I read this the first time I went, oh, okay, well, this is an account of uncertainty. Wait a second, how can that be? So. Yes. Uncertainty is about the hypothesis giving rise to- So the when? It takes place over a duration because the updating of priors is happening through time. So it's like, as my tummy is sending me signals, the body is as if determining whether I'm hungry or not. And then that uncertainty is the probability of a state or hypothesis relative to other hypotheses. So the traditional models or methods that we use to try to account for what has been learned doesn't necessarily take this into consideration when in fact, this is the basis or moving forward and for learning. Yes? I think there'd be some interesting dot one and dot two discussions about uncertainty in frequentism versus Bayesian probability and about what it means for a frequentist p-value-driven understanding of certainty or not. I fail to reject hunger at the 0.05 p-value versus the Bayesian credible interval or the Bayes factor of being hungry is this or that relative to this conditioned on the sensory inputs, literally hypothesis hungry or not conditioned on the inputs. Yeah. Yeah. Again, I don't wanna take this off in a tangent. Let's get back to the math because he does a brilliant job of clearing this up. This is such a cool discussion. The simple answer to the question of whether priors are objective or subjective under the FEP is that they're subjective. Where's my gavel? It's over. As we said, they track the confidence over a systems belief. Under the FEP, priors do not conform to a rationality beyond the rationality of the inference per se, but nor are they rationally unconstrained. So this always makes me think of like sort of the new atheist, like if you were rational, you would believe like we believe. And then that kind of absolutist rationality making way for a multi-perspective rationality, like I guess different people with different priors and sensory inputs implement their own local rationality and they come to different conclusions. Fancy that. There's a rationality beyond Bayesian rationality. It comes from the way variational Bayes is realized by the system. And we're gonna talk about that later. It basically embeds when into that process because the approximation to the posterior in the variational Bayesian inference, Q, is the prior used in the next cycle of inference. The rational constraint over priors is the fact that the approximate subjective posterior Q will not only be Bayesian, but will also always be the best guess relative to what the true posterior ought to be. In short, under the FEP, even though priors refer to psychological states of the system or internal states of the system, updates of the system make those priors an approximation of what they should have been had the prior been updated with exact Bayes. Thus, it might be said that priors under the FEP cut across the objective, subjective dichotomy. They are subjective, just like the gavel said above, while satisfying a rational constraint mandated by the existence of the system per se. Yeah, this is the precise approximation argument that comes up often, meaning you, it's not a question of do we need both? We do, but how do we have both available to us? Well, we got to keep both eyes open and we got to look at both the system and the project, which is really hard, because it's much easier to just one or the other and give our attention over to that thing. That's why post-diction is a little bit easier, maybe. hindsight isn't 2020 as we'll prove in this paper, but it's still better than making that bet with absolutely no priors. We talked about substance versus process ontology. And then we have this subjective objective. So it's almost like substance ontology maps with objective because it's like the object, it's the substance. The what? Yeah, the what. And then process ontology maps to subjective because subjective is relational and dynamic, it entails something different. The when, yeah. And then we can kind of lean more on one side or the other, but really it's like they're nested within each other. Yeah, we can't let, if we drop one or the other, or if you hear people dropping one or the other, you kind of have to bring them back to, no, they're both here, they're both present and they're both available, regardless of what you choose to make your focal point. That's when they go, well, that's like just your opinion, man. Right, exactly. Yeah. So let's go to the crux of the paper, which is the numerical example. Okay, so there's some other pieces, we're giving a linear ordering and kind of jumping through the paper. We're gonna think about a organism, like a bacterium that's gonna infer whether A or B occurs externally. For the organism, A and B are part of the class R and the representations by the organism are A or B. So R can kind of be read like receptor states. A and B are inferred when receiving a chemical signal part of the class S, like for signal, which can be alpha or beta. So S is the signal molecule, alpha or beta. We assume that before observing any signal, so prior, the probability of A is P and the probability of B is one minus P. So they're mutually exclusive and they form a probability together. Then the numbers are given for A and B, the prior and then the conditional probabilities. And then this is where life and death enters the picture. You gotta guess correctly. If it's A, if alpha is out there and you guess A, you live. If alpha's out there and you guess wrong, you die. So you wanna be like on the diagonals. You want A and alpha to go together and B and beta, but you don't want the off diagonals, you'll die. Okay, let's look at it graphically and figure one. So the right panel is a visual representation of the organism inferring its prior beliefs about the cause of the observations it makes. And then it restates several of these pieces. So we have like the prior and then we have the signals A and B out there. Here's something really important for all you active heads out there. There's no action involved in our example. We're not talking about active inference. In more biologically realistic descriptions of behavior like chance at all 2020, which is where this example is a simplified version of, we need to discuss active inference. In active inference behavior is the result of a different inference process that of an action policy. This involves more priors, namely about the transition between hidden states and often about preferred sensory outcomes. So that would entail a model where it's like, okay, there's a probability of transition between alpha and beta out there. And then there's gonna be a preference for alpha and then I'm doing action selection on tumble or not based upon whether things are going well for me or not. So that's several more pieces to the model, but it's a toy, toy example that's going to make the point that minimizing free energy is not sufficient for being alive, but it is necessary. And so we're just distilling to the simplest possible sensory processing example that doesn't even have action. So maybe someone could argue that, well, adding action somehow makes this example null and void or whatever, but I don't think that's the case. And so this does involve more priors and action is then distinguished from inferring hidden states. It's about inferring another hidden variable, which is policy. So the first line is the generative model, the joint distribution, which is multiplying the prior and the likelihood. The second and the third line, these two, represent the prior and the likelihood. So here's the prior, it's 80% a priori that I perceive A and 20% a priori I perceive B. And then there's this mapping with A and alpha and there's like more density there and then B and beta more density there and the off diagonals are like less. The fourth line is a possible Bayesian algorithm that could be used. And then the fifth line is a variational free energy minimization, which we'll talk more about, but that's kind of this example is we're just thinking about the sensory processing of a bacteria or a cell that is needing to correctly guess the cause outside when it's only getting a receptor on its membrane, for example. What do you think, Dean? Daniel, do you think people will know what the difference is between the process that you and I and Stephen walked through in 33.2 and what we're looking at right here? Will they know just how different those two representations are? What do you think? It's a good question. We can restate that this is not an active inference model. It's also not even trying to be the most awesome Bayesian bacteria model ever. This is like just using Bayes theorem to show that simply minimizing free energy is not gonna cut it for being alive under this contrived setting, but when we were in 32 or yeah, 33.2 and we walked through the matrices where there was a preference and there was the hidden states, it was quite different. Yeah. So check that out. Do you see co-variation happening at all in this? Co-variation? One example of co-variation is like you need to believe that you have accurate sensors. If you, there needs to be a co-variation between A and alpha. Okay, but if you don't have them and you don't assume that you have them, are you not dead eventually or pretty quickly? That's sort of the gentle invisible hand of selection sweeping those kinds of systems off the table. Yeah. Okay. Well, I'm just asking because when I saw this, the first thing I said to myself is, my goodness, if there was a continuum between 33.2 and figure one of 33 or 34.0, man, these two things are really, really far apart. They are, well, you know, let's go to 33.2. So in 33.2, this is like looking, here's what we had done just last week or whatever earlier this week. Here, the observations are getting mapped to this prediction about hidden states. So that's actually, this A matrix is also the mapping matrix. Maybe it could have been a little bit clearer because also A is like the hidden state that's being inferred. But this bacterium is just reflecting this part right here. So the prior comes in, it's like just this sort of right angle neighborhood where you have the prior beliefs and then we're gonna have observations coming in and then a mapping to the hidden states. What active inference brings in is this whole tree where we're going to influence through policy selection how hidden states change through time and that entails the preferences for energy minimization and the affordances. So this is like just a sub-components of active. All right, so we're gonna assume that the computation is conformant to base theorem and that the posterior probability after observing the signal is going to basically update the prior towards what is most likely given what was observed. We're going to infer the posterior probability of A after observing alpha. So the prior on A was 0.8. And then the probability of alpha, observing alpha contingent on A was 0.7. So if it's A in reality, then 70% of the time you hear about alpha. That's what is actually on the receptor. And then B is just one minus such. And so 0.7, 0.3. Okay, Bayes rule is applied and you carry through the math. And basically, after you've observed alpha, your probability of estimating that the external state was A went from 0.8 to 0.90. 3, 2. So it didn't take you to 100, but it bumped you up. And then that is Bayes updating. So that's a Bayesian bacterium updating its belief about a hidden state of the world based upon a molecule that actually only has a 70, 30 accuracy. And that's really important to know. Like it sounds like 70, 30, it sounds like that should almost like degrade your accuracy, but it actually, even if it were 51, 49, it would still upgrade your accuracy a tiny, tiny, tiny bit. Now, if it were 1, 0, 0, 1, well, then yeah, observing it, if it's perfectly associated is going to update your belief a lot. But even something that's relatively noisy like 70, 30 can still make you go from 80 to 90% sure. This is just basically taking whether or not you've got a decontextualized set of information versus a contextualized one and what the benefits of adding a contextualization does, I think. Again, maybe I'm over oversimplifying, but I think that. Okay, so with exact Bayesian inference, this could be computed, but potentially that's not tractable in biological systems, especially if you start to think about how there could be like 500 metabolites. And that's called the black box problem or the solipsism or seclusion problem variously in the FEP literature. So how does the FEP get around that? In order to bypass the problem, the FEP models that inference as approximate base. Approximate base bypasses the direct evaluation of the likelihood and that makes it a lot more tractable. The central claim of the FEP, interesting to read, is that changes leading to behavioral and neurophysiological responses in living systems can form to a form of approximate Bayesian inference known as variational base. So check out actinve number 26 to learn more about variational base, which basically relates to we're going to control the family of equations, and then we're gonna do model fitting on a family of equations, and that'll be easy. Like, we're going to constrain our equation to y equals mx plus b, linear model. Well, then we can fit it with a least squares error. It's gonna be a convex optimization, we can do it. These are different types of problems, but similarly, we're going to have a certain family of equations that we parameterize, and that's variational base. What citations did you add? All I put in here was some stuff that I've been looking at over a decade ago now, and some conversations that I had with people like Morris Moskovich and Rayna around the idea that neurobiologically is there some sort of an analog between walking, an elephant walking across the span of once and not really seeing the desire line versus if a trace is going, if traces are going through multiple times, how does that affect potential gene folding in the neurons and stuff like that? And so it doesn't really, I don't think it builds on the idea of what you're talking about in terms of how we can, we might want to be able to set up software, recognition software, but over a decade ago, the idea that how something is grooved in eventually and approximation becomes something clearer and the energy that you have to put into that, and that's where you had the top cat being maybe too energy intensive. I just think it's really interesting because back over a decade ago, FMRIs were supposed to be the thing that was gonna give us a clearer picture. And there were people, even back then, they were saying, wait a second, we wanna be able to talk about when a really blurry picture becomes a more clear picture as opposed to this is what a bump in energy tells us. They kind of, we're trying to kill that idea more than a decade ago. Cool. Now, in that example, everything went well for the bacteria, right? It perceived A, or perceived alpha, it updated A, it's all good. So the following numerical example will demonstrate that minimizing free energy is not sufficient for life as it can lead to the exact opposite death. This is really key, FEP haters out there. Note that the scope of the following numerical example is deliberately limited. The goal is to demonstrate minimizing free energy can lead to maladaptive inference when performed with the wrong priors. All things being kept fixed. If the priors are allowed to update, the inference should lead to adaptive behavior. So it's a contrived example. This isn't how you would do FEP modeling if you wanted your system to work. Adaptivity is guaranteed to the extent which priors match environmental constraints more than by the machinery employed to perform the inference. So just saying, we used Bayes. It's like, this isn't a sales pitch. You could have the wrong priors. We used free energy minimization in our software. Okay, well, you're going right off a cliff. So you can think, have the prior, this is fine. So it's not enough to hide behind the methodology. Priors have to be stated and they have to be adaptive. It's not enough just to say, well, free energy minimization occurred. So things worked, right? No, it's not just like your one-stop shop. It's like a linear model, but a little bit different and more complex. So the following example of free energy minimization is provided with a sole purpose of supporting our response to the entailment problem, which is to say supporting the weak and refuting the strong claim. The goal is to give a formal intuition as to why minimizing free energy is not sufficient for life understood as the preservation of structural integrity. By no means should the following numerical example be viewed as an exemplar of the manner in which free energy minimization operates mathematically. Okay, so this is not an FEP model. It's just not. It's an FEP-like model that we're using to make some important claims about FEP. And so here in equation four is the free energy formulation. This is the variational inference, which we're not gonna go into right now, but what we'll note is just the Q of R, that R is remember the receptor state, like alpha or beta, corresponds to the proposal, recognition or approximate posterior density referred to in the FEP literature. It is Q of R, the recognition density, remember we're only making a sensory model that is embodied by the organisms, not to be confused with a P of R and alpha, which would be the joint distribution or generative model. So way back in live stream number six, we talked about how there's like the recognition model, which is from the data to the hyperparameters. That's from the observation of alpha or beta to A or B. And that's related to perception. The world is handing us alpha or beta, and then we're updating our causal model of the world, but then we're not here discussing action. So it's kind of a limited example. It's more like a unidirectional example, but adding in the second direction doesn't get around the fact that perceptual free energy was minimized, and that doesn't always work. So let's look at it, not work. So we're going to it's like, and it works because it doesn't. So let's assume our organism operates under variational base and still represented A with the same level of confidence. So here's what happens. This is actually a good variational base case. Okay, so here A is observed and the FE is calculated at 0.47. And then in six, beta is observed, or B is a question of whether B has a higher free energy contingent on observing A. So that gives a free energy of 2.297, 2792. So minimizing free energy is better. So under the variational model, we still would have selected the right outcome. So this is actually the variational equation working. We're going to change it in a second. Variational equation works. Friston's like, yeah. You're going. Minimizing free energy leads to survival only under the right conditions, which is if you have the right priors. So now we're going to imagine the same scenario with inverted maladaptive priors, okay? So you're dumped into the desert, but the priors are for the ocean or something. So here the P of B is 0.8. So now after there's an update, the energy free energy of representing B after sensing A is 0.96. And the free energy when representing A after sensing alpha, which is like the correct thing to do is higher. It's over one, okay? So the minimal free energy would be representing B after sensing alpha because you had such an overwhelming belief in B that even observing A doesn't push you all the way to believing A. I think there's a typo in the paper because it says represented state B with 0.63% confidence, but it's actually 63% or 0.63. But the point is, if you started at 80% believing in B, then you got alpha, you dropped from 0.8 to 0.63. So you're still 63, you're less confident, but still you're two thirds believing in B. So in other words, free energy minimization just happened and you just picked the wrong thing because your priors were wrong. That bacteria is gonna die. So that's the hand sweeping those maladaptive priors off the table. That's like evolution by natural selection. Hence minimizing free energy per se does not entail life, not under the wrong prior. Not dressed like that, you don't, but it's just again, making a simple important point. Minimizing free energy is not sufficient for being alive. It is necessary for being alive. Although free energy minimization is not sufficient for life, the numerical example suggests there may be an entailment that goes the other way around. If you're alive, it might very well be because you did something like minimize free energy. So that's the weak relationship. So we kind of invalidated the strong claim of sufficiency, but we're actually finding support for the weak claim of necessity. This makes the FEP an interesting epistemic principle for researchers interested in development of themselves, of cities, of organisms. In a reverse engineering fashion, if we observe a free energy minimizing organism still living at the time that we observe it, just like all things that we observe are, we can trust it has good enough priors to remain alive. Like I look at the ants, I think whatever they're doing, as the world has changed, they've been doing it for 120 million years. That's what we know about that. How do they know what to do when there's a break in the trail? Well, the ones that didn't aren't here and the ones that are here did it well enough and that's like a first pass at an answer. And then also on a wing and a prior, on a wing means like sort of like a few different things, but it reminded me of the wing snowflake, the 2007 paper, and then moth to a flame. And so if you're observing these snowflakes, the ones you observe are going to increasingly be the ones that have wings that can stay solid because the ones that don't don't exist. So that's FEP on a wing and a prior. One could rightfully say that minimizing free energy when equipped with the right prior beliefs is sufficient for life, i.e. all you need to be qualified as living or maintaining your structural integrity under the FEP. The point here is the choice of prior, whether under Bayesian or approximate Bayesian regime is the real concern since the entailment relation between life and FEP entirely depends on the adaptivity of those priors. Interestingly, some have argued the adaptivity of priors can also be guaranteed by a free energy minimization operating at the population level as a form of natural selection. The citation there was to HESP 2019, a multi-scale proposal of the emerging complexity of life with HESP, Ramstead, Constant, Babcock, Kirchhoff and Friston. And here's a figure from there. I'll say though that upcoming, there will be a paper with myself, Axl, Blue, Parr and Friston that will go into a lot more detail on this. So it's absolutely a great point and I believe that it's the case. Dean, what did you write here? Well, I was just wondering from the perspective of what a prior is, I think is it available there in terms of something that we can leverage off of and then of what, what, then what do we do because we have some sort of previous marker from which to infer mentally. So I was wondering if that's, if that's the essence of what an adaptive, adaptive as function, as an action requires. But I think I'll wait until we bring up that paper with five authors because that'll be- We'll discuss it. Yeah, exactly. We will discuss it. It's like, we'll have fun thanks to discuss. I can think of more examples, but I'll save them. So in the future direction, last section, the dissolution of the entailment problem. You know, again, write a paper if you don't think that the entailment problem is adequately parsed and dissolved in this paper. It's a great paper if somebody wanted to write it, puts us in a good position to move to another related difficulty in the philosophical literature on the FEP, which is this time of an exegetic kind. Look at these synonyms. Hermanutical, there's an awesome Friston paper about the hermeneutics of communication. Right. These are just nice adjectives. If minimizing free energy is not sufficient for life or survival, how should we interpret statements such as the minimization of free energy may be necessary if not sufficient characteristic of evolutionary successful systems? Friston and Stefan 2007. Way, way, way back. Hmm. Postdictive scientific statements are concerned with what must have been the case instead of what will be the case. Like saying that the Hawks gene duplication allowed for this body plan to change, it doesn't say about what will happen, but it is about what was necessary for happening. So a statement such as the minimization of free energy may be a necessary if not sufficient characteristic of evolutionary successful systems, this quote, is probably such a postdictive statement. The statement should be interpreted as claiming that free energy minimization must have occurred if a system is evolutionarily successful, not the other way around. We're looking at living creatures and postdicting what got them there, not looking at the class of all systems, hypothetical, failing, et cetera, and then saying what will make them alive because you can, again, minimize your free energy right off a cliff, burn yourself out like a candle. That'll minimize free energy too. You could have maladaptive priors and also be minimizing your free energy. Daniel, do you think this talks about the difference between deductively realizing when and inductively realizing when or is that? Let's talk about deduction, induction and abduction. The when piece of this, again, I don't wanna push this too far into the future because we're almost at the end here, but maybe that's a point one thing, I'm not sure. Yeah, it's a definite point one. Okay. So the final slide, it's also the final lines of the paper. So it's a little bit of a personal, you know, it's interesting to read a paper with I, it's the first one author paper, but it's also more of like a philosophy and Dupri's papers are written in this way as well when they're single author. It's like, I argue that this. So that's kind of an interesting style that STEM papers usually don't have. So this is Axel kind of bringing it all together. As a person who used the free energy principle in the numerical example above, the puzzling trace for which I was seeking the causal explanation for was the survival of the organism. That was my observation. The causal story or hypothesis for that observation under the conditions we impose to our simulated organism was the free energy principle, the inference over which led me to write the paper you are reading at the moment and the live stream you're listening to at this moment. That paper functioned as sensory evidence for my hypothesis, e.g. when writing down the number and seeing they were adding up. And that paper, live stream, is the observation that you are using to evidence your hypothesis concerning the claim I set at the start of the paper, namely that free energy minimization is not sufficient for life. So if you had wildly out of whack priors on that claim, maybe even listening to us talk, you still believe what you believe. That's okay. It's all consistent and good. Fidel to the unifying grip of hypotheses and the historical sciences, the free energy principle is meant to account for all of that. You, me and the organism understudy in a unifying fashion. So here is Axl looking at his figure one and then we're reading the paper and then like Actin, Flab and the broader ecosystem is observing that and there's other things observing us. So FEP, all I see. Yeah. Yeah, again, back to the question of whether it can be both objective and subjective at the same time and whether that fits within the relatively small time frames of what we're looking at when we're talking about FEP. And this is even a smaller version of FEP than maybe others looking in terms of a unit of analysis. Anyway, I just think that the way he was able to use math to show that there is a way of arriving at knowing what is the numerator and what's the denominator here was pretty, A, A was clear. B, it was pretty concise given the complexity and C, it's pretty compelling. So. Agree with all of that. Really looking forward to the dot one and dot two. Of course Axl and anyone else is welcome to join in those discussions and then we have just so many more papes to discuss in the coming years. We're gonna just do what we do in the dot one and dot two. We're gonna be in the bathtub phase, be in the thick of it, be in the forest for the dot one. And then we're gonna have our springboard and our jumping off in the dot two. And we looked forward to this dot zero but soon we're gonna be looking back on it. What we did. Exactly, perfect. Perfect setup. Okay. Thank you so much. Thanks there. Absolutely. Thanks everybody for watching. Hope to see you around. All right, bye.