 So, okay, hello everyone, thanks for joining this second meeting of the second cohort of the Active Inference Textbook Group where we're going to be discussing chapter one. Today is September 9th, 2022. So I'll minimize this and you guys can click on my screen and that will show you the full screen of what I am presenting. So what we'll do in the live meetings here for the next two and a half months or so is totally going to be driven by the material that's in the chapters and in the appendices. So we're going to go to the questions page and we're going to look at the questions that have been asked for that week, for that theme, and we're just going to start with the most upvoted questions. So here you can upvote a question like this by clicking thumbs up. So we will start with the most upvoted questions and then we can take notes here in the discourse, in the answers in the discourse, and if you just click on that, it opens this page and anybody is free to write here. And after addressing the questions, which is probably going to be most of the time and discussing them, then we will move on to maybe look at the ideas more broadly over here. So we don't really have too many questions. If anybody wants to jump in and write a question, we did send an email reminder that, because this was pretty empty over the course of the week, but hopefully thinking by writing in this way will be a structure where we'll be able to hear everyone's perspective and leave the coda in a better place than we found it. So if anybody wants to jump in and write additional questions in the coda now would be a good time. And then after looking at the questions, we will pull back and focus a little bit more broadly. So maybe we'll go over here to touch on the ideas and insights of the chapter and then ideas for that week. And then in the last minutes, we can just see if there's been any changes to the project ideas and start to see what people are excited about applying active inference to. And then we will end at the top of the hour. So let's just head over to the questions, the cohort two questions page, and take a couple minutes to put some questions in here. So just so you guys know, control plus and control minus are good ways to scroll in and out to get like the right view of the screen, and then you can hide this little textbook bar over here. So I'm going to hide that and just give everybody maybe like five minutes to put in a question or something that we'd maybe like to discuss from chapter one of the textbook. So maybe we could just take like one more minute and wrap up these questions. Just want to say like to everyone at any point that there's always the affordance of just listening and also of just rewatching this recording. There's also the affordance in gather to raise your hand, see if I can show where that is. It's been a long time. Over here, there it is down at the bottom. So there's this raise hand feature and lower hand. Yeah. Okay. And then if you would, you can raise your hand if you would like to share just on that particular question, and we're going to have them up in the coda. And you can also pull up the answers in the discourse and that will pull up the focus view of the question. And everybody is free to collaboratively like edit the answers here in the discourse. It's better to really add and be noisy, even if it's just like a bullet point list or an incomplete thought, or if there's incoherence or if it's contradictory, those are all fine. It just adds information and perspective on the question. So yeah, we're going to open up each question, the answers in the discourse and read what has been read and already contributed in the answers. And then also people want to share other thoughts that they're having. That's awesome. And then if anybody also just wants to be taking notes, just things that are coming to mind or they don't want to share in a speaking way or just taking some notes on what other people are saying. So this is really a participatory group. And so it just is what we make it here. OK, cool. I'm going to leave this little window open. I won't see any raised hands and gather. All right, let's see. I need a double screen. Let's see if I can do it in a small way. OK, cool. All right, so let's get into it. This first question, I'll pop it open. And it says, what is the difference between a stimulus response mapping and a state action policy? So I don't know. Does anybody have any ideas here? I'm not. I think that they're both kind of like an if this, then that, but they're specific, I think, to machine learning. So if anybody has any of like more depth or background in this area, that would be awesome to hear. So I don't know the way I kind of think about it. Oh, is that Arun? Did you raise your hand? Arun? Yeah, can you hear me? Yeah, cool. I can give a very, I can give an attempt to an answer to that, which from my understanding, and please tell me if I'm wrong, when you've got the stimulus response mapping, you've got 100% certainty on what the stimulus is. So with the states, you're then factoring in beliefs about you've got uncertainty. So maybe just at a simple form, you could be chucking a Gaussian distribution on every stimulus and then a Gaussian distribution on what your response is to go from state act to go from stimulus response mapping to state action policy. That's my intuitive understanding of what's going on. But again, please tell me if I'm parking the wrong tree. Yeah, that's awesome. Thanks for sharing. And I kind of think about it in the same way, I don't know. I mean, this is probably totally wrong, but I think about the Roomba, the little vacuum that cleans your house. And so if you turn it on, it turns on and moves. That's the stimulus. You push the button and then it turns on. And then it moves around because you pushed the power button. It's a very clear thing. Versus the state action, if it's full, if the vacuum gets full, then it dumps out. But that state of fullness is a distribution. So it can be more or less full when it senses how far it is away from the dump-out bucket or whatever. So that's the state action. The state can be full. But what is too full for the Roomba? Is it a certain weight that's in there or a certain volume or sensor? I don't know. I have uncertainty about when the Roomba dumps itself out. Cool, awesome. So we can just write this in here. And anyone is free to take notes and type in here. OK, cool. Yeah, and anyone's free to just get in here and jump in and do this. All right, so let's look. Let's look at this one. All right, so this question says, is the brain trying to maximize qualia divergence and minimize encoding length? Ali? Sorry, I raised my hand before. I just wanted to add some comments about the previous question, because I think it's a very important one. You see, as we go on specifically in the chapter 4, when we encounter the concept of Markov-Blanket, well, I think we can see kind of more granular differences between those two concepts, between stimulus response mapping and state action policy. But briefly, states are just representations of internal or external world. And actions are specifically goal-oriented responses. And they just try to change the state of the system or the agent. But when we're talking about just stimulus response mapping, I mean, response doesn't necessarily try to change the state of the agent. As an example, if you touch a hot object, the pain you feel is just a response. But if you move away your hand from that hot object, it's an action which tries to change the stimulus. And one other important distinction here is the term policy is used in a different sense in active inference literature as compared to reinforcement learning literature. Namely, in reinforcement learning, this term is often used to refer to singular mappings or one-to-one mappings from states to actions. But in active inference literature, policy usually refers to possible sequence of actions, not just singular actions. So that's, I think, one important distinction we need to have in mind also. Cool. Awesome. Thank you very much. And I looked that up actually. So here, I think that when they talked about state action policies, they were using it in the machine learning case because it says in the textbook, the goal-directed character of action and active inference is in keeping with early cybernetic formulations but distinct from most current theories that explain behavior. Oh, sorry, someone's talking. Could you please mute? Thank you. So it says distinct from most current theories that explain behavior in terms of stimulus response mappings or state action policies. So I think that this here, they're not talking about the active inference state action policies, but they're using it in the reinforcement learning way. So it says also stimulus response or habitual behavior then comes a special case of a broader family of policies in active inference. So I think that the way that they were talking about it in the textbook refers to the reinforcement learning way as opposed to the active inference way, which Ali is correct. It's definitely different. Any other comments or thoughts on this question? And thank you for whoever is taking notes in the coda. That's always super helpful. So we can move on and look at this question. Is the brain trying to maximize qualia divergence and minimize encoding length? Who has some thoughts here? Just the idea of maximizing qualia divergence seems like at some point that's beyond schizophrenic maybe. So there's some maybe threshold at which you would not want to go beyond. And that seems to have something to me, seems to have something to do with epistemic value. If you can't just try to maximize your epistemic value forever, you would end to a pragmatic problem that's encoding length. So I guess I'm wondering if maybe a different framing of the question or commensurate framing of the question is what's the relationship between epistemic value and pragmatic value like are there trade-offs there? Yeah, I scammed through chapter 1 and really didn't see qualia divergence or encoding length there in the chapter. So I'm not really, is this anyone's question? Do you want to unpack this question a little bit more for us, maybe? No volunteers. Awesome. Daniel, thanks for joining us. I will, I can officially pass the baton off to you and maybe you have some comments about maximizing qualia divergence or minimizing encoding length here. But we got in and we wrote a few questions and we went through the first one. We could also go back and touch on that if you've got comments on that as well. Questions just makes me wonder what is being minimized? What is being maximized? And when is it some particular value that's being minimized or maximized? And when is it the divergence between which other differences? And in chapter 1, there aren't any of the formalisms yet, but we'll see a lot of double lines and KL divergences that will cue us towards where a divergence is being at least described. If you're ready, I can unshare my screen and you can take over. Please continue. OK. Any more additional, additional comments here? I like this. Is qualia, I didn't get to write this down, but Brock had mentioned qualia divergence, schizophrenia, beyond schizophrenia, that was the term. Diversity, you can imagine, is necessary and some more might be really healthy and useful. But at some point, then you have multiple personalities or something? That makes me think of the sense of normalcy and what is experienced as normal and how much of a deviation from normal is perceived as like what is perceived as normal and where is it perceived as creative and maybe even enjoyable, novelty, like a guitar solo? And then where is the qualia divergence too high and maybe uncomfortable? And then what's the view from the inside with that actual experience, like what qualia brings in? And then where is this being modeled from the outside with behavioral descriptions where we might not be able to directly address what the qualia are? And Daniel, that reminds me like some of it might be, you said some of it might be normal and some of it might be uncomfortable and or some of it might be, some divergence might be creative and some might be uncomfortable. And there's actually like a huge degree of like mental disorder associated with creativity. Like studies have been done and it's like, I mean, there's the classic like songwriter, singer who commits suicide and we hear this again and again. But there's actually like a great degree of mapping between the creative type and some kind of atypical mental state. So maybe both of those are the same. OK, cool. Anybody have any comments? I don't know if I can see everybody in my tiny screen. Go ahead. Sorry. Hello. Hi. It's not clear to me what precisely it means a quality divergence because divergence with respect to what? I mean, usually a KL divergence is a divergence of like a measure of the difference between two distribution. So here what does exactly means maybe the person asking the question and clarify? I'll also just note that I'm sorry. I was just I don't I don't know if they're here. It's already asked. But to me, it just seems like internal kind of population of one quality asset, another quality asset that are differentiable. Like you can have divergence, certainly of qualia in your own experience if you just consider the various sensory kind of experiential states that you you occupy. Like, OK, so that means that would mean like expanding the range of of your phenomenology, something like that. Yeah, yeah. So yeah, so you could imagine a sense of space and self, you know, that's kind of frequent and normative that if that starts to maybe in a religious or spiritual setting or other kind of group team settings, you know, you kind of get a group flow going maybe like that that starts to change. But but maybe if those two things greatly diverge that wouldn't be great. So that maybe is related to the complexity of that part of the model that encodes a representation of the self. So that would increase the complexity of the of the model, which is one part. One part of of the free energy immunization scheme. So that, I guess, I mean, with each increase in the epistemic import of of the of observation, then also complexity of the model don't needs to be kept within within some limit. So inside your blanket. So yeah, so it's probably it's a trade off between increasing the complexity of the model for what concerns the range of possible experiential yeah, quality and their epistemic value, I guess. Also just mentioned out of fairness that quality is not mentioned in the textbook. That's totally awesome to consider what the formalisms entail for philosophy in many papers and many discussions do. And it's like an important area to explore. But it's also not what is addressed. And I think it's probably even though these authors in other settings tackle the problem head on. The textbook is seemingly crafted to enable some discussions that might not require the consideration of the view from the inside or the phenomenology that qualia entail. And that helps the generality of the and the applications across different systems. And so understanding how they map onto our undeniable experience is a great area to explore. Yeah, overall the book doesn't expand too much on the part of the model that represents the agent itself as an entity, like our own self representation. It mentions it but does not really dwell onto it. More about that is probably in the book of Jacob Hoey. There's a little bit more on that, but in this one, not so much. Well, perhaps this connects to the following question on self-evidencing. So before we jump forward, I just want to stay on this question for one second. So I looked up the definition of qualia here. And it's the individual instances of subjective conscious experience. And I don't know, maybe in the question, which let's read it again, it says, is the brain trying to maximize qualia divergence and minimize encoding length? So I was wondering if this maybe is meant for a question like in an evolutionary sense. Like, is this what the brain is trying to do across time, in deep time, from prehistory to now? So yeah, we can move on, though, because maybe self-evidencing is more addressed in the book than qualia divergence. Yeah. So what does self-evidence and inactivism really mean? Does anyone want to take a stab here? Yeah, I asked the question, because when I read up, my eyes just glaze over when I hit these words. I can read them in sentences, and I just doesn't stick in my head what is really meant self-evidencing. It just turns into something tautological. I don't have any real grasp of what is being said in these sentences. I can take a stab at that, because really it's quite a simple concept. So if we think about the agent, not as having a model of the world, but being itself a model of its own environmental niche, then minimizing free energy basically means working, acting, so as to collect the observation that it expects, that is minimizing surprise. So since you can say finding the evidence for its model of the world, but since the agent is a model of the world, that becomes self-evidencing. So basically, you could also see it as trying to find the sensory stimulation that makes your phenotype, your organism viable, that sustains you. So that's evidence in finding the observation, the sensory evidence that is more coherent with your makeup as an organism. Thanks. I'll add a few notes. So inactivism is a long-lived thread in cognitive sciences that is long existing before active inference. And it is describing that systems with bodies, real, particular things existing in the world, ought to be having their cognition modeled as bodies engaging with the world, rather than them taking a view from the outside or necessarily needing everything to be like symbolic communication. Whereas in digital systems or abstractions of cognitive systems, it's very easy to get into some information and symbol passing discussions. Whereas for real systems with biomechanics and embodiment, inactivism brought attention to the requirement of considering the active engagement of an entity with its niche. So maybe it's not a term that provides a huge cognitive model update, because for many people it's very natural to consider cognitive systems as inactive and embodied, embedded, encultured, et cetera, et cetera. And sometimes that's called 4E cognition. And those are all like in allegiance together. But that's what inactive points towards. And self-evidencing is describing that the imperative of systems as modeled under active inference is for them to accrue consistent observations and find evidence for the kind of thing that they expect and prefer themselves to be. And in the coming weeks, we'll explore this like expect and prefer and how can expectations be preferences and what about when we expect bad things to happen and uncertainty about expectations. But we can contrast this imperative for finding consistent, self-consistent, or self-evidencing observations rather than pursuing rewarding observations. So one can say I expect and prefer to be at room temperature and then moving around the space or putting the jacket on or off to reduce surprise about what observations one is receiving as opposed to needing to construct a reward framework in which room temperature is rewarding and then it gets less rewarding. And then what is being maximized is the reward of the cognitive system. And again, in contrast, self-evidencing puts the imperative on the system's self-model and reducing surprise about its own observations rather than seeing those observations as just proxy of reward states that have to be maximized. Arun? I found both of those answers really interesting. And I have a question to each of those. So on the self-evidencing, we have, so maybe this is fast-forwarding a little bit in the book, but they talk about both generative models and generative processes, and those are distinct ideas. So when we're talking about an organism as a model in its niche, it's still distinct from the generative process, right? So that's something that I'm not 100% sure on, and I also find the self-evidencing part a little bit circular, and I don't quite have a good understanding of that. So in my mind, the generative model and the generative process are very distinct. And yes, you're looking for, you're an organism and you're in your niche and you want to succeed and by natural selection or whatever, we only really see organisms that succeed. They're going to look for things that meet their priors, such as a nice homeostasis, food, reproduction, all of that sort of stuff. That's sort of my understanding of that part, but it's still an organism using a model to navigate the world rather than it is a model itself. And then also on the idea of minimizing free energy, but also it's not a reward. To me, it sort of is a reward. It's just a reward with respect to a prior distribution. So yes, you're minimizing free energy and you have a prior preference of being at room temperature. That's great. But you still have an objective function, or functional, I think as the book calls it. You're just subtracting the prior distribution of temperatures that your body likes. So I still see that as like it's fundamentally an optimization problem. It's just the units are a bit different and it's all statistical distributions and the differences between them rather than, I don't know, X minus X bar squared, that sort of thing. So those are the areas that I'm not very clear on. So I don't know if anybody can clear those up. Thank you. So can I follow up on that then? So on the inactivism, I think the problem is knowing what the specific meaning of it is compared with embodied and embedded because those ease are so often used together. What is, where would you say inactivism rather than embodied? And I think you touched on it by saying about it involved in having a model. Maybe whereas being embodied doesn't. I would suggest that inactive calls the action into focus whereas the embodiment in the embedded refer to just the materiality. But the reason why they are allegensed as for E, E, E, E, E, etc. They're just fingers pointing at the moon. They're just fingers pointing at real embodied organisms and some of the complexities of modeling them and framing them in contrast to disembodied approaches. So they're pointing at organisms and their different theoretical vectors that have been taken and some of the synthesis amongst them has already been carried out. Yeah, inactivism became popularized, the notion with Francisco Varela by Francisco Varela actually. And I think he was not too fond of the term actually. But anyway, the thing is that he wanted to stress the fact that cognition and consciousness emerges in the circularity between sensation and action. So that was enacted in the circularity, continuous circularity of action and sensation. So I also feel it's really close to the free energy approach and active input. And so though there are some of the early champions of inactivism like Evan Thompson and some other scholars that argue that there are actually crucial differences. Anyway, for what concerns the difference between the generative process and the generative model, the generative process is basically the actual physical processes that turns causes into stimuli that impinge on our sensory surfaces. The generative model is the best guess that an organism can make about that. So they're not identical, of course. The organism tries to learn at its best the generative process, but it's just a model. Ali? Yes, and I also wanted to mention Jacob Howe's well-known paper, The Self-Evidence in the Brain, which I believe is the seminal paper which introduced this concept of self-evidence into active inference literature. And in that paper, he also contrasts this term with inactivism, inactivist, and all the other four e-cognition frameworks and theories, but briefly he believes that the self-evidence in brain by putting a boundary, namely the Markov blanket, as one of its essential components in order for it to be able to infer the states of the environment, it can retain its cognition within its inner statistical or generative model. So that's one of the reasons the self-evidence in concept was deemed necessary to introduce at the first place. And self-evidence is something that just is beyond the biological brain and it can refer more generally and more broadly to any agent which tries to learn about its environment as a one dynamical system which tries to perform inference about the other dynamical system which is the outside world. But with a particular aim of trying towards the minimizing the error or performing the alastasis, so to speak. So that's one of the main differences between inactivism and self-evidence in concept here. It's also very illustrative to look at figure 1.2 and see where these terms are used in the context of the low road and the high road to active inference which is one of the key concepts in this chapter and it's describing how the following chapters are going to be structured. So we see self-evidence on this upper branch of the subway system coming down from the high road and we see some of these other terms that people have been mentioning like generative model coming up from the low road. And then in the same page that we were just discussing I copied some sections from the text that are describing these high and low road. They describe that the high road is starting from the question of how living systems persist and act adaptively in the world. And so that is very resonant with self-organization which is the first stop on that subway system from the high road. And also Varela's work, auto-poesis a lot of questions about how cognizant sentient systems persist. The high road perspective is useful to understand what living organisms must do and why. Why to be persistent via surprise minimization not via reward maximization and then what they have to do. So it's the what and the why in response to the broader question of how complex systems persist. That's the high road and we've had many discussions and fun for every person to also think about this high road, low road distinction but the high road is starting from a complex system thriving or persisting in a complex niche we can think of. And then the low road is starting from base theorem and continues up through the Bayesian brain also motivating active inference from a different direction and it is describing the low road perspective is useful to illustrate how active inference agents minimize their free energy. Well they're not necessarily doing frequentist statistics they're doing something that we can describe as Bayesian and this textbook is going to introduce base theorem briefly in a box and it's a big area and some people will be familiar with thinking about priors and updating priors and the way that's discussed in Bayesian statistics some will be familiar for some it might be novel but this is describing the atomic computational or statistical nucleus that is going to describe how different systems can fulfill that imperative that we saw traced out by the high road and Ali I know you have a lot to add on that. I've also put a pdf to Carl Friston's essay Beyond the Desert Landscape on resources page and also on the chapter 1 questions on notes page I believe this essay is the most comprehensively articulated and in my opinion very beautifully even poetically articulated account of the high road and low road distinctions and the philosophical distinctions between them if you scroll down I think you can find the paper I attached there right at the bottom of the page and I've also appended, yeah that's it the Carl Friston's no below that yeah Desert Landscape and I've also appended Andy Clark's reply to this essay because this chapter comes from this book called Andy Clark and its Critics which I think can help to understand the real differences between those two concepts and why it's important to distinguish between them when we want to understand any active inference related concepts thank you awesome thanks I'm going to just make note of that in here maybe just a comment on the why if you Daniel like talking about the high road is it why really because my understanding really is the perspective or the nature of the free energy principle is really as Carl says is more total logical than teleological and why seems to point to some sort of teleology so it's go ahead yes scientific or formal sense of teleology or end or goal directness are very interesting and another framework which is evolution has also thread the needle with tautology and teleology and so there's a lot to be said there about how those two categories aren't necessarily disjoint do you believe they have to be disjoint or are you expecting or preferring that they are if the imperative is to persist and survive and we simply don't observe systems that don't then that phenomena can be both tautological in the sense that itself describing as well as teleological in that from a perspective's view not speaking for the system itself but in the sense that it's not acting like it's trying to go deeper we don't need to entail any qualia or any sort of broader theological reason for that to occur we can describe something that is both teleological and tautological yes I think I agree I think that I think that I think that I think that from what often says that it's deflationary account is that basically the free energy principle basically describes self-organizing system just because it persists and so what characteristics does it need in order to persist and so in that sense is tautological it's just if it persists it must behave like that so you don't even need to put in any imperative to survive if it survives it must behave like that but of course as you said you can frame this self-organizing drive in a sense as some sort of imperative to maintain its form and function of the organism so there's a lot of interesting things to say about that and also one note is sometimes we speak in a mode of like what the imperative is for the system as itself in terms of what it normally should or imperatively must be doing but also we can just speak from ourselves as modelers of other things as what our modeling imperatives are and that is a slightly different question and that's sometimes known as instrumentalism it can be compatible with realism of various flavors and it's been explored broadly however from our perspective as observers things that are not acting as to minimize their surprise relative to some area in space that makes us say it's that kind of a thing if it's not staying in that space which we're defining as that kind of thing it's not that thing if the tornado we're not staying in some range of face space that we call tornado it wouldn't be a tornado to us and so sometimes that can be a more empirically grounded and less speculative stance to have two feet as us as modeling systems which is going to be consistent with the metabasian approach that's in the book and it side steps or preempts a lot of very thorny questions around speaking for what systems actually are or actually are doing and it's the difference between saying the thing is a model of its niche you know big if true we're going to model it as if it's a model of its niche it's irrefutable that's just a choice that we're taking and someone could say well I would have taken a different choice in modeling and you can say great what would your choice have been and there's actually a starting point instead of just this endless speculation around well what is the system really and especially when we think about our Markov blankets and our inability to make direct contact to hidden states it bolsters the justification of speaking as a modeler very cool anyone else with comments on this question self-evidencing and inactivism which we can talk I'm sure all day about sorry if I just finish off that I think possibly it's the language that's the problem self-evidencing it just sounds as though it's self-confirmatory but all it's doing is looking for support for its existence which is not a good strategy for survival why you say dead well if I only look for evidence of my existence rather than things that are a challenge to that then is it really related to the dark ring problem exactly for the sake of long-term persistence may be necessary to incur sometimes transiently into states that are surprising so there's no real there's no real incoherence there I think no I agree I'm just challenging it's as a language a bit of a language yeah great point and indeed the natural language is often heavily shading interpretations of different terms like around surprise belief expectation preference all these everyday terms which are core in the active inference ontology it's really important to interrogate our understanding of them and check their alignment with formalisms and this book is going to a large extent focus on the kernel active inference action perception loop single agent single level kernel and in that way it's very analogous to like a single linear regression in practice linear regression on large data sets sweeps across families of models there's all kinds of things that come into play just for the linear regression that's the kernel in a nested model one could reduce surprise and find confirmatory evidence I'm the kind of thing that seeks out novel information sources and therefore seeking out novel information sources is confirmatory evidence of that so once we step into multi layer models many of the absolutely valid limitations of the kernel in and of itself are addressed so those are like all the directions that people's questions on and written down are super important they're all the things that are direct adjacencies and relevant for applications and there's going to be things about real complex systems that the kernel does not describe totally so this isn't like the end of the active inference modeling of complex systems it's a total book it really is just describing the essence of the loop so I want to just give Arun the last comment because it's the top of the hour now so Arun yeah I just wanted to add to Neil's question again I fully understand where you're coming from when I first read about the self-evidencing thing I was like that makes zero sense and exactly what you described you're not going to look from predators like as a basic thing that's not evidence of my existence that's evidence of my death but I think the way I've sort of come around to it is it's all about the prior preferences and what you expect to see and yeah this occupation of face space I think is quite important of you you sort of get attracted to particular states where you're comfortable but you do need to know about the rest of the world anyway that's the plan for not being in those states sometimes and so looking for evidence of your existence is all about matching those priors and those priors can be pretty flexible so yeah that's I think that will in my mind like that comes through a bit later on in the book I think so I would say my advice to you is stick with it but it was very counter-intuitive to me as well when I first came across it you