 Hello and welcome everyone to the Active Inference Livestream Take Two with an awesome set of flexible and awesome participants. This is Active Inference Livestream 6.2 on October 20, 2020. Welcome to TeamCom, everyone. We are an experiment in online team communication, learning and practice related to Active Inference. You can find us on Twitter at inferenceactive. You can email us, find us on Keybase, or at our YouTube channel. This is a recorded and an archived Livestream, so please provide us with feedback so that we can improve our work. All backgrounds and perspectives are welcome here. And also, as far as video etiquette for Livestream goes, please mute if you're going to have noise in your background, raise your hand so that we can get to hear from everyone and use respectful speech. Today, just as a quick point of news, our friend and colleague Jared Tumiel has released a very interesting blog post called Spinning Up in Active Inference and the Free Energy Principle, a syllabus for the curious, and it is a great place for beginners as well as for those who want to go deeper into one of the various domains that are touched upon by Active Inference and the Free Energy Principle. So today, in Active Stream 6.2, we are going to have a warm-up. We're going to have take two on our intros as I kick out this other Alex. We are going to then come to a discussion of the paper. Last time in 6.1, we talked a lot about the goal of the paper and we went through the roadmap and the abstract in quite a lot of detail. This time, we're going to be highlighting some excellent follow-up comments from our participants. We're going to be talking about pattern versus structure, metaphors, and more. And then we'll go through the figures and make sure to talk about some of the similarities and the differences between Bayesian networks and four-nif factor graphs. Great. On to the intros and check-ins. So for this section, please just introduce yourself and your location. Say hello. Provide a little short intro if you'd like and pass it to somebody who hasn't spoken yet. So I'm Daniel. I'm in California and I will pass it to our first-time discussant, Kate. Sorry. I'm Kate. I'm based out of the University of Edinburgh. I'm a philosophy of cognitive science. I'm interested in life and continuity and trying to find a satisfactory definition of what it is to be a living system. And I will pass to Mel. I think Mel is not joined in yet. So let's go to Shannon. I think Mel's here. I don't know. Yeah, absolutely. Mel started her intro actually. So Mel is absolutely here. We can hear you. I don't know if anyone can hear you. Yeah, weirdly, I don't hear you. Mel, could you just reload? Sorry. And Shannon, go ahead. Okay. No, I don't think he sees you. This is a new and fun kind of gitsy area. Just reload the page, Mel. Just take refresh. Yeah, exactly. Perfect. I see that. Mel, take it away. Thank you. There you go. We're good? Yes. Beautiful. All right. Cool. I'm Mel Andrews. I'm in Ohio. I'm a doctoral student in philosophy of biology and cognitive science. I'm interested in the same question as Kate. And I will pass to Alex Kiefer. Hi. I'm Alex Kiefer. I'm in the philosophy department at Monash University, currently in New York City. I guess I'll probably be wind up defending structural representationalism about generative models today. I don't know. I'll pass it to Maxwell, I guess. I'll probably be agreeing with you in the end. I thank Alex. I'm Maxwell Ramstead. I'm a postdoctoral fellow at McGill University in Montreal. I'm speaking to you from Montreal right now. I work on multi-scale active inference, and I'll pass it to, I guess, Ale. You haven't spoken yet? You're right. We are three Alex. Well, hello, everybody. I'm Ale or Alejandra. I'm a professor in the department of psychology, typically in the department of cognitive sciences. And while I'm interested in also the same question that you were talking about what exactly is to be alive and maybe going further with this cognition and perception. So yeah, that's me. Oh, well, I'm in Mexico. So I'm going to pass it to Sasha. Hi, I'm Sasha. I'm based out of Davis, California, and I'm a neuroscience graduate student studying kind of the molecular and developmental aspects of the brain. And I will pass it to Shannon. Hi, I'm Shannon. I'm based out of UC Merced in California, but I'm in South Dakota right now. I'm also interested in how active interests can help us learn about the brain in music cognition, as well as how we can learn about how humans coordinate in music or other sort of group interactions. And I will pass it to the other Alex. Hello. I'm Alex Vatkin. I'm a researcher at System Management School in Moscow, Russia. I'm trying to find possible ways of integration, active inference with system engineering and system management frameworks. So who's else? Cool. I think that is all of us. And thanks, everyone, for bearing with these funny technical glitches. I'm also recording this locally. So worst case, we'll just be able to have this uploaded. Let's get on to the warm-up questions. Just anyone who wants to speak would be awesome to hear from you just to sort of get started. What is something that you've been working on or learning about recently, whether it's directly linked or just something that influences your mind state coming into today's discussion? And you can popcorn in and also I'll be watching the stack if people raise their hand. I'll start while people are raising their hand. We just had complexity weekend this past weekend, which was a really exciting and interesting time. About 100 people from 37 countries and forming cool teams on different topics really learned a lot. Mel? Hello. Yep. Beautiful. Yeah. So recently, I've been sort of puzzling for a couple of years now, God, it's been a long over how to conceptualize the free energy principle. And so that's pulled me into work in philosophy of science on models and modeling. So all different kinds of ways of conceiving of formal models and what they do. And then kind of parallels in biology, behavioral ecology and pathology and evolutionary biology with models that sort of having some sort of empirical concept. Anyone else want to share something they're working on? Or I can also just raise the second warm up question just in case people want to answer that one, which is what is something that you'd like to have resolved by the end of today's discussion? Whether it's a thought about the paper specifically or something general. Sure. Fellow Jitzer. I'm not sure that is Shannon. Go ahead. Hey, I wonder whether the like structural representation that we might be defending later today, whether that structure could just be a set of attractor dynamics? Like, is that enough to be a structure? So I think Alex might be raising his hand. Alex might be raising his hand and have something to say. Yep, Alex. Oh, yeah. I mean, I was going to speak anyway. I mean, I would say yes to that question probably. It's just my opinion. But I was going to answer both questions, I guess. So apart from trying to figure out the free energy principle still in lots of conversations with people, I've been working up to trying to implement a variational autoencoder to figure out how that actually works. So I've been doing a lot of sort of more technical work. Something I'd like to have resolved is just to figure out exactly where inactivism or the version of inactivism that's defended in this paper is inconsistent with representationalism if it is. So like, I don't need there to be residual disagreement, but I just want to figure it out. Cool. Sounds good. Any last thoughts on these warm-ups before we jump into it? We have a lot of very interesting slides. Kate, go for it. Yeah, I guess the biggest thing that I want to have resolved by the end of the session is I'm still not really clear on the status of the generative model because at some point it sounds as if that something is almost causal, that it's controlling the animal's behavior or the organism or the system's behavior. And at other points, it's a model. It sounds more like it's a model that we use to describe the system's dynamics rather than something that is including beliefs or being causal. And that's something that I'm still not entirely clear on. So I think that's a big thing I want to have resolved. Cool. All really interesting topics. Thanks for sharing. And we're going to jump into it because a lot of these are on the slides ahead. So the paper that we're going to be coming back to today is A Tale of Two Densities. Active inference is an active inference by Ramstead, Kirchhoff, and Friston. And just to rehearse the goal of this paper, they aimed to clarify how best to interpret some of the central constructs that underwrite the free energy principle and its corollary active inference in theoretical neuroscience and biology, namely the role that the generative models and the recognition densities play in this theory aiming to unify life and mind. And so last week we talked about what these two densities are, and we talked about the tale that integrates the densities. We also walked through the roadmap, how they build from A to Z by talking about statistical models and representations, introducing active inference, and then combining that with an active inference in this sort of inactivism 2.0 framework before building towards multidisciplinary research heuristics for cognitive science. So that was last week. Let's get into the new and exciting parts. So this was one of the feedback comments from a participant last week. And what they wrote was, a key learning goal for me in building my understanding of active inference is to develop intuitive examples for the critical concepts that can act as the basis for abstraction later on. It seems to me like there is probably something like a logical sequence in which this should be done, but I don't necessarily always know which concepts and ideas from active inference to prioritize. As an example coming from a cognitive linguistics background, I tend to think of beliefs as either causal reasoning or the underlying sensory motor simulations that connect to linguistic concepts. I've been trying to find new intuitive examples that can help me to understand beliefs as probability distributions, as clearly this is the key for making progress in understanding active inference. So my question for all of you is what is a system or a metaphor that helps you understand active inference and why? And while people are raising their hand on the bottom, we have just a few of the systems that we've been discussing over the last weeks. We have a brain, the technical and the artistic parts of the brain. We have insect colonies. We have dance and embodiment and movement. And then on the right, I guess it reflects something like the economy or changes in dynamical systems. But what is a system that helps you understand through specifics something about active inference? Any thoughts on this? I hate to be that guy, but I started understanding active inference when I went into the math. It's a formal theory and I think we can actually be misled by metaphor. Yeah, some of my reservations about the philosophical work that's been done on active inference it operates on the basis of kind of unformal, intuitive, conceptual understandings rather than trying to see what the math that says. So I hate to be that guy, though. No, that's perfect. We'll go to Alex Vyatkin and then to Kate. Go ahead, Alex. Yeah, for me, most important changed my views. It was taken seriously approach from like for process and understanding concepts about states and steady states of possible processes. And later on, always concept with Markov Blanket as a border and underlying structure like generative models in links with cybernetics. It's bringing me some kind of top level picture and now I'm trying to focus on different parts to have more detailed understanding. Cool. Kate and then Mel. Yeah, I was also going to say cybernetics. I think I only really started to understand what was going on once I started seeing the links as like ashy and accounts of survival as stability. It was when things started to make sense for me. And there's also just like a nice metaphor I quite liked in I think it's it's a paper by Jellie Brunerberg. I think we can keep standing right felt with the Crooked Scientist metaphor where the scientist is trying to obtain the result that it's looking for. So it has to have some sort of latch on the world in order to bring about that result. But it's still trying to obtain the result that it wants in the first place rather than passively find out what will happen. I thought it was quite nice to metaphor so do you think we are helpful? Cool, Mel. So I remember I'm remembering my first sort of aha moment with the FEP and Active Inference. I was explaining this stuff to my friend who works on on similar things but in a very different framework in the MIT Media Lab and had like papers string everywhere and scribbling all of them. My realization in explaining this framework to him was that that there is fundamentally at the lowest level at the level of the simplest sort of systems or at their kind of lowest scale. No difference between belief updating and action. So a change in the system is a change in the system. Right. And that is both a belief update and an action. Right. And once you build in further levels of hierarchical organization and complexity you get this strong differentiation between them but at the lowest level the same thing. And Kate did you raise your hand there? And also I'm just going to yeah Alex Kiefer go ahead. Oh yeah I mean I was going to that's I like that point that at least at the lowest level the belief updating and action are kind of same thing I guess that's one reason that's one reason that I'm skeptical that we need to explicitly I think implicitly more traditional Bayesian brain stories kind of are built on that claim as well but I was going to say what got me into what got me into active inference also sort of a technical point was just that you really can't you can't optimize the thing you can't minimize surprise just by adjusting at least not okay the surprise depends in part on the sensory observations right so you can't minimize that just by changing your generative model you have to change the observations so that's that's the thing that got me into it cool and yeah thanks everyone for being flexible and fun on the tech Maxwell I really agree that the metaphor as always with metaphors it's a question about compression of semantics and meaning and at the end of the day we're talking about something specific and so we can think about different systems that help us understand how active inference is deployed in the real world but often metaphors will take us off the mark because it removes us from the real underlying details so let's return and the details are really where the devil's at in this case I think but yep I agree and we're hopefully in the end with the figures we're going to be able to look at those details really specifically and I think that the way that figure 2 3 and 4 lead into each other helped me understand a lot about the system so just to take another look at these 2 cities the 2 cities that are being linked the densities on the left we have the Bayesian structural representation list and Alex freely adding any details here but Bayesian models move between the data parameters the observations and hyper parameters or higher order parameters and from the data to the hyper parameters we call it a recognition model and to go from hyper parameters to a set of data or observations that is a generative model and the outcome is a statistical convergence of a multi-level model that represents structures of the world for example through an expectation maximization scheme we can contrast this with the inactivist school of thought which has linked rather than data and hyper data but rather has focused on the world and agents and how they're linked through perception and action so a lot more adjacencies to areas like niche construction and ecological psychology and we remember hearing from Maxwell last week about how it was the desire to mathematically rigorize some of the inactivism that led him to the Bayesian approach and let's now think about how these 2 densities are linked with a little bit more feedback from one of our participants they wrote the domino metaphor finally makes sense the physical dominoes are the physical state of the system corresponding to the recognition model and the falling of the dominoes is like the process or the dynamics of the system reflecting the generative model so here we've kind of combined these 2 different ways of thinking and we're looking at the relationship of the world and the agent or the system and the system surroundings and directly went ahead and combined perception with the recognition model and action with the generative model and then also just to combine that with what Kate was just talking about with cybernetics this is a quote from the paper that says on this view active inference can be read as a new take on the good regulator theorem proposed by content and Rosh Ashby in 1970 active inference tells us about the relation between a control system the generative model with priors over action policies and a system being controlled the organism and its adaptive behavior the actual actions undertaken in part of the world so what exactly does this clarify about active inference or what do people think about this metaphor or at least way to think about how if not necessarily a metaphor how would we apply active inference to this system what are we getting at here with these similarities and differences between the 2 models sure Mel well so the good regulator theorem the idea is that any good control structure for some larger system has to be a model of that system it has to contain in it all of the the key variables for that system and basically what this says to me about active inference is that in order for a system in an environment to be able to adequately adequately react to whatever kinds of perturbations are happening in the environment it has to contain all of the key I call it an existential variable the variables in the environment that would lead to the system continuing to exist or not continuing to exist cool and so how exactly I'm just curious does the system come to embody all the key variables of the outside world don't the models of the organism reflect a simplified version of the outside world otherwise there kind of is a story about the map that represents the territory exactly so how does a good regulator arise in the context of an environment like that Shannon and then Maxwell well I don't think the key is to be an exact like imitative model of the world or an exact map even of the world but just a model that is good enough to enable you to act and adjust in the world cool Maxwell and then Mel yeah I mean so formally speaking there's a difference between external states and hidden states and I mean when everything is going well the external states that are modeled by the system coincide with the hidden states that are actually out there and this is more I mean it's not I'm not sure if it's guaranteed but it is strongly implied by the fact that we're minimizing variational free energy that you know if we're not generating a lot of free energy then the external beliefs that we have about the environment tend to reflect you know the causal structure of that environment but I mean Manuel and Chris Buckley have done some interesting work showing that actually there isn't a necessary kind of entailment relation you can have the system can be acting that don't actually exist and I mean this makes sense also just evolutionarily perceptual systems don't track truth they enable adaptive behavioral loops so if you're a prey item like a bunny it's adaptive to generate a lot of false positives compared to allowing for false negatives in terms of like predator detection it's more advantageous to you know make a few less costly mistakes than make a big one so I mean yellow Breenenberg and Eric Reveld make that point in the paper that Kate was mentioning just earlier but it doesn't actually have to hook up to anything in the real world they'll usually will end up doing so cool Mel and Sasha yeah so the key term in what I said is key actually so the key variables right that's just what keeps the system alive and we can imagine that we first get a system on the scene that needs just basically one parameter in its environment to be correct in order to continue to exist as a system and then we build up from there to systems that are progressively more and more complex and need progressively more and more complex sort of environmental scenarios to survive and then they themselves in order to survive in those more complex regimes need to build in complexity right in in terms of their their the models that they enact right and I think that says something interesting about how we conceptualize cognition and the onset of cognitive complexity as sort of and this is Peter Godfrey Smith line on the subject is is it the response to a complexifying environmental sort of concept right cool Sasha then Alex Kiefer um yeah one key phrase that really helped me better understand active inference um is good enough and that's what really kind of put it all together that while we're going about and reducing uncertainty about our system it just has to be good enough to make the next action or in the evolutionary sense to survive and that metaphor has really helped me go through all this all these high level concepts so I think you're mentioning that Alex Kiefer then Alex thanks um yeah I just wanted to just note that there's there's you don't need to contrast accuracy or you know answerable to the truthness with um enabling active behavior uh as as like a dichotomy um I'm not sure if that's what was intended or not but there's one reason I've had trouble getting into understanding the sort of anti-representationalist viewpoint is that um you know structural similarity is a matter of degree accuracy is a matter of degree in this sense so um I mean yes your hidden state representation doesn't have to map completely accurately onto reality doesn't have to capture every detail but if it doesn't do that to any degree um you're screwed so you know um maybe we need systematically um sort of simpler or biased representations in order to do the job but I think there's still a implied relationship to the truth there cool Alex yeah thanks uh I want to change a systems level of consideration of possible application of active inference and for me interesting uh on a personal level in terms of day phenomenology and especially uh professional and or a working day phenomenology uh how it works what is a generative model and like an example uh and what I want discuss if I get it correctly or how it's could possible develop to think about it that if for example some doctor he know different disciplines and uh this different ontology and when he met a patient he started his action by his generative model which activates with recognition models to serve as a doctor depends on exact situation and exact case and uh if it's so from for example from another level from team level if person behave professionally in some discipline it's became like perception model for generative model of the team and what does it mean in terms of learning and education of team members because if we can how in some way to link recognition models with ontologies and starting to work with it more again cybernetically it's possible could be very interesting at least for me cool agreed the the ontology is the how we think about the world definitely influences our perception and Sasha do you have something else otherwise okay Shannon go ahead um I was just going back to Alex Keeper's point that your model like there has to be some truth of the world and if we're going to the dominoes and the process needs to be that the dominoes fall to make some cool pattern if there's an interruption like so now this domino can't reach the next domino like that's akin to a disruption and and how truthy the model is of the world and it disrupts the action that you're able to take or it disrupts the um you die instead of being able to survive because your model isn't accurate at that stage for the processes to continue happening to do the next action or to survive cool cool discussion on this domino system let's talk about counterfactuals and also return to some of the great questions Alejandra raised in previous weeks this is from another participants feedback they wrote what I'd like to clarify if possible is the following am I right in understanding the generative model as a control system that is dynamically instantiated between an organism and an affordance in the context and the recognition density as a relation between the sensory inputs and the possibilities for action policies encoded in the organism if so how does this work when action is instantiated by counterfactual forms of cognition or simulation i.e. those that do not rely on affordances in the current external context and so on the bottom right there's a younger person looking into the mirror or maybe vice versa and seeing an older person and this is reflecting how not only is there a deep temporal aspect to our action in the current moment meaning that there's often affordances that aren't available to us at the second that we're actually making the decision but also that we think about future possible affordances that don't exist based upon the sensory data that we're getting or do they and then on the right side I have a few bacteria because this is like an example of a system that we might not think of as having a simulation or counterfactual based mechanism of action often bacterial behavior can be summarized quite concisely by just the short term gradients that it's ascending or descending so for anyone to raise their hand and chime in how to counterfactuals play into this and how do we make sense of affordances that aren't in the current ecosystem but are still important for behavior Maxwell? I just want to comment on the first part of that question I think it's a bit simpler than that in terms of the relationship between the recognition and the generative models or densities so your recognition model is just your posterior all of your posteriors basically all of your posterior estimates over states and precisions it's like my best guess as to what the value of all these states are and I mean you can think of that as the system's current physical state so under the active inference formalism the current physical state of the organism in codes basically the the specific parameters of these beliefs over external states the actual physical states of the system encode these parameters which so you embody a best guess the generative model is the point of reference for the free energy gradients so you can think of one as like the inference model and the other is the control model because the inference model of the recognition density is sort of my best guess right now and the generative model is essentially a model of what my phenotypic preferences are and it's the tandem between the two that kind of gets you behavior as in the generative model provides the point of reference for the definition of the variational free energy gradients i.e. how close am I to my desired sensory distribution and the recognition model tells you where you are effectively so it's like where I am where I want to be and the free energy is sort of the difference between the two effectively so I mean the first part of that question, yes you're right in understanding the generative model as a control system the recognition density is just the best guess it's the posterior so I mean yes it is harnessing these relations that are being described here but that's precisely because it's a posterior estimate of states and precision and all that and one more thought on that is for an agent who knows about for example how hammers and nails are related through their internal models they might look at hammers and nails and pieces of wood and think later on let's build a house with this and so it's actually a future affordance but it's based upon sensory input at the current moment cool I'm gonna just move on to the next slide and just a short little quote by Buckminster Fuller because sometimes I just have to he wrote rope may not be much like water but not is like the wave and this is from the synergetics framework and it's getting at this notion of structural integrity which is like the rope and the water which are quite different they're both static and stable to pattern integrity and knots and waves can travel through medium and so by virtue of traveling through medium kind of like that domino wave there's something similar about these traveling waves that goes beyond the mere structural integrity of their media and so I just a thought question for later is really is active inference a structural integrity or a pattern integrity I would say that it's a pattern integrity and just bringing up this notion that many far from equilibrium systems though their mechanisms are extremely different for example the bacteria and the person there's gonna be similarities because of the way that they're needing to persist from a cybernetic perspective enough there let's get to that question about structural aspects of attractors so here a participant provided feedback they wrote are sensitive to initial conditions so any two initial points will lead to different precise physical states and dynamics but will still travel through a very similar trajectory which hovers around these attractor basins the states and the dynamics of a set of points over time can be taken as a whole and be identified as embodying structural representations of the Lorenz system so this is something from the realm of chaos and complexity theory and the Lorenz attractor is an attractor that describes a simplified system of equations describing the flow of fluid and it's a fluid of uniform depth and there are also parameters for the imposed temperature difference the gravity parameters the buoyancy the thermal diffusivity and the kinematic viscosity and the simple physical analogy here is like heating up a pot of water there's heat coming from the bottom and there's cooling coming from the top and there's this churning happening and it goes around these two different attractor states so in this case maybe Alex or anyone else who raises their hand what is structure and what is representation and if we agree on what this system is then what is this representation debate that's happening? Mel? Mel? Unmuted Mel? or Maxwell go ahead okay let's go to Alex yeah Mel if you want to speak you're raised then we'll go to Alex Kiefer first yeah I was going to say Alex okay okay I was going to say Mel talk I don't have much to say about the Lorenz attractor it looks like a I'm not sure what's supposed to be represented in this case I have an intuition in the case of a biological system because we're intuitively representing some kind of environment I'm just not sure I'm not sure how to begin with this example right we've got the state space and this nice graphical representation of the state space but unless we interpret this entire system as representing something I don't know what to say about it's sort of content okay cool Mel? I don't know what a representation is and so for that reason I guess I think that the question of what a representation is and what in the world is a representation is maybe too ambitious only because I haven't told that but there's a related question that's less ambitious which is what is the difference between biological function and a representation and now I think is more manageable in some respect in filled bio-mind COGSI we speak of the disjunction problem so we speak of and I know Maxwell like hates this framing and thinks it's updated and we don't need it anymore but the idea is that in order for representation to be a representation it needs to have the capacity to represent and we have a similar idea in biological function in order for a function to be a function it needs to have the capacity to malfunction right so you need malfunction to have function you need misrepresentation to have representation but how do we differentiate a function from a representation? Good question Maxwell then Alex cool well so regarding the question what is the representation we targeted a very specific account of representation I mean in the philosophy of cognitive science representation is basically an internal thing within an organism that stands for some feature of the environment or the organism's own body in a way that the organism can leverage to do something interesting we focused more specifically on the claim that generative models are structural representations so I mean this has been an account that's been Opien O'Brien first worked on this in the mid-2000s and then Glazyewski and Milkowski in the context of predictive coding and then Alex Kiefer and Jakob Haui you know built on that I think in a very interesting way and you know from that point of view structural representation is an internal structure that's internal to an organism that gets its representational content because it stands in a relation of structural similarity to some target domain in the sense that second order structural features of that target domain like the statistical properties of the domain are recapitulated or like mirrored in the actual properties of the representations themselves that's the first there's a structural similarity first property, second property it's not just that these structures are in the organism they have to be exploitable in some sense so the organism needs to be able to use the content encoded in the representation to guide intelligent adaptive behavior and then there are these two other points that I think are pretty minor it has to be detachable meaning that the system can use it offline like as in you know not actually engaging with the environment and it has to be able to afford representational error detection in a manner similar to maps so those are the main features of structural representations according to the the philosophical account on offer and in this paper we propose that generative models are not structural representations precisely because they don't really meet these these properties don't seem to be true of the generative models I mean yeah it I think Alex and I are gonna still disagree about this even after talking about it for like dozens of hours at this point but yeah the main reason why in this paper we argue that they're not is that the generative model isn't encoded in anything and this is like one of the main differences between more traditional Bayesian brain architectures and active inference as Dan was pointing out really early on in traditional architectures well the recognition and generative models are just inverse ones of the others the recognition model is a mapping from the data to states and the generative model is a mapping from states to data so it's really just the same kind of set of connections and everything but it's just like which direction are you considering it like from the top down flow from the top down flow or the bottom up flow of information in the active inference it doesn't work like that the generative model is nowhere in the dynamics sorry is nowhere in the physical organism it's only in the dynamics so it's very much like these dominoes falling over the wave of falling dominoes isn't present in any of the kind of it single dominoes falling over the wave is is part of the dynamic phenomenon and it's the same with the variational free energy in the generative model the generative model just is not there at any time slice it only exists insofar as we consider a lapse of time as the point of reference for the free energy gradients so from that point of view it can't possibly be a structural representation because it's not encoded by anything cool that's the point a lot there let's go to Alejandra we'll go to Alex Alejandra I figured you'd have something there yeah well I mean there was a really nice explanation of what structural representationalism is so I don't have much to argue about there and in general I don't want to argue more than is necessary about me but I think the reason I want to insist on some of this stuff is just for my own basic sanity to just feel like I have an understanding of what's going on with this stuff right so the reasons despite I understand I think I understand exactly why from an active inference perspective you wouldn't want to say that the generative model is encoded let me just say it first I think the question of whether or how it's encoded is maybe a slightly distinct question from whether it's a representation and the reason I think that it has to be a representation while there are many reasons one argument the recognition model is an approximation to the generative model right and so that's one example of like a rational or probabilistic relationship between these models and I don't think that makes any sense if the generative model is not a representation also I'm not sure that's how that works under active inference I agree that in a more traditional Bayesian brain sense the recognition model is trying to approximate the generative model I mean that's essentially how it's working it's an approximate posterior oh yeah it's an approximate posterior but it isn't approximating the generative model well it's approximating the posterior under the generative model right so that means right both the generative and the recognition models are concerned there are some distributions in part over I have to be careful here not the actual external states necessarily but over hidden states I don't know how on earth you get those into the picture if it's not a representation that you're talking about because the hidden states are environmental states they're not right we have to be as you point out in the paper we have to be careful to distinguish the generative model from the generative process we can't just identify what actually happens with the generative model I think we need to have this is one of the fundamental questions that I think in activism often is faced with is how do we deal with yeah I mean Mel brought up misrepresentation so the fact that the fact the point is the way that you see the world the organism or the creature or system sees the world is not necessarily the way it is but we still have that distribution over external states as essentially part of the generative model let me just say one more thing here about encoding so the reason I think if you look under the hood a little bit so the generative model under active inference as I understand it can roughly be identified with the non-equilibrium steady state density right something like that so I mean I would just say the dynamics sort of instantiate the generative model I think that's a really cool point and I actually think that structural representation didn't do quite enough to emphasize the fact that the generative model is as a control system but I think if you look under the hood there will be some features of the system like stable structural features in virtue of which it has that nest density so in fact you will find some stable thing that you could think of as encoding the generative model maybe at least that's why I still have I'm still attached to my views on this cool let's do Alejandra then Mel yeah I think I agree with Alex where are I'm kind of confused how the generative model can be conceived just like this inactive process really I was reading the paper all over again I actually don't get it for me it was the recognition density is the inversion of the generative model so if it is not encoded like anywhere it's I feel kind of lost there for me these top-down connections this is the generative model talking about the brain specifically and the recognition density is related with this button-up connection so you can recognize your best guess right so if it is not encoded what can be said about these top-down connections I don't know maybe I can jump in here and just provide some points of clarification about the generative model if that's okay yep okay is screen sharing activated you could share your screen if you yeah you probably should be able to okay let me try this okay so this is from a paper called the graphical brain and this is what the generative models look like can everyone see my screen also we're going to be returning to this so yeah so the generative model is defined as a joint probability density over all of the variables of interest in the system in this case the variables of interest are the states that we're trying to infer which are here s the policies that I'm trying to pursue which is here pi and the data that I'm observing is the time step which is here denoted O just so everyone is clear time flows from left to right so this is the first state this is the second state this is the third state and essentially the model you only have access to this data point these are counterfactual data points and the whole thing is updated basically at every time step and so look the generative model itself like I said joint probability density it's like the probability of all these variables connected with and effectively and when we say that the generative model is not present in the system I mean that this density here this joint probability density is never encoded anywhere in the system because the generative model itself is factorized so basically you take this joint probability density and then base rule and other chain rule etc manipulations you can write this joint density as the product of a bunch of likelihoods and priors and it's these likelihoods and priors that are updated constantly as part of the recognition density so this prior you know about your data given your states and this other prior about your states given the next state and the policy and all this these are updated dynamically and their current value like all of them together can collectively comprise the generative model yeah that's right sorry the recognition model the generative model is how these quantities are all connected the ones to the others if you want to think about it heuristically you know this all of these specific parameters like the s, the pi, the o all that all of those values together as they're being updated comprise the recognition density and the generative density or the generative model is really just these relations between the different quantities as they change it's literally that the the inference process itself is what holds all these quantities together and the generative model is just a description of how those quantities flow together yeah so again it's like the dominoes falling over thing you know the wave of dominoes only exist in that motion and similarly the generative model only exists in that kind of coordinated inference or update dynamics of the quantities that are part of the recognition density yeah and the reason it's called the generative model at all is just because it's a terminology borrowed from machine learning in machine learning the joint probability density over all of your variables is known as a generative model so it's called that way just because that's the technical term it's also called that way because if you're starting from a joint density over all of your variables you can actually generate fictive data that you would expect under this configuration of different parameters thanks for that clarification, Maxwell will have those mills we'll have mel and then we'll carry on well that all sounds pretty good to me go for it mel yeah okay, Kate and then Mel I just wanted to follow up to what Maxwell was saying about structural representation can you all hear me? can people hear me? yes, yes we can hear you awesome, yeah just to follow up to what Maxwell was saying about structural representation the sort of detachability of the representation that's what should get us the distinction between a biological function and at least a structural representation right because my feet and legs and knees are a representation of ground in some very minimal sense in the same way that fish fins are representations of fish fins and bird wings or representations of fluid dynamics right but this isn't detachable from immediate environmental circumstance for offline use right so that's what should get us the difference between a functional representation but I'm curious to see what role that plays in the FEP and active if any do we want to is there merit in retaining a distinction between a functional representation under active FEP or are they just continuous nice very good question Kate did you have something there? okay okay I'm going to just continue with the slides because I'm not sure I think Kate's up next I was just going to reply I think I'm still in the case of the paper was really useful for me in helping me understand what the generative model was in the FEP almost all the way when it's described as a statistical description of the generative process all of that makes sense to me but then there's also parts in the paper where the generative model sort of is what the organism expects and it guides what the organism does which makes it sound like the generative model is playing this kind of causal role with the generative model itself and I find that kind of confusing if given the generative model isn't something encoded by the organism or by the system I'm still I get the idea of it as a statistical description of the dynamics that sounds right I just don't understand how that's all it is that something the organism uses it seems like it's something we use to describe the organism rather than something the organism itself is in any way using that's a good point I've proposed in a recent paper that came out in entropy not too long ago that both are correct so you basically the correct way to interpret the FEP is as instrumentalism nested within instrumentalism meaning that we can have a model theoretic philosophy of science reading of the FEP and you know for that matter any other theory of cognition as this is a model that we as scientists are using to explain the behavior of organisms what I've been trying to argue is that the FEP itself at the theory level is also saying the organism is exploiting the statistical structure of its body and movement aka the generative model to guide adaptive behavior so the model really is these harnessed relations between the variables that have to obtain for survival to persist so it's sort of this idea that once the system kind of gets moving then there's kind of an inertial dynamic thing kind of keeping it moving so like my core body temperature is 36.5 degrees and the fact that it is keeps a bunch of other processes in play that in turn keep me preserving my temperature by initiating adaptive behavior like putting on a parko and it gets cold and so on so both kinds of generative models are used separately in dynamical causal modeling of the kind that the first in group used to model the spread of COVID we're not assuming that the process that we're modeling is itself an active inference agent and so you can do that but you can also make the additional assumption that it is and then you're in the game of active inference proper cool thanks Max I'll go ahead Kate can I come back to that or yes absolutely I think the kind of where I get there then is it sort of seems like you're losing the distinction between the vehicle of the model and the target of the model such that I worry it sounds a bit like just saying everything is a model of itself essentially I mean effectively that is what we're saying and that's where it gets a little hazy for me like you know that's probably the point of disagreement between Alex and I also I want to say the generative model just is the organism in the sense that like this this specific you know statistical structure that harnesses all of these different you know existential variables as Mel was saying like this really just is the organism like this kind of statistical structure that keeps kind of reiterating itself and coming into existence that just is the organism and that's one of the reasons why I don't want to say that this model is a representation of anything you could just call it you know the phenotype we call it a generative model because formally speaking the same construct is borrowed from a field where in which it's called a generative model but you can think of it as the organism's phenotype and precisely because it's a bit weird to think of the model as modeling itself I kind of resist that interpretation and say well you know if anything is a model really in the more traditional sense it's this recognition density that's performing posterior state inference and I'm sure that Alex will tell me why I'm wrong it's a great response let's just move on through a few more of the slides because there's a lot in the figures that I think gets at this Mel has her hand up go ahead Mel I just wanted to interject but I think I think Maxwell is possibly miss modeling himself I think he's I think he's actually a tad bit more realist than he's letting on Oh Maxwell a realist at heart exactly if you can't take care of me at my most real so in this next slide one of the feedback just from a participant just to give an example and we won't spend too much time here but I like this idea that the beliefs don't have to be verbal that the physical therapy context shows us how the range of movements that the body does not want to go to for example due to a traumatic experience or injury embodies a belief that doesn't involve language and this was a paper with Lemonowski and Friston 2020 where a person's hand was opening and closing and then they were getting a VR visual that was delayed or was inaccurate and so there's so much here about how our sensory feedback are proprioceptive as well as our visual feedback multimodal systems are being used and that's that would take us so many hours and more to clarify this is what the representation questions are about so it's like the person what is the representation of the current state of their hand being closed at that just pure phenotype level maybe somebody with a ruler says well it's whether the hand is closed or not but it's not just that because if you're getting visual input that's messing with you it changes your behavior so there's something else happening and that's really a lot of what we're talking about I think going through the figures 234 are going to be really helpful before we have closing thoughts and thanks everyone for bearing with this was sort of a weirdly technically altered stream briefly the Bayesian networks so Bayesian networks are from Bayesian statistical sciences and in these diagrams the nodes are variables they're like random variables, stochastic variables or parameters the edges are statistical relationships they're probabilistic relationships and they can be undirected or directed these models are quite common in statistics and also there's some really nice aspects about them in that you can bring prior knowledge to the table with your priors and you can also estimate it directly from the data with a parametric empirical base and to give an example about what this looks like here on the top right we can see how like an accident increases the likelihood that you'll hear sirens an accident also increases the likelihood of having a traffic jam bad weather it can increase the likelihood of accidents or traffic jams so these Bayesian networks are pretty commonly used in a lot of statistical areas and we can contrast that with what are called for any factor graphs and working through this device and first in paper was definitely quite an experience this is a complex paper and there's a lot in it but in a for any factor graph as they write in the paper it's a type of graphical model that shares qualities with Bayesian networks in Markov random fields for any factor graphs afford a visually insightful representation of generative models especially beneficial for complex models underlying hierarchical active inference in a for any factor graph each factor is represented by a node and each variable by an edge so it's like a flip of the Bayesian an edge attaches to a node if the edge variable is an argument of the node and so here on the right is a table from the paper of de Vries and Friston and it shows how we can think about these functions like addition subtraction multiplication as having variables coming in and out and just to really quickly summarize the similarities and differences in the Bayesian networks the nodes are variables whereas in the for any graph the nodes are the factors or the functions in the Bayesian networks the edges are the probabilistic relationships amongst variables whereas in the for any factor graphs the edges are the variables themselves a cool trick about Bayesian networks is this model inversion which is that the model can be gleaned from the data and also the data can be generated from a model another cool trick of for any graphs is that they specify a specific order for factorizable computation making extremely massive models tractable and then a last cool fact about a Bayesian network is that's where the Markov blanket formalism comes from which is it's the set of nodes that insulates one part of the network from each other from a probabilistic perspective though there's a lot of nuance and how it's used in the free energy principle framework and the cool fact about the for any graphs is that it's linked to information theory and message passing computation and other math so in figure two we see this generative model in active inference and this is actually the model that Maxwell had on his screen earlier and I annotated all the pieces so just as Maxwell was saying time flows from the left to the right and in this is a again a Bayesian network representation so the the boxes or the circles and there's different types of them which are described in the caption the boxes in the circles are like variables so the initial state is a vector the hidden states are variables the state transition matrix is how the hidden states map to each other that's B a is the map between the states and the observations and then figure three is the same thing as a for any factor graph and so this is showing that there's this equivalence with representing something as a Bayesian net factor graph and the for any factor graph allows for these extremely large scale hierarchical models to get factorized and then computed in a tractable way I should say like the advantage of using a for any style factor graph is that the message passing is just you can just read it directly off the for any style factor graph there's an algorithmic way of moving from a base net to a for any style factor graph so typically intuitively the base net is easier to write because you can you can just literally write down the dependencies between all of your factors like this like I know this state depends on this other state and you know there are these transitions and etc blah blah and so you know from this you can move to the for any style factor graph and that tells you just directly how that will be implemented in active inference I should say that the reason why I we discuss these at all is really just to show in the next figure how the figure for how the generative model and the generative process kind of link up I mean to me this is where the interesting stuff happens where like literally the generative process and this is a model of the generative process right like how we think it's structured it's in this little box here and then the for any style factor graph represents the message passing that is ongoing yeah outside the box to keep all the quantities within you know phenotypic bounds thanks Maxwell and yeah I know that was a little bit of a rapid sprint through these figures but it's so cool that there's a formal way to relate the Bayesian and the factor graph models and just as you specified it's easy to specify sometimes the Bayesian network just like oh well the traffic jam depends on the weather depends on this other feature and then the factor graph allows us to have an order of computation and a really implementable way to compute it and also it has some similarities with message passing for example in neural network systems so it's an interesting question that I've been having is okay we talk about this is as a Bayesian theory but if it can also be represented in another way is it really a uniquely Bayesian theory gained by saying that it's a Bayesian theory it's within the domain of Bayesian inter-transform ability or interoperability but that doesn't uniquely disambiguate it and that's why I think some of these philosophical questions that we're talking about here like about representation and about cause are so open because it's not uniquely specified by the free energy principle or by active inference like how exactly we should answer these philosophical questions because in the end as Maxwell said it's the technical details this is what we're talking about when we're talking about the free energy minimization and how that bears upon the policy selection this is what's happening and also one more point before if anyone raises their hand to provide a comment is um notice that the free energy is linking to the policy selection Pi P for policy and that means that the organism isn't just minimizing the free energy of the observations given a sort of neutral representation actually the policy representation specifically how it bears upon how states transition into each other that's what the actual computation is being done upon so the observations are being used to run then upstream through the hidden states and their transitions into the policy that ultimately dictates what's going to be done by being selected by the organism or the system as the most likely so this is really how deeply action is embedded in these models it's not that there's a latent representation of the external states and then you're doing a best guess on what you should do it's actually like action is at the top here the policy is at the top and so I think that's really a nice feature of these models and it becomes graphically clear and then if somebody wants to have a philosophical framework a or philosophical framework b on top of this structure or if they want to propose a different structure for how observations and policies are linked that's something great it's something tractable that we can talk about but this also returns to the metaphor discussion this is the real thing as it is cool I know we had a sprint through there but it's just coming near the end of our time so any thoughts on these images or are people ready to sort of go to our closing thoughts I was just saying the equations are more interesting but yep the equations are interesting and the 2017 DeVries and Friston and this model it just reminds me that if we can write down that Bayesian relationship the probabilistic relationship between variables then there's going to be a process model that is going to be how we can link it and these captions are very informative and there's really a lot to say here but enough for today let's close it out we can hear any last thoughts but while people are preparing their last thoughts and raising their hand if they like I would just like to thank everyone for participating we're learning and developing each time figuring out the tech each time we provide follow-up forms to the live participants so just check the calendar invitation and it would be really helpful to have feedback we also request feedback suggestions and questions from the participants live in the comments or otherwise and just stay in communication we really appreciate everyone's engagement here and it's been such a great learning experience so if anyone wants to raise your hand Alejandra first and then follow it up by anyone else who wants to let's just have some closing thoughts well I was wondering if if it's possible to like model updating if when you are doing offline cognition without like an explicit policy you are not doing something online when this interaction bodily interaction you are thinking about something and then you have like this like Mel said like this Eureka feeling so this is kind of a model updating without action I don't know the short answer is yes I just linked the paper to it we we've created a ruminating agent so basically this agent combines the affective inference scheme that we have developed over the last two years where you're effectively inducing a new hierarchical layer of state inference based on state and precision estimates at the lower level and these are affective states that then guide lower level precision estimation so we combine that with what's called sophisticated inference which is the new generation of active inference agents vanilla formulation current vanilla you could see that in the generative models that we just showed they include beliefs about counterfactual observations that we would make as we propagate states into the future sophisticated inference is about propagating your beliefs about the observations that you would make in counterfactual futures and so by combining these two things we have an agent that isn't acting but that ends up ruminating and catastrophizing about its own cognition so I just linked the paper in the chat we submitted it to the international workshop on active inference so it's a little poster now but we're slowly expanding that into a full manuscript but the kind of summary six pager that just does what you were suggesting this is it thank you but this is kind of confusing with the inactive view of cognition this is why we're not like classical inactivists we reject the equal partners principle it's just not the case that the body the brain and the cultural mealier are going to be involved in the same way for every task so for rumination you could I guess appeal to a history of interaction which is the reason why agents are able to entertain beliefs about other agents at all and so on and so forth but you're right there isn't much inaction here so that's fine I think Alex Kiefer and then anyone else who wants to raise their hand yeah so I guess in lieu of a proper closing statement I just wanted to point out that Maxwell has some really cool new work on how the neuronal populations can be seen as enacting inference on their own level and that this is one way you can see structural representations being sort of realized so that's one thing we didn't have time to get into but I just wanted to say in general like I think yeah there is regardless of the precise philosophical take that you have on all this stuff it's cool to disagree on that level but this literature has produced some really amazing work and I think it's pushing forward in a really interesting way and one advantage of active inference certainly is that you can actually write down the generative model and sort of transparently see how it works and you don't get that out of just unsupervised learning cool any other thoughts awesome well thanks so much for the discussion we're gonna re-watch understand technologically what did and didn't work and for next week provide all of you plus anyone else who wants to participate with more of a specific spec sheet for how you can prepare and make sure that it will be a really like seamless experience and as far as the timing goes as well next week we will move on to ACTIMP 7 which is going to be about variational ecology which is a really awesome paper close to my heart as a biologist as well and we would really appreciate anyone who wants to participate in these conversations next Tuesday at 7am pacific or the following Tuesday both of these discussions will be really great and it's quite a complex paper so I'm sure we'll have a lot to go into again so once again thanks so much everyone for your participation really brought up some awesome ideas and I hope we conveyed to those within and without of the field that there's a lot of discussion ongoing by no means are these questions resolved and it's an opportunity for people to bring their unique perspectives to the table so thank you all again for your participation really appreciated and we'll see you later