 Hello everyone, welcome to Active Inference Live Stream number 6.1, it is October 13th, 2020. Whether you're a first-time listener or not, welcome to TeamCom. TeamCom is an experiment in online team communication and learning related to active inference. You can find us on Twitter at inferenceactive at our Gmail address, our Keybase team that's public, or our YouTube channel. This is a recorded and an archived live stream, so please provide us feedback so that we can improve our work. Also, all backgrounds and perspectives are welcome here, and in service of that, remember your video etiquette, mute if there's sound in the background, and raise your hand so that we'll be able to hear from everyone. Here we are in Active Stream 6.1, and today's stream is going to go like this. We're going to start out with a warm-up section where we introduce some of ourselves, especially our newer participants, and just have a quick check-in. Then we'll turn to the discussion of the paper. Today, we'll try to cover the goal of the paper, the roadmap, how they get from A to Z, go through the abstract, which is how the authors represent their work in the most distilled form, and then look at some of the figures, see where they're going to be going with the figures, what they're going to be showing, and then next week, in 6.2, we're going to have a lot more time for further discussion on this paper, and also to dive into some of the technical details of the figures, so please save and submit your questions. All right, here we are in the intros and check-ins. For the introduction section, please introduce yourself and your location. Just say a quick hello, anything else you'd like to add, and then feel free to pass it to someone else. So, I'm Daniel, I'm in Davis, California, and I'll pass it to one of our first-time participants, Matthias. It's actually Mathis Pink. I'm a master's student in Germany at the University of Osnabrück, which is in Lower Saxony, so it's afternoon where I am. I guess it's in the morning where you are. Yeah, I'm excited about participating. Thanks for having me. Great, let's start. I don't know, is there another person who's new here? Lee. Alex, maybe. Let's go to Lee, our other new participant. Hi, I'm Lee, I'm based in London, but I'm actually studying at the University of York, studying embodied cognition, and yeah, really looking forward to this. I will pass on to Alejandro. Hello, everyone. I'm again at Mexico, kind of tired. Last week was very hard, but yeah, nice to be here again with you all. And I'll pass it to Alex. Hello, everybody. I'm Alex. I'm an engineer, basically, and now also a researcher, and I'm stated in Moscow, Russia, and affiliated with System Management School. So I'll pass it to Klipp. Hi, I'm John Klipp. I'm in Cambridge, Mass, with the MIT Media Lab, and actually doing this in Northern New Hampshire at my farm. And I'm very pleased to be a part of this and to learn. Let's go to Alex Kiefer. And I'll pass it off to Maxwell. Okay. Hi, everyone. So I was here a couple of weeks ago. I'm a philosopher, I guess, and I'm a sometime computational modeler at Monash University, and right now I'm in New York City. And I've been involved in this debate that's explored in this paper for a while, so I thought I would join in. And I will pass it to who's left? Shannon. Hi, I'm Shannon Brooks. I'm part of the Sensory Motor and Neuroscience Lab at the University of California in Merced. But currently I'm in South Dakota. So I'll pass it to Sasha. Hi, everybody. I'm Sasha. I'm based out of Davis, California, and I'm affiliated with University of California Davis. And also just very excited to learn and unpack some of these topics. It's been a really great past six or so streams. So I will pass it on to Stephen. Hello. Yeah, I've been involved in this for the last few sessions as well, so very, very helpful and very interesting. I'm doing a practice-based PhD into some processes to explore social topographies. And at Canterbury Christ Church University at the Solomon School of Applied Psychology with the help of the Professional Development Institute. And I'll pass that to Maxwell, I suppose, isn't it? I believe I'm the last one. Yeah, so I'm Maxwell Rammstead. I'm based in Montreal, where I'm talking to you from at McGill University. And I'm also the first author on the paper that we'll be discussing today. So I'm quite excited to discuss it with you. Thanks, Alex. Kiefer, by the way, for showing up. Alex and I have had a sustained series of discussions around these papers that have led to follow-up papers. And I think to a really robust and fun friendship through these discussions. So I'm looking forward to discussing all these issues with you. Thanks for the continued following of our work. Awesome. You know, we bring the ideas together sometimes through the people and the friendships. And I guess I'm not going to pass it to anyone since everyone else has spoken, right? Yep, I think that is everyone. So cool. Let's go to our warm-up questions. And anyone who likes to speak can feel free to raise their hand or jump in if there's no one speaking yet. So the first question is, what drew you to this paper or topic? And while people are raising their hand, I'll start. I think what was exciting about this paper was about just combining two different schools of thought and bringing them together in a constructive way that was really super additive rather than just choosing sides. What about you, Maxwell? Well, the motivation for writing the paper in part was that I don't think that, at least at the time, I didn't think that what the generative model business under the free energy principle is all about, really. I don't think that was well understood at the time that we wrote this. So I alluded to this, I think, a few weeks ago. This paper, the Answering Schrodinger's Question paper in Physics of Life Reviews, and the paper that we discussed the last few weeks, Multiscale Integration, were all originally the same paper. It was all one big thing. And we ended up splitting it into different papers to address different things. So the multiscale stuff that we'd been discussing to address like the scale-free and multi-level formulation of the FEP, and the aim here was really to connect the FEP to other pragmatist approaches, the center on action, and also to clarify the nature of the generative models that are at play here, which aren't just kind of brain-bound statistical models but turn out to be something like the phenotype of the organism. Cool. Anyone else have thoughts on that? Alex, Kiefer? Yeah, thanks. So I guess, well, I was drawn into the debate because some of my work with Jacob Povey was among the sort of target, the critical target of the article. Although Maxwell and company were very nice to us, and they didn't say we were wrong. And this doesn't exactly generalize to the free energy principle. So that's how I got drawn into this, and I continue to be interested in it because I think it... I mean, I think we've converged on more of an agreement than maybe existed at this time this paper came out, but... Oh yeah, 100%. I think we basically converged to one coherent story at this point, but this is two years in the making. Yeah, but the issues raised here are still interesting to think about, and I continue to learn more about thinking about them. Awesome. All right. The second warm-up question is going to be, what would be something that you'd like to have resolved by the end of today's discussion? So that could be a specific question about how to apply something to a system. Yeah, Lee, go ahead. Go ahead, Lee. Sorry, I was just turning my sound off. So yeah, essentially, I guess I was drawn to this topic because my journey towards active inference has been via cognitive linguistics. So I started off looking into relationships between language and perception, and particularly metaphor. So I think I've got quite a different understanding of what's meant by model, probably something more kind of phenomenological and maybe epistemic, and I'm really starting to understand that that's not really what is meant by active inference. So I've taken a good look through this paper, and I'm hoping to build some kind of bridge from where I'm at to more of an understanding of what's implied by a generative model in active inference. Cool. Well, great, because that's one of the main topics of the paper. And Stephen? Actually, just bouncing off what was just said there, in some ways, when you take metaphors, if you bring metaphors into embodiment and as being embodied in space and around us, actually then suddenly that whole metaphor work becomes actually almost very close to what might be a generative model in a funny way, even though it's different. So there might be some way, so in some ways I'm kind of interested how this sort of becomes full circle and can come back to these kind of human ways of knowing, even though we've accessed in it through maybe kind of abstracted models. Interesting, and I think we'll return to some of these ideas. If anyone else wants to chime in on what they'd like to have resolved, but very interesting to hear about metaphor as cognition and how we can humanize our understanding of some of these technical or conceptual issues. All right, well, if there are no hands remaining, I'll move on to the next slide. So today we're going to be talking about a tale of two densities. Active inference is an active inference, which is an article in adaptive behavior in 2020 by Maxwell Ramstead, Kirchhoff, and Friston. And in this paper, they lay out their goal really clearly, which I always love to see in a paper. They write, at the very beginning, the aim of this article is to clarify how best to interpret some of the central constructs that underwrite the free energy principle, or FEP, and its corollary active inference in theoretical neuroscience and biology, namely the role that generative models and recognition densities play in this theory aiming to unify life and mind. So what are the two densities? There are going to be the generative models and the recognition densities. And we're going to learn more about them and hear about how they're related and discuss different perspectives on how the densities are linked. And specifically, the question is, what is the tale of these two densities? It's alluded to in the title, and it's awesome that we have Maxwell and Alex and so many other voices here to make that synthesis and that tale that we're all telling together realized. So in the Act Imp Stream 6.0, I provided a little bit of context just from my perspective on some of these issues if people want to learn more about some of the background ideas. But for now, we're going to just jump into the abstract and at any point, people can just raise their hand and I'll just pause right there and we'll take a comment or a thought. In the abstract, they begin by rehearsing what I had just read, that they're looking to clarify how to best interpret some of the constructs underwriting the free energy principle and those two constructs are the generative models and the variational densities. So those are the two densities and the tale is going to link them. We argue that these constructs, generative models and variational densities have been systematically misrepresented in the literature because of the conflation between the FEP and active inference on one hand and distinct, albeit closely related, Bayesian formulation centered on the brain, variously known as predictive processing, predictive coding or the prediction error minimization framework. More specifically, we examine two contrasting interpretations of these active inference type models, a structural representation list interpretation and an inactive interpretation. So we're setting up the two sort of sub stories. These are the tension between these two perspectives that we're going to be looking to resolve under the FEP through active inference. We argue that the structural representation list interpretation of generative and recognition models does not do justice to the role that these constructs play in active inference under the FEP. We propose an inactive interpretation of active inference, what might be called an active inference. In active inference under the FEP, the generative and recognition models are best cast as realizing inference and control. The self-organizing belief-guided selection of action policies and do not have the properties ascribed by structural representation lists. So for this next slide, I'll really appreciate anyone's perspective or linking it back to things they've seen before, because I just put it up there as a visual and just a starting place. On the left side here, we have the Bayesian structural representation list, perspective that we're just trying to highlight the features that are going to be most simple to carry forward. And the Bayesian structural representation list story is about how data and another type of data, which are often called hyperparameters, are linked through a recognition model that takes data, like sensory data, and recognizes it. And then, going the other direction, you have the hyperparameters that are generating sensory data. And we can also talk about why it's important to have this generative step. And the outcome of this Bayesian computationalist scheme is that there's a statistical convergence of a multi-level model that represents structures of the world through something like expectation maximization or EM models. And we can contrast that with the inactive paradigm. And the inactive paradigm is about how agents and the world are related through perception and action. And the outcome of the inactivist perspective is an embodied ecological action sequence, really, from an embedded agent who is enacting behavior. And so this is always the school of thought where we see all the ease encultured and embedded in all these things. And they're seemingly, at least at the first pass, up to two quite different explanatory outcomes. They seem to be talking about somewhat different aspects about the world, and they definitely link them through different ways. So I'm just curious, Maxwell or anyone else, what led to these two models being the two cities, the two densities that were chosen? How does one come down to just two? Why is there not one or three cities? And then how did the Bayesian structural representationalist and the inactive viewpoint rise up as like the two kind of tier one theories that we wanted to find a synthesis between? Well, so when I got into this literature, especially from the vantage point of philosophy, what I noticed was that very little of it was technically rigorous in the sense that a lot of it was telling a story about how the brain roughly performs Bayesian inference and then kind of saying, well, there's like a family of different theories that do this in various ways and grouping the free energy principle under that. I say there was like a lack of technical rigor. I want to emphasize that Alex's papers with Jacob are probably the exception to that. When I consulted Alex's papers in 2018, I thought, well, here's some wonderful work that is really taking the time to drill down on the formalisms as they're used to study the brain. I thought that was great. But I spent a lot of time really drilling into the formalism of the free energy principle per se. And one of the things that I was sort of surprised to find out as I was learning the formalism was that although everyone is talking about the generative model, there are really two models at play. So those are the generative and the recognition models or densities equivalently. So first of all, we say model by model, we just mean a probability density, right? So a probability distribution over a bunch of variables that are of interest to us. And so, yeah, the two models in question under the free energy principle function slightly differently than they do in more traditional brain-based. So to step back a bit in machine learning and in statistics, a recognition model basically tells you the probability of some state given a bunch of other things. It's not a joint probability distribution. And it's used essentially to recognize what's causing your data. So you're using it basically to invert your mapping. I mean, Alex, also, if I'm saying anything inaccurate here, just please jump in and let me know. But yeah, so there's basically in traditional kind of, you know, Bayesian brain machine learning architectures, the generative and the recognition models are basically just the inverse of one another. So the kind of top, the kind of bottom-up pass is a recognition pass. It's a recognition model in the sense that it's starting from the data and then kind of passing through the network you're able to infer what must have caused your data. And the generative model is the inverse pass where you're starting from your beliefs about states and you can generate fictive data, you know, based on this model. At least in Alex's papers, the way that this, well, this is how it was described as applying to the free energy formulation as well. And my point in this paper was that, well, this is a very technically rigorous and accurate description of what's going on in the Bayesian brain, but these constructs have a slightly different meaning under active inference. And to bring it back, I'm almost done. Just give me one more minute to bring it back to these schemas, basically the, so the recognition density is sort of like your best guess right now. You can think about it sort of like as your posterior and your prior. So your recognition density is a density defined over all of your states and your parameters. And it basically tells you what do I believe is the most probable value of these states and parameters now, you know, given my prior beliefs and my evidence. Your generative model to the contrary is the point of reference, you know, for the generation of free energy gradients. So it's not your posterior. It's your prior. It's sort of like it harnesses all of the priors, especially the priors about your preferred data distributions relative to which the free energy and therefore the dynamics are defined. So, you know, you write inference and control. I think you discussed in 6.0 Dan. So the recognition model is responsible for inference and the generative model is responsible for control, you might say. So I'll stop there. Cool. Alex Kiefer, and then we'll go to anyone else with a raised hand. Yeah, thanks. So I stopped myself from jumping in. I mean, that was a good summary. The only thing, the point at which I wanted to jump in was to say, well, so the recognition model isn't as construed in these Bayesian brain theories is an approximate inversion of the generative model. So my, if I have any complaint about this paper, it's that I think that there's a closer sort of conceptual connection between the generative and recognition densities than maybe the paper suggests. And that, I don't think you can cleanly separate these things. So, well, anyway, I don't want to launch into this yet. What I wanted to do first was just address the question was sort of how did these two visions sort of arise? And my sense is that what Carl Friston did, he did a lot of great stuff, but the main thing he did that distinguished his approach from the existing stuff in machine learning, a lot of which was based on free energy minimization, which I think he came to around the same time as people like Jeff Hinton. But anyway, he added action into the picture and he pointed out that you can act so as to reduce the surprise or the free energy cosplayer sensory states instead of just revising your generative distribution. Anyway, so I think I'll hold off on arguing for the moment. Awesome. I just want to say I agree with you now. These two things do not really come together, come apart, sorry. I mean, it's really just an implementation of variational inference from that point of view and variational inference. And this is why it's a tale of two densities is that to do this variational inference thing, you need both. You need a kind of point of reference that's going to give you the free energy gradients and then you need a sort of what is my best guess as I am performing gradient descent on my free energy. And yeah, I don't want to say that the two come apart. And indeed, I mean, I think, you know, in a newer paper that I've written on this, I basically just straight up say that you were correct initially with respect to the recognition density. So the recognition density, it's fair to say that it's a representation in the structural representationalist sense that you've been articulating in the series of really awesome papers, which you should all read by the way, I think. Yeah, whereas the one point of disagreement that I think we've clarified now is the status of these generative models. And yeah, yeah, I'll stop there too. Nice, we'll go to Stephen and then anyone else with a raised hand before the roadmap. Now, one question I've got is in terms of if these kind of hyper parameters that they use in machine learning are using a free energy, is it that they're just minimizing the kind of energy expenditure and the entropy internally? And they're not using Shannon entropy. They're not looking at how entropy is inferred or transmitted, sort of in a second order process from interacting in the world. They're just minimizing it within the kind of data that's being kind of accrued, which kind of in some ways seems to happen easily when you look at like vision data, but may not be so easy to pass when you sort of look at the whole body. But that question of entropy, is it that they use entropy in a different way when they're really using entropy in the kind of external entropy of interaction, but kind of just the entropy within the calculations in terms of... There are two kinds of entropy at play in general, right? So entropy in the thermodynamic sense is a measure of how many macro states are compatible with a given value of a macro state, right? So I don't know, your temperature is 36 degrees Celsius. How many different macro state configurations are compatible with that? So that form of entropy is one special case of the broader kind of entropy, which is more or less a measure of how flat your probability distribution is over your states. So if you have a perfectly flat distribution, it's optimal. And so the entropy that we're concerned with is really the second type. It's the information theoretic entropy, but it's transpired over the last few years that variational free energy is also a thermodynamic free energy. You just have to multiply by Boltzmann's constant. So essentially, yeah, it's always information theoretic measures, but if this is realized in actual physical systems, it is also a thermodynamic free energy. But the variational free energy, it sounds kind of spooky and esoteric, but it's really simple, really, is that you have a preferred data distribution and you have an actual data distribution and the free energy is just the way to quantify the difference between the two. Awesome. We'll go to Alex and then back to Steven. Yeah, I was just going to briefly say in the earlier machine learning literature, the free energy definitely wasn't supposed to be anything thermodynamic or maybe that was an open possibility, but it was just a measure, as you were saying, Steven, between two internally determined distributions. It was the top-down generative posterior versus the approximate posterior. And any connections to thermodynamics are really cool, but I think that's an additional, very substantive question slash thesis. Steven, and then anyone else who raised their hand? Yeah, I think this is quite a useful distinction because I think that's where a lot of the mixing gets sort of caught up at the moment because a lot of stuff has been referred to in the last 15 years around Bayesian optimization and Bayesian stuff, and then where it's all pretty much as if it's contained in the data in the brain and then now it's like a reconceptualizing of that. So I think it is a big challenge for people to forget about what they've learned before. Really agreed with that, Steven, and that's really what this conversation and synthesis is about, is about bringing those qualitative, often, insights from inactivism and saying, well, wait a minute, perception isn't simply the reverse of action. There's not photons coming out of your eye. So what is different between perception and action, but also recognizing that the world and the agent have this sort of symmetry and that they are linked? So again, anyone can raise their hand, but we're going to turn to the roadmap and ask how the authors set up the discussion between the Bayesian and the inactivist approach and then ultimately converge towards some of a synthesis. And on the right side is just the title page of the Dickens work, Tail of Two Cities. So first there's an introduction sequence and then they talk about statistical models as representations and specifically they talk about generative models and recognition models in Bayesian cognitive science as well as generative models as structural representations. So that's the Bayesian structural and the representational components. Then they talk about generative models and action policies which is sort of opening the door to this potential inadequacy of the purely Bayesian approach which is that it's not action-oriented. In the third section they discuss the active inference framework. First, they start by talking about how phenotypes are conceptualized under active inference and about Markov blankets. And there's a figure one which we've seen before and we will see again about how Markov blankets and active inference are linked. They then discuss surprise, entropy and variational free energy which are all terms perhaps not surprise but the other two that we've brought up in this discussion and in the paper they really go into detail a bit more about how these topics are linked. There's then a discussion about how active inference specifically links the variational free energy with the inferential models. In section four, a tail of two densities, they talk about how the generative model and the recognition density are to be conceptualized under the FEP. And specifically, they talk about the generative model and the generative process in active inference. In figure two, they depict a generative model and represent it as a Bayesian network. In figure three, they talk about the same generative model in active inference represented as a four-knee factor graph. And the similarities and differences between figure two and three were probably going to go into next week in 6.2. There's then a section on variational inference and recognition dynamics under the FEP. In the fifth section, they turn to purely inactive inference and they say first the claim that the generative models are control systems. And so this sort of borrows on and builds on the generative model idea which is a bit of a computationalist perspective as well as this control systems perspective which implicitly means action orientation because control of action, not just control of theoretical parameters. They then take the fighting words section. Generative models are not structural representations. I always like it when we can get down to specific claims and negatable statements and just really being specific because sometimes these abstractions get pretty far out. There's then the final figure four which depicts the action perception cycle in active inference as a generative model and a process. And last, they conclude with some remarks. And the remarks also conclude with towards multidisciplinary research heuristics for cognitive science. And I know we have a lot of awesome cognitive scientists and other perspectives in this room and so this is sort of our springboard where we can take the ideas that we're talking about here and ask how does this impact how we're going to do the cognitive science research? What kind of explanations, what kind of predictions, what kind of experiments are we going to do differently? How will our research paradigm be different if we take this approach that is synthesizing the inactive and the Bayesian approach? So especially cognitive scientists would be awesome to hear any thoughts. And while people are sort of collecting their thoughts about anything on the roadmap or prior in the last 20 minutes here we'll have good time to go through the figures and ask about how they relate to this big topic. How are the two cities linked? What is the highway that connects the two cities or what is the broader collaborative network that links these two? Any thoughts here on the roadmap? You definitely laid out the reasoning in the paper very well, so thank you. Good, and these papers are straightforward to lay out because they really are almost an additive way of laying out the topics. First, you lay out the strongest version of the other approach and actually it's the professionality and the clarity and not misrepresenting another viewpoint just to throw it under the bus and make a straw man argument. It's laying out the strongest form of the prior literature is what allows us to build another level of strength on top of the literature rather than just cutting out the pillars from underneath us. I have to say we started off with this a much stronger position than we ended up with and it's by engaging seriously with the arguments and Yakabed prepared with a few other people, Paolo Godziyevski and so on had been arguing for this kind of stuff. Thank you for saying that about the professionalism. I really think that there is something to that argument and the whole point is to keep the baby while getting rid of the bathwater. The baby is this sort of idea that there is something like a representation of the external world that carries semantic content and so on. It's just that it's not the generative models, it's the recognition densities. That was sort of the tweak that we wanted to bring to the table. Cool, Alex, and then anyone else who raises their hand. Yeah, like Maxwell says, we have a lot of interesting discussions about this. I guess the point on which I remain not convinced yet is that I think I still think of the generative model even in the FEP as a representation so we can maybe talk about that. I think to me what this brings to the table and this is transformed how I see this stuff is the importance of understanding the generative model as a control system. I just think you can do that in a way that also grants that it's a structural representation and that's still the framework in which I work. So I think there's another question raised here about whether the generative model is sort of encoded and I do understand where the authors are coming from on that question and I think it's not so straightforward in the FEP as it would be in some earlier models like the Helmholtz machine. So I don't know, I'm curious to see if Maxwell and I still disagree. I have the sense that I maybe still want to call the generative model a representation more than Maxwell does but we can get to that. Can I ask Alex and again anyone can raise their hand and have a thought here too is what is the advantage of wanting something to be a representation or what is the alternative to something being structural? Right, so I mean so I don't think in those terms in terms of the advantage of it I just think is it a representation or not? I think it is based on whatever representation is and how this thing functions in this model. So I know oftentimes people talk about like what's the explanatory value of talking in terms of representations but I mean to answer more directly so what's the alternative to it being structural? That's another good question. In my view this structural representationalist paradigm really isn't anything new it's kind of just like the core notion of representation that's been at work in serious cognitive science since like Turing. So like if you go back to you know like there's a paper from 1980 by Alan Newell on sort of foundations of cognitive science and he described how what makes a particular symbolic physical symbol system able to what gives it its universal computational capacity is its capacity to simulate other systems. I really think this notion of simulation is at the heart of what representation is supposed to be in cognitive science and that really just is structural representation and if you go back less I won't go on forever. If you go to Cummins's early work on structural representationalism the S representation can also be read as simulation. So it's not that I think that you need to call the generative model a structural representation I just think once you understand how it functions I would say even under the FEP that's like what it is but there's still a question about how it's encoded or whether it's encoded I think that's a separate question. Awesome thanks for that response. We'll go to Lee and then anyone else that can raise their hand. Okay. So excuse me I was just trying to make sure that I was understanding terms in the same way that they were being used but Alex you just talked about the notion of representation as a simulation and one of the ways I've kind of arrived at active inference is via the workers of Barclay and sexual symbol theory and simulation is that overlapping with what you're meaning by simulation or does it have a different meaning if you're familiar with this work at all? Probably. I'm not as familiar with that work as with the New York paper that I just mentioned. I am and it is just a great idea. No just kidding I want to read it. Okay. I'll accept that. I'll take that for the work. Yeah I mean essentially especially these deep generative models they allow you to entertain well I mean there's a sense in which the Bar-Salu idea of metaphor is just what the generative models do you know if we're talking about a kind of loose associative structure that redeploys inferences about one domain in another I mean this is the kind of thing that you would expect the generative model would be able to do so these things are definitely I think strongly aligned in effect. Didn't Steven suggest something similar? I think like the, yeah metaphors generative models in a strong sense cool notion will go to Alex Kieffer and then anyone else. Yeah so that's interesting I don't know what to say about metaphors but I think the way I see the way I see this now is that I don't think what Maxwell at all are saying is at all of course maybe not of course I don't think it's at all wrong but I just think that are the sort of the structural representationalist reading my view is that it doesn't quite say enough but not that it's wrong in any particular respect as applied to the FEP like so after further conversations with Maxwell and also with Michael and the other authors on this it seems to me that there's still a need for a sort of neuronally realized generative model in addition to considering the entire phenotype to be a generative model so if you have a fancy organism that can do deep temporal kind of modeling and plan for the future and things like that I think there's still a need for something that looks like a structural representation in the brain that is the generative model can be construed that way as well as there's this sort of larger more encompassing system and maybe this could be somewhat hashed out by appealing to the multi-level active infant stuff but last piece of this in discussions with Michael Kirchhoff I hope I'm saying his name correctly we've never said it out loud to each other it seems as though at least he is thinking of top down sort of propagation of signals as a form of action so the stuff that we were saying about top down generative models might just be sort of a special case of the sort of action that this paper is talking about cool before we go to Steven and then Lee it really reminds me about Alejandra's point one or two weeks ago about what are the similarities and the differences between the systems that can at least appear to do these deep simulations counterfactuals like the brain versus a embodied phenotype the cell perimeter is not simulating other cell perimeters that's a little bit different than the way that the brain might be able to do something strategic so we'll go to Steven and then anyone else can raise their hand one thing I always look at when we talk about representation also in performance is to think of like re-presentation is there another presentation in the brain which often people start thinking of as a projection and then you start getting into this very visually dominated view of what our knowing is you know so whereas it's bit different to re-enacting or re-imagining so like if I was to re-imagine as an engineer a model and then I manipulate it in my brain well am I manipulating in my brain or have I re-imagined it in my peripersonal space and am I now manipulating it actively in my peripersonal space and that is what I'm actually thinking of as a representation but it's not actually my brain at all it's just a re-enactment in space for me to then start to work with or if I do it on paper I think the mechanism for that is not quite clear but this sort of opens up that question Paul Lee Stephen I think I understand what you're talking about but are you just kind of mixing up the phenomenological with the underlying ontological level so if you have a representation in the brain and you simulate it that it manifests in experiential space or perceptual space is that what you mean? Well to manifest it phenomenological space is to enact is what I'm saying so it's not like it's in the brain then I project it into space it's not in the brain it might be encoded like place cells, grid cells all that stuff with all the other stuff but it might be encoded in a way that can re-enact it but once it's re-enacted so the work with clean language and micro phenomenology embodying concepts in space around us so my kind of feeling is that a metaphor is kind of like our reading on our kind of affect and our kind of thinking, our beliefs re-created in a kind of a physicalised way you know so we actually do like I feel like I'm stuck behind a wall like I stuck you can actually ask people where's the wall and they'll know where the wall is and what's it like to be stuck so suddenly you find that they've actually got like a model but they never had that model before they kind of created it so the question becomes is it a re-enactment a re-imagination and how much of that was already I suppose it's a question was that already sitting in the brain ready to go or did it only come about as the act of re-imagining cool in our last 10 minutes it would be awesome to hear from anyone who hasn't spoken as well as just to flip through the figures so that we can see where we're going to be headed to do some technical unpacking especially next week so anyone can feel free to raise their hand if they have a related or unrelated point but in these last few minutes let's just try to look at the figures and hear some closing thoughts so in figure one we have the Markov blanket and active inference and we've returned to this figure many times it's used across papers and it's really nice to see it again and again because the context is always different and the part that I underlined in red was in graph theoretic terms the Markov blanket per se is defined as the set of nodes that isolates internal nodes from the influence of external ones and in this capacity it's really good that we had the discussion about integrating internalism and externalism because those are also sort of like two ideas that have this tension between them and the beyond internalism and externalism paper was about building on top of those about taking the strengths of those two ideas and then asking how they're integrated under ACT-INF and today what we're doing and in this paper is about taking interpretations of internalism and externalism specifically this Bayesian more computational perspective and this inactive embodied perspective and building on those strengths so it almost reads like a recipe for how active inference can be used for synthesis which is by identifying areas of tension in the literature or previously unrecognized harmonies between different ideas and then just asking how we can build strength on strength rather than attack strength to weakness or just contrast weaknesses and I'm just going to continue just to show the other figures but I'll pause anytime someone raises their hand on figure two oh Alejandra go ahead actually yeah and that figure I can take again the question you mentioned I'm still confused in terms of generative models and their recognition density in which temporal and spatial scale of course in each individual cell of the brain this process is occurring and then in I don't know in a layer of the cortex and then in the hierarchy of the cortical layers and then in the whole brain and then blah blah blah but so belief of dating is happening in each cell I don't know if I'm getting to the point but the cells have to maintain actually this blanket contains also some beliefs that never can change right so to continue being the cell it is so I don't know this process of the densities is occurring in I don't know how many scales and I mean we'll end up hopefully discussing this next week if it's unclear I mean I have to say I don't think the figures are super useful in this paper they're mostly there to illustrate a few things what's really important is the equations so I mean for next week Dan I'll show you which one is in particular so that we can put them up yeah so Alejandro what you were saying is exactly right all of these things are performing a belief of dating and when we talk about behavior or dynamics or whatever it is all belief of dating and belief of dating just means changing the physical value of the recognition density at a moment so basically in this paper like the big kind of conceptual contribution that I tried to make to this literature is to say well with active inference we can rethink what we mean by embodiment and enactment so embodiment means encoding a recognition density so your physical body is a guess about the it's a posterior probability that you're constantly updating through this process of belief updating aka dynamics aka adaptive behavior like you've got this we literally embody this density the physical states of our body encode the parameters of probability densities this is the idea and then basically the generative model is just the point of reference for the dynamics that exists in the dynamics the way I like to think about it is sort of like dominoes falling over where the recognition density is the dominoes each of these dominoes is like a posterior estimate and they kind of accumulate but the wave is the generative model you see like the domino wave itself the generative model exists in the same sense as the wave making all the dominoes fall and the recognition model exists in the sense of the physical dominoes that are kind of encoding the process as it kind of flows over but this will be I think hopefully clearer next week mathematically it's actually pretty simple it's that the generative model is the joint probability density over all of your variables and this density doesn't exist in the brain it's not anywhere what exists is the factorization of this density so you know you always see this in all these active inference papers it's like P of your variables so the eta mu and all the parameters etc equals and then this long factorized which is basically it's a product of likelihoods and priors so this product of likelihood and priors is the recognition density that's constantly updated and that's what exists in the brain but this joint density only exists as a function of the dynamics it's like all of these partial like carving ups of this density right this factorization they all move together and in their dynamics together they realize this joint density sorry I just I do have to end it at the 59 as we discussed but this is the perfect excitement to build for our follow-up discussion next week we're definitely gonna go into this question we'll just pick up right here we'll literally just hit play on the video and go into detail on the technicality and let's even have some dominoes on the screen so again just thanks everybody for understanding about the timing I've provided a follow-up form to the live participants in the chat and we welcome any other feedback suggestions or questions and please just stay in touch stay engaged if this was interesting or exciting we welcome all participants and it would be awesome to have you on this stream asking the questions and also learning by doing with us so thanks everyone for the awesome and energizing discussion and I'm really looking forward to next week when we can go another level detail into all of this so thanks so much