 Hello, everyone. Welcome to Actinflab, to Actinflab Livestream number 32.2, it's November 16, 2021. Welcome to the Actinflab. We are a participatory online lab that is communicating, learning, and practicing applied active inference. You can find us at some of the links here on this slide. This is a recorded and an archived Livestream, so please provide us with feedback so we can improve on our work. All backgrounds and perspectives are welcome here, and we'll be following good and fun video etiquette for Livestreams. At this short link, you can see some of the upcoming streams. This is our second participatory group discussion on the Stochastic Chaos and Markov Blankets paper. And then in the second two weeks of November, we're going to be discussing Thinking Like a State by our friend and co-participant, Aval. And we haven't yet set the papers for December, so if you know an author who you'd like to invite or if you yourself would like to join for several weeks of discussion of one of your works, then just let us know. Today in Actinflab stream number 32.2, we're aiming to learn and discuss and ask what if and what is and hold all these different questions at once. We're in the second participatory discussion around the paper Stochastic Chaos and Markov Blankets by Fristin Heinz, Uldzhoffer, Dekosta and Parr. And last week, we had an awesome discussion in the dot one, and we got right up to the cliffhanger where the M word actually was deployed. And I think today we're going to sprint up to that cliff and then jump off. So it'll be a fun discussion for those who are watching live. Totally right questions and comments in the live chat and Stephen and Dean and I and anyone else who joins will be just more than happy to hear what you have to share and to roll with it as you write things. So let's begin just by giving an introduction or warm up. So we can say hello and then also just maybe what was exciting about the paper overall or since we all participated in the dot one. What are you looking forward to in the dot two? Or maybe how have you changed how you thought about it over the previous week? So I'm Daniel. I'm a researcher in California and I'll save what I'm excited for until a little bit later and I'll pass it to Dean. Good morning. My name is Dean. I'm in Calgary in the in the one week's time since we left off. I think this paper more than many of the other papers we've had conversations around breathes life into something that is typically perceived as not very alive. It's just it's just a sort of a point to and an explanation of but I think this is one of those papers where when people pick it up and start having conversations with one another about the kind of work that's being done here. It makes it really, really interesting doesn't give us necessarily the answers that the authors could project would be given and I really like that. Thank you Dean, Stephen. I'll pass it to Stephen. Good morning as well from Toronto. I'm really interested in this paper. I think because it does give a unifying perspective. It does help with that unifying perspective, even at first when you see things like stochastic chaos, it can be like, oh, we're going down another rabbit hole, etc, etc. But actually, rather than always justifying or talking about applied active influence influence through applications and quotient applications, there is some benefit to this kind of coming to the foundation. So I'm excited by that and also actually following a conversation we have in earlier in the fact that we have this complexity weekend is some of the ways that complexity can be talked about outside of a more woo woo kind of conversational sense at these foundational levels, which again seems to get into unnecessary detail. Ultimately, I think it starts to allow some of these crossovers between applications of complexity and this to happen, which for me personally, I don't think I was necessarily able to articulate a year ago. So I think some of this work does just help fill in some of those gaps. Thanks, I agree. Like the system that's picked up and turned around and remodeled throughout the course of the paper in a very narrative driven way is the Lorenz attractor, which is like the classic complex systems model. So it just kind of plays into contemporary work on complexity in so many other areas. But and then also, Dean, I really like what you said, like the authors, I hope we'll be able to join for a future presentation. So we're in contact, they probably won't join today. But what the authors are putting out there and then how that's received and communicated around and explored. That's what's fun. And I know that we'll have many cool things to explore today. So let's jump to the big questions. Then we'll look at the roadmap and try to remember where we got to yesterday and maybe a different big question is striking either of you or anyone in the live chat today. And then we'll try to remember how far on our road trip we got last Tuesday. And then let's see which sections we're going to get to. And we have a bunch of other things kind of written down. So the big questions that we had previously asked were, what is a good model of thing this in a chaotic dynamic and dissipative world? How might this model or approach to thing this speak to system sentience. And then they're in little tiny font, it's sentience not in the phenomenological, like experiential, but greeting survival. Good to see you. But sentience in the in the sense of feeling thinking and feeling systems without worrying about the first person experience side, at least in this paper. And then the key technical question of this paper, which broaches on to a million and a half other areas is what is a Markov blanket just what are these blankets, and how is one model identified defined statistically, what are the pros and cons of different approaches. And so that's some of the technical details that we'll get to. So, Stephen, thanks for raising your hand and then welcome survival after that feel free to say hello and introduce yourself if you'd like. Yeah, I'm curious. And I noticed here that we start with thing this, which I do agree is this big question of thing this is something that Carl Friston brings up quite a lot by things often. He's the he in that part I feel he's often a bit of a lone voice in terms of really pushing the thing this side of things. And I think it's really interesting that second question then of, okay, how does that speak to system sentience. And you see now we move into systems, right. And okay thing is sort of to me is it gives rise to systems and gives rise to sentience. However, I think it's thing this is still there, right thing this is still the game in some ways of being maintained. So that first question, because when we then get into the third question thing this can get lost. And it's all about system sentience. And I think there's something really interesting there because there is a question of, I mean, it's even as we say in those brackets, what it really means to have system sentience, or even just system is probably rather ill defined. And I suppose to some extent is how much is it even needed. How much can thing this be maintained as an equal partner in the game and not sort of passed over once we scale. So I think that's a good, good piece of a provocation. Thank you. And to what extent can things and systems be identified in a bottom up way from measurements of dynamical systems. Cerval, would you like to add anything or say hello? Bonjour. Hello, I'm a terrible. I'm a complexity scientist. I founded the kind of special battery which is interested in building an active models of social political changes. And I'm here to see the discussion on the formalism of the paper, which I did not revise. And therefore cannot make informed comments on. Thank you, Cerval. Dean. I just want to add one thing to what Steve was talking about there. I think what the big questions and what this paper addresses pretty directly is, can we accept a minimum up to that there's an abstraction and that there's a material aspect to thingness and marrying those two things is kind of what we're hoping to be able to do. But it comes in two very discrete forms. So what of that? We call it thingness because it kind of spans both realms. And that's what I think I take away from the idea that it can be chaotic and not. Absolutely, there's the map in the territory. And one thing that we're going to return to is like the first map that we apply to the territory. So already we're separating the territory in the map, but even the first map can be chaotic. And then can we put a second map, the Laplacian approximation as it turns out in the paper. Can that second map be one that gives us some properties that make it more tractable to handle? Like you know, potholders taking something hot out of the oven give a better grip on the first map, which gives a better grip in terms of actionable insight into the territory. Can that second level both be more tractable, especially using computers, but also retain those chaotic properties, which we're going to see when they estimate the Lyapunov exponent of Laplacian approximation. That's the recursion prior to the repetitiveness. That's what I think this really gets into recursion modeling. Thanks, Dean. So Stephen. Yeah, I'm going to put this as maybe a little bit of a question as well, but it's rhetorical if it doesn't want to be answered. But I think that the thingness and the Markov blanket, because we're getting to that stage here, there is this, I think there tends to be a sense that, OK, we have a thing. And because it's trying to sort of maintain itself in a just a dissipative world, then that's effectively a system, because it's got a Markov blanket and we're in that world. But with this, it's like, well, look, you've got Lorenz Attractor. Then Lorenz Attractor goes into thingness, right? Now, and then thingness potentially goes into, probably with a few phases, system sentient system. So the question is, which side of thingness is Markov blanket? Does Markov blanket emerge from going from Lorenz Attractor into thingness and then continue? Or does it is like Lorenz Attractor into thingness, into Markovian blanketed thingness? I'm curious what your thoughts are on that, and maybe this paper shed some light on that. Great question. Let's look at the roadmap and just see how that comes into play. So we talked in 32.1 about a few different ideas and we have some slides ready with just Tabula Rasa. If we'd like to add notes or we can always modify slides and deal with questions as people are raising them. Let's go to the roadmap, though. OK. So the roadmap of the paper is as follows. In section one, there's an introduction. In section two, there is from dynamics to densities. So this is one of the really fundamental kind of linking events early in the paper where dynamical systems, dynamics are linked to flow systems like densities. So this equivalence between dynamical systems and density driven systems is going to be finessed extensively in the paper. So that's an introductory step that's very important to note. The Helmholtz decomposition is a tool from vector calculus that applies very well to densities and flows where it's usually applied. But because of what links were made in section two, it becomes possible to take that Helmholtz decomposition approach to dynamical systems. That is specifically visited and revisited and then moved beyond in the Lorenz system. That's one of those classical chaotic systems and a favorite toy of complexity analysis. The Lorenz system is moved beyond and then approximated with a Laplacian and then even moved beyond there. So that's pretty much where we got to last week. We tried to do a really solid grounding in first the equivalence between dynamical systems and density driven systems or two kind of representations of underlying systems that could be seen as equivalent. And then we talked about the Lorenz system and about how it's chaotic and how chaotic means like sensitivity to small changes as quantified by the Lyapunov exponent. And then we talked about how the Laplacian approximation, which is kind of like fitting a quadratic, which is kind of like letting the ball roll to the bottom of the bowl. And this idea of like, well, if we can phrase the problem like a bowl, then all we have to do is roll to the bottom. There's only one bottom and we know which slope we're on. So that's also called convex optimization, because no matter where you are, if you know that the problem is shaped like a quadratic curve like a bowl, you can just instantaneously estimate the slope of the bowl and then just run downhill. Whereas if you're on a rugged landscape, just, you know, you're lost in a mountain region will go downhill. You're going to get trapped into a valley potentially and you might not make it out to the ocean. But if you could reimagine that rugged landscape in a way that was more bowl like it would allow for tractable estimation of which way to move. In section four, we get to the Markov blanket concept and the free energy principle. And I think that's going to be the really fascinating discussion is like, okay, so let's get what that Markov blanket. What is that partitioning? What are we really talking about? And then are we talking about the territory? Are we talking about the Lorenz map? Or are we talking about the Laplacian map? Where is this blanket partitioning coming into the picture? And then how does that help us reduce our uncertainty about systems? And what does it say or not about that underlying system? And that's where we're going to be talking about particular partitions. Of course, a favorite and recurrent pun, which is that particular means specific. But also it means like a particle, like a autonomous particle drifting around in space. And then section five really comes to a culmination with a discussion of the free energy principle alone. And so I think it'll be interesting to discuss like what is subsumed into the free energy principle and why do we need one section Markov blankets and free energy principle and then another section on just the free energy principle. So that's kind of the roadmap. Steven and then D. Yeah, just a quick question. See if this clarifies something which I think is what you're saying. Dynamics is the dynamics, more particular flow license and densities. While you could have a density of particles, it's actually more like a probability density on wave functions. Would I be right in thinking is there's a kind of a wave particle or part actually a particle then wave way into this. And that's partly how they're playing this out. Someone with more experience with the formalism would be totally open to their contributions on that question. My first take would be you could make a dynamical systems model like equations of motion for a particle floating around. And so it's like you're focusing on the first second and third derivatives of movements in XYZ space or tetrahedral space for some particles. So a dynamical system model could be describing the movement of a particle. But the flow that gives rise to the movement of that particle is kind of like figure and ground. Like the dynamical equation features the movement of that particle. I want to know the pendulum's location and so that's going to be a dynamical model of the pendulum. Whereas the flow based models like you pointed out are much more about what is the underlying space of the flow of that pendulum. So it's like if it's to the left of the bottom, it's like it flows this way. And if it's to the right, it's like it flows that way. And so that is about asking what is the underlying flow dynamics that do give rise to movement of potentially particles. And those are why this meme is on the slide and why they are made in equivalence early in the paper. Because it is like talking about the difference between the movements of a point through a space versus understanding the flow dynamics of that space. That yes does give rise to the movement of particles. And can I just ask one? So what we've almost it goes from just purely the momentum of a particle to this flow. And the flow is kind of a bridge then between density dynamics, which is maybe more of a probability density of where it could be, which can map both to an object where an object might be, but also where a wave function might be in broad sense. And then you're sort of going from literal particles in a classical sense to the flow. And then the flow then bridges into density dynamics, which is can be more probability orientated. Is that fair to say? Yes. Particle is going to take a particular meaning a little bit later when we talk about the partitioning of states. But I believe that's correct. And also the density, like you suggested, ties much more closely to statistical density distributions. Like in statistics, we're often interested in density distributions of a special type, like the area under the distribution summing to one, the integral over a distribution being one, allows us to treat it like a probability distribution. And so that's the whole debate in quantum. Is it actually a wave or does that distribution of positional uncertainty merely describe our statistical inference on the location of something that's at a specific point? So whichever side we want to come at this from, section two says it's going to be okay to use both. And we're going to sometimes lean more on our left leg and sometimes we're going to lean more on our right leg because there's tools that exist for one, but not the other. Dean? Yeah, I don't know if this begins to answer Stephen's rhetorical question, but I think where most people get introduced to a Markov blanket, they get introduced to the idea that it's either a way of partitioning and dividing, or it's a way of being able to envelop or blanket. And I think what the paper points out is that it's both all the time. So you can, for example, find a box of cookies and see all the cookies settling at the bottom, whereas the person who's described as being creative uses that box of cookies as a leverage point to get more cookies off the top shelf. Now it's a platform and now it's a mound. And so we can see it both ways at once if we think of it as both ways at once. It's not a nor question. It's that minimum of two piece again that is really fascinating about this, that really kind of blows people up. It's really both at once. Then you decide which is going to be. And that's what I think is really fascinating for people who just sort of start getting into this because they've been so habituated to saying, oh, it's a cookie jar. Oh, it's a platform. No, it's both. And that's what we want to do. Thanks, Dean. And Marco in the chat wrote, like a boat in a stormy sea, it moves at the mercy of the seas and winds flowing forces with only its rudder sales and other limited mechanisms to steward the world's influence on its journey. So Bucky meme one unity is plural at minimum to Bucky meme two on his gravestone. Call me trim tab kind of like the rudder. And that is exactly getting at the point of this paper, which is if we just were talking about a dust molecule floating in the air, then we could say, well, we could use the dynamical systems model of the position of the dust particle, or we could do inference on the flow of the room. And that gives us a different and complimentary, totally co-existent picture. But that's a dust molecule. What about active systems that have affordances that are able to do inference on action selection? Choose different policies. How do those active systems do inference? That's the winged snowflake. That's the moth to the flame. And those are the really exciting systems. And that also speaks to the difference between what's called mere active inference, like the particle where we can still have this partitioning and still see it as a dynamics or a flow versus the adaptive active inference, which is then we're thinking about those systems where the particle, the particular states, are able to affect different policies and select between them in a way that acts as if it's resisting dissipation. And those are the kind of exciting systems that we want to get to by the end of the paper. Stephen? And pulling on that idea of cookie jar is the cookie jar is maybe almost a particle, a thing that is leveraged, and there's a thing-ness which it becomes. So say it has a thing about it, but there's a thing-ness of being an affordance. So this is an interesting, just sort of came to me, but the thing-ness in itself is slightly emergent, because only by the process of trying to use it to get the higher cookie does another kind of thing emerge, which becomes a step. So there's something interesting there about the nest as an emergent. You know, is it ever a step? Really. But it's never not a thing at the same time. So there's a thing that's something quite interesting there in that emergent stroke, blow stroke, you know, literal sense. And also, I don't know to what extent this is intentional, but thing-ness, we talk a lot about N-E-S-S, the non-equilibrium steady state, or non-equilibrium stationary states. And so it's like the thing-ness, that's the thing. The thing we're talking about is the thing-ness. So we're not talking and saying that the system is at stationarity. We're saying that the thing-ness, by virtue of how we're modeling it, has stationary characteristics. And that is the thing as we model it, which is that always existing difference between the map and the territory. We're not saying that the territory is even a thing. We're saying the way we're relating and interacting with the map, which we totally made, no one's going to deny that, that map's thing-ness is tractable. I can hold the map in my hands, that gives me better grip on my road trip. Steven? That's a really, yeah, I think also the thing-ness can tie into the idea that something ontologically comes into being through relation, through some sort of network dynamic in which it acts. And that actually speaks to this Lorenz attractor. So the Lorenz becomes part of a network of an ontology of thing, which is really a thing in relation to what some other thing is. Because obviously the scale of the cookie box is quite a lot different to the scale of the atoms in the cookie box or whatever scale we want to... And then there's some sort of affordance level. And maybe we can get out of the whole sentient system thing, and it's like it's just an affordance for the thing-ness to maintain thing-ness. It doesn't need to know about systems as such, unless some creature happens to evolve a brain which starts to sit down working out system diagrams. Two-legged ones, potentially. Otherwise it's just things all the way up, right? Now that we're 30 fun minutes in, let's sprint up to that Markov cliff and then see where we go. So figure one of the paper is kind of making this first fundamental connection between the Lorenz system as defined by dynamical equations. So the traditional Lorenz is defined more like the blue version within equations of motion. Well, it has a positive Lyapunov exponent, which is to say that very closely located points diverge rather than stay in like a laminar flow parallel line world or converging, so they diverge. And that's what makes Lorenz system chaotic. Ergo difficult to predict given measurement error and tiny perturbations due to even thermal noise, which is like kind of how pseudo-random number generators work. And then there's going to be an approximation based upon this stochastic solution. And so the trajectory in the right panel here, it's the deterministic solution to the Laplacian form of Lorenz based upon the Helmholtz decomposition. And so this is sort of like the version on the right that's usually seen and the state space model is almost like we're sampling stochastically from points along that racetrack. So that was figure one. Would you just explain that? Just go over that one more time just to clarify that bit. I think I understand what you're saying, but it's kind of an important point you make there. When you say deterministic solution that I didn't, there is that that went just to clarify what that means. So the deterministic solution would have no noise so that the position at the following time is entirely determined by the non-stochastic state that it's in at the preceding moment. Whereas when we talk about a stochastic form, it's like that's like your car's driving and there's thermal vibrations on the road. Now the Lyapunov exponent on that car at non-zero temperature is small so that the flow term dominates the vibratory term. But there could be some other system where the difference between well, if it's in an absolute zero world and just on the road then it's going to go straight. But once you introduce any wiggling when the system is chaotic, it has that positive Lyapunov exponent then the tiniest wiggle causes the system to diverge. That's really helpful. So basically normally with the deterministic thing it's just whatever the initial conditions if it varies slightly but then once it runs it will settle into something. But if you've got the initial condition and you've got the perturbations due to noise as it's actually running which adds I suppose you could say a little bit of an extra change in initial conditions to each state that the Lorenz attractor is moving through. So okay that's helpful. Thanks Daniel. Yes and I'm going to just show this image. I don't know if it's the very very first but this is the image that Lorenz used to talk about chaos in 1961. So I mean what's the difference between 76.8 and 76.85? It's imperceptible. If your thermometer only had like three significant figures you wouldn't even be able to talk about the difference. But I mean how different can it be? And it turns out that those two systems start imperceptibly different. But nearby points in chaotic systems do diverge. And so it's like okay but they're still tracking each other right? It's like well no wrong later on. And so of course a more chaotic system with a more positively operative exponent diverges faster. But any system with a positive leading Lyapunov exponent is going to have divergence. And so if you have a purely deterministic system or one without this kind of Lyapunov induced chaos then you can predict far out. However prediction matter experts as Dean I'm sure will revisit us too meet their match with chaotic systems and that's actually one thing I hope that we get to by the end which is that by putting action in every loop and inference on action we can still whichever of these timelines we end up on we're gonna be doing okay. Because we're always readjusting our inference to whichever of these mountains we're on. It's not like one of these is better or worse or more or less challenging like they have the exact same upper and lower bound. The only mistake would be to make a prediction and be super confident about it and be on the wrong predictive timeline. But as long as we're always updating our predictions then either of these days are gonna be okay. But if we put on the jacket because we thought that it was gonna be cold here and that's when it was hottest then we're injured. But if we just think okay well we know we have the jacket afford in so we're just gonna keep an eye on that as we watch the temperature and continue to update our predictions. Maybe then it'll be okay. So that was the initial Lorenz attractor. Section three goes into the Helmholtz decomposition which we've talked about in several other streams and we've seen it introduced as the breaking down of a vector field the total current in this kind of electromagnetic representation into the irrotational which is kind of like the straight line there's no rotation in those straight lines and the purely rotational solenoidal current flow currency. We've talked about solenoidal ingredient partitioning and then what this paper does and I think the technical details are in the appendix B of 16 so that was the Bayesian mechanics paper and then also there's this housekeeping big lambda and that's in one of the appendices to this paper so that would be like something for someone else to help us understand a little bit of because it does matter and it's actually a key contribution because if the landscape is changing moment to moment you can't just do the Helmholtz decomposition once and sail on that sea without updating your estimate of the partitioning here of the decomposition here. That decomposition so again dynamical systems that's like the position of the car and then we're going to because of section two be having an equivalence between that dynamics and the flow modeling and one of the terms that shows up on both of these is the fancy J fancy I it's the self information the first part performs a Riemannian gradient descent on the negative log of the steady state density which can be interpreted as self information so how surprised should one be to be at a given location relative to expectation so we talked about how like the potential energy of the ball higher up on the bowl is higher and then the potential function drops and that's the physical analogy to this more informational statistical distribution approach to the self information and then in 3.1 that self information is going to be pursued and expanded upon for a relatively more simple case so we're going to identify due to the Helmholtz decomposition that there's this key variable which can be interpreted as like a potential function corresponding to the self-surprise and that self-surprise term may have functional forms so we can have a function that describes how surprised we should be Steven Yeah, if you could go to that last slide the previous one there I like this is quite useful so the fucker plank if it's an equilibrium which we're not at because we're doing non-equilibrium steady state but if it were then it would be zero basically it's an equilibrium so it's not changing so there's no divergence because it's zero and then he's taking this Helmholtz composition which in a way is like a vector isn't it? with a vector you've got it can split something's movement into X and Y so the gradient is coming out and the solenoid was at right angles to the gradient so it's kind of like that trick that he does with accuracy and complexity he can say that there's a solenoidal and gradient which go together when it's still in a kind of a particular angular momentum sense and then there's that ability to with the housekeeping to then move it into the flow which I'm not entirely sure I think that's the bit where we might need some quite what because flow could probably be more specifically thought about so now you've got this ability to sort of structure through the kind of more particular context of the solenoidal and gradient and then there's this transition to a more kind of general variable which can be just we don't need to get the particulars now because it's a variational metric that just gets used in an approximation sense and then it follows into what you're saying would I be on the right tracks with that? I'm not sure who he is but it's an expectation because it's a multi-person paper but just looking at these as like circles and triangles and squares we can see that upside down triangle fancy I of X is in both terms so there's like a Q multiplied by something and then a gamma multiplied by something and we see that omega is defined as like the Q minus the gamma so these two terms which were decomposed out from each other recombined into the final flow so that's just like if it were like A times X minus B times X and then omega were defined as A minus B times X so we're recomposing a flow equation that still features this fancy I and yes we're solving that for a specific point that matters we want to solve that because it falls at the bottom of the bowl so there's a function that's going to tell us more or less in multiple ways whether we're more or less away from the bottom of the bowl and again it's not about the system being at stationarity it's about this decomposition having like steady state just one quick thing at the bottom of the bowl the gradient could still have a value because it can still be spinning round at the bottom of the bowl but the gradient then I notice the gradient term isn't in the flow but the solenoidal term is the gradient is because Q minus gamma is in omega so this omega fancy I is just these two terms combined I was just going to say typically when this is brought forward it's easier to understand the decomposition and the recompiling if we view the blanket as a partition where it gets is when the blanket is literally an envelopment and there's a physicist by the name of Keith Humphries talks about emergence and he looks at that piece of it as well because there's a polar aspect to this we can think of that in terms of positive and negative fields magnetization and stuff like that and how that is also a decomposition and a recompiling that's presented in this paper that's the partition way of looking at it it's more or less but there's also the positive and negative polar piece to this when we look at the Markov blanket as a blanket as opposed to a separator and again if we're holding up two things at once this doesn't change that it's just making sure that we hold both things up at once and the divider I just want to bring that out there because again we can see it easier as a partition like the skin as a partition we don't typically see the skin as a platform for sunscreen that's my point thanks Dean Steven this is a good point in terms of you can imagine at first I was thinking about it's obvious that you've got a gradient that you can pass through maybe where the gradient zero but also solenoidal flow I suppose if things are moving around this way and it starts to turn and it starts to go the other way at some point the solenoidal flow is also passing through a zero so they can both pass through a zero point it just wasn't as intuitive when I first thought about it yeah and even if you have something that's spinning around like something that's spinning in a circle like a hammer thrower it's actually continual acceleration that doesn't mean it's speeding up it means that the vector is continually changing so now our map is of that acceleration and it is unchanging in how it changes so that is to say that the system can have in a different way of thinking about it that's a solenoidal component we're not saying that balls at the bottom of bowls are dead and unmoving we're saying in our map there's only one point because we set it up to be like a bowl map where the derivative is zero that's the solution where the rate of change is zero so that's not to say that the system isn't moving that's to say that the flow steady state has been achieved in figure four well in figure three we see the three states so this is like the cover of the GEB book like kind of three different projections from a three-dimensional system the second versus the third the first versus the third and the first versus the second and we're seeing how there's movements of sampled trajectories and they're kind of coming back to something that's bowl-like for any set of two dimensions so it's kind of like a three-dimensional bowl now because any two of these dimensions it's more like a regular bowl the kind of soup goes in but then like the soup bowl is like a down projection of a system that has an attractor in a higher dimension potentially a much higher dimension but Lorenz they're only modeling three because the particle only moves in three dimensions when we continue to investigate the bowl-like properties of our Laplacian approximation to that Lorenz system we see that there's a few different shapes of bowls depending on which angle we're going to shine the flashlight from some of them are like spherical so a spherical bowl doesn't have any specific correlation between the two dimensions it's like a cloud of points the regression is just like flat through it however there are some couplings where the bowl is non-random so it's almost like if we were tracking the movement of a ball it's getting jostled around in these bowls well like the one of the first and third if that was the light we were shining it would just look like it was a scatter plot there's a totally the regression is flat there's not much information being provided it's within the bowl so it already has some nice dynamics but it's kind of within that attractor zone it's a scatter there's no pattern but clearly for the first and the second we see that there is a coupling now imagine if we didn't see the bowl because of course we don't it's not there, that's part of our map we're just seeing the movement of that ball and it would be only on this manifold kind of being like spending more time going from the top left to the bottom right in these images so we're going to take that idea and then connect it to the Jacobian and the Hessian which is again throwback to our matrix math slide and an awesome opportunity for anyone who wants to bring more of a technical interpretation to the stream Jacobian is like the first the unnamed matrix is just like the matrix of position so that's like the first moment of the distribution and then higher moments are like the partial derivatives so Jacobian is like the first order partial derivative so that's like the slope of the ruler at that point and then the Hessian is the second partial derivative that's like the curvature so if it's a bowl there's only one point where it's flat and then it could be flat and the bowl could be like up or the bowl could be upside down and so that's why it's important to also have this second curvature term to understand whether we're at a bowl like this or a bowl like that so that's kind of like calculus and those bowls, those bowl attractor shapes which we're inferring using the Laplacian quadratic assumption not saying that the Lorenz has these possesses these, is these it's saying that's the approximation that we're going to make it turns out that although there's somewhat of an unclear correlation pattern in the first derivatives so like the movement to the left it's not like it's correlated with movement up however the second partial derivative, the Hessian matrix starts to look a lot like the covariance matrix so the actual positional covariance of these states with each other so like the on diagonal is kind of obvious like of course your position on the x-axis is correlated well with your position on the x-axis but there are correlations between state one and two and that is recapitulated in a structure of the second order polynomial approximation to these attractors that are Laplacian approximations so again up until that point in 3.5 and in figure 6 Markov has not been introduced nothing has been said that word hasn't come into play it didn't come into the system's definition and so that's just really important because it's just showing like there's a lot of ground work to get to where we make the Markovian partitioning and so there's power and challenge in that. Steven? Would I be correct in saying that effectively it's a case of approximating the states of the Lorenz attractor and that's what's currently there so this is different ways of showing that within some manifold state space and by approaching it in different ways it reveals a different type of approximation the only thing I would ask I'm a little bit confused why it does squish maybe I'm missing something I can see why there'd be kind of a round one and just trying to work out how it squishes is it something to do because of the solenoid or piece or the gradient piece or is there something else so if we were talking about just a particle undergoing Brownian diffusion in three dimensions so it doesn't have any kind of correlations or any sort of structures it's just a random walk in three dimensions well then it would have a spherical error profile for all of these images but we are capturing structure that exists because Lorenz attractor has a specific specification so the states are whichever are being modeled if you care about pressure and volume that's your state space that two-dimensional state space of the map it's not saying those are the only states of that system it just those are the ones that you're modeling in the Lorenz system we have three states its position in three dimensions but those three dimensions again also apply to the Brownian diffusing particle which has a different Jacobian covariance and Hessian we're talking about those being calculated for the Lorenz system with the parameters that they selected and so we are pulling out covariances and connecting it to this second order polynomial approximation in the Hessian for a system where supplement is not just a Brownian diffusion and so that's why their structure in the covariance and that's why the structure is interestingly recapitulated in the Hessian Dean yeah and all I would supplement that with is things here we go things that are in something that's moving the thing that's moving that's in something that's moving has a tendency to have the negative charge things all congregate in one place down in the lower right corner and all the positive things all congregate in another place in the upper left corner that's the nature of movement and so that polar part of it is in play as well if we consider it as part of these statistical transitions and that's again it's fascinating if you look at it a minimum of two things so not just not just as a partition but also as an envelope those things suddenly pop out at you and you see the pattern in a different way so I want to think about a physical system that's not chaotic but will help us understand what this covariance matrix is so let's think about a children's carousel it moves the horsey is moving around in a circle and so that's like our unit circle like the four quadrants and then also the horsey is going up and down okay so when you're you know so again it's like if you're at 12 o'clock on the carousel it's like you have a one for the y and a zero for x hope that makes sense now as it goes around it's like 0.5 and 0.5 again just using simple numbers here like the x and the y are correlated and then they're also correlated here so one could say that there's like a correlation between the x and the y like they have a covariance together but now the z axis is moving up and down independently so a system like that you'd see a covariance between the x and the y positions of the horsey and not that they're a different system but that they are uncorrelated, they don't provide mutual information on the z axis which is oscillating now if this paper had done we model a deterministic carousel and it's kind of like putting the rabbit in the hat because the covariance matrix it looking like this would be no surprise because it would have been defined that way so that's the key insight and one of the key contributions is like they didn't put this in the hat this is the OG complex system yet the approximation the Laplacian approximation pulls out a covariance structure that's extremely tractable and it turns out that it still has a very strongly positive and very similar Lyapunov exponent so it's like now it's like a chaotic carousel and yet our approximation our map of that chaotic carousel still holds the chaotic nature of the system while also giving these bowl like attractors for any given to and even all dimensions or states considered simultaneously so we're not conflating the simplicity of a non-equilibrium steady state with the complexity of the underlying density dynamics just because the map is simple doesn't mean that the territory is and so that to me is a knockout blow against realism and for instrumentalism because this paper saying nothing about the underlying system Actif doesn't say anything about the underlying system we're talking about approximations and our maps of territories in other words the probability density can evolve in a complicated fashion in attractors go ahead Stephen and from what you're saying it's like if I was to take that Lorenz attractor which is like a saddle people often talk about a saddle and you took the most agreeable way of viewing it so you view it in a box space this way and look for how it's moving that way and then flip the box like this and see how it's moving that way and flip the box like this and see how it's working that way those approximations would be those three different distributions however if you'd have moved it by like one degree that would be like messy if you look it's almost like how the Lorenz attractor taking the most optimal view of XYZ can be mapped out as it transgresses through the saddle if that makes sense to bring it into a physical space but again it's more complex than that because this is taking it where there's the focal plank is zero to try and get the most optimum sort of that's why those things are follow a nice stable transition whereas if you're taking a sort of an odd angle it might all be completely a mess would that be a fair way of saying it? If you were to take a perspective a projection down that was a mixture of other axes than not sure what is gained by that but I see what you're saying there's some particle moving in three dimensions and you can have a projection into any side of that cube or you could take some sort of off-kilter view that would be still some type of linear summation of all three so it hasn't really reduced the challenge of estimating but yes I mean that's one way to see it. Dean? So I'm in complete agreement it has to be instrumental so that you can compare to the real so the minimum of two you have to have multiple maths not one in order to be able to see this and that basically then reinforces the idea that the Markov blanket has to both at the same time be a partition and be an envelope it's both at the same time a minimum of two again so again all we keep reinforcing is this idea that if we over reduce if we get to a place of under two we may be deceiving ourselves and I think we'll come back to that soon so now one hour in we're able to get to the Markov blankets okay so previously we were talking about the case of a single Markov blanket and so if we thought about like a Bayesian graph so now we're thinking about three nodes each corresponding to the X Y and Z axis nodes A and B you know the first two axes would have like an edge between them because they have some type of statistical relationship and then it would be disconnected from the third node because it doesn't have a statistical relationship with them so now to actually start to get towards more like the communicating systems or system in the niche that we're interested in that's the target that we're trying to aim towards in section four the authors repeat the analysis of the previous section but approximate two Lorenz systems that are coupled to each other through their respective first states so we're going to look at this matrix okay so notice that there are two this make that transparent so here's one Lorenz system here's the other Lorenz system the red one and the blue one and then there in these other cells these are the coupling terms so these are couples this is saying that the first state of each are coupled so it's a bi-directional coupling and that it's saying like state one the fourth state on the matrix like matrix row and column four is the first state of the blue system so the first state of the blue one the top left of the blue and the top left of the red are coupled through the off diagonal green boxed terms if this was all zeros nine zeros here nine zeros here we would have two independently evolving Lorenz systems so we kind of took the Lorenz system the three by three matrix that specified the dynamics of that system and then we just made another now a nine six by six matrix with two Lorenz systems that would be evolving statistically independently unless we introduce this targeted coupling and they say this form of coupling was chosen to be as simple and symmetric as possible okay so this is like the next step but it's a difference that makes a difference because this is going to change the total dynamics of it if you had just the red and just the blue you could like split that matrix and you'd be getting the exact same model ability because the systems wouldn't be interacting but now we've introduced these terms even if we only had one where you can no longer divide this matrix into sub matrices that perfectly describe the time evolution of the system so seven figure seven is like figure three so it's similar we're taking like sort of pinpoint locations on the Laplacian approximation of the Lorenz system figure seven is doing the same but it's looking at these two synchronized systems or maybe it's better to say that there's two coupled systems entangled yes and due to the structure of their coupling it results in an entangled movement pattern such that their trajectories have their trajectories are informative of each other and that's also reflected by this flow diagram where clearly their movements exist on a manifold that's in a restricted space of the possible so two people brownie and diffusing in two separate rooms you'd get that circle scatterplot two people who are tied to each other you would see a perfect correlation and so we're seeing something even though the system has complex and even chaotic endogenous dynamics we're seeing that the introduction of this coupling term allows the systems to actually empirically have entangled behavior Steven we've now got Lorenz systems as opposed to an attractor and I'm curious when does it go from attractor to a system does the coupling effectively start to create more systemic nature we're just curious about that Lorenz systems the Lorenz system is just these three this is the specification of a Lorenz system and it has it's a chaotic attractor so now we're talking about two of them so there's two coupled Lorenz systems but it doesn't make sense to talk about two coupled attractors because that's not what's coupled the systems are coupled as per the matrix specification and then it doesn't even make sense to couple the attractors the whole point is there is an attractor now by virtue of the coupling of the Lorenz systems is it almost like once you start to have something and you're going to relate them as soon as things relate there's a system because there's a relational quality to it whereas previously it could be a mathematical formalism it was still described as a Lorenz system so I wouldn't get too hung up on that word but yes systems have multiple interacting parts hmm okay figure eight, just like seven revisited the format of three eight is going to revisit four and now we're going to get to where we're going to do the partitioning figure eight is going to revisit four so remember the first partial derivative the second partial derivative in the Jacobian and the Hessian respectively and then the covariance so we're going to do that same thing that we did on a three by three but now we're going to do it on a six by six the log Jacobian looks a lot like this the non-zero cells in the specification of the system which for systems that we get to specify it looks a lot like this but again for territories we don't get to specify the map looking like this does not say something specific about the territory but it's not super surprising that we actually see like similar this top right has like no correlation like one in three are not correlated and we see that little chip missing here and here so we see a lot of the same dynamics at play within each of the coupled Lorenz systems but also we've introduced a coupling now that coupling induces a new and different covariance structure look at these gray squares in the off diagonals now one and two of system A are coupled to one and two of system B less so little bit grayer than the one that are within each system but still very marked by directional synchronies being induced the interesting piece comes into play when we look at this second order approximation and that's where we see that several cells can be identified so here we have like the magenta please Stephen thank you and the red are those first states of system A and B those are the ones that have the coupling on the fourth okay then you have the other two states of system A in teal and the other two states of system B in dark blue so it's almost like we have four kinds of states four kinds of states out of these six of course you could say well each state is unique great totally true they're different states that's why we have them as different dimensions in our model but there's going to be some interesting things that fall out as a function of us separating out the two systems and then separating out parts of the system that interface internally and externally and then parts of the system that do not directly by virtue of how we know we specified the system do not couple out again the fascinating thing is that like state two of system A and state two of system B have a covariance that's like maybe a mirror neuron again using a more like realistic mapping it's like there's some internal part that is only interacting with other internal parts by virtue of how the system is defined and yet these two parts in each system which are not directly interacting by virtue of how we set up the system have covariance so that is what leads us to this numerical analysis which kind of provides a little bit stronger and different empirical evidence so it's not just an analytical formulation it's a numerical simulation where we can see that when we actually let that system play out that even when states start very correlated like we just start the third and the sixth states in a very similar position so they start out with a high partial correlation coefficient when we run out that simulation through time it ends up trending towards zero partial correlation so we're recapitulating the correlation patterns of the system even when it starts in a more correlated place and so that suggests that some very essential aspects of the system are being isolated and tractably handled by this kind of partitioning of coupled systems so it's pretty interesting now we've seen this top yes just real quick what it also says is that hyperbolic is also existing in this pattern although it's not necessarily being materialized in these diagrams it's also what that's what this explains is that there can be continuity there can be alignment and there can be hyperbolic function within all at the same time again I just want to point to the non-obvious but it's still there great so we've seen this top of Steven yes I was also going to add that by showing the nature of the coupling like varying the nature of the coupling that you showed you can reach a different non-equilibrium steady state in some ways would that be true there's a different pattern emerges as it could speak maybe I'm overextending this but changing the nature of how things couple could in a way change the sort of states which things can fall into you can almost have multiple states being inferred based on how the coupling is set up oh yes I mean we would throw out any model that treated two people each other as simply moving the same way as two people in different rooms it is the coupling of systems that influence their behavior and again we're not talking about the causal structure of the world territory we're talking about the models that we got to specify and how changing the coupling pattern changes their behavior Dean and Steven when you're taking the abstraction here off the page so you could set up your programming for example that says I'm willing to sponsor somebody because I want to mentor them in bricklaying you can start with that or you can start with there's two people that are interested in bricklaying but the first thing we have to address is whether there's a potential for positive entanglement in real terms that's how you can flip the abstraction into the likelihood of those entanglements turning out extending because the people will actually be relatable to one another so that I just again I'll bring that up because that's how we turn the abstraction into the so what does that look like okay Steven something quick on this otherwise we'll get to 10 but yeah so that speaks to because we often think about the model on the inside the generative model in the external states and the model trying to understand what the external states are doing but this speaks to the ability to change the way the blanket is coupling so how much work is going on in terms of enabling there to be information variational free energy information being used in the generative models how much is being done by actually changing coupling dynamics in the blanket it's even feasible that because that in itself will start to yield more and more ways for the generative model to start to extract and make inferences based on what's going on externally so that also helps I think as a heuristic thought pump great but actually the blanket is downstream of the coupling pattern so it's like changing the coupling of a blanket is a little bit of a car before the horse because there is the coupling of the system and that is what induces partitionings which we can then map to Markovian assumptions so it's like you know when you put on headphones and you're listening to this yes a different blanket is in play because the partitioning is different because the coupling is different between other states in the world Dean on this and then let's go to 10 yeah be real quick so like purses at Charles Sanders purses abduction doesn't guarantee that when you walk in later and see beans in one hand and a bag of beans in the other that you'll get to a okay I just lost Dean so I'm going to continue so in figure 10 but he was talking about abduction and CS Pierce kind of that flipping that I understand you're right it is the cart ahead of the horse and sometimes that helps not saying it's you and I know you're agreeing with me but all I'm saying is we shouldn't just assume that it only goes one way horse first cart second yes and in fact partitioning different imagining differences ways it could be absolutely this is how we are going to apply and ask what is if for active inference like what is is like what is the blanket states given a coupling pattern well there's an answer sensory states are the ones with incoming statistical dependencies and active states are the ones with outgoing statistical dependencies those are the blanket states what if the coupling were different then what would the blanket be so it's not that that first blanket that you identified is being changed what's being changed is the coupling which induces a different kind of covariance amongst the states but it can be almost thought of in a cart before the horse way so just to recap this nomenclature which we've seen before so we have external states sensory states active states and internal states B is the set of blanket states so this is one of the innovations of friston at all beyond pearl 1988 beyond markov at all earlier earlier earlier is that blanket states which in the non-fep active world are the set of states upon which conditionally internal and external states are independent those blanket states again we're going to partition them into incoming and outgoing statistical dependencies sense and action now we have autonomous states alpha which is the set of active states and internal states autonomous states are the ones that for systems that we design we get to control we get to control our internal states we get to control our actions we don't get to control directly the sensory input you can close your eyes so that it's darker or you can move your head so it looks differently and maybe you'll even be right about what sensory information comes in then but you can't just say I want different sensory information coming in so those are the autonomous states and then there's the particular states the particular states are the whole particle that's the blanket states which are a sense and action and the internal states so the particular states are like the whole particle floating around in the room or the whole cell or whatever again to give physical interpretations and then the autonomous ones are like the things that the cell can control or the thing that the system designer can control so we have autonomous states which composed of internal and action states blanket states action and sense particular all three on the right side and these are all being defined as a partitioning a cleaving apart from external states as modeled not of the territory of the map only Steven and as we mentioned here this this paper gives a way for this to emerge from the very most basic principles and then we have like you described the blanket states the autonomous states of course in any sort of system organism non-equilibrium study state where it is a nested so it's not like there will be a whole nested levels of Markovian blankets so these when we emerge at higher levels we might be emerging it's not like unless we made our blanket there would be no blanket there's going to be blankets all the way up and down so as a blanket starts to be formed or brought into play then the nature of how that those blankets come into play starts to draw in the idea of coupling you know so maybe there's some of those things that you just mentioned there about there's not a dependency the fact that we're creating something from first principles doesn't mean in the real world without the first principles we can't make a blanket because in other cases it's going to be sort of emergent through more coupling between blankets because I noticed we're talking about the blanket but it's also good about the blanket in this scale and then the blankets implore at the multi-scale okay yeah so within internal states again these are statistical dependencies so this one can have other things happening it's like a black box within each one and so that is for people who expect to see the mechanics of the system that's a disappointment but for people who want to model real systems including machines it's a huge advantage because it allows us to go in keeping the map territory distinction totally apparent which is the real transparency we need from our machine learning models there's not going to be a transparent explanation for how 500 variables interact but we can transparently see potentially how the map and the territory are distinguished now that's where we get to their definition of Markov blankets so let others read through and work through the math and critique it and build on it but we're talking about boundaries based upon sub matrices of the total state space of the system so like this is just two coupled systems that we're talking about here we were studying system A and system B each is Lorenz we added the coupling to 1 and 4 and that's where we saw a new pattern arising why does a particular partition comprise four sets of states is it because tetrahedra are the minimum polyhedra in our world in other words why does a particular partition consider two Markov boundaries the reason is that the particular partition again particular states blanket and internal the particular partition is the minimal partition that allows for directed coupling with blanket states sensory states can influence internal states, incoming dependencies and active states can influence external states, outgoing dependencies without destroying the conditional dependencies of the particular partition as shown in the upper figure upper panel of figure 10 so very cool how we went from one coupled system or from one system with chaotic dynamics made an approximation and then showed that it had certain well behaved dynamics and then we took like the next step up which is to couple systems in a way that we designed and that gave a new manifold that the systems were acting entangled as if within and then here's where we get that very fun claim that there's no claim either the original Lorenz system or coupled Lorenz system possesses a Markov blanket we're not describing the territory I don't even think we're even describing the first level of the map the claim here is that there exists a Laplacian approximation to these kinds of systems that in virtue of the zero elements of the Hessian feature Markov blankets so this is one reason why you know we're also acknowledging that there's many perspectives and so much to learn that we do want to be precise with our language because it's easy enough to say that the system has a Markov blanket or there is a Markov blanket in the system or all these other ways which prepositions and nouns might be combined and those might be so misleading it could be truly incredible so it's a word of warning because as we figure out exactly what the technical details are and continue to clarify here it will matter which natural language words we use to describe it and it really does matter Dean? So let me add to that Daniel so on an abstracted level we can frame the problem as as in or out and then when we translate that into the physical space we know when something is entangled or when something is different but I think what this allows us for on the abstracted level is to sample the Markov blanket as in and out it gives us the potential for a second way of looking at how those two things work together the Markov blanket has both a divider and an enveloper do you agree? So on an abstracted level it's very important to be precise but if we include this we have to accept the fact that on an abstracted level in and out must be part of our our repertoire not just in or out because one is more precise what is active inference while the other allows for what if active inference the and would you agree? I think I agree I'm just imagining that it could be possible to design a system where there was one coupling going out but there was no coupling back in so one could devise edge case matrices that have various kinds of properties would those even be good maps of systems that we care about probably not so for some of the basic attributes like of communication of course if one person is not wearing their headphones like yeah the systems are coupled one person is talking and the other person can hear them but not return any information so it's like that's interesting it's a type of coupling that systems can have and it has a different blankets because of that but especially for systems where we want to think about states that are influencing the environment and vice versa whether it's communication or stigmargy or coupling or collective behavior that whole class of systems it's extremely important to have of course in and out and more and less and different I wouldn't drop or sorry Stephen I just got to get this end I wouldn't drop or I'm actually suggesting that this tells us we must also contemplate and so that's like saying which is more important the positive ions or the negative ions they're going to congregate and we can tell the difference but we shouldn't apply a value all the time there are times when we can just see the difference and accept that for what it is as opposed to always jumping to the idea that something has to be more or less that the positive ions are more important than the negative ones maybe they are but we shouldn't jump to that right away done I'm done okay closing foot on this Stephen and then we're going to get to free energy principle yeah so yeah just mention a little bit to what Dean said there if something is a charged particle how much is the system of the molecule showing the distribution of charge and how much is it the quality of the thinness of that thing and that becomes I don't think there's a definite transition I'm glad you went back to this diagram this is the one I just want to mention there the arrow is there on the right so where sensory states it shows that the sensory states are related to the same external states as the action states are affecting so it shows it's almost like there's a regime of attention so it's not the sensory states aren't reading something coming in about something which is completely unrelated to you know it's not like I'm looking I'm acting in front of me but actually my eyes in the back of my head right so my sensory state there's an alignment and then when I look to the left this coupling that's going on then between in the way that it's set up between this it's saying it's it's not a Markov blanket when it's saying it's a coupled system it's sort of saying in a way that this more slightly more nascent version there's kind of a there's still a spookiness about it I suppose we might say is that there's sort of a it's not it doesn't drop out as far as a Markov blanket there's that kind of resignation going on and I suppose somewhere between left and right you start to become you meet the the conditions of a Markov blanket and I suppose that's an interesting point I don't want to just make the water murky with too many terminology we haven't brought attention in it's not related to what we are discussing here but I see what you mean that yes if actions were influencing state A your actions were moving something up or down but you were observing the left right yes that's a different kind of system then if you're observing and acting on the up down movement of something that's not a regime of attention but I know what you mean let's get to the FEP okay finally remember that edges reflect statistical relationships which is the other side of the coin of saying that the absence of an edge reflects a conditional independence so one can provide a definition of the conditional density over external states so probability density that's why it was so important to go from dynamics to densities in section 2 and we're going to think about that external density probability density conditional statistical density as being parameterized by the conditional expectations of internal states given external states here's the formalism see how the coupled systems have a flow where like given where you were on even if it were a little noisier as long as it looked like an oval as long as you were like you'd know that if you were low on the left you were going to be low on the up and down and if you were further to the right you'd be higher up on the y-axis but imagine if the flow was like in a circle you'd say okay we're at x on the zero well you really don't know because you could be low you could be high or imagine if it was just like a scatter cloud no matter where you were you would like it was everywhere in the box you wouldn't really know but because there's a manifold then this admits the possibility of a diffeomorphic which means like sort of like function like and stretchable map between the sufficient statistics of the respective densities not the territory not the first map I don't even know if we're at the second map anymore we're talking about sufficient statistics of densities of approximations okay I think it's a death blow for realism personally but I would love to hear somebody who thinks it's not the existence of this mapping rests upon a continuously differentiable and invertible map which is linear under Laplace approximation there's a few technical pieces there but it's almost like we set up the system in a way that was defined only by its coupling but then we also got to choose our approximation approach and it turns out that with a system with interesting coupling dynamics enough to befuddle almost every other kind of model out there the approximation has a really nice relationship that recapitulates aspects of the system and lets us map like if we're low on here we can go to low on here and low on here to low on here like it's a kind of it's a function that can go both ways and so that is going to be the relationship sigma mapping between the internal and the external states which was explored more in the Bayesian mechanics paper number 26 this means the autonomous flow can be expressed as a gradient flow on a free energy functional of the variational density here are those four states, four kinds of states external, sense, action, internal and those are going to be expressed as a tuple it has to be because it has to be all four of those states co-evolving and those have a gradient flow gamma and then the upside down triangle plus the solenoidal flow so remember way back when when we saw that we had q solenoidal, g, gamma for gradients and then lambda for housekeeping so now that is coming back into play the solenoidal is still a single term that's just the potential function that self-surprise and one solenoidal term then we have a gradient over these four states and then the lambda housekeeping that can be rewritten and now pi again let the ontology develop so that we can resolve some of this but notice how particular states pi are the blanket and the internal okay but how have we usually used pi policy action selection so this is a different pi the free energy functional on particular states particular states let's call them equals the self-surprise on the particular states plus the divergence term that is the form that then lends the ability to interpret that free energy functional on particular states to be rewritten in several ways including energy minus entropy that's a very classical chemical way to talk about free energy to be phrased as the divergence between the q distribution on external states and the p distribution on external states conditioned on particular states plus the self-surprise of those particular states or the one that's closest to the Bayesian information criteria on Bayesian modeling world but still being shown as equivalent to is the accuracy minus model complexity and the evidence lower bound so that's where we've seen these formalisms come out many many times but we're approaching it from a very different way in a lot more bottom up and I think but again would be open to people correcting in a way that fully embraces the map territory distinction and makes no bones about it whereas if you just pop in from the top with these formalisms it might seem like they're describing aspects of the system but I hope that the way we've approached this over the last like six hours of live streams literally and the way that the authors wrote it so carefully makes it clear we're talking about these formalisms on probably not even the first map and maybe the second map or summary statistics thereof these are relationships that are intrinsic to our approach of map making because we did the Laplacian approximation etc etc etc Steven and then Dean yeah this is very helpful I might just ask we go and walk through the very first sentence again in a second just because I think it'd be helpful just to reiterate but from what you're saying and I hear this really clearly is the particular states and the thinness of maintaining particular states potentially or of whatever those particular states persist effectively then is the same when at a higher order level over time and space as a policy let's not bring policy in a picture yet we just haven't gotten there but I won't jump the gun on that but I suppose the point is just to say that we can still be in the realm of thinness we can still be in the realm of thinness because we're talking about particular states of things so we haven't got and I won't say beyond that and then maybe just to go through the ontology not just of that first sentence just to get it clearer again of autonomous flow expected internal state could just recap that just to just unpack that a little bit more just so that it's a little bit clearer okay so the autonomous flow so autonomous states are actions we're not even bringing policy selection in so it's the right direction we want to go we want to be talking about internal states that are actually doing planning as inference that affect their action states but for now we're just treating the action states as outgoing statistical dependencies and then we're just we're pausing on how the actions are selected we're just because in the Lorenz system there's no policy selection but there are external states that are modified by states those are the active states so it's almost like we can talk about action without policy inference so the autonomous states the autonomous flow at the expected internal state so these are expectations of autonomous states action outgoing and internal states can be expressed as a gradient flow so now it's like we're going to rather than have a six state system we're going to rewrite as four remember we pulled out four kinds of notes and now we're going to express a flow not in a six dimensional space where each dimension was a measurement that we were making but express a flow over the four kinds of states and then that means that even if there was a bunch of internal nodes that were communicating to each other we could still use this four-fold breakdown hashtag William Blake, hashtag Bucky hashtag Friston no matter how many specifics were inside the internal or how many blanket states there were like how many nodes in the graph or how many external states because now it's like a flow over kinds of nodes and then that because the nodes are defined by their directionality of relationship like how they influence each other right like internal and external we know don't directly influence each other they're conditionally independent based upon blanket states etc that allows us to write formalisms that are like evergreen and can be expressed as flow. Dean? Yeah I would just say that when I first encountered these kind of formalisms I always asked so and then what happened and that's when I coined the phrase when in doubt zoom in zoom out but I didn't want that zooming in and zooming out to be collapsed just to the optical zoom I assumed that your legs wouldn't become paralyzed when you picked up a telescope or a microscope that you would still zoom in and zoom out by moving right so the inference is in doubt so how do you overcome that when you're active but not just with your eyes you actually pick yourself up and move yourself around a little bit more you don't just have to be on the north end of the carousel sometimes you can pick up a ladder and look down on it as well and that will give you a whole new set of data or more complementary data a different collection and so that's why I think it's really really good to see that and understand what it implies it basically says if you're if you're not sure act some more how can I disagree with acting first surf let's just look at this slide at the end Stephen so this functional the flow over the four kinds of states that is using this blanket partitioning paradigm can be expressed in several forms expected energy big e is expectation expected energy minus the entropy of the variational distribution with the q variational distributions like the one that we get to control q of mu that's the q of internal states which is equivalent to the self-information fancy i plus the k l divergence between the variational and conditional density that's this k l divergence term which is always greater than zero so self-surprise is always zero or greater you can have negative surprise all and then the divergence is always going to be positive as well this can be decomposed into accuracy and complexity so that's this third framing and in that setting negative free energy becomes the evidence lower bound or elbow which will be familiar to many people with machine learning backgrounds okay amazing so those are the three different formalisms and how they fell out this is the basis of the free energy principle wait what's the free energy principle put simply it means that the expected internal states of a particular partition at non-equilibrium studies state can be cast as encoding conditional or Bayesian beliefs about external states that's the free energy principle in this paper not the first not the last way it'll be defined and not the only but that's how they discuss it here and basically we moved from early on like tying our shoe with a density to dynamics side and now we have this whole universe of formalisms that connect free energy looking formalisms like chemistry with information theory formalisms so thermochemistry with energy and entropy with thermo information divergence and self-information to information and inference with accuracy minus complexity as a modeling heuristic so that's big free energy is big and then they immediately go to talk about physiological perspectives and how that suprisal function that constitutes the potential is log model evidence so if you're getting good evidence for your physiological model you're spending a lot of times in your preferred and expected physiological states and that can be seen as simply a statement of homeostasis where outgoing statistical dependencies i.e. active states maintain interceptive signals i.e. incoming states, sensory states within characteristic physiological ranges you minimize your surprise about that until you don't and you dissipate Steven amazing this really ties into the active inference paradigm shift because traditionally representationism, for instance like you were saying it does really really throw that out in a way because if i have 100 cameras normally and i took 100 different cameras all taking a photo it's going to make it 100 times more blurry in that sense however in this sense 100 sources in this noisy dynamical gives me 100 ways to integrate free energy reduction and even actually the noise of me moving it 100 ways by 100 cameras gives me more and more informational thermodynamic ways in that just shifts very much the way we might think about how to know about how to act in the world cool let's continue on and we'll probably do a few minutes after the hour to get to these last few points so we've been talking about different ways that we can talk about the free energy functional of the particular states 5.1 is where this idea of the internal states as a generative model and self-evidencing or like the minimization of self-surprise is really leaned into so the alternative and deflationary perspective rests on noting that the free energy gradients are also the gradients of self-information so it's like the free energy itself is a wind that's difficult to latch on to so the free energy bowl the one where you'd be doing strictly as well as you could if you were on the bottom of it we're not on that bowl but the one that we can be keenly aware of is our self-surprise so if the self-surprise gradients is aligned with the free energy gradients then reducing our uncertainty about sensory states incoming helps us on the way towards minimizing free energy gradients reducing that difference the organism wants to be minimally surprised about expected and preferred observations and those are the perceptions or measurements more technically it wants to be minimally surprised about expected states and then preferences are going to play a key role when we talk about active systems that do resist dissipation this is where the moth to the flame comes and I love this plot arc from 2006 Ice Queen Friston with the wings snowflake and then, you know, things are heating up for act imp others are being drawn like a chaotic moth to the flame they come close and it burns and then they fly away so one could imagine these kinds of systems to be like stochastic chaos and that could be now again we're thinking about previously the Lorenz system was like maybe a particle in a chaotic weather system so that was like the weather systems that Lorenz was studying but now we're going to be thinking about particular states like a particular moth that's able to not just have outgoing statistical dependencies but to do policy selection so this is where we actually get adaptive active inference and again, you know, if there's two kinds of moths some of them are just particles and then they get burnt when they get burnt and they don't when they don't some will survive and some won't but the ones that do enact effective policy selection are going to be persistent even a day later so then the LAP adopter have been around for millions of years so is it really any surprise that the moths have adaptive behavioral systems but that when the niche changes like with electric nighttime lighting that that doesn't entail that it's going to instantly be the adaptive system for that new niche we can think about the motion of a moth attracted towards a flame but constantly thwarted by turbulent or solenoidal air currents because active states influence sensory states and possibly external states this would look as if the particle the moth was trying to attain its most likely state in the face of random fluctuations and solenoidal dynamics so it's like there's those arrows the irrotational component are like converging the moth to the flame like overshoot and the windy room are blowing it around and so again there's the moth in the room and we can look at it in the x, y and the z and if the candle were in the center of the room it would be like a bowl that the moth were converging towards for any two dimensions that you looked at and indeed all three dimensions when considered together and the movement of that particle the dynamics of that particle would be related to something like the flow of the air in that room with a pullback attractor this perspective emphasizes the active part of self-evidencing sometimes referred to as active inference so we were calculating the potential, the self-surprise or the outgoing dependencies for particles before we talked about policy selection and that is showing mere active inference as it's called then when you think about the real world where only the anti-dissipative systems persist and the successful ones even more so that's where policy selection comes in to play and that's actually where this paper ends they basically just bring it to that point and say all systems are going to be able to be framed within this way again not the territories all of the approximations that we make will have these attributes and then we can imagine that there are some systems where the internal states are not just acting as if they're doing conditional expectations on external states but then they take those conditional expectations like about the weather and then there's another node for policy selection like the other Pi policy selection in the models we've seen and then that Pi can be like put on the jacket or not those are like the two affordances or those are the two states in the affordance variable and then in a dissipative world you're going to see adaptive agents as the ones that are able to recurrently adapt to changing in chaotic circumstance to repeatedly come back to that candle and so just to summarize then we'll just if there's any implications we want to touch but we'll summarize they constructed a Laplacian system that's quadratic a bowl like in its potential function and flow operators state dependent flow operators by the way that includes the housekeeping function that evinces so provides evidence for stochastic chaos because it has that Leoponov exponent that was very similar to the ones that were calculated from the non Laplace assumptions or approximations the way that the polynomial approximation played out with the first and second order approximations gave a sparsity which can be interpreted as conditional independencies and a partitioning into four kinds of states coupled systems have a conditional synchronization map in this example it had a linear form they were on that manifold that was like y equals x and this is important because in these Laplacian coupled systems the conditional synchronization map is necessarily diffeomorphic again would be awesome if somebody wanted to come on and like help us see what the technical implications of the diffeomorphism is but the simplest level is like it doesn't bend back on itself so it's a good function so that means for every sensory state there's a unique expected internal state and conditional density over external states so it'd be like yeah when the thermometer says like 20 it could be 20 or 50 that's not a very helpful thermometer now if you say the thermometer always reads half of the temperature or it's always double or it's a little noisy those are all workable but like the thermometer could just totally be deceiving you and then it just is a wacky world with what the thermometer says not going to be a very adaptive system the resulting formulation can be read as Bayesian mechanics in which internal states on average parameterize belief of a Bayesian sort about external states just like these equations when applied to particles and statistical distributions were statistical mechanics were taken the full Bayesian insight and reading these equivalencies and formalisms as a Bayesian mechanics so that's the summary of 5-2 and then there's a few implications that they get into but we'll first have any questions so Dean and then Stephen Well I just think it's fascinating that it doesn't matter whether we're talking moths or people we're kind of blissfully unaware of the statistical manifold floating around as prior to us being pulled back to that plane and either becoming the hero moth or the fool moth we don't discover that until afterwards but if we could be aware of the kind of the statistical distribution in play prior to us being pulled it might help I don't think it determines much but it certainly enlightens as opposed to being lit up yeah and the moth example I think but if it really reaches the flame it dies so maybe that's not even the final perfect example for how attractors work that'd be like a bowl with a hole at the bottom again it's just a heristic it helps us think about the kinds of systems that we're talking about but realism interpretation aside even that example doesn't capture the whole thing because it's a metaphor because it's like map of territory so it's totally chill Steven yeah this the evidence in or evin says stochastic chaos which wasn't a word I'd heard of before to be honest is this idea of equality so that that's really useful that how it gives a sense of how both there can be chaos in the attractor and the noise maintained at least at some level throughout because the in most cases the noise is just averaged away in other scenarios so this is this piece where we're tapping into that quality is really interesting and then that then tying into the Bayesian mechanics all the way down is very very useful especially when I think the Bayesian mechanics as I mentioned before is stacked it's like it's not saying this is where everything is different to evidence in it's like like we're going to give better evidence which or better evidence which make us think we're going to show a more accurate Lorenz attractor solution no it's and we can stack up and add these fuzzy qualities together and we'll still have a fuzzy quality but maybe we'll have one which is a little bit more practical and that's the game in town here is practicality so I like that also it does average out noise the internal states on average the expectation parameterizes beliefs and so that's going to be a major challenge is to go from these expectation and mean field approximations without going into ergodicity that is literally the whole challenge the Laplacian is like a quadratic that's fit on another distribution which we're not even caring to learn the form of and so we could say that the expectation the single number expectation the expectation of a child in that class is five feet tall but maybe it's all four and six feet but if we fit Laplacian on that it's going to be at five looking like a quadratic so the map and the territory are not the same and it is a significant challenge across domains to connect some of the approximations that are double swords like let's approximate over a bunch of possible futures or let's approximate this way there's no free lunch so the internal states on average parameterize beliefs but does any given internal state parameterize beliefs evolution kind of helps us out by saying like well on average it's going to get better and better that's like the fundamental theorem of national selection FTNS and then the ones that don't do as well so evolution in a way helps us get out of this but the formalism doesn't have that as we saw it right now that's a really good point and so it also does tie into the so it gives a way to sort of the word average to average or approximate these faster perturbations yet it still maintains perturbations which could be used at the next nested level or in other ways so there's an interesting question there's an interesting thing about yeah it's reducing and the overwhelming noise yet it's also not doing what traditional statistical averaging does which is basically crash everything down to you know a fixed point it gives this ability for it still to be alive in a way absolutely like if we just took the expectation, the single average point of the Lorenz attractor itself it would just be one point in the middle of the saddle okay but we have something that still has a dynamical nature with attractable approximation and the dynamics have a positively open of exponent so we didn't squash the system down to a point or a racetrack we kept some of the dynamical and even chaotic aspects of the system while still being able to talk about things on average but we're not talking about the average position of the particle yes even quickly on that yeah I think that's a good point and also just one thing that caught my mind is with that butterfly with the moth flame is it talks about a perspective that's been taken and this is probably useful when people are thinking in the applied active inference this is the perspective that we're having and it's possibly more use the word perspective as we see a picture of a moth if there was no picture there we might say a description of the scenario so there's this perspective on the moth yet the moth isn't necessarily in this scenario we are taking a perspective it's doing it's doing its mothness yep it's definitely moths gonna moth although the picture was not in the paper I added that one but the wing snowflake was indeed in the 2006 paper so let's just recap a few of the domains of implication and then we'll close because like what a fun paper and discussion so one implication but also sort of note of caution is that just because we took the OG complex system Lorenz and coupled it so two chess moves beyond some sort of simple scenario it doesn't mean that a synchronization map exists let alone a good one for any given system but one of the implications of this paper was that now we know that for a non-zero subset of dynamical systems a non-zero subset of chaotic dynamical systems and a non-zero subset of dynamical chaotic coupled systems we can get this well behaved approximation map so not to say all but we know that the answer is now not none so that's a key implication another implication was to really start to bring in the variational free energy in terms of generalized synchrony that's what opens up to the DECASTA discrete state space synthesis active inference Bayesian surprise optimal Bayesian design intrinsic motivation Infomax risk sensitive policies KL control Occam's principle decision theory utility theory max and constructal theory like this is not all in this paper a lot more has to be done to get to some of these places especially in such a reason bottom up way another implication of the paper is that they took the Helmholtz decomposition in a new way and it might provide a generic model for random dynamical systems importantly because we connected it so closely to Bayesian generative modeling it could be possible to do less just like let's do descriptive analytics on flow to like let's have a map that we can run both ways where we're updating the parameters of our flow map with what we're observing but then we can also generate data sets we can generate counterfactuals and so that is what allows sparse data sets to play into active and FEP and ties us more closely to modern machine learning like variational auto encoders another implication or next step would be like we were looking at a three dimensional system the X, Y and Z of the Lorenz attractor but how could we approach high dimensional dynamical systems so attractors in 50 dimensions that's a little bit different and hard to solve so they suggest that a way to approach this probably hint hint cough cough wink wink etc is to learn the state dependent flow operator and the NES using deep neural networks so it's kind of like you have a neural network now that recognizes bowls and so we're still kind of keeping some of the same things that we had a intuitive interpretation of in lower dimensions it's just like you go like okay I know about the cube and then someone says well there's like an 11 dimensional cube okay I kind of know what I don't really see exactly what that means but I kind of see what is being discussed how are we going to compute on the 11 dimensional cube though well neural networks and that is actually very close to neural stochastic differential equations and I think this is super exciting one could ergo hasn't been done probably hint hint use a separate feed forward neural network to parameterize different components of the flow operator and self information so in other words we're taking what we talked about here and we can think about combining it with the types of stochastic differential equations neural stochastic differential equations and Bayesian neural networks all of these advances in machine learning and now we're actually coming back to the table with new architectures for deep learning because this would be a deep learning model it would have a deep neural network you would benefit from having a GPU but it's an architecture that was inspired by the discussions that we had here and then just the final implications and then we'll just have a little closing the flows across the boundary also notice that a little bit of the blanket starting to potentially be inched away from talking more about a Markov boundary and connecting the FEP to the Constructo Law and the work of professor Adrian Bayesian who honestly is a lot like Friston in some ways and a really gracious person and communicator as well with a long career of working towards this kind of a unifying flow theory to get towards evolution of design and nature and as the image shows from floodplains to long alveoli that similar kind of a unifying model and then a final set of implications is that we take covariances all the time from time series that's like how we do crypto trading, that's how we do weather prediction, that's all kinds of systems take covariances matrices on time series and so potentially because the Hessian is recapitulating elements of the covariance matrix it could be possible to have an empirical data set and empirically take the covariance and then learn something about the Hessian the curvature of the underlying flow states so in other words to use the Laplacian approximation even if it's stochastic, chaotic and dynamic that empirical covariance could identify conditional independencies where there's a zero and so in principle you could furnish a description of expected flow and the Lyapunov exponents to establish whether the system was chaotic or not because we found that the Hessian approximation gave us a Lyapunov estimate that was like the actual Lyapunov exponent in the Lorenz system so we could go from empirical covariances which people do literally every day COV, open parenthesis and R empirical covariances to estimated Hessians to estimates of things like the Lyapunov exponents and that's why people often study chaotic systems in the super toy models like the double pendulum in the Lorenz because it turns out it's shockingly hard to find chaotic systems in nature outside of those toy examples because the real noise in the system swamps your signal and so in conclusion having a simple functional form for the flow of random dynamical systems may be useful for modeling and analyzing time series generated by real-world processes that are far from equilibrium and may or may not be chaotic and so that's the closing question human flow physical flow, supply chains, information flow, narrative flow, semantic flow ants bringing seeds in and taking the mid and out all of these flows we have data sets for them and now we have a new kind of model that we can use to model those flows so we'll just chill and talk for a few more minutes but what a great discussion so thanks a lot Dean and Stephen to participate in 32 so Stephen first deep breath there was a couple of things there that seems to have a big impact because as well as Markovian monism and the idea that active inference brings in this partitioning and the active inference field which can then also have relationships to deep learning and how predictive processing is thought about what's being said here that there's also evince in or evince, the idea of the evince and statistic chaos itself also being a paradigm paradigmatic part of evidence in being arrived at which in some ways is not dependent purely on Markovian monism being the overarching game in town so that is quite a that's quite an interesting additional way for deep learning to be to be impacted by is this a field of practice is it a domain I don't know what we call it but that does it does make everything it doesn't mean it's all Markovian monism in terms of the ability to then infer you've also got this other ability to get information and from what I'm saying it almost does much by inferring if there is chaotic presence or if there's not maybe going and seeing if there's not chaos chaos present may give an idea of there's maybe no correlations present that's jumping the gun maybe there's other ways of being able to interrogate statistical dependencies in time series analysis that can be sitting alongside the Markovian monism so how that's connected to Markovian monism is definitely a good question but yes these are general tools for complex systems analysis so I hope people don't read this paper as like frist and apologetics or some sort of fringe development in act and only it's pretty clear from the nouns that we used that this is actually like a fundamental contribution towards analyzing dynamical stochastic chaotic systems however just because for example a low Lyapunov exponents of your approximation it's not a statement about the system so we will always be able to just halt right there it's not that it's not a chaotic system your approximation doesn't have a chaotic leading Lyapunov exponents so say what you want to say about your map now if your map isn't chaotic and you're doing a great job predicting, explaining, controlling designing, creating wonderful I hope the business is successful and that you can have friendship and health and all those things but that's not a claim about the territory and so there's going to be so many fun ways to talk and explore that Dean what are your closing thoughts? Just real quick I think what the covariation piece which is basically statistics and an abstraction to sort of explain things points out is that if we can understand the world as being both within enveloped and between partitioned at the same time that's a good fundamental place to start and I completely agree with your last statement we're not we're not counting fairies on pins and then just leaving it there we're actually hoping at some point to be able to turn this into something useful and functional but if you can't start from a place of seeing it things at once within and between you're going to struggle you're going to tend to collapse prematurely and you're not going to get very creative an active inference will you'll be able to define it and describe it but I'm not sure how how much new information you'll be able to derive with it unless you can hold up both at once it's within in between. Great there's so much more that we all have to learn the appendices the formalisms we didn't go into the citations other perspectives it's just the beginning and that's what .2 is about like how far we've come when we can go back to our roadmap and just be like wait dynamical systems and densities and then we're 300 miles past there but where have we gotten and so it's like just really exciting and I'm glad that we're doing it in a participatory way because there's a lot to it so Dean and Steven thank you again everybody watching live and in replay hope that you get involved and stay involved with Actinflab. See you later bye