 Hello and welcome to the Active Inference Lab. This is Actinflab Livestream number 32.1321, Stochastic Chaos and Markov Blankets. Welcome to the Active Inference Lab. We are a participatory online lab that is learning, communicating, and practicing applied active inference. You can find us at some of the links here on this slide. This is a recorded and an archived Livestream. So please provide us with feedback so that we can improve on our work. All backgrounds and perspectives are welcome here and we'll be following good video etiquette for Livestreams. At this short link, you can see some of the past, present, and upcoming Livestreams. We are here in the early November 2021, 32 discussions on Stochastic Chaos and Markov Blankets. And in the second half of November, we'll be talking about the paper Thinking Like a State, and we haven't yet set the papers for 34 and 35. So if you have any ideas or if you're an author or you wanna come on to discuss for these dates, then get in touch with us. Today in Active Stream 32.1, the goal is to learn and discuss this extremely interesting paper, Stochastic Chaos and Markov Blankets by Friston Heinz, Ulzhofer Dekasta and Parr from 2021. And there's a lot to cover in this paper. There's a lot of formalisms and figures and keywords and broader questions. So I'm sure we'll have a lot to think about and discuss. If you're watching live, then please feel free to write a question in the chat because otherwise, Stephen and I are just chilling here. So let's get into it. With just intros and warmups. And if others join, we'll have them introduce themselves when they do join. But for now, let's just say hello and maybe bring up just something that excited us about coming to this discussion or something that we wanted to resolve today. So I'm Daniel, I'm a researcher in California. And something that I liked or remembered about the paper was that there was seemingly some changes or updates to the formalism of core terms. And so how to understand that in the broader trajectory of the literature, which one of these changes are like updates? Where is it a software update? Where is it a breaking change? And then what are the implications of those changes for some of the broader conclusions that people do or want to draw from active inference and the FEP, like the nature of thingness or biological systems or levels of nested organization? So Stephen. Yes. Good morning from Toronto. I'm Stephen, I'm doing a practice-based PhD, which is kind of moving to a methods-based PhD. I work a lot with group processes and working with creative spatial approaches to help do sense-making. So I'm really interested in active inference within that as a kind of unifying way to help to close a lot of the loops and the fragmented questions that come up when trying to do sense-making with different types of people in different types of spaces. So yeah, as with Daniel, I'm really interested about this paper as after the initial view of it, because you see stochastic chaos and I'm consciously trying to not go too far down sort of an abstract academic aspects of things. But I realized that actually these underlying processes have got quite a lot of applications in this ability to understand why and how things are unifying. And I think that's really exciting because there's a lot of sort of semi-Marcovian looks at larger cognitive processes, there's certain practices and inactivist processes. But what we tend to see when it gets down into the nitty-gritty is it's how are things bubbling up through the kind of stochastic chaos of the world and life or should I just say stochastic separated chaotic processes. So the idea of having stochastic chaos and what that means and this paper does that is quite useful to start to look at what it means to have to be able to reconcile those approaches. So it actually does start to help think about more applied approaches. So that's something I'm interested and excited about this paper. Thank you, Stephen. In the 32.0, this is kind of the lead-in was this topic of flow and that is in a way what might start to bridge this distinction that you raised and will definitely return to between kind of abstractions and applications. Flow is a term that has applications across domains from the flow of water, flow of people, flow of information and money to psychological flow. It's kind of a nomadic term. It's a transdisciplinary term. And then that's an abstraction. We're abstracting to the patterns of flow that help us connect the dots across these different systems, but also we wanna apply it. We don't just wanna say, well, water's flow and information is flow and that's the end of the story. So how abstractions and applications kind of work like the left and the right foot together to build our knowledge and our application, almost like our epistemic and our pragmatic value, I think that'll be a thread that's important to pull out. What do you think about that? Yeah, that's a really helpful connection. And I think this also speaks to that usefulness of this work beyond modeling directly is when this idea of flows talked about in these ways, it's kind of a bit vague, right? This kind of, and it still has this idea of things flowing in a certain direction and are you in the flow? And so I think when we start to also have the potentials for stochastic processes and in a way the sense that everything's flow and how does, what does low and high requirements mean? What does low and high competencies preferences mean? They kind of, these are quite bucket terms, right? And essentially they are what they are because they are workable, truth be told. But we can start to look for something which can hold those in a more unified way. Because I think that the way that we might feel in the flow when we're in a zone where we can both do things and we're immersed in it may well be maybe what we're in all the time anyway, but it's just a different form of. So that also changes the kind of rarefications out there. Great point and we can actually connect that to this sort of view from the outside and view from the inside, which we talk about a lot in Actinth. So it's almost like there's the skier going down the hill and the view from the outside, the behavioral view is like given the angle of the hill, they're optimally flowing. That river is physically optimally flowing down that hill as you'd expect from the friction and the potential energy and just the physics of the setup. And then there's this internal experience of flow. So it's kind of like maybe we are in flow all the time even when we are getting, for example, distracted or bored or frustrated in what ways is that still flow? Because if we think about flow, well, flow is good. It's a state where we're productive and happy and things like that, and then we're out of flow. That's what you'd hear in the psychological literature. But let's just say that the river we're flowing and flooding. We don't say, well, we're done using flow models because now it doesn't do what I thought it was going to do. Flow is the whole space. So how do we think about this whole psychological space as being particular regimes of flow and flows the big topic that connects stress, fear of failing, overwhelm, concern, excitation, etc. Those are all flow states. Maybe it's not the flow state that we expect. Maybe not the one that we prefer, but a river can have flow states that we don't expect or prefer either. And so it's like when we do abstract to the physical systems, we get a whole quantitative toolkit. And we also get the ability to maybe look beyond phrasings that see flow as just something to be obtained and rather flow as the nature of perhaps interior experience. And then we can talk about designing or perturbing regimes of flow. Yeah, go ahead. Yeah, that's a good point because from the outside, there's the outside system, like, say, seeing someone skiing down a hill, and you could say there's different ways of doing that. And then there's from the inside, you've got essentially a dynamic, nonlinear, dynamical organism that's kind of trying to sort of thrive and survive. It's different. There's no getting away from that. And the nature of that difference that's another question beyond. But the interesting thing is on that graph as well is often that graph only, the flow is only shown on the top right and not on the bottom left, even though it could be seen to flow down. And so something that's like a quadrant and it's just the top right, which is seen as the flow. But what's also interesting is it kind of depends again on these multi-scale dynamics. Okay, so within certain scales and temporal scales, things start to fit our organisms' preferred states, some of which is kind of built into our morphology. So if something becomes so easy, for instance, it's a lot easier if I'm trying to protect a building just to stand in front of the front door as a security guard and nothing happens all day. But, and that may end up in some sort of flow, but there's a point at which there's a desire for some stimulation, there's a desire for some change and there's also maybe changes that happen at the kind of, this sort of ties in with Adam Saffron's work about windows of experience, both in real time and over time, is when is it kind of in the same sort of time scales that we consciously function in and when is something beyond that. So that could be where stress and boredom or stress and burnout can be both because the nature of the activity is too rapid, it's too disjointed for the morphology, or it may be that we perceive it that way because we have some depression or anxiety that's also not enabling us to engage in those ways. What's your thoughts? Great, let's take that point to one of the formalisms of the paper. So you just mentioned how there's like multiple time scales and there might be slower time scales, as well as faster ones which might even be faster than our rate of perception or modeling, which are one in the same if perception is indeed a generative model. And this is very early in the paper in section two from dynamics to densities. So they write the aim of this section is to get from the specification of a random dynamical system in terms of its equations of motion. So the, for example, differential equations that define the time evolution of a dynamical system to the probability density over its states in the long-term future from any initial conditions. So going from dynamical systems, which takes us into the temporal realm, to the density realm. And so there we think about ocean water of different physical densities. There's a motion to densities because denser things go down. And so that idea of denser things going down or flow from high to low density regions, just like in the atmosphere, just like in the ocean, that is going to be connected to the behavior of random dynamical systems and to information geometry. So that's a little bit of the next piece, but the two time scales are here. Here's a classic flow model. There's a flow and then there's the fast fluctuations. So it's like there's the river moving ahead at, you know, one kilometer per hour. But each molecule in that river is not always going at exactly that speed rate. There's a fast fluctuation, which might average out to zero. So in the bulk aggregate or in the time aggregate, the flow dominates the behavior of the river. But the fast fluctuations are also occurring and they can often be equivalent to a well-behaved i.e. Gaussian error term. So this is where we see the separation of time scales into sort of the flow of the river and then the vibration of the molecules. And those have two different partitionings of the system. The flow is well-behaved because it is just like a single number for the whole river in this simple example. And the fast fluctuations are well-behaved because they're smaller. If they were larger than the flow, they would become the flow itself. So they're smaller than the flow and they might be uncorrelated or have a zero mean, which allows us to do Gaussian statistics on them. And then this idea of separation to slow and fast time scales is connected to density dynamics through the Falker-Planck equation. Yeah, Stephen. Yeah, just a little question on that. So this time dynamic, so flow is both at a time scale of interest, I suppose you could say, and potentially one which has a net kind of change in some sort of directional kind of gradient descent, I suppose. And then the fast fluctuations, it doesn't say random, but it's probably more stochastic, but could also have flows going on. So is this... How much difference in the rate do they need to be to be classes? Good. Slow and fast. Good question. So the X with a dot is kind of like the derivative of X. It's the rate of change in X. It's the flow in X is going to be decomposed to these two terms. Now, we might be interested in a very special solution to this equation. So again, we have X dot is kind of the change in X, which is being broken down to these two dynamical variables, and then that's being equivalent to the Fokker Planck, which is more of a density framing. So that's the key introductory step in the paper is just to build this really strong equivalence, AKA they're the same picture between the dynamical specification and the density specification. So if we could solve when the dynamics were at a steady point, that would be equivalent to when the density would be at a steady point. There might be some methods to help solve it in one phrasing or the other phrasing. It turns out that there's a way to solve with a little bit more feasibility when P of X equals zero. So that's when the flow, the density flow, given this is zero. And so that's what they're discussing here is basically asking when does that equal zero, and that is a stationary point. So it's kind of like, doesn't mean it's just simply flat in the reservoir, but you know, there's a pump bringing water up to the reservoir at 10 gallons per minute, and then the reservoir is dumping at 10 gallons per minute. Then that is going to be a point where there's a stationarity in the overall densities of those two bodies of water, just like if it were two bodies of heat. And so we're solving for a special place, which is where the total flow rate is zero. Could I just ask a question on that without going too deep? So in a way, the ability of having this relationship to the flow it gives you a sense, I suppose you could say it's like when the graph, the gradient zero, you know, the top of a graph between two types of descents, and it's got like that flat gradient. It gives you a chance to choose moments which could have a particular set of properties without having to calculate everything about it. It's just that, and now when that happens, what happens then? Is it then that you then measure the density and just keep measuring lots of density points when you find the flow points to be it at a state of equilibrium, or is there something else? So very good question. Taking samples and using that to say something about the landscape is something we are going to get to. Welcome, Dean. So we will talk about how samples can be empirically taken to say things and do inference on the flow density. But you made a good point which is that we want to figure out something about a special point. Let's just say that we knew that we were solving for a quadratic equation, which is going to come into play later. So it is going to look like a bowl. Now, it'd be awesome to know every single point on the whole bowl. But it turns out we might only be interested or mostly interested in one point. The bottom of the bowl. And for a quadratic function, the bottom of the bowl is the only time that the derivative is zero. So we can have this big, nasty quadratic equation and then take the derivative of it, the first derivative, and then just ask when does the first derivative, the slope of the curve, equal zero. And we know that for a quadratic equation, there's only one point where that happens. And either it's like a bowl or it's like an upside down bowl, but because we constrained the discussion to a specific family of equations, which is going to be important for variational inference, we can then solve some very reduced equations like the derivative, which is of a lower order of computational complexity, and then just get the one point that we actually wanted, which is like the location of the bottom of the bowl. And then from the bottom of the bowl and some of the coefficient estimates that we have, we actually do get the whole bowl. So this is kind of like we're taking some big family of equations and then we're going to be looking for special points, like especially where the flow is at equilibrium or stationarity, and then fitting a family of equations with a form that we know we can compute with to approximate a flow that has a different underlying dynamic, and that is going to come into play where they show very early, you can take a Lorenz attractor, which has some equations that govern its motion. So again, those can be moved from the equations of motion to the density dynamics. And then you can reconstruct an approximation of its dynamics and your approximation to the dynamics doesn't have to be even of the same form as the generative process itself. So it's like you could fit a linear model between height and weight with some error, but that doesn't mean that there is a linear model relating height and weight. It just means it's an approximation heuristic with a form that's very easy to solve. Steven? So in some ways, this is a bit like that particle wave challenge whereby you have to use a particle to find where something is in time and space. You have to use a very simplified understanding of the wave to say, okay, at this point I can't have all that other information to have my location information, even if it's an informational location. So it's like the flow finding a point in the flow gives you a way to identify the location, maybe in the information space and then the fucker plank gives you a way to then embellish that with another type of you know, put information back in in a way back into the cake mix put a bit more you know, a bit more salt and a few more chocolate chocolate buttons in there and so you can see what's going on. Okay, that's quite useful to think about that. And would I be right in the sense of what those two equations are doing? One's giving you okay, some chance to choose a moment in information space or physical space so at least you've got something to and then the other one's giving you a way to then put some information in which you can then start to do some variational processing on. I'll give a short thought and then Dean and Blue feel free to say hello and just any overall thoughts and take it wherever you want to go. Most dynamical systems have been framed classically using these equations of motion. So you get like a matrix that describes the differential equations that describe the unfolding of the Lorenz attractor. So it's framed as a series of dynamical equations. So this early equivalence allows us to bring some of the tools of fluid mechanics and density dynamics and flow to bear on these dynamical systems. So there are some ingredients or some kitchen tools that make approximations to flow very useful for equations of motion that might be intractable if you try to solve it on just that domain like figuring out a chaotic systems underlying generative process it's basically not possible but potentially by approximating it with a very specific type of approximation families of functions that we know have certain behaviors which we're going to talk more about using approximation families that are well behaved and using some of the things that have been developed from flow decomposition like the Helmholtz decomposition that doesn't exist for dynamical systems so by making that equivalence between dynamical systems and flowing systems we can then use flow tools to study dynamical systems as well as making it clear that we're describing the flow of our approximation but it turns out that that does pretty well at least in some of these examples that they're going to present in the paper. So Dean and then Blue. Hi I'm Dean I wasn't going to come on today but the conversation was stimulating enough I figured I had to jump in here and maybe add my two cents at some point. Thanks and Blue. Hi I'm Blue I'm a research assistant in New Mexico and I was into this paper because it I don't know kind of piqued my interest in information geometry and Daniel I saw like in the dot zero that you have mentioned that course that's online I'm going to look into that for sure and you know I am curious about like a lot of the leaps that were maybe made in this paper between like particles from particles to cells and back to particles again and those kinds of transitions get a little fuzzy for me so I'm curious to see kind of where this flows to. Yep I think that was the information geometry course of John Bayes so that's like again I haven't taken that but I know that that's recommended and very cool so we talked a little bit about flow about that's kind of one of the dreams that we could have a unified flow model for heat and people information behavior strategy etc and one of the earliest moves that they do in the paper is just to really lock down this connection between dynamical systems systems through time and systems with density so it's worth just repeating that because that's second section of the paper and that's going to be a lot of the build up another term that comes into play is the Leoponov exponent and chaotic systems so if anyone has a thought of course just raise your hand but one way that chaotic is quantified and formalized this doesn't mean it's the only sense of the word chaotic because sometimes people do use it non technically but the technical definition of chaotic is referring to the extent of the sensitivity of the system to small changes for example in initial conditions or in perturbations and so there's what's known as a Leoponov exponent which is something that can be calculated from the model of a dynamical system and there are some dynamical systems where adjacent points stay adjacent so it's like two pieces of wood that are just put in a river that's moving straight forward there's other systems where adjacent points converge or where adjacent points diverge and it depends where you are in the system like if you're at the bottom of the bowl then the either way you go it's like a stable equilibrium you're going to get pushed back in but if the bowl were upside down then either point to the side of the top of the bowl would push you and diverge so that's the Leoponov exponent and there's a lot of technical detail in exploring chaotic systems from this perspective so that's one matrix term that comes into play is using the Leoponov exponents especially the leading Leoponov exponents to characterize some of the attributes of a system Steven? I think it's good to mention that idea of a system you know like two things coming together and when they can meet you know the system as well they can converge or they diverge but they could also when they bump into each other there's also that the dynamics of the system so as well as the systems dynamics merging the dynamics of the change it will stop changing once they meet right so that there's also the extra level of dynamical change which you get with these non-linear dynamical systems which are actually working with the dynamics of the change so it can sort of reset itself once they're in a physical system they've come together that's kind of it when you did them with variations on change it becomes much more generative okay yes so sometimes these exquisite sensitivities to tiny tiny differences in initial conditions they get swamped by random fluctuations so that's what's interesting about their approach here is again they're partitioning out the flow from the fast fluctuations and then they're describing how when we when random fluctuations are added in then the strange attractor is going to be replaced by a pullback attractor and then the flow is the expected motion so that's the movement of the river of the flow term if this flow shows exponential divergence of trajectories i.e positively operative exponents we can impute stochastic chaos they then use this approach to investigate one of the classic chaotic systems which has also been one of the earliest chaotic systems of Lorenz who maybe there was some other live stream where we did a little bit more on the Lorenz but it was a meteorology question Lorenz and colleagues were studying the weather and how it changes and about some of the sensitivity of the dynamical models to tiny changes like it seems like if it's 21.1C it seems like that should be very similar in its behavior in the long run to it being just .1 degree higher and then as you got further away it's like well then it'd be way more different in the future if it's more different today but then it could actually turn out that a tiny difference ends up sending your equations haywire whereas a very different initial condition might actually like feedback and come back to a similar attractor through time so that's the kind of systems that they were investigating in like the 1950s and there's been a lot of computer and technical work since then but this is still a total classic and a beautiful strange attractor for them to be working on as the first example in their paper. Stephen? So from what you're saying the strange attractor is the kind of complexity science idea of these mathematical models where things get caught in as you just described and initial conditions can change the nature of what that becomes within a certain range and obviously outside that range it doesn't form so it's still kind of off equilibrium this pullback attractor is almost a set of states which can be moved towards looking at the flows within other types of chaotic attractors so it's making a difference so in some ways the pullback attractor would need or involve some sort of inference or some sort of driver to make it happen it's not inherent to the nature of the mathematics of that particular thing once you let it roll would I be right in that? So the pullback attractor it's kind of like let's just imagine the attractor is like a post in the ground and there's a bungee cord so if we were looking at that quadratic bowl it'd be like there's an attractor at the bottom of the bowl and then if all you saw was the lateral movement of the particle so we don't know anything about the height necessarily or the shape of the graph we just observe that when we push the ball to the left it gets pulled back as if it's being on a bungee cord pulled left maybe it even overshoots and it kind of dampens so that's like a pullback attractor what's being shown here is an attractor that is in higher dimensions so if you had a two-dimensional pullback attractor it'd be like the line y equals x and if you get perturbed off the line you come back down into the valley so that's like where you start to see vector fields and the flow of water into a river and now we're in a three-dimensional attractor at least three spatial dimensions so it's kind of like there's a race car going around this butterfly track and then it has the bungee cord so you're getting pulled to this car that's moving and then if you sample your location at different times you're being drawn to a pullback attractor which is the compromise between the flow as defined by the equations of motions of Lorenz attractor and the fast stochastic fluctuations Dean and then Stephen just quickly I think this is why sometimes using the metaphor of building a bridge is difficult because not all times as convenient as the lowest energy stayed in a bowl sometimes what we have to look at is the fact that at one end call it point A we're trying to figure out what point B is and it isn't always the lowest energy state. I think here's an example so if you've been to Hudson Yards in New York it's a fascinating translation of a space over time and if you were to ask the people who actually constructed those high trestles at the point that they were built if you had asked them X number of years later whether or not that was going to be the lowest possible energy state or attractor state I don't think they could have predicted that and so that's why I think it's difficult at times when we for the sake of convenience to expedite our sense of what's going to happen next we use a connector metaphor but in fact that's not exactly what stochastic possibility is about so what we tend to do is collapse down too quickly down to the particulars of what purpose a product can serve and if we're going to build a bridge from a product to a process which is what I think looking at stochastic separation is about we should maybe drop the idea of a static connector and maybe look at the back and forth and the actual dynamics of those dynamic situations which were now the tool of active inference affords interesting it just made me think about throwing a frisbee and it catches on a roof that's its lowest local energy state but if you just thought about gravity maybe you'd say well it's going to fall to the ground so one particular trajectory may not realize the ultimate lowest energy state yet in a way it's in its own energy state so Steven I think again about applications I think what Dean mentioned is a good point is there's a natural tendency to want to have that kind of bridge between two stable things the thing with two stable things is they're stable as we just mentioned so therefore if you're trying to get information about variations within them particularly chaotic variations you're not going to get much right so you kind of need there's a need to be off out of equilibrium to give the conditions where things can move in potentially a strange attractor or potentially other types of you know solenoid or flows or something there needs to be some way for flows to happen and then within that which I think that might explain why that last graph so if I'm right in thinking that those four graphs that you showed there kind of showing that you've got this strange attractor that you can sort of look at the state space using points of where something has become momentarily zero in the flow calculation which was zero so you can then for take a point and then the bottom right is basically saying you can recapitulate that as a global potential so you can sort of by taking the changes that are happening at the second order you can sort of recapitulate some kind of understanding of the structure of the information is that right? There's a lot there so I cannot reduce it to a binary but yes it is showing different aspects of how strange attractors which are found in these exotic deterministic chaos systems like a perfectly clean double pendulum and then how when we take the approximation we find a pullback attractor empirically in the approximation so now we're describing a nicer, still stochastic but much nicer and more approachable pullback attractor like it's being pulled back to some butterfly shape now we don't get these like rings of jupiter situation from our sampling because we're taking particular samples and we're fitting then a flow what flow generated those samples so we never see rings of jupiter strange attractor we're dealing with approximations of random attractors and then we have at least two tools slash helpful techniques to apply that are going to help us come to terms with that approximation so one of them is we're dealing with flow so we can talk about the Helmholtz decomposition so this was discussed a lot in act of 26 and as mentioned in the dot zero stochastic chaos in Markov blankets was submitted like a few weeks after 26 was submitted Bayesian mechanics so the Helmholtz decomposition describes this partitioning into the irrotational kind of the gradient putting the ruler on the hill what is the angle of the ruler if you're on the top of the hill it's flat if you're on the hill it's going to be at an angle so separating that ruler on the hill to the hiking map the isocontour so let's make the ruler flat wherever it is whatever the slope is and then let's just take that ruler and kind of move it side to side as we go around this hill now if you're on the very top of the hill then the ruler is flat and there's no isocontour but if you're anywhere else on that whole landscape other than a local maxima there's going to be some angle to the ruler and some way to get around the hill so that's the Helmholtz decomposition so that's one piece that we get to bring to the puzzle and then one piece that would be interesting to ask the authors or someone with more technical experience is we've seen the Helmholtz decomposition before into the solenoidal and the gradient components but then there's this housekeeping term you know capital lambda and that's described in an appendix it gets quite technical but it has to do with the density landscape changing as a function of flow so it's not just like we're going to have one mountain topography and then we're going to have rain falling on that one mountain there's going to be some changes as we flow around so that it's more like you know you have a jet of hot water into a cold water you can't just take the mountain of temperature and then just ask well what happens if we pour hot water on top of that mountain because the mountain is actually changing the landscape is changing as a function of the flow and some of that gets incorporated into this housekeeping term so one tool that we have is from the flow toolkit is the Helmholtz decomposition into these different components that help us reconstruct a flow that is a much better approximation Steven? you mentioned there that we've seen solenoidal and gradient before with Helmholtz decomposition when you say we've seen it before we've seen it more recently with the solenoidal work or has the solenoidal been in there for a long time like this was drawn we've discussed it in paper number 26 and other papers this is not a novel partitioning by the authors and it's very linked to other discussions that we've had about Helmholtz decomposition in the context of distributional approximation but previously Helmholtz was more of a focus on gradient descent the focus was in your regime of attention potentially right going way back to early days so in that paper that was looking at solenoidal flow it was giving more weight to that was it using that in a different explanatory way because there has been this shift around solenoidal flow to give another way beyond ergodicity to explain how these dynamics can work they're kind of co-instantiated because they're partitionings from each other of the same underlying model so which one is emphasized more is totally a question of situation and flourish but it also may be the case that over the previous three years there's more of an emphasis on the importance of the isocontours in complex problem solving as opposed to just run and gun gradient descent so I agree with that so that's one tool we have so we do the Helmholtz decomposition because again we're making this deep equivalence between dynamical systems and flow systems so one tool we have from the flow world is the Helmholtz decomposition for vector fields now they write in section 3 1 our objective is to identify the functional form so it could be used in the sense like what function does it play or a functional is like a function of a function a functional form of the self information or potential function that describes the non-equilibrium steady state density so we go back to in 3 the fancy i the fancy j they write the first dissipative part performs a remanian gradient descent on a manifold on the negative logarithm of the steady state density which can be interpreted as the self information of any given state or as some potential function so this is how surprised should I be to be in this state so now let's go back to our bowl you shouldn't be surprised if you're at the bottom you should be surprised if you're at the top there's high potential now if it were gravity that were pulling us down that would be potential energy gravitational potential energy you have higher gravitational potential energy that can get dissipated free energy can be dissipated if you have greater potential energy so we want something that is just like you could have the bowl and know where's the ball expected to be and where's it going to roll down from where's their potential that's high and where's the potential low and then here in figure one they're going to plot self information on the bottom left that's the wax self information and then there's that is fluctuating through time as the system is cranked it's kind of like if the self information staying at a constant value if you're in that race car going around the track then you're always where you expect to be because again the bowl it's like a point attractor and then the Lorenz approximation is going to be to a dynamical approximation so we want to find the function of the non-equilibrium steady state that's going to be describing how surprised you should be the potential function for that system so these are big questions like how is the self information measured or calculated and then what does it mean for the self information to be high or low and how does the self information for different states that we might care about like homeostasis how does the self information in those cases differ like is the self information higher or lower when homeostatically you're in a good attractor so Dean then Steven so I don't know this but I would guess that on some basic level the measure is on its most absolute terms greater or less than to begin with because I think that that relative difference is something that anybody doesn't even have to be aware of consciously calculating they can just you can kind of tell but I'm not really sure but I would think that as we as we go up in terms of the the determination and the accuracy of the measure we could find ourselves getting more precise but on the most basic level I would think it would from a functional standpoint that's where we would have to begin yes and having being less surprised about the state that one is in entails having a richer and a more accurate generative self model so if you have a model that just totally off base you're going to be recurrently surprised if you have a very nuanced self model your location and your circadian rhythm then different observations will be less surprising I was just feeding back what you were saying because you did a really good explanation but the in your explanation the key thing was I should be more surprised or I should be less surprised like you yourself in doing the explanation come back to that basic difference and so again as we get more precise I think we can have more sophisticated ways of being able to show how we arrived there but at the beginning even a two year old can somehow tell the difference between more and less that's why they dig in their heels right so it seems like on that basic level before we get into some of the more sophisticated stuff we all can it where it says addresses these questions for one simple case I think on its simplest terms that's where we can probably start we can look there first if that's not available I don't know that other math matters I know it does but to the person who can't see that first I'm not sure where they go this past weekend we were hanging out with some larva with some children and we were really struck by how there would be some times where it seemed like you just would tell them something totally different like switch topic in the middle of a sentence and they just went along with it and then other times where they needed something really specific like they wanted that piece of food and they wanted in this part of the plate and they want to touch it this way and so it's like they have a generative model and there's certain parts that they're surprised by or not and so we're just like we want that piece of food and I'm not too surprised at the route it takes to my plate but if their generative model entails a specific way that that is going to get to their plate they're going to be surprised by anything other than that and if they don't have a generative model of topics in the world then switching topics and talking about this event and then that event it's just like it's as if any other set of topics were addressed so great point about our intuitive connection to surprising information how that's related to our generative modeling so blue and then Steven so just to go to your child bottle like having had lots of kids around for a lot of time food is something particularly that they like fixate on right my younger stepdaughter is like one of the easiest kids ever like she's so chill and so laid back but like I remember when she was about two years old she would like freak out one time I cut an apple for her sliced it up just to make it easy to eat and she got so mad because she wanted the whole apple the whole thing and she thought I was only giving her part of it I think maybe food is I mean my point is I'm rambling about the story but maybe food is one of these things that is like so baked in to our generative model because otherwise the children have an attention span of a nat like you know you're in the car and they start to argue and you just say oh look at that pretty yellow car right there and then they're done they're over it like so generally like the you can surprise a child and completely shift them but when it comes to food it's like this weird and many kids like that just you know aspen with the apple but then there's also like this food can't touch that food and like they have a bigger piece of cake than me and all this stuff so this food like this fundamental like baked in thing that's like central to survival and therefore it's like very deeply ingrained in our generative model maybe nice agree Stephen yeah so the idea of children I'll keep that analogy in there if so we've got this these flows happening and we we have state spaces that get readings off so the very first state space reading I mean unless you've got a prior somehow to go on you don't really know what that means and you get another one and another one and so you get a series of state space insights so to speak and then you'll start to get a sense of whether something's more or less than I mean the simplest term the average right but you know we can then use gradient descent and that to sort of get an understanding of that and with children that can be interesting is so yeah sometimes I just want to go with everything and I notice also like I get that with my daughter Layla sometimes it's like you know she'll say like I want something very specific or that school there someone so went to that school like now she knows that well I know because that's what happened when I was little meaning when she was a small baby or and it's all kind of made up stories in some extent but it's kind of like she's converging on she's got enough state space information to say okay and probably you've got that with food right it's like there's there's enough state space information to say and they can be proud and I know that and you didn't know that to say to me you know I knew that you didn't know that I knew that this is quite cute but you've got that okay so she's got enough state space awareness that they're going to converge down and then there's other times like when you want them to just stay on the football pitch and not run off the swings they just go crazy so there's something about you know sampling the space and saying you've got enough information and synchronization on to joint action manifolds is going to be one of the prices of the paper so blue as a response and then Dean so you know super interesting about like asking your daughter Steven like you know the thing she says that she remembers from when she was a baby something that's really fun to do with kids and it's also leads to like the synchronization where I feel like maybe we're going but ask a kid when they can first talk like when they can barely talk maybe like you know you're two years old ish ask a kid where they were before they were born the answers are awesome or like do you remember being in your mommy's tummy like and like you know I've heard ranges and ranges of answers from kids and it's really super interesting so where is this like is it some inherited prior is this like past life regression or is it just like they're falling into synchronization they're making something up because children have these like great imaginations at that age but where does this come from like are they just responding back to what they think you want to hear this is like an absurd question to ask a kid right like I mean you can't really expect a reasonable answer but ask because it's fun to find out what what they think that they were doing before they got here it's fun and Dean first yeah I think it's interesting because if you're going to go back to that geometry aspect of you know can can two-year-olds think in the abstract state yes they can number one and the other part I think we're we're sort of we're all sort of synchronizing around is that if it's lesser more that can be represented by the X and the Y on a graph we've already shown that but if you want to think whether it's in or out you have to add that Z vector and that is the part that most learning doesn't necessarily explicate it's there two-year-olds do it they know whether they're in or out they know whether they're getting the whole apple or a slice so again I ask are we are we attempting as people who are trying to help the two-year-old to build a bridge from point A to point B or should we be actually understanding this as building something that is a product that is material to something that is possible that's what the stochastic field is until we collapse it to that point B right and the only way that that happens is if we include in and out if we include the Z dimension to this so again if we if we can't get in our own minds if we can't get that correct then we're actually going to get in the way of developing pedagogy it's I'm sorry to say that but it's actually true so we can think of these little people as these wonderful little right right brain the right brain hasn't been brought down to these these these equations at the bottom of this page and yet they're still able to function their functional form is self-information based on product to process to be determined so again I'm not saying this because I've got it all figured out but because I don't want to have this active inference tool which is a fantastic potenti potentiator to be collapsed to now here's the here's the potential outcomes that we can engineer A through Z that that doesn't allow for a hunt a Hudson yards and yet there's a Hudson yards so if we're going to help people let's not as you say Daniel let's not over reduce prematurely here in the hopes of actually helping people when in fact what that does is it says as you fall out of the sky raindrop you you shall depart this this flow in Surrey British Columbia or the Gulf of California or at the at the mouth of the St. Lawrence River we can't do that or we shouldn't but we do and I can understand why the greatest thinkers do that well I can because we like to talk but but bottom line is we've got to think rethink this that's what active inference supports totally agreed and we can engineer the landscape of flow you can add a dam so that the water droplets do go one place or another but very interesting so we've got a four so Steven on this and then we're going to go to the next tool that gets introduced yeah I think well I think I feel I know so there's a question of it's an interesting question as well when you say you've got a two year old is where do they get their knowing from so we have abstraction in terms of thinking abstraction so we have well and and the sort of ontological we can create around our knowledge structures but you know they have access to two years of state space experiential sensations of which they can then make a story from whichever way they so please as they can so there's an interesting question that what it means to know that's abstract IE or what's outside of the sort of lived experience level and it's sort of and there's that other way and when when one thing I always find interesting is I see a child is around the age of two three you see really see a change in the fingers that the whole morphology of the body shifts and I think one of the reasons that maybe that memories aren't accessible from before the age of three is literally they were encoded in a different body that's too far removed to revisit the priors or the likelihoods from the state's places that have been laid down so it may be partly a natural way that we sort of forget but we don't necessarily forget thoughts that we had cognitively we just forget the embodied stuff because we're going to we're going to lay down some new stuff unless you have a trauma in which case that can lock in certain states Lou any comments on that of course it's interesting the encoding in a different body we're doing that every seven years like we have full cell turnover but there's this process of synaptic pruning that happens so as we're babies we grow bazillions of neurons and then synaptic pruning happens around the age of two or three where a lot of those neurons die back and then the ones that stay continue to build stronger and build better but everyone is so completely different in terms of their ability to remember things from when they were baby and not always related to trauma I have no conscious memory until I was 10 but it's super late maybe I might do some hypnosis and reveal some kind of trauma at some point in my life but for me to not have any I remember some flashes of stuff I remember that I probably remember the picture of it so there's this whole other level of thing but the idea of being encoded in a different body is super interesting because of course my body is different now also from when I was 10 but I mean also when you're 90 your body is different from when you were 50 so what stays what goes how is it chosen is it this embodied aspect of it or is it just like what we think are important for our model and if it's not relevant for our model maybe we just let it go Dean? I just want to say this real quick because I know you want to move move forward Daniel but everything that we describe here in order to get to a concept of between whether it's between us today and us seven years plus a day before that between or any other between that we want to describe at some fundamental level we have to be able to discriminate between more and less and what's in and what's out if we can do those things we in any situation can describe between now once we can describe between we can see a difference between what a product is the bottom of this slide and a process something that that equation can do if we can get that clear then stochastic chaos is a between thing that we can then manipulate to a certain outcome or we can leave open to possibility if we can get that just that we will have conquered more than most people have conquered in their lifetime because we can use the word between but I don't think they necessarily understand what makes that up and I think that's what we that's what I would like to be able to share with others and co-learn with others as opposed to excuse me building another set of specific outcomes that we can measure and say this was you one month ago now seven years plus a day and now we're going to decide whether or not you're closer to some thing that I've arbitrarily selected as the thing that makes you have learned something if we can get there could you imagine how amazing that would be not just that one thing sorry I get really passionate about this but holy crap I'm actually talking to people who can actually hear that and go that's interesting instead of just dismissing it out of hand and going we don't need to learn that because we already know what between means between is geometric so for sure information geometry will come into play and then the idea of taking vast situations and through time with uncertainty and then calculating more or less cough cough expected free energy that could be an extremely helpful approach so Blue anything on this or we'll continue yeah so what's the next tool are we going right into the Markov blanket because man when we talk about between I just want to get to the Markov blanket when we talk about between I think that having that idea of the Markov blanket anyway I'll add it at that time so go on absolutely the Markov blanket is even understood by the non-technical as a betweenness but is it the betweenness that is your epithelium mechanically or what kind of betweenness are we actually talking about we need the tools let's go to the next tool so however this is the quote in general we cannot discount the correction term that arises when flow operators are a function of states so if it were just Mount Everest we could just do the solenoid in the gradient decomposition and then figure out how to combine isocontours with gradient climbing and we would get to the top however swept under the rug by the housekeeping term is this change in the underlying mountain landscape indeed it is this state dependency that underwrites stochastic chaos so a way to think about that is like the double pendulum the chaoticness of that system arises from the exact angle that the double pendulums are in you can't just say there's a Mount Everest and it's the bottom it's like okay not very helpful for me predicting these wacky waving shapes it's doing this presents us with a more difficult problem the problem can be finessed by using polynomial expansions of the flow operator and the potential as follows for n states so it could be one two three four n dimensions up to polynomial order m so a linear approximation has a polynomial order of one a quadratic approximation has a polynomial order of two and so on so it's like fitting higher and higher order terms so the big idea is that we're going to use polynomial expansions to approximate complex underlying flow functions polynomial expansions do many things but most importantly they restrict or scaffold our model selection approach to a very specific form and function and size and tractable computation so individuals might be familiar with a Taylor series which is where you take a given point and you say well what is this function at that point like the sign of zero is zero and then you say what's the first derivative that's this red line and then you say what's the second derivative starting to curve a little down what's the third derivative and so on and you can use the Taylor series to approximate that function further and further out from your entry point that you're calculating the first second third fourth fifth derivative on etc and then this is a static approximation so this is something that might be learned in high school or college level maths that's very related but not always discussed is the Volterra series and so the Volterra series is a model for non-linear behavior similar to the Taylor series it differs from the Taylor series in its ability to capture memory effects the Taylor series can be used for approximating the response of a non-linear system to a given input if the output of the system strictly depends on the input at the given time and this is developed quite extensively the Volterra in SPM textbook by Friston et al so again many of the seeds and roots of what we're talking about today check the SPM textbook and documentation because that's where a huge amount of this competency in these authors arose from and how they represented it almost 15 years ago so what happens when we do this polynomial is that we get an extension of the functional form well first off we moved the problem that we were solving from static mountains only to trampolines that are moving as we're walking that's what brings state dependent changes into the mix and action as well and it moves us from again a static landscape that we can just calculate where the rulers are flat on and then just figure out the best way to get there well you can't just lay a flat ruler if there's state dependent changes because where the rulers flat at T0 is not where the ruler is flat literally the next moment Dean yeah I think this is excellent Daniel because what this town says is now we can explain why there are impressions why there's some residue left over why the in and out aspect of this not just the bump but the in and out going through that bump also has to be accommodated or explored at least acknowledged as part of everything that we're looking at here the whole thing so yes that's a great explanation thank you and the learning landscape isn't just the mountains the interaction landscape is more like it's like the coupled systems will get too soon but it's even one person learning landscape is like being in trampoline world sometimes so that brings us into this space with many important topics that are discussed in an earlier and different way by the SPM textbook now how good is or could be this polynomial expansion they write inspection of the Lorenz system suggest that a second order polynomial approximation is sufficient given the flow is second order in the states so we're fitting an approximation that has two terms so it's like a quadratic so that's why we talked so much about the bowl because the approximation that we're fitting is bowl like which allows optimization techniques that are simple like convex optimization those are easy roll the ball to the bottom of the hill by minimizing the potential function which potential function the free energy potential function the free energy functional whereas if it was a rugged optimization landscape you get the whole headache of how do you know when to stop and what if there's just a whole neighborhood you didn't even go to so it seems like potentially a second order approximation is sufficient then they use this word onsatz which is nice it's an assumption about the form of an unknown function made in order to facilitate a solution of an equation or other problem so now we're going to go with this idea that if we can have a second order approximation to Lorenz we're going to be doing good enough let's work with that and see if that turns out to be the case so it turns out if this thing that's so hard to model the self-information the surprise the potential function is modeled by a second order polynomial then the non-equilibrium steady state is approximately Gaussian this is known as the Laplace approximation in statistics which is also in SPM very helpfully here we generalize the notion of Laplace approximation to cover not just the quadratic form of the log density but also the solenoid flow that underwrites non-equilibrium dynamics so the Laplace approximation in classical statistics is like we're going to take the density distribution of what we're trying to estimate and we're going to find the peak the maximum point and then we're going to fit a Gaussian over that peak and so if there's like a big shoulder and then a tiny shoulder you're just not going to get that tiny shoulder but that's a very rich distribution if it really has that structure in the real world what you're going to get for that distribution is a Gaussian centered around the big peak so for some distributions you're going to be way off there are things that are hard to estimate and there's things that don't work well with the Laplace approximation it's not a solution it's an approximation however if we're interested in the mean and the variance around potentially the bulk of a distribution then solving the parameters for the Laplace approximation is as easy as fitting a quadratic which is to say the ball sliding to the bottom of the pole and once you go beyond second order approximation you have a third order quadratic there's a whole another world because then the ball could get caught in that first little elbow or it could fall off the other side so higher order approximations they take more computational resources and while they might have increased accuracy they can often be less tractable to actually solve so this is at least in my reading a pretty fundamental contribution to the statistics of complex systems because they've taken the notion of Laplace approximation not just towards estimating the mean and the variance of some distribution out there but to the solenoidal component of flow post-helmholz decomposition related to non-equilibrium dynamics Steven so up to this point we actually don't need Markov Blankets yet we haven't used the word has not come into play yet at all exactly so you've told the story of what needs to be in place for there to be the kind of state spaces and then the nature of these state spaces has been so changeable and non-linear and then on top of that then there's a way to tractably work within that which is the Laplace approximation in statistics which gives you does it always give one gaussian on any landscape or is there multiple or is it just one per landscape yes per dimension per state one and only one mean and variance it's like fitting a gaussian distribution to the log density so it does only give one which is why I gave the example of the two hump it's gonna pick one and go there let's look at figure two how interesting though that the paper is Markov Blankets and still cast a chaos but we're not even at the Markov Blanket but the way that it gets led into is very um improved and different than what we've seen before instead of assume a partition dot dot dot dot dot we're starting way back so now here on the left is the flow of the Lorenz system using the Helmholtz decomposition so here are the three components of the flow as they've described the blue is the gradient orange is the solenoidal flow and then the gold is the correction housekeeping term right this uses the same format but for a Laplacian system based upon the Lorenz in the upper panel so we're gonna be fitting a Laplace like approximation to each of the states each of the dimensions of this Lorenz strange attractor the key difference is that the dissipative part of the flow operator and the Hessian are positive definite so just really quick matrix vocabulary the Jacobian matrix is the matrix of the first partial derivatives so in the x, y, and z dimensions or however many there are what is the angle of the ruler that's the first partial derivative the Hessian is the second order partial derivatives so it's describing the local curvature flat and it's curving down you're on the top of a hill if your ruler's flat and it's curving up you're at the bottom of a hill and so on so that's how the zeroth the value of the function the first derivative and the second curvature are used like in calculus so the Hessian is positive definite which means the gradient flows converge to the maximum of the non-equilibrium steady state density this is reflected by the blue arrows that point to the center of the state space so in the Lorenz world you're getting race card around the flow is taking you all over the place yet by making the Laplacian approximation all of a sudden the flow the attractive part that's dominating the movement within that space it's actually convergent so that's not to say the Lorenz system converges to a stable point it's to say our approximation parameters converge towards potentially a more tractable and simple attractor in the abstract approximation space and we'll see how well of an approximation that indeed is but that's the key difference is here the blue arrows are taking you on a wild ride as the gradient takes you around the Lorenz flow here we're doing a flow composition on our approximation which we constructed to be of a certain type and it turns out to have properties in its attractor that are very different than the Lorenz Dean and then Stephen unmute Dean and then yes here's a question for everybody here if we're talking about a process could we be looking at those over time seeing those arrows in both the Lorenz and the Laplace decomposition thickening or expanding like having water added to them has anybody contemplated whether in fact that is what is happening over time so the fit is actually from a within the space is getting filled up by these arrows growing in terms of their circumference very interesting question the length of the arrow usually describes the amplitude like the speed of the flow in that moment I do not know if I've seen a visualization where the tree rings the arrows represent thicker or thinner vectors but great question and if anybody with more experience in vector math has a thought that could be very interesting to know okay Stephen and then blue just to clarify that I've got this right but in each those points where you've got the blue arrow and the solenoidal flows so each one would be solved so to speak for that point as a you know a Gaussian approximation and then by this is like an assemblage of multiple results of a Gaussian approximation and it's sort of mapping them out so to speak so you know this idea of multiple pieces of information and nestedness is sort of comes into play yeah there's some landscape and we go out there with our ruler our Jacobian and our curvature ruler our Hessian and then we're going to make just transects on the hill and we're going to just sample a bunch of points and at a bunch of points you know one comma one one comma two we're going to use the ruler and the curvature and we're going to look at at that point where the field is pointing and then if you've sampled at an inadequate resolution you're not going to see a coherent vector field so you need to sample at a resolution that allows the vector field to be kind of obvious so blue so I just want to point out for people who might not know that we need to have a positive definite Hessian to calculate a covariance matrix like in statistical analysis that I've done this is like one critical point and if you don't have an invertible Hessian then you have to recalculate your model because all of your factors are too entangled to find out how they are covariate together or separately so I just wanted to for anybody who doesn't have the background in in matrix math I'm going to point that out yes thanks Dean blue help me does that mean literally that you can't thin one of those arrows you can't you can't let certain things be purged from your state space because that's like I really don't know I'm not questioning I'm asking a question from a statistical standpoint and then being able to translate that to the physical reality so I don't translate to physical reality well my brain just like doesn't work that way but what has to happen is what that really means is you can't establish conditional independence without having some idea of the covariance and so if you can't calculate the covariance you can't discover this conditional independence which is where we're building up to in specifying the Markov blanket so that's why I wanted to just point that out because you have to have that ability to find that covariance okay let's go to where the covariance comes into play and then Stephen so we're going to address how well does this approximation work because I mean we're fitting a few quadratics to like the OG complex system is that really going to work what's being shown in figure three is on the very top so this is the left one is second versus third first versus third and first versus second so these are like three projections on a cube so we're looking on a flat screen or piece of paper about a three dimensional state space and the red dots are the sampled trajectory in that cube so it's like there's a firefly moving around in the cube and we're looking at it this way and so we're projecting it onto that screen and then in the background we see this very nice unimodal Gaussian attractor so the lighter part is like the attractor part and then the darker part is higher potential function more surprising less likely so that is the Lorenz trajectory being played out in our Laplace like approximation cube and so it can be shown that the flow of the Laplace approximation it has some of that kind of butterfly race car dynamics it's probably doing a lot better than just sticking a linear equation through it but it's not perfectly recapitulating the underlying Lorenz system so how good is the approximation well one thing that this approximation is going to live or die by is whether it can characterize the Lorenz system as chaotic or not because if it turns out that the way we're going to deal with chaos systems with a positive Lyapunov exponent is fit a non-chaotic model oh I made an oscillator pendulum to describe homeostasis okay so you made a simple model but what's really interesting is if our approximation could also reflect chaos yet handle it attractively so what they do in Formalism 12 is they show that the estimated Lyapunov dimension for the approximation is 2.48 and they use some numerical approximation techniques here like by sampling a grid kind of that transect sampling idea but for the Lyapunov exponent at those points rather than say just the gradient and that favors that compares very favorably 2.48, 2.43 importantly they're both like similar and way over one with the actual analytical Lyapunov dimension so the approximation is actually highlighting the chaotic nature and almost quantitatively the extent of the chaos is more and less chaotic systems and so for our purposes the Laplace approximation is easier to handle than the Lorenz system because the functional forms of the flow and potential are immediately at hand that's the whole point we use an approximation family of equations that's going to be like having a ball rolling down to the bottom of a bowl rather than being swept up in turbulent flow bigger 4 is where we get to the covariance that blue mentioned okay so we're going to have three states those are the x, y, and z states or dimensions of the Lorenz system and now we can look at the log of so the log is just sort of a transformation to help highlight some of the differences between 0 and 1 especially we can see the log of the Jacobian which is that first partial derivative the log of the Hessian which is the second partial derivative and then the log of the covariance in the middle which is like what blue was referencing so here the first and the second states dimensions have very high correlations neither of them are correlated with the third so the on diagonals are states being correlated with themselves the first and the second dimensions have high correlation well what does that look like when we use that cube shining light so on the bottom left here we can see like the first and the second have a like an oblong relationship that scatterplot has a significant regression through it there's a correlation coefficient there's mutual information there there's non-sphericity whereas the other ones that are uncorrelated relationships have what looks to be more of a spherical correlation like if you sample variable a and b and it's just a circle there's no mutual information and so interestingly the covariance matrix looks a lot like the Hessian matrix so that is providing evidence that if what we care about is the covariance amongst the states we're getting a very similar structure when we use the second order polynomial approximation to those states and it turns out that it does pretty well if you want to for example estimate the first state given the second so what if we could only measure two dimensions well we wouldn't want to pick one and two because they're basically correlated in their measurements we would want to pick two and three and then do a really good job at predicting one given two which is to say that the informational dimension of this system is between two and three it's two and a half basically so those are some of the connections between covariance amongst different dimensions and polynomial expansions to dynamical systems based upon flow approximation and then just to summarize here then we can have any questions and then in the last 20 minutes we'll look towards the Markov blanket we still haven't even gotten to the Markov blanket yet so just this is the foothills of Mount Markov and they write it's important not to conflate the simplicity of a non-equilibrium steady state density that's this one with the complexity of the underlying density dynamics the map is not the territory you just cannot say it enough simple approximations are not claims about system simplicity in other words when prepared or observed in initial state the probability density can evolve in a complex and itinerant fashion on various sub-manifolds of the pullback tractor and they're going to use information length to talk about that but the important thing to know is just that our approximation has attributes that resonate with the original underlying systems dynamics as well as attributes that we specifically know are different or they might be different or not but no matter what they're not claims about the underlying system so that's not a pipe Steven yeah I think this also speaks to this idea of triangulation or multimodality is like I say this isn't directly speaking to what's out there or what the system or what you could say whatever the things are that are being inferred so there needs to be or that the more approximations that can be triangulated as I say which is kind of what happens in qualitative research qualitative research is all about the triangulation of research not about the exact measures and accuracy within research then you can get a better grip potentially on what might be out there if that makes sense so like say with that pipe you kind of you know if I could if I could go in and you know I've only got one aspect of that reality so I can only resolve it to a certain degree and it is what it is and then I have to see I don't know how big it is as a painting how you know until I see it in a gallery etc etc right so yeah I think that this the need to find the way to get an approximate way in means that you need to have many approximate ways in to start to make a higher level confidence in what you're thinking Thanks Steven, Dean Yeah Daddy can you go back to slide 48 for a second? Yes So this is the fascinating thing about contrast and proportion because you've got nine squares up in the upper right there with the Jacobian and the covariance and the Hessian trying to translate that and I understand this is the big this is the sort of the grail of translation to translate that and the contrast and the proportion that you see there to fresh eyes to new eyes they might look at that middle one with a completely different sense of what that pattern is describing for us than the two outside slices of bread and to us because we're a little bit more familiar with it the contrast and the proportion you can actually see where the Hessian matters from a statistical standpoint but drop that into a context now where you've got mountains and rivers and forests and streams and how do we take that information there which is really quite clear to people who are familiar with it and translate it to the context the situation in which we find ourselves and are now trying processing to a converged sense of context what does this what does this envelope tell us right we're inside of this cube now how do we get outside of it use this tool to so to re enter the context and have a an agreed upon sense of what covariation means this is this is the Grail if we can set up our learning experience as situated one where we take this tool in but don't confuse the tool as you say don't confuse the math with the field that's again it's fantastic if you can get a whole bunch of people seeing what's the big picture here what's the tool what's the contrast and proportion with the tool and then what is the context what's all of this fractalized stuff out there in different colors and different shapes and different sizes how do we put those two things together into some sort of coherent narrative point so in the last 10 to 13 minutes we're going to finally verge to M word Markov blanket and then next week it would be awesome to speak with any of the authors or anyone else who would like to share their perspective or will continue because like we're basically halfway through and we haven't gotten to the core piece of the Markov blanket so we're going to just look where we're going in the last little bit of dot one and then in dot two we're going to like land on that Markov trampoline and then see where we go okay in section four they're going to repeat the analyses of the previous section approximating two Lorenz systems coupled to each other through their respective first states this induces a richer conditional independent structure from which one can identify internal and external states that are independent when conditioned upon blanket states so this is the more classical Markov blanket definition which is blanket states are those that conditioned upon them make internal and external states conditionally independent so that's a definition that we've seen before and the internal external external and blanket states they all are co-instantiated it doesn't make sense to talk about one without the other and we've mentioned that before so in figure seven it's a lot like figure three where we were looking at those three projections of the cube of one Lorenz system well now we're looking at two and it's pretty clear from looking at their trajectories as well as just from their movement in the state space like these are pretty synchronized so some coupling has been instantiated no different than if you had like two engines and then you connect them with a heat transfer device they're going to have coupled temperature fluctuations except now it's like more statistical figure eight is a lot like figure four so in figure four we had this three by three matrix describing the first partial derivatives second partial derivatives or the covariances of states one two and three and the on diagonals were states with themselves and the off diagonals were states with other states now we're going to have three plus three states two sets of three two coupled Lorenz one two and three is the first Lorenz and then four five six are the three states of the second Lorenz you see that in the Jacobian there's high correlation within each Lorenz system and then there's that one and four where there's a strong value that's the coupling like that is the cell that represents the coupling between the first upper quadrant and then the lower right quadrant and that induces some extremely interesting covariance structures that is what we want to investigate it turns out that by looking at the Hessian of this system some partitions fall out those partitions can be numerically analyzed to show that basically there's partial correlations that approximate the Hessian of the Lorenz systems that are coupled so we can use the Hessian which we already established the second order polynomial expansion is like basically adequate we can use the Hessian of our approximation not of the system not the system has a Hessian our model of the Hessian approximation very well aligns with the partial correlation and so that's demonstrated by showing that two states that shouldn't have correlation third and sixth and fourth and fifth so like here three and six are not correlated we can show that even if they start very correlated like they just happen to start at a similar value the dynamics of the system actually move them towards having a zero or even a slightly negative partial correlation so our Hessian approximation is giving us deep insight into the partial correlations and then I think a great place to sort of posit is with this connection of the graphical Bayesian network that we've looked up for with that same partitioning into four states internal, external, and blanket states now Friston and others then partitioned the Markov slash Perl blanket states which are of one kind into these two kinds of blanket states incoming sensory states and outgoing action states with perception, cognition, action and control theory and planning as inference so this is an image we've seen before and here are those six states so one, two, and three that's Lorenz system one then four, five, six is Lorenz system two and one in four is the only cross coupling that we're putting into the model that's this cell right here so we're cross coupling one and four you have two neurons and then there's a little tin can between the two of you and that's inducing a global structure to all six nodes so you don't need to have every neuron wired to every neuron there's sparsity in this connection structure and technical details aside that is going to relate to the partitioning which we're going to be deriving a partitioning on the Hessian which is a second order polynomial expansion of our model so I think that is a great place to go for today and then just to close then we can have last thoughts or what we want to talk about next week just to reiterate there's no claim either the original Lorenz system coupled Lorenz system possesses a Markov blanket so the territory doesn't have a Markov blanket our coupling doesn't even have a Markov blanket the claim here is that there exists a Laplace approximation to these kinds of systems that in a virtue of the zero elements of the Hessian feature Markov blankets so not has a Markov blanket is, wants to be, did have those are speculative interpretations that maybe future larva will discover to be the case or not but what this paper actually claims is very far from that blue and then Steven okay so here we're going to stop just as we're getting good just as it's getting really good okay it's the cliffhanger for next week I just wanted to leave off with where this figure is in the paper like right under that figure it says the names active and sensory or blanket states inherit from the literature where these states are often associated with biotic systems that act on and sense their external milieu so I'm like why are they imputing sensory states onto particles like this is where the great leap in the paper kind of happens for me and I'm like okay now my particle is sensing which is like to me okay like I'm actually okay with that like you guys know you've seen me like I'm pretty good with panpsychism at all at overall and it doesn't really make a big deal big difference for me I'm still very interested but it's something that a lot of people I think are going to get maybe tripped up by so I would like to hash that out and to say well there's absolutely no claim that real systems have a marco blanket you know like the skin it's like are we going to be talking about real systems and how it maps onto physical components of the real world and the mechanisms of sensory transduction or is this about the hashing approximation to the Laplace so sometimes very close to each other in the text so that can make it a little bit ambiguous Stephen yeah very very cool yeah blue blue makes a good point I suppose what I'm sensing here and correct me if I'm wrong is that this is showing a way to show that patterns can be existing within the covariance of attractor states or attractors just in themselves it could just be a blob of stuff right in amongst blobs of things and stuff and particles and then it also shows that there can be the ability for those dynamics to partition out so it's kind of gives a bit of two ways it gives the partition out piece and then it also gives a way to have a blob this is a very technical term a blob of very useful stochastic chaotic states which don't need to be themselves blanketed that could be what becomes just sort of sensory states for instance but could be a reservoir within a bigger reservoir and would that be a correct assumption if we truly had no difference between our approximation and our incoming observations like we had a perfect model then maybe there'd be a valid claim that the map was the territory and for systems that we define in software it actually can be that way but for any real system this approximation there's always going to be a delta between the incoming observations and our approximation and so the claim here is that it's the Hessian the second level polynomial expansion of our approximation that has some very nice mathematical properties not that the system has that or any specific properties I would just speak to the idea that the skin is a great example because you can look at it only as a product as a limit where you can look at skin as a process I mean we know that all the processing that's going on there so again do we do we collapse it down to one type of function as a product as a limit or do we see it actually as a process as well as a something that's dynamical, something that's constantly changing and then that opens up the idea what between means now I don't know if it takes it down to the particle level seems like it does at those plank scales but I'm not really worried about that so much as I'm interested in the idea of what betweenness is because I don't know that skin automatically is a partition sometimes it's a process as well so if we keep that open that'd be something interesting to take up next week great yeah I think if anyone else has any comments like what a fun paper and dot one we hit a lot of the warm up stretching and preliminary topics and contextualization that brought us like right to the vista of Mount Markov with further clouds behind even that but we're where we know that we can step into the dot two with a backpack that's prepared and then these are really interesting and clearly either unresolved and or uncommunicated and or unapplied questions in literature in the state of the art which is what is the relationship between these formalisms and specific nouns and verbs in the world the kinds of things that we're semantically engaging with and if it turns out that the semantics of a Markov blanket are not like the semantics of any other thing or they're importantly different from the semantics of what we currently feel familiar with great or it would be interesting if they do overlap or encompass or nest within some other semantics that already exist but that's the question so we can just look at our final closing questions and if anyone has a thought otherwise 32.2 next week on November 16th and we hope to hear from you if you would like to join the discussion live just get in touch with us or contact us at activeinference at gmail.com or just show up and ask questions or make comments in the live chat but interesting stuff we were a little trepidatious with such a complex paper but I think bringing simplicity and simple rules underlying complex behavior and how can our approximations be tractable even when the world is just infinitely surprising those are some of the fundamental questions that this paper and activeinference address so Blue, Steven and Dean great work and we will see everybody next week bye