 Hello and welcome everyone to the Acton Flab guest stream number 7.1 007.1 actually very nice number. It's May 28th, 2021. And we have some very special guests on today who I will let themselves introduce themselves and then we'll go into a presentation followed by a discussion. So any and all questions and comments just put into the live chat during the presentation and I'll be compiling those so that we can make sure to ask them. So please, Miguel, go for it. Okay, hi. Yeah, I'm Miguel Aguilera. I'm working as a postdoc since a year now at the University of Sussex. Yeah, so my name is Crystal Buckley. I'm a lecturer at the University of Sussex in both the Sussex Neuroscience and Sussex AI groups. Interesting century motor control in general. Cool. So Miguel, share your screen. I'll resize it and then we'll have a presentation take as long as you'd like. Okay, thanks a lot for the invitation. I'm really happy to present our work here. It's been really fast because we just released the preprint a few days ago, but it's cool to have the opportunity to do this discussion online so far. Okay, let's see. Okay, can you see my screen? Yep. So I'll resize it now so go for it. Okay, okay, so this is a preprint that we have just published. So Miguel Aguilera and I wrote it with Vermilich, Alexander Sanz and Christopher Buckley from Sussex, but except for Brenda's now. And so the work is to focus on some of the recent publications about the free energy principle. I guess most people watching will know that the free energy principle was conceived originally as a unified frame theory, integrating experimental data and trying to describe the relationship between action, perception, learning and so on. And more recently, the theory has evolved and tried to become more general, stating that any non-equilibrium dynamical system under some special conditions can be interpreted as performing by GCN inference. So this would be a broader theory of living system or a way to describe certain living systems in the cave using the tools from GCN inference. And okay, so this is kind of, so in principle, this kind of schema of how a brain is predicting an environment will be using different systems like a cell or different living beings. And I imagine me, one familiar with the free energy principle, have seen this kind of pictures a lot. And we are trying to do different ones and so I think it's maybe a good thing to try to visualize things in any ways because these ones have been profiting so many times. Okay, so the idea is, so we are interested in exploring these different assumptions behind the principle because in theory the free energy principle applies to very diverse systems, but in practice there has been very few attempts to explore it, to explore all the steps of the theory in specific systems. And what we want to do here is just that. And we want to understand the free energy principle and its mathematical assumptions in the simplest system, in the simplest non-equilibrium system that we can find. And this is a linear line of dynamics that I will explain later. And there are assumptions that the couplings between elements are weak. And in this very special case, we can compute all the relevant variables analytically and this will be very, this is a very interesting case for studying in depth all the mathematical details. Okay, so the work I'm presenting today is related to the latest publications around the ideas of the FIP. The monograph of free energy for a particular physics and the most recent paper by Thomas Lanz and Carl about Markov blankets, information geometry and stochastic thermodynamics. And specifically we want to focus on two questions. The first question is how general is the free energy principle? That is, from all the systems that we can imagine can capture aspects of biological systems, how many of them conform to the assumptions that the principle requires? And the second question is how informative is the principle about the behavior of a system? So if a system, if the first point holds, so if a system conforms to all the requirements of the principle, then how much do we know about the system? Right? How much the principle is able to inform us about the behavior of this organism? Okay, so we find two issues that I will explain later in detail. One is that some of the requirements of the principle make it challenging to find systems in the class of systems we explore that match the required assumptions. And the second issue we found is that one important step in connecting the behavior of the system to the gradient of the free energy of the principle involves decoupling from the history of interaction of a system. And this creates problems in the interpretation of a system behaving as if minimized in a free energy. I will explain that in detail. So before starting, just one quick note on the notation we are using, which is a bit different from the papers. And some of the literatures we are describing external, sensory, active, and internal states with this C, which is composed by external states Y, sensory states S, active states A, and internal states X. And here also differently than in the papers, bold symbols represent vector and matrices. So expectations instead, which were bold in some of the papers here are represented by these M variables like M, Y, M, S, and so on. And sometimes the theory uses conditional expectations given a blanket state. And in this case, the blanket state, the conditional state is going to be two parentheses. This is the mean of Y, the mean of external states given B. And also we can compute time derivatives as usual. And we will represent marginal close as these F variables that I will describe here. Okay, so, so first we try to summarize the different conditions and assumptions that one needs to derive the free energy principle. And I'm going to make a bit summary, which will be useful for later. So first we, the free energy principle assumes we have a system that we can describe as a Langevan stochastic differential equation. And this is just having a noise term, this omega, which is just random and uncorrelated of white noise. And some flow function F, that could be any potentially nonlinear function that describes this time derivative dis-evolution. This deterministic part of the evolution of a variable C. And in principle, the kubari omega signal center and its covariance is just a diagonal matrix gamma because all the noise is independent between elements. Okay, so the first condition that the free energy principle requires is that the flow function decouples what it's generally called autonomous and non-autonomous states. Meaning that internal states X are not going to influence external and sensory states. And external states are not going to influence active and internal states. And so this is what we generally expect from a sensory motor loop in an agent. And so if we have an, we represent this evolution of the system like an Euler integration. So the external states do not influence active and internal and internal states do not influence external and sensory. Okay, so we are going to require restrict or analysis to system that have this sensory motor structure. The second condition of the principle is that this system has a global attractor. That is that the probability distributions of the state of a system converge to some specific probability distribution, which is called T, and this P can be described by the surprise, which is the negative logarithm of the probability, which is how surprising or how unlikely is a state. And this condition proposes that this global attractor or this steady-stated distribution can be described by the composition, which holds the composition or an A of the composition, which relates the flow function with the surprise via two matrices Q and gamma. Gamma, we saw it before, it was just the covariance of the noise. And Q is a matrix capturing what's generally called the solenoidal flows in the system. And Q is asymmetric and the solenoidal flows of the system mean that it capture the non-equilibrium influences in the system. If a system is at thermodynamic equilibrium, then Q is going to be zero for all elements. But if the system has an exchange of matter or energy or have some asymmetry in the interaction between the parts, Q is going to be non-zero. Okay, so we are going to have this diffusion terms, gamma, which are these dotted lines. And then we are going to have couplings between elements with these solenoidal influences that are captured by these blue arrows here, these weekly arrows. Okay, so that's the second condition. And the last condition that the Framer-Hippensberg requires is the Markov-Lankett condition. This is, yeah, maybe very well known, which requires that the sensor in motor states, SNA, which are generally called a blanket state that we are going to call V, comprising the two. So the condition requires that this variable are going to play a very special role, the coupling internal and external states in the following way. And they are going to, if we compute a conditional distribution, fixing V, and of external, internal states, this state is going to be independent for a fixed V. So they are not independent in general, but they are independent for a specific blanket state. Okay, so we can, yeah. So if these arrows are conditional, are statistical dependencies, given V, we have Y and X, but they are not statistically correlated for a given V. Okay, and so in this sense, these blanket states are going to be like a boundary or a barrier between external and internal states. Okay, so these are the conditions required by the Framer-Hippensberg in Orvio, meaning that these are the prerequisites that we require for a system to display the properties proposed by the Framer-Hippensberg. Okay, but aside from that, we interpret that the theory has some three extra assumptions. So these are not conditions, these are not prerequisites, but things that the theory expects to happen or expects to find at least in some cases. And for explaining this, we still explained before how the theory proposed that variational free energy is involved in the behavior of a system. So the free energy principle proposes that any system under these conditions that I describe can be seen as minimizing or as if it's going to behave as if minimizing a variational free energy. So for defining this free energy, we start with the surprise of the blanket state. So the blanket state is what an agent can observe in the environment. The agent cannot access the environment directly because of the sensor metastructure that we have, but only can access the blanket state, the sensors and the actuators. And if a system is behaving as if minimizing the free energy or doing bias and inference, a system is going to behave as if minimizing the surprise of this state. But the system cannot access this probability distribution directly because it cannot access the environment. So instead of that, the bias and inference prescribes to use a lower bound of the surprise, which is called a variational free energy. And this variational free energy consists on the surprise term plus some divergence with some KL divergence between the actual probability distribution of the environment given the blanket and a model Q, a simplified model of the environment. And bias and inference consists in finding the parameters theta of this model that minimize this distance. So when this distance is minimized, then the agent has a model of the world and it's predicting the states of the environment wide. So solving the bias and inference problem, the simplest way of doing it is just doing a gradient descent minimization of this free energy, which ends up being a gradient descent of the surprise of the blanket state. And then we have the environment wide at the most likely state theta and blank. Okay, so here just theta just parameterizes the most likely state of wide. Okay, so then the free energy principle defines the most likely states of the system and environment, fixing a blanket, just like the mode or the arc max. But sometimes it has been proposed that the mean is enough for doing this. And since we are going to use linear systems, the both are going to be the same. So you can interpret this M as the mode or as the mean of Y and X given B. And then you can, if you remember, we have these flow functions. If you can compute the average flow of the system, Y for a given B and which we saw in this assumption that can be described by this matrix Q and gamma and the gradient on the surprise. So if we want to find something like the gradient of the surprise that we have here, the free energy principle proposes that we have to compute the average of the flow given B and we obtain something like this. And we can see here that the first term is just what we want is the gradient. It's already with the gradient of the surprise, but we have some extra terms. And so if we have some gradient like this, we could, if we had something like the first term in this equation, we could interpret. We could use this way to perform a minimization of this energy, which if we interpret it strictly, it could mean having some variable theta that follows this gradient. Oh, sorry, one second. Okay. Okay. So if we have some, okay. So the free energy principle suggests that this parameter theta is going to be the most likely states of external or then internal states. So we have, sorry, we have this equation here, this equation 12, and we just need the first term. In general, the free energy principle proposes that the coplings of this matrix Q, the non-di, they are going to be zero. Okay. So, okay, so this is a, so each element in this diagonal is a block, can have many, a number of dimensions. And the free energy principle assumes that, well, one interpretation could be is that it assumes that removing these solenoidal couplings, these non-block diagonal couplings, then we are going to remove these elements that we don't want. And we are going to have a flow of the system equal to this gradient on the surprise that is the gradient that minimize the free energy. Okay. So this is the first assumption. If these Q terms are equal to zero, then we have an average flow that goes in the direction of minimizing the free energy. And this is interesting because we will have a system that points in the, in the direction of the free energy and, well, we will explain that later. However, in some works, it is proposed that there's a more general form of this matrix that is not, it's also blocked diagonal, but not within Y and S and A, but within autonomous and non-autonomous states, so Y and S and A and X. And this has been referred as the general case, but it's not, for us at least, it's not that clear how you can derive in this case this gradient of the free energy and what it suggested that the solenoidal couplings can cancel out, but yeah. So we are going to consider the two cases, but consider in principle just the case of a complete block diagonal matrix. So this first assumption means that the solenoidal couplings between elements are or can be approximated as zero. Okay. And the second assumption is, okay, so we have a flow of external states that's pointing in the direction of minimizing the free energy, but we want to interpret internal states as minimizing this free energy. So what the free energy principle proposes to solve this? So the proposal is that there is a smooth mapping sigma that connects the most likely internal states with the most likely external states. So this MY is equal to sigma X. And the assumption is that there's a gradient of, yeah, there's a gradient or there's that derivative of this mapping that exists, so the mapping is differentiable and that it is invertible. And the proposal that a sufficient requisite for this assumption to is that the mapping from B to MX is injected. Okay. And so if this mapping exists, then we can connect the external states with the internal states and we know that external states have a flow that points in the direction of minimizing the free energy. And so the hope is that now internal states, if this mapping exists, are going also to have a flow that's pointing in the direction of the free energy. And the last assumption, and this is maybe more controversial, is that if one have a strict interpretation of the gradient descent of a free energy, the assumption will be that the dynamics of the most likely states are approximately equal to the dynamic to the average flow of the external states. So this is the derivative of the average states and the right side will be the average of the derivative to some. So this is saying something like this. No approximate derivative of the average is going to be equivalent to the average of the derivative because the flow is just the derivative of the system. So if this assumption will be true, then not only the flow of external states point in the direction of the free energy, but also the dynamics of the system are minimizing the free energy. And this is interesting because if we had this, if this is true, then we have some equation like this. So the derivative of MY is equal to the gradient with respect to MY of this surprise. So the dynamics of MY are behaving as if minimizing the free energy. So this equation 22, we know that the dynamics of the behavior of the system is going to be minimizing, actively performing a gradient descent on the free energy. And if this is true and we have the mapping that assumption 2 proposed, then we can use the chain rule to find the connection between the derivative of MY and the derivative of MY. So the dynamics of the external states and the dynamics of the internal states just with the gradient of this sigma function. So this is just a chain rule of the derivatives. And with this step, then we can describe the dynamics of internal states as the dynamics of external states and use this gradient mapping to reach this equation 23. That tells you that the dynamics of the most likely internal states is doing a gradient descent on the free energy. So this is the end of the argument of the free energy. So if this is true, then the internal states will be behaving as if minimizing the free energy about external states. And then the internal state will be behaving as if effectively performing a Bayesian inference. And we derived that from some of the steps in this publication in the monograph of free energy for a particular physics and the most recent paper by Thomas about Markiplank and some information geometry and so on. Because they or at least it appeared to us that they were using the same symbol for the derivative of external states and the average flow. And because so in some equation there is a connection between the NSO for the chain rule they use this symbol and what. So this is why we interpret this in this way. However, and also from conversation from the authors, it's proposed that the free energy principle is not exactly proposing that the behavior. So an interpretation of this will be that the system is not doing an strict gradient descent as this assumption three that we label with this star, because it's in part or interpretation. But that the alternative to this assumption star three will be assumption star star three, which is that it is not the derivative of the most likely state that follow the grain of the free energy but that but just the flow, the average flow. So the average flow of a Y is just the average of the derivative of Y given the conditional average with the conditional distribution and and and the claim will be that if the flow is pointing. So if the average derivative is pointing in the direction of the free energy, then you can interpret that the system is behaving as if on average it was performing a gradient descent. So so this assumption star star three will be that not that the system is exactly doing a gradient descent on the free energy, but that it is doing on average. Now, if we will have like many observations of the system, then we could see that they on average point to a minimum of free energy, but maybe not if we just observe one system individually. Okay, and then this assumption starts star three will be that yeah exactly that no if the conditional average flows follow the direction of gradient descent on the free energy, then the behavior can describe on average as if minimizing the free energy. Okay. Okay, so this is the summary of the different conditions and assumptions of the principle in which we will have competing interpretations at some points like this assumption three. Okay. Okay, this is a lot of steps and it is hard to intuitively have an idea of what goes what yeah how we can expect from any of them. So what we did is we tried to explore these different conditions and assumptions in a in a very easy system to solve and and see what we can expect in this very simplest case. No, so our idea is the best way to really understand this is to have a toy model that the simplest we can have and and see how easy or hard is for the different assumptions to be hold to be held by the system. Okay, so we substituted the language band dynamics that we had before by just a linear language band dynamics. So everything is the same. And, well, we have a factor for the noise but this is going to be this one. And we have, we have now the F function the flow is just substituted by a linear multiplication. So we have a matrix J, which is invertible and real, which is multiplying the state of the system, minus some constant row, which is just going to be the an upset capturing the average state in a steady state. Okay, and this system is going to be a non equilibrium system, if the matrix J is is not symmetric. Okay, so indeed, in that case, we are going to have a non equilibrium system, and we are going to have like, for example, spiral flows and so on. And the interesting thing is that this system has a steady state. If we let the system run for some time, it's going to have a steady state that's going to be a Gaussian distribution, because if everything is linear, and the noise is linear, eventually the system is going to reach a steady state Gaussian distribution. And in the limit of time, time going to infinity, the average of this Gaussian is going to be just this whole parameter. And the covariance matrix of this multi-parade Gaussian is going to be this Sigma star matrix. And this Sigma star matrix cannot be solved analytically, but it's the solution of this equation 29. It's an equation that involves the noise, the covariance of the noise gamma and the couplings yet. If in equilibrium, this matrix, this equation can be solved analytically, but we are interested in more generally in non equilibrium systems. And this equation is called a continuous Lyapunov equation. And in principle, it can be just solved numerically, which is not ideal for us because we want to access to have the connections between the parameters and the matrix, but we cannot in general. Okay, so what did we do? So in order to have a solution of this equation, a general solution, we assume the following. We assume that the couplings of J can be defined as some diagonal term I. And this is negative just to ensure the stability of the system. So if Y is positive and large, then you will have a diverging system. And then we have non diagonal couplings C and we assume that these C terms are small. So if we have C square or C to the power of three, these are going to be negligible terms. And in that sense, we can solve this equation in terms of power series expansion. So this will be equivalent to a Taylor expansion that you can see in calculus. And we can have an expression of the sigma, the covalence matrix of the solution in terms of increasing powers of C. So we have here the linear term, the quadratic terms and some cubic terms that we are going to ignore. Okay, and to make things more easier, we are going to assume that the noise is homogeneous. So sigma is just identity matrix times some constant, some scalar constant. Okay, so we choose values of Y. So the conditions one and two are almost automatically true. If Y is negative and larger than C, we have a global attractor. And also we can define J to have the sensor motor structure that we want. So we go directly to condition three. So condition three is the mark of blanket condition proposing that the blanket states or the sensor motor states play a very special role being a boundary between external states and internal states. So that the distribution, the conditional distribution is an independent distribution given the blanket way and X are independent. Okay, so in a Gaussian distribution, and this condition is true, if and only if the Hessian matrix, which is the inverse of the covariance satisfies that the Hessian between Y and X, the Hessian between internal and external states, remember that this can be a matrix if the dimensions of Y and X are more than one, is equal to zero. Okay, so we can very easily compute what's the form of this Hessian with these approximations. So we have the question, how common are Markov blankets from all the systems in the world and how often can we expect if we have just random parameters to find our Markov blanket. Okay, so we want this matrix H, which is the inverse of the covariance of the steady state distribution. And we can express this inverse again as a power series and this is called a Newman series for competing the inverse of a matrix. And it has this form. We have again a linear term and a quadratic term and some cubic terms that we're going to ignore. Okay, so we just need to know C to know what's H and to know if this corner of the matrix is equal to zero. Okay, so C is going to be like this because we impose these specific sensor motor structures in which external states don't influence internal and internal don't influence external and so on. And so for the first order approximation, we have this equation 36. Okay, so for a first order approximation, we just keep this term of the equation and we can see which is C plus the transpose just switching rows and columns. And we can see that if we sum C plus the transpose, the term in the corner is going to be equal to zero, which is what we want. So for a first order approximation, we have an exact Markov blanket and the Markov blanket is always going to be true. So this is a good news in principle. And however, if we want more precision and we compute a second order so we get these quadratic terms here, we find that the corner of the hessian, the one that has to be zero has this form. So in principle, this is not going to be zero or is going to be zero just for very special cases. And if we look at the connectivity structure of the system, so the kind of sensor motor loop that we require in general is this canonical loop at figure A. So in principle for this structure, very few systems are going to have a Markov blanket because this is not going to be zero exactly, almost never. And however, there is some other configurations that are going to have a Markov blanket. If we have a circular loop like this case B or we have something like this structure in C, then the relevant, then the terms appearing in this equation are, at least one of them is zero in both cases and this is going to be zero. So B and C are going to have a Markov blanket in general, but not A and they're a second order approximation. And so under this canonical flow constraints, so case A, only a few systems are going to display an exact Markov blankets, although about one exception are going to be the circular loops or the symmetric loops. And however, we refer, we have to say that in the case of weak couplings, the Markov blanket is still going to be a reasonable approximation. Because if we have some equation like this, and so the first term, so even if we have something like this, this is a second order equation and all the other elements in the matrix are going to have first order components that are going to be much larger. So even if we don't have an exact Markov blanket, it's going to be a reasonable approximation in most cases. Okay, so this is condition one, condition three, sorry. So let's go to the next one. The next one is assumption one, which states that the solenoidal couplings between blocks of states have to be zero in order to have a flow that points in the direction of the free energy. Or we could relax this assumption and have these larger blocks, but also requiring some zeroes in here. Okay, so we can rewrite the previous solution of the steady state equation to find a connection between q, gamma, and j. And doing that, we'll write in the previous equation, we can reach this equation 40. And again, this equation cannot be solved directly in general because it's another continuous Lyapunov equation. But assuming that we have j equal to c minus y, and the couplings are small as we have before, we can do the same trick and have a new power series that connects q with the values of c. Okay, and has this form, which is similar, but a little bit different than the case before. Okay, so, but this case looks a little bit more complicated. So first we just completed the first order expansion of q. And it looks like this. So this is q for a first order case, ignoring the quadratic terms of c. Okay, and again, we see that there is, in principle, few cases where the assumption is held, exactly. Because we're having, we want to make zero all these terms here. And if we don't constrain the possible parameters, this is going to be true only in very, very special cases. And we can see that one way to make zero many elements of this matrix is assuming that they are symmetric. If c ys is equal to the transpose of c, so if the connections from sensor to environment are equal to the connections of environment to sensor, then this is going to be zero. And in that case, we obtain this matrix, which has a lot of zeros, but still there is these terms that we cannot get rid of. So if these couplings are not zero, then we don't have the assumption of the free and repensible element. And this is important because well, this is, well, this is, this will be the connection from actuators to the environment, which in principle is something we expect in many cases. And this is the, yeah, and this is the connections from sensor to internal states, which is also something we expect in most sensory motor or we intuitively expect from a cognitive system with a sensory motor loop. And if we want to make these elements of the matrix equal to zero, then we need to have something like this loop c in which these elements are symmetric. And also we remove the connections from sensor to internal states and actuators to external states, which is intuitively a bit strange, but I think in some of the literature of the free energy principle, this kind of loop has been used. And so, and it has proposed that this could represent, for example, the membrane of a cell that separates the internal and external states in this kind of symmetric manner. Although I think there is some, well, there's some research on cell membranes indicating that asymmetries in the membranes are still important. So this could be still a potential issue for finding systems that have this zero, this block that you're going to argue. And if we consider the second order approximations, things become spookier. We have many, many terms. And we see that these elements are just going to be zero in a very narrow set of cases, which is potentially problematic for, yeah, for asserting that the principle is general for, yeah, these assumptions will be general for many systems. And this also gives you an idea that we were saying this very special case with very weak couplings, but having a stronger couplings in principle will make things more complicated and not easier. And, okay, so this is assumption one. We only are going to get this block diagonal and there are very specific circumstances and this is the only exception being this symmetric loop with this very, very special structure. And so this is, so in conjunction, the Markov-Blanket exploration and the exploration of these horizontal couplings will indicate that the class of systems that we studied, the free energy principle will be, will only emerge exactly for a very narrow set of systems. Okay, so this is principle line. So interesting to think about how general the principle is or how can we expect it to be found in different classes of systems, although we just restricted our analysis to these linear ones. Okay, so the second assumption and connected to the question of how general, not how general, but how informative the system is. The second assumption is that we have this mapping between internal and external states and it will compute the most likely internal and external states given a blanket. For a Gaussian distribution, we can do it very easily to compete in a conditional average. So these are the questions you can find in any textbook about multi-valid cautions. And we find that if this requires to compute the generalized inverse of some matrix and we find that we can, in linear systems, in some cases if the kernels of this matrix are correct, then we can do it very easily to compete in a conditional average. In some cases, if the kernels of this matrix are correct, then we can derive the mapping, but these mappings, if we have the correct dimensionality, will exist very broadly. So this mapping is just going to be a linear matrix multiplication of the relation between M, Y and MX. So this is linear. So in principle, this mapping exists and it can be expected to be very broad if the dimensions of these variables are correct. Although maybe, yeah, maybe we should know that in some of the literature, some of the verbal descriptions say that this mapping, it's a consequence of the Markov-Langit, but what we found here is that they are, in principle, independent. They could emerge even if there is not a Markov-Langit, at least in linear systems. And also this, we discovered that this result was also found independently, but by Lanz-Takostina work that is going to be published soon. And I think that they got to the exact same result as we did. Okay, so going to the last assumption, which is maybe the most complicated. So the last step of the free energy principle was to say that we know that the flow of external state, the average flow Y, is pointing in the direction of the free energy. And this means that the system is behaving as if minimizing the free energy. And I describe that there is two possible interpretations for this, an strict one and a relaxed one. So let's explore first the strict gradient descent interpretation, which would mean that the derivative of the average of the external states is equal to the average flow. And we can compute that very easily in our linear system. This equation is the derivative of the average state. And this equation is just the average flow. And what we did for comparing both is just generating a dummy variable, this tilde m, which behaves as equally as the average flow. So this variable tilde m will be a variable that strictly minimizes the free energy. Right, so this is the real dynamics of this most likely state. And this is what these most likely stages will do if they were exactly minimizing the free energy. And there are weak coupling approximations. We can see that we can get to a simplified version of these equations. And we can see that both of them are a bit different. First, the real dynamics has a noise term because the fluctuations are going to affect the dynamics because the observations of the blanket are driven by noise. So this is going to be a noisy dynamics. And this is not. And the second difference is that this term changes the sign here. So here is positive and here is negative. And this seems strange, but the reason of this is because this is the real dynamics, which depends on the previous state. And this average flow, it's history independent in some sense. It just depends on the previous state of B, but it doesn't depend on the previous state of Y, so to speak. So in that sense, the structure of the flow of this variable, of the dynamics of this variable is going to be different. And in this case, this difference is captured by this constant that has a different sign between one and the other. And if we run a simulation of both equations, we find that the behavior is very different. So this is the behavior of the real average variables. So this is a system with two dimensions. And you have these symmetric forms because we restricted the couplings to plus, minus 0.1. And if we represent this tilde m, so this is the system that strictly minimizes the free energy, we have this very, very different behavior, which has this kind of a random walk. And this is because this variable is integrating the fluctuations of the other one. So the error keeps accumulating and the behavior is very different. So in principle, it's unlikely that... And these results were similar for many combinations of parameters. But in principle, it's unlikely that the dynamics of the most likely states are going to be... So the average flow is going to be informative about the actual dynamics of the most likely states. Okay, so this is the strict version of the assumption. The relaxed version will be that only the average flow follows the gradient of the free energy. And if we interpret this, we find that there is two problems. At least, we thought that there is two problems. The first problem is that a key step of the free energy principle uses the synchronization manifold, so the sigma function connected external and internal states, and then derives a chain rule connecting the dynamics of MY with the dynamics of internal states. And this leads to saying that the flow of the internal states is following the gradient of the free energy. Okay, and so if we say that the free energy does not really apply to the time derivatives of the most likely states, but just applies to the flow, then we cannot use the sigma mapping to derive this equivalence between what the external states are doing and what the internal states are doing. So this step becomes problematic. And if we want to connect the flows of internal states and the flows of external states as the free energy principle proposes, then one will need a new mapping because this mapping and this mapping are not the same. And we computed in a linear system the two mappings and we find that the mapping has a different expression and it is different from the mapping that's the gradient of the mapping that's used in the literature. So in principle this relax interpretation will require a different mapping that's different than sigma because we are not interested on the flows of the most likely states, but just on these average flows. And, okay, sorry, yeah. And well, and more importantly, another problem arises which is, yeah, sorry. So if we have an strict, sorry, one second. Yeah, if we have a strict interpretation of the, okay, sorry, I lost this. Yeah, if we have a strict interpretation like this, like, okay, internal states are doing a gradient descent on the free energy. So the derivative, the temporal derivative is equal to the gradient of the. So this is, this is a definition of a textbook gradient descent. Okay, and this is fine. If we have this equation, we know that the system is doing a gradient descent. So the question is, if we just have this equation here, so the average flow is pointing in the direction of the gradient. And we really interpret that the system is behaving as if minimizing the free energy. So how informative is this average flow about the real behavior. So we try to explore this. So first, so well, first in these simulations, we see that it may be little informant. Yeah, it will be little informative, even if we had a different mapping. Maybe this average flow has little information about what the actual dynamics is doing. And for exploring this further, and you can see it in an appendix or for preprint, like the second appendix. We try to explore what is the connection between the average flow and the actual dynamics of the system in a very simple system. So we restricted our system to a bi-dimensional case with this non-equilibrium behavior. So the flows of the system have this spiral behavior, which under noise are going to be rotating in this field. So we have two variables, y and b. So this is the actual flow of the system. We ask, okay, so in this system, what is the dynamics of the most likely states? So this derivative of m, y, even b. And we find that we have this negative slope, which is, well, you have this pink area because this is a random variable, which is going to have noise and it's going to be noisy and so on. But on average, we have this negative slope, meaning that the system is going to be minimizing, is going to have an attractor at zero. If you study dynamical systems and have a negative slope of derivative, or at least on average, you are going to minimize. So if the state is negative, you are going to reduce it, you are going to increase it, sorry. And if the state is positive, you are going to reduce it. So you are going to have an attractor at zero, which is what we see in the actual system. But at least for some combinations of parameters, which were like half of the possible combinations of parameters that we tried, if you compute the average flow, it gives you the opposite behavior. Here, the slope is positive, meaning that if you have a system that follows this slope, it's going to diverge to plus infinity or minus infinity. So this is intended as an example that the average flow can be a very poor description of the actual behavior of a system if what you have is a stochastic dynamical system. Okay, so in general, we cannot assume that under all circumstances, the marginal flow is going to be informative as the behavior of a system. And even if you can derive all the steps of the free energy principle, it might be the case that in some systems, the average flow that is pointing in the direction of the free energy, but the system is not behaving as minimizing the free energy. We can see it in this example. The system is minimizing the value of y squared is going to zero. The value of these variables is going to zero, but the average flow points in the opposite direction. It tells you that if you interpret the average flow as what the system is doing, then you will believe that the system is maximizing the square value of y, for example, because it's diverging. And what it's doing is just minimizing it because it's going to zero. And so in principle, if the dynamics of the most likely states and the average flow are different, then they are going to be in general different and in that case, they are not going to be informative about the behavior of a system, except in very specific cases as mean field approximations of this kind of thing. And this difference is because these marginal flows arise from a decoupling of a previous history of the system and they are not considering the effects of fluctuations. Okay, so just to finish and maybe let's get placed to the discussion. In this work, we explore the conditions and the assumptions that are needed to derive the free energy principle in linear non-equilibrium stochastic systems. And some of this has been explored for specific counter examples as recent work by Martin Biel and others. But I think it's interesting that we did this for a general class of systems. So it's not one example that we should numerically, but we solve a family of weekly couple systems and we see how general or not are the different assumptions. And so first we explore that, we explore how general are the conditions and assumptions of the principle, and then we study under what conditions can we expect the principle to be informative about the behavior of a system. And first we found that the Markov-Lankett condition emerged at least as good approximations in the case of weak couplings, but if the couplings are not super weak, are not very weak, then maybe we cannot expect to be exact in general. However, solenoidal couplings are generally present in all blocks of the system, so they are going to be zero just in very, very special cases as this symmetric loop. So maybe a direction for the theory would be to explain exactly what happens when these solenoidal couplings are not zero and what happens in that case. And so in conclusion we will say that only very specific structures are going to satisfy the requirements to derive that the average flow is pointing in the direction of the pre-energy. And the second point about how informative the principle is, we saw that the connection between the free energy and the marginal flow of the system with the actual behavior of a system is in most cases very weak or even contradictory as one can point in one direction and the other in the other. So we found that this marginal flows are not capturing the actual dynamics of a system and this is because this marginal flow is ignoring the history of interactions and we could intuitively think about this so we can imagine a system that has many trajectories and if we fix the blanket at some point it just means having different trajectories that they are going to pass through this point and the average flow is just going to compute the directions or the direction of the most likely derivative at that point. Well, the average derivative of the system at that point but any individual system can have a very different behavior which is represented by black line here. An intuitive example for this could be to distinguish between the population of an individual organism and the behavior of a population. So you could think that on average a population of organisms is behaving in a way that's maximizing evolutionary fitness but this can be largely uninformative for describing the behavior of individual organisms. That's sometimes the kind of criticisms that we hear to some proposal say in evolutionary psychology and this kind of thing, right? So even if at an average macro level you can detect some tendencies in many cases these tendencies can be uninformative about what an individual is doing. So we think that it's interesting to think about if and when and in which cases the conclusions derived by the free energy principle are going to be informative about the behavior of an individual or groups of individuals and so on. Still we have to say that even if we did this kind of critical evaluation we find that the motivation of the free energy principle is very suggestive and we think that the aim of connecting ideas from a personal inference with the dynamics of complex self-organization system is tremendously important as it could allow to apply the machinery of bagicia and information theory to describe systems that are intractable in practice but we also think that it's somehow difficult to just drive a theory without a connection to specific models that can allow you to inspect in more detail some assumptions or some consequences of the assumptions that are difficult to grasp intuitively in many cases. So in principle we find that it will be interesting to exemplifying the steps of the theory in a specific model is very useful to understand the connections between steps and how likely or not are different steps and that this kind of exercise can be really useful to understand the kind of difficulties that we face when preparing a theory. So this is the property that we just published, how particular is the physics of the free energy. Thanks a lot. Thank you. Thanks for this awesome presentation. There was a lot there and it was incredibly well presented. So for those who are watching live I would invite them to write questions in the chat but first Chris I'll just pass it to you for in initial comments if you would like. I'll unmute and then please continue. Chris you're muted and then continue. Yeah so I told Miguel that even though we found the particular kind of technical presentation we're seeing we still find that the ambition of the free energy principle is laudable and it's something we really want to embrace but yet we feel it needs some future development, right? So I mean I summarise there's two issues that Miguel has presented here. The first one is the problem about trying to characterise the interaction of an agent with its environment. So you can do this in two ways and one typical way is to draw kind of connections, directional connections between an agent and environment, right? So the sensors and the motors and so on and so on. There is a different way of understanding those interactions in terms of the Hessian or the covariance matrices which kind of divide correlations across the system rather than direct connections. And I think what Carl has done is kind of really work from the Hessian upwards rather than considering the functional connections that underlie those do, underlie those, right? So he wants to assume there is a Markov blanket and wants to start there and then he's not so concerned about trying to work out the class of systems that meet those conditions and I think that's a valid thing to do. It might be a very special set of systems but I think Carl will claim that's the kind of systems I'm interested in. I'm not really interested in those other systems, right? And what we've pointed out is that we can't find those very generally in the class of systems. And the second major issue that I think Miguel's work points out is this decoupling between the description of the deterministic dynamics of the system versus the average dynamics of an ensemble of systems, right? So in prior accounts of the free energy principle you actually write the deterministic dynamics into your description so when you're using free energy principle to do cognitive systems you are describing the deterministic dynamics directly in terms of your expectations and flows on your expectations. What the particular physics approach wants to do is try to find that deterministic dynamics within the physics, right? It wants to see where it would lie within the kind of general stochastic dynamical system series of physics and that's where the problems lie, right? So it's very hard to go from the physics to a good description of the system in terms of the kind of dynamics on expectations and so on in that system. So we find this problem of a decoupling between the description and the dynamics of the moments of a system versus the average over an ensemble of systems, this is where the problem lies, right? So over an ensemble of systems the free energy might be a good description but it's not connected to the causal explanation of the dynamics of a specific system so you can't attribute the right kind of mechanistic description to that particular system that you would like to. So that's kind of my summary, maybe. Thank you for the succinct summary, that's very helpful for many, I'm sure. If you'd like to unmute, sorry, I heard a feedback when Chris was speaking. So if you can unmute or, yep, okay, introduce yourself and then give any comments you'd like and then I'll go to the questions. Yes, so hi, I'm Ben, I'm obviously an author on this and I'm currently now a postdoc at Oxford but I worked very closely with Chris and Miguel at Sussex for a long time. So, yeah, also sorry I'm late, I had something else scheduled in conflict so I just jumped in for the questions. But yeah, so I would just like, I think Chris did an excellent job of explaining the issue with the average dynamics versus the dynamics of the average but I would just like to go in and kind of reiterate to me the important thing about the Markov Blankets is really this idea that the Markov, there is a huge distinction between the sort of statistical independence relationships and the kind of like causal actual relationship you get if you write out the dynamics. And so it's quite easy to confuse these two and indeed I think we naturally complated these two a lot until we started writing this paper. And a lot of the literature also like complates these two and you know they have these intuitive diagrams of you know like the bacteria with its cell wall and everything and that's the blanket and so forth but that's in the sort of causal level of the description. Like in the things causally have to pass through the blanks but that's not the same at all. It's the statistical description which is you know the statistical independence between you know the blanket and the external states. And I think yeah it needs quite a lot of thinking about to what extent these causal mappings actually hold up and sort of actually are sustained by the statistical mappings or vice versa and we've found in linear systems there's a lot of cases where they don't and they don't seem to matter at all and so yeah it takes a lot of kind of different intuitions really I think need to be built about what these marker blankets mean and an actual sort of causal level. Thanks for this comment and again people can write questions. If I could just actually add something about the difference between the causal structure and the correlative structure you could imagine on a market there's companies that are causally related to each other but their stock prices are not correlated. And similarly in the sell we often make networks that are based upon like gene co-expression so genes with correlated expression and then there's gene regulatory networks or protein-protein interaction networks that are like proteins that touch but those proteins usually don't have correlated expression because you might up-regulate one that then represses another so they're not expressed in a correlated fashion but they have causal relationships. So and it's people slipping between oh gene regulatory network you know co-expression and interactions certainly conflating the causal structure of a system with the physical structure or the correlative structure is a broad category error and I think your work does an incredible job at highlighting the difference and where slipping between these easily confusable modes of thinking about networks and then also conflating that with reality or all of reality just very well done to these points. Yeah thanks, another example would be like the difference between structural connectivity functional connectivity and effective connectivity in the program, right? You can have some structure and then you can have statistical correlations but even then the dynamics can behave differently because you have like these asymmetries in how one variable is connected to another and the opposite and this is also in a work that relation with the solenoidal flows that capture something different. So yeah so it's generally very complicated to disentangle all these different levels. But to just as to the office I mean I think they're aware of this difference obviously Karl's group is very aware of this difference but they feel that the theory sits above that right at the correlative structures and they're not considering those causal structures because I think maybe some confusion might have emerged in previous papers but in recent papers I think it's very clear that you want to sit above the causal directions. Cool, so I hope that by bringing the math to examples of like the market evolution and fMRI we can hopefully have a lot of people giving their perspective. So I'm going to just go with a question from the chat and then I have some questions that I wrote down. First question, thanks a lot. I was wondering whether you would assume that going to nonlinear systems the limitations that you find would be even stronger or would there maybe be an argument that there could be nonlinear systems that fulfill these assumptions. So what will relaxing the linear assumption do for the math that you described today? Okay, well that's difficult to answer question in general I think. So first I think it's difficult to solve because in nonlinear systems in general we cannot find the kind of analytical solutions that we have here and even in the case of linear systems we found that we can only solve the case of weak couplings and in the case of weak couplings we could do the trick of considering different orders of expansions and we saw that including stronger couplings make things more difficult rather than easier. Okay and intuitively at least my intuition about nonlinear systems is that the case may be similar in many systems having more intricate couplings more intricate dynamical couplings could add extra terms to some of the matrices that we are expecting to be zero like in the Markov Lancet but still this is not something we can say because you will have unexpected effects from chaos for example and I think that's something that Carl proposes sometimes but I think maybe one interesting direction of research would be doing a similar expansion to the one that we did to say okay we have an almost linear system but we consider very small effects of nonlinearity and see what kind of terms this adds to the matrices that we are expecting so this would involve a very different analytical solution but I think there is some up for coming work and from some people in cars working in that sense so I think it's an open question but I intuitively think that in many cases this is going to be challenging to continue to see that nonlinearity is going to be a solution. The intuition is that you might be able to solve it in specific systems but the idea that you will solve it in a class of nonlinear systems seems helpful but you might be able to find examples of which you augment the nonlinearities that meet the criteria maybe if you constrain yourself to a particular evolved living system that might be okay but if you want to say something more general about the emerging self-organization in a class of physical systems then you perhaps run into trouble Two thoughts on that I will go ahead first I was going to say that I agree with Chris that obviously with nonlinear systems you have a much bigger space of systems and so you can probably find edge cases where all the conditions match perfectly but I mean the general intuition in the dynamical systems is that the nonlinear systems are a lot harder to deal with in general and so I'm not going to probably if it's not satisfied in linear systems there's no obvious reason why it should be satisfied in nonlinear systems and especially the issue with the dynamics of the average versus the average of the dynamics like the case we did where you have a linear Gaussian system is probably the ideal possible case for this because obviously the average is a linear operator and for the Gaussian case the mode and the actual average are the same and so you have all these nice properties none of which will hold in the nonlinear general case and so I think for that nonlinear system it's almost certainly going to be much worse for that specific issue Yeah, also to add on that I think so one question is whether there is linear systems some of the problems are going to go away but maybe the right question is what happens in the kind of nonlinear systems that we expect in a linear system and what we know from linear systems often involves bifurcations or critical points that lead to long-range correlations and correlations that propagate through the whole system and this kind of thing that can be very difficult so the kind of isolation between internal and external states can be challenging when this is the case, right? Thanks for all these excellent points in complexity we often talk about nonlinear as being like non-elephant animals it's like all of them so it's even harder certainly to make a general stance and Baron that was a really interesting point that as the state space of what's possible increases there may be more and more edge cases and that kind of brings it back to the question about evolved systems not every parameter combination of cells is going to be functional and so maybe we're only dealing with that vanishingly small amount or some other, I'm not going to go to technical details on the fly, but that could be one aspect and also it was such an interesting point during the discussion when you explicitly brought it back to evolution like fitness is the law of the land in a way at a level that doesn't explain motor behavior in the moment to moment they're nested time scales and so of course there's motor behavior that's maladaptive and long run you wouldn't see that under a stationary ecology you start adding some assumptions in and it's just I think a very fruitful mapping that will be explored for time to come with the relationship between population level claims under stationary environments population level claims under dynamical environments and then individual level claims under stationary and changing environments so it's just a lot of topics that come into play and to have them integrate these kinds of points that you're raising which are often implicit errors in other areas they're being drawn together and brought to the forefront in a way that is quite novel okay again since the technical details I believe stand best in the presentation as you provided as well as in the paper we can take any questions but I think that will be the place where the discussion continues I just kind of wanted to ask a few other questions to provide a little context so for maybe whoever wants to answer how did you come to be studying the question this way were you working with the FEP and exploring new kinds of math or did you have more of experience with the mathematics and then see the FEP coming up onto the radar like how did each of you come to be working on it this way okay I'll go because I think Miguel can follow so I mean I wrote a review a few years ago now of the original free energy principle at that time it was quite an enigmatic principle I mean the reason I originally I didn't like it I didn't like the idea of the notion of representation seems to be pregnant in the idea of the free energy principle kind of representing generative models but I actually came to like it as a cognitive framework and I still do like it as a cognitive framework and I still think within the regime of describing essentially motor loops abstract sensing cognitive systems it's very useful but then obviously there's been a rapid development of the free energy principle into a grander claim to find this principle in physics in a cluster of physical systems and that's where I started Miguel's got a real strong background in statistical physics and that's basically what's led us to this to kind of really go back to some of those assumptions so I do separate the two things that I think there's still a very fine the cognitive account very appealing even though we find these problems in these physics examples yeah yeah for some I think I started to be involved with the principle more recently I for some time it's something that I had on the radar but I actually understood what it meant like it's always same kind of scary and challenging like the equations and I guess that part of the motivation for doing this work is it was to as a way of understanding what the theory consisted of because you could go through the mathematical descriptions and it's often really hard to know what they imply or what they mean exactly in the sense that they are complicated maths right and I guess that during my research career the way I have had for understanding complicated maths is by building models and this understanding by building philosophy that I think that the initial attempts to build this system it was an attempt to really understand okay this step I can for us mathematically more or less what it means but let's have a model and change parameters and play with it and so on right and so that was more or less the journey and once having the model then say okay can I have something more general than just some specific parameters so let's see how we can have an analytical solution and so on so yeah so I think this exercise is very fruitful both on a personal level and also on a scientific level because you can on your way to learning you can develop things that can be useful for the theory more in general awesome thank you for those answers and also I was just getting a few back right now yeah Miguel so just while baron gives a okay go ahead baron yeah I just want to follow up on Miguel's point really about just the importance of actually trying to build these models because I think my perspective a lot of the sort of developments and problems we discovered basically came from actually looking at specific models and thinking hey this doesn't quite work out as it says it should you know and hey it's basically things that just become an awful lot more clear if you actually try and build them rather than just staring at these sort of symbols on the page and so I think that's really the key to a lot of it it's simply kind of construct a system where it's working and so hey you know it's things aren't going quite as you know the particular physics monograph says they should be cool and it's so interesting and maybe even appropriate that action became relevant for this inference problem so it's not just conceptual because we must be engaging in the literature and conversation in the work and going through it ourselves in order to infer deeper we can't just wait at square one and look all the way to square one thousand we have to be working along the way individually and maybe as a group to so here's a question from the chat brilliant talk wonder if Carl and others will respond to challenges presented in this paper also because the paper is written in such coherent style wonder what would you recommend for beginners to study in the FEP nice question what do you wish you knew at the outset even if it's a new resource what courses or topics would you recommend having a familiarity with or what are the on ramps for somebody who's not in this area of research and practice how can they best be on ramping themselves I recommend Barron's Github he's got a nice repository of papers which go from very introductory papers to very sophisticated papers in a nicely organized way if you want to provide that Barron is it publically accessible publically accessible I could post that in the chat or something if you want to say hello in the YouTube chat and then I'll make you a moderator and you can post links okay so then continue on that yeah I mean regarding the actual question about what to sort of start with I think that depends a lot on what you want to do so there's kind of three big separate topics in my mind is that sort of the FEP the sort of hardcore maths of the theory as is presented in particular physics and for that I'm not sure there is necessarily a good beginners level resource at this point like there is essentially when our paper tries to go through it quite straightforwardly and then they have you have the Thomas Parr Marker Blanketsons stochastic thermodynamics paper which sort of covers the main points but it's not like a tutorial it kind of just states like this is XYZ and then really it's just the particular physics monograph is where most of that information is like there isn't yet a really nice laying out of it and obviously we kind of try and do that in our paper but obviously we're also focused on the actual points you want to make it's not just like specifically a tutorial yeah I mean there are other two things obviously the active influence sort of discrete state-based active influence that people talk a lot about I think the best resource for that element is probably Lancelot the casters review on that and then there is then essentially there's some several of Carl's papers which I think I talk about in my github repository which is kind of going into more detail but Lancelot's review for that is really good and then there's a sort of more continuous state-based predictive coding dial which I think Chris's 2017 review is the best for that to be honest so yeah yeah I think Chris is the one if you want to sort of understand the whole the theory and all the math is a whole because it works very nicely where it's like if you want to build a predictive coding network which works and you have that sort of more focused on the newer science I would refiles is the way to go if I can ask one note before Miguel's answer when you said newer science do you see them as just historically juxtaposed or do you see some of these approaches as encompassing one another or being prerequisite to one another so I think the idea is that the sort of FEP in the particular for the monograph encompasses all of them in that the idea is that these are just specific models you know you have a specific variational density specific generator model and then you work through the math and you get these process theories as Carl calls them out I mean I think in practice though they stand they can stand alone away from whatever the particular physics monograph says because they always make you know empirical predictions about the brain or whatever and so you can actually empirically test these so even if all particular physics is wrong then these things could be right and vice-versa even if all particular physics is perfectly right and these things could just not be what the brain is doing it could be doing something else I think they stand alone although in theory the particular physics monograph is kind of meant to even them all thanks for that response and Miguel to that original question what would you recommend a beginner to be learning and looking into yeah I'm not so sure for me so I think yeah I started with Thomas Bar paper about Markov Laggitz information geometry and then going back to the to the monograph of a particular physics and but it was challenging because yeah the maths are complex and yeah it was and I kind of liked it was useful to Martin Bills paper a technical critique of the free energy principle in the sense it was very technical and formal but it was useful that they tried to encapsulate the different conditions in a way similar to what we do here and I think that was helpful to try to to differentiate different the different steps which are not always so clear in the in the literature so yeah I think there is at least for the part regarding the free energy principle as described in the in a particular physics and yeah we need maybe better tutorials and more materials that are focused to us introductions yeah thanks for these responses and I know in our active lab and in many other associated efforts that the focus on accessibility and communicating these ideas in a way that's gonna be helpful to people from any backgrounds is a top priority so way back in the introduction you know 120 time steps ago you talked about new visualizations and it's definitely the case that paper after paper will use some classical figures let's just say and I'm excited by visualizations and so was curious just maybe whoever wants to go in for it what would the new representations do or be communicate what style or what made you just wanting to explore visually and how is that in feedback and research okay yeah I think that's interesting like at some point I keep looking through the literature and it's paper after paper the same graph about brain sensuality and so on and yeah so at some point it stops being useful right there because like its visualization can give unique intuitions to some extent right and and at some point I yeah I wanted to know more about so it was not it's not always straight forward okay there's an arrow but what does it mean so why is this bi-directional and yeah part so the different figures that we have we develop them with the idea of okay how can we intuitively express this very complex assumptions and I don't know ask how do you have a figure of a solenoid flow and this is really challenging and I think I don't I'm not sure if we did a good job there but I think it's not that I think many intuitions about the free energy principle are really hard to capture because they are quite complex and I think it's interesting to try to push into the directions of having new images that try at least try to capture that intuitions and what do they mean and how to yeah because often I think you have to read the theory based on your intuition or covering with intuitions the part that you don't understand yet especially if you're learning and this kind of thing I think it's incredibly useful in general in science Chris or Baron would you like to add on that point no I mean we found it hard right there is so many different types of interactions we want to represent I mean it's typically monolithic arrows between agent and environment seems to dominate everything right but when you start drilling down into this they have the difference between causal and functional obviously people like Pearl have done this in the philosophy right they'd be very clear about different types of interaction but I think in especially modelling work it's become a bit clouded in lots of locations especially in the area of energy principle but yeah I don't know whether I think Miguel did a really good job but it does look complicated the final product is complicated looking right but it does do justice to these different types of interaction Baron anything on that or no yeah go for it I would just say that I agree like I'm not actually a sort of visual person who actually really looks at the figures but I think just what Miguel did was actually really good and it's the sort of thing that if you actually look at them and try and understand them it will yield a lot of dividends in terms of quickly getting to grasp on precisely what's being said what's not being said because yeah it's very easy to have this sort of hazy interaction without knowing what that really means it's the same again as a geneticist it makes me think of oh you know nodes are genes and edges are interactions which kind of those multiple classes and does one paper use it one way and go to the methods and it's one way in this paper and it's another one in another way but it's being cited and so it takes a lot of subtlety to pull back to that question about what kind of interactions are being discussed and then Chris what you just said there's like it's a high praise for the work that it does justice because that's what we hope the bleeding edge does for the rest of the knife in the handle is represent visually on a manifold that doesn't deceive it's not drawing hard lines where none exist it's not making false positive or false negatives or false claims visually because visual rhetoric is how many people will learn and think about these things and so that's it makes all the difference whether there's a bidirectional arrow graphically or not because people are going to unpack it especially if they're not going to go into the 10th level of math they're going to unpack it literally and with the knowledge that that's what the expert represented in their aim. So I really do think it's particularly important for this field because there is a separate between this technical morass that sits at the bottom and obviously the ambition and what it promises so so many people are attracted to this particular theory and then but not everyone has got time to go in and decipher that morass so that at the level of these figures and trying to make discourse and so on at that level so it's vital that those figures become very clear and very pointed in what they try to communicate and any lack of clarity is just going to send the whole community off in the wrong direction particularly if you're not the kind of person who wants to go in and do the statistical physics and want to do more of a philosophy or cognitive science of these systems so it's vital that we get them right in this particular field particularly I think. Yes it's a I think there's an XKCD cartoon you know about where the fields are arranged but especially when working with the theory itself it should be lean and as low dimensional as possible and not anymore because otherwise it will be unpacked in ways that are just going to be completely discordant but if it can be represented in sort of like a build in or like an opt in at this level instead of trying to explain backwards from what we could do in the future in the math we need to start with the axioms and clarify what they are and then unpack those and then it's like just like with any other area of theory did you disagree with the axioms or did you disagree with how they were applied because those are the areas where there's sort of controlled space for discussion but then something like I don't think it would work that way in a human computer interface it's neither here nor there for the level of rigor that we approach the questions with today so I can ask any other questions or we can see if anyone else posts in the chat but that was actually one of my questions was are those all the axioms how do we represent axioms or how do we even know what those axioms are without getting into like too much of a whole you know incompleteness thing what are the axioms were they what you put in the paper and nothing else is that sufficient or is there some other stuff happening actually it's our ambition just to say before I think I'll let Miguel answer that but we have an ambition to write a kind of more accessible version of this paper which hopefully we've done in the next few months so we want to do exactly that we want to draw away from the hypothetical physics and have this paper underneath it and then even more even more clear about the kind of overall narrative of what we're suggesting here and we can answer those questions that you asked directly hopefully but for the specific case I'll let Miguel answer I'm new and then continue thank you yeah I don't have much to add yes I think that the pre-energy principle is has been an evolving theory and there is different versions of it through the different papers in the ten years that it has been being proposed and at least for me it was a difficult job to try to to summarize the theory in that way or to have the exact connections of steps because also sometimes the descriptions are verbal but I think that's an important avenue for continuing the research in this field is try to do more accessible descriptions or more try to encapsulate them in this way or try to solve this sequence more clearly or at least for an introductory purpose yeah it's been very much a living theory the pre-energy principle I found this when I wrote the review back in the day which is that there is this development that's very clear in the series of publications that comes out and I think things have changed obviously as they develop so try to see the consistency across those types of things which is the fundamental axiom which is later introduced and so on is very difficult actually and again I don't think it's a I don't criticize the office of this because I think it is a living theory it's something that's really evolved in the literature it's a very complex one so it had to be like that but yeah hopefully as we go forward in the next few years we can become more clear about these things and there will be more and more people doing reviews and more evaluations like Miguel's done which will further ground this theory in a place where we can all agree on and agree on the axioms yeah I just want to say Baron and then Miguel I just want to say that we should definitely not consider all the axioms and the pre-energy principle at the moment to be completely fixed in stone like I feel that it's still living and still evolving in many ways and there's some cases where there's questions about this axiom kind of be more general than this kind of be less general than this so I think there's going to be a lot of more foundational discussion around the actual axioms underlying it and now the people are just really just starting to tackle in a particular physics in a really deep way that I think that there will over the next few years be lots of discussion about this and hopefully in a couple of years we'll have a much more precise statement of what the FPP says what is needed, what isn't needed and how everything fits together but at the moment it's still in a fair bit of flux like because really deep critique outside of Karl's group on particular physics is kind of just really starting in the last year or so Thanks for that Miguel Yes, I also want to add that so we focus on the pre-energy principle and this claim that dynamical systems or self-organized systems behave as minimized in the pre-energy but I think that's only one part of the universe of the pre-energy principle you have the active inference you have the design of systems that explicitly minimize the pre-energy and that's a completely different field and often this is also mixed and not sure and there's sometimes a mix which is the the axioms that are even in specific papers, not what where you are situated in your work is not always clear for anyone because of this evolving this evolution of the theory and these interconnections and so on but I think that becoming more mature as the field becomes more mature this should be clear because sometimes this leads to confusion about what the theory means because the pre-energy principle shows that every living system is minimizing pre-energy so it's different when you think of agents that do it by design or when you try to extend the claim to other systems so there's these different points of view and these different sides of the theory that I hope I imagine that they are going to evolve into more compartmentalized even if they are interconnected but without they are clear the borders of the different claims and so on there's going to be an FEP metaverse that is appropriate for the online age honestly because when I think about how the needle was pushed here GitHub and interactive computational notebooks pre-prints like yours live streams like the one that we're on a discord server a reading group that was arranged by Maxwell and others 10 years ago, 20 years ago it would have been like what is going to be accessible globally so to see how the material and the informational culture and also social culture is changing in a time when participatory science and open science and all these kinds of transdisciplinary ideas are coming to the fore it's just an awesome contribution so I'll offer any final thoughts from the chat and each of you a final word just to say from the lab that we really appreciate your willingness to jump on a short notice and work on it and share it in this way and also with these exciting directions that you've planned out and expect now if I say I completely agree with you and you have to shout out to people like Maxwell Ramstad who organized these discord servers which are really promulgating this debate in wide communities which are really valuable for infrastructure for people like us to come in and do this type of research so yeah I completely agree with this this new infrastructure there is really important for this type of stuff Aaron any last comments and then Miguel of course final word I just want to echo what Chris said really about the sheer importance of reading with Maxwell organized because I think that's where I learned an awful lot about how particular physics work and how everything goes through in that monograph and I think a lot of people did there as well like I think we've really built up enough expertise to make papers like this and obviously similar to Martin stuff and just generally get everything together and probably lots of stuff in the next year or so coming out actually made all of that possible really and so yeah also this is a family like if you're interested in learning more by the FEP like email Maxwell because he has recordings of all the reading group meetings and so that's probably going to be a really good a good little lockdown I think yeah a lot of recordings there'll be lots of material there if you want to really drill down the technicalities and those recordings will probably be really good as well great and Miguel if you'd like to give any last comments yeah just well just as you said as I think I'm partly as Chris and Bern said that I cannot generate to the group and just attended to the last sessions but it was really helpful to dig through some of the discussions that took place there and I think in general what so I think in general yeah like what could make the field advances to make things successful in different ways and this can be through resources through communities and also through more transparent models or more easier to follow guides through the maps and I think part of the difficulties in the field is that there is a disconnection between people that is doing mathematical work and people that is doing philosophical work and not only here but I think in cognitive science in general or even in more deeper views of systems, biology and so on right and that this multi-level I think it's very valuable this effort in trying to make things more accessible and easier to follow from people from multiple disciplines I think that's a very valuable effort that we should try to do collectively because that's the only way and thanks a lot for hosting this session Thank you, we always talk about tools, ideas and people, they're a triple helix and so thanks for making it happen with all those today Thank you very much for that Peace