 Hello everyone, welcome to the Active Inference Lab and to the Active Inference Livestream. This is Active Inference Livestream 19.1 on April 6, 2021. Welcome to the Active Inference Lab. We are a participatory online lab that is communicating, learning and practicing applied active inference. You can find us at our links and contact information here. This is a recorded and an archived livestream, so please provide us feedback so that we can improve our work. All backgrounds and perspectives are welcome here, and we'll be following hopefully video etiquette for livestreams such as muting when there's noise in our background and raising our hands so we can hear from everyone who wants to speak. You can go to this shortened link to see the upcoming streams of different kinds. Today we're in 19.1 on April 6th, and we'll be having a follow-up group discussion next week on April 13th. Hello, Steven, we're already live. So today in Active Inference 19.1, the goal is really just to learn and discuss about this really awesome paper, Deeply Felt Affect, The Emergence of Valence in Deep Active Inference by Casper Hesp, Ryan Smith, Thomas Parr, Mika Allen, Carl Friston, and Maxwell Rampstead, and we're really appreciative that Casper has joined us today, and any one of these other authors are free to join us next week or the following. In 19.1, we're going to go over the various parts of the paper in any order that people want to raise them, and we can just start with some short introductions and warm-ups that will be culminating in Casper and giving a little maybe just introduction or a background, and that will be great to start with. So I'm Daniel, I'm a postdoc in California, and I will pass it to Sarah. I'm Sarah. I'm a person stuck in Berlin because I can't update my visa. Next. Hello, I'm Steven. I'm in Toronto, and I'm very much looking forward to learning more about Affect and its relationship to Active Inference, and I'm going to pass it to Dave. Hey, Dave Douglas. I'm in the mountains of the Philippines, retired from information technology, including natural language processing and machine translation, background in process philosophy, whitehead especially, and cybernetics, and I will pass to Dean if he hasn't spoken it. I'm Dean. I'm from Calgary, retired, but really interested in this stuff. I don't know who I can pass it to other than maybe Casper, I'm not sure. Sure. Yeah. My name is Eric Casper. I'm a PhD student at the University of Amsterdam, currently being co-supervised by two professors at the University of Amsterdam, and one being at the University of College London. So that's one you probably all know called Kristin. She's like my external supervisor, and this paper was developed in his lab, basically, with a bunch of really wonderful people. Actually, my co-first author is not here today, but I am aware that Brian Smith has been with you on many of these sessions. Yeah, so I think I will just go into the intro of the paper, and I guess it will be interesting because you can read in the paper itself the kind of formal motivation that I will tell you a little bit more about the story behind how this came about. When I first entered the group, most of the modeling that was done was with single layers, and there was already some preliminary work on deep-temperal modeling, and that's essentially working out the implications of these hierarchical models that start to look more like the one that you might have come across in deep learning research, but now with a Bayesian flavor. And one of the things that I was missing, personally, was a meta-commented aspect that moves beyond just estimating confidence, is one aspect, but then reflecting on that confidence estimate. To give you an idea of something that can regulate other parts of the system. So we ended up going back and forth on this topic with Carl and Thomas Parr, and it was a huge help there as well. And he's also in the paper. Essentially, thinking about these action models and active influence, and without presupposing a certain degree of familiarity, but I can suppose I can quickly recap. Essentially, the way I presented it in paper is really meant to be something that you start from zero knowledge about active influence, and I kind of build it up in steps so that people who are not super familiar with it, yeah, I guess it's like, yeah, these steps essentially, the point is that these diagrams have a very mathematical meaning, a very specific mathematical meaning, and when you just present people the whole diagram of influence, what we call directed acyclical graphs, it usually is pretty overwhelming. So for this paper, I decided to go back and just get the viewer, the reader, a very step-wise kind of incremental primer, as we call it. And we start with these first two states that are being linked, the types of states is hidden states and sensory states, and that's really the kind of core. And what's interesting is that it's very modular. So if you go to the next slide, yeah, so if you just ignore the maps for now, with the nice idea here is that these graphs are entirely, or it's kind of like a Lego box, essentially. So once you understand the first piece of the puzzle in M1, where it's inferring hidden states based on sensory states, you can basically understand all the other graphs as well to certain degree. I mean, at some point, we introduce links between continuous and discrete state spaces. That's where it gets a little bit more complicated mathematically. The one thing you can keep in mind when reading these figures is that down on the arrows is always about prediction. So here, for example, the prior over hidden states on the top, the predicts the hidden states below that S. So downstream on the arrows, we're talking about prediction. Upstream, we're talking about inputs. So in this case, you start from the prior, make prediction about what the hidden states are a priori, and then moving from these hidden states to make prediction about what the sensory states are. That's where the likelihood of nothing can continue. So that's one crucial aspect, so downstream prediction, upstream inputs. Then there's another crucial part that you need to know in order to interpret these figures is that the circles are variables, and the squares are parameters. So in this case, hidden states and sensory states, those are listed in the circles, the S and the O, those can vary over the course of the simulation. If you set it up in this very simplistic way, then the squares are fixed. So the values in the prior and the likelihood mapping are fixed in this very simplistic set. Once you understand this, you can start hooking these things up to each other. So once we link another circle to the square, then the mappings become variable as well. You will see that coming back later. But it would now generalize this reading of the figure to the second, so M2 here. Exactly the same maps is applied to this way the states fall over time. So generative model of anticipation is using current states to make predictions about future states. That's where we come to the third part of these kinds of graphs. That's pretty important to know is that from reading from left to right, it's supposed to be the temporal dimension. So S1 is used to make predictions about S2. And based if you have knowledge about S2, you can make inferences about what S1 was the most likely state in the past as well. So it works both ways. So that's perception and anticipation. Obviously, entirely reduced to the most simplistic form that you couldn't come up with. Essentially, often these models are criticized for being too simplistic. I think that's kind of the beauty in the sense that we obviously start with these components, the Lego blocks, so to speak. The individual blocks are pretty simple. But once you start hooking them up to each other, you get emergent dynamics that are pretty hard to already pretty quickly become hard to wrap your mind around. So that's why we're presenting them in a step by step. I would say, if you're really interested in the math, then you would probably want to go into some of the tutorials, for example, that Ryan Smith recently made. I can actually send a link to that right now. But he works through the actual equations in a step-by-step fashion. You can put any links in the YouTube live chat so that everybody can see it. Otherwise, only you will be able to see it. Cool. Thanks for all these clarifying points on Figure 1. And I agree, it'd be awesome to have Ryan, he's been on a few times. So yeah, okay. Question related to Figure 1, Steven? Yeah, just one question. When you mentioned the predictions and inferences, do you tend to think about the downward predictions across all the components and then the inferences more, although it's happening at all components, more in terms of the big block jumps between what M1 and M2 inferring in relation to each other? So do you tend to think about it more that way? So you look at each piece as a smaller idea predicting down and linking, and then the inference is more thought of in these broader sense of something that you could get your head around as a human, so to speak. It depends on which level of description you're currently focusing. But in a sense, the answer is yes, that once you stick another layer on top of it, so to speak, that layer can start to, it's basically perceiving in a weird way. You could say it's trying to perceive the perceptions. And then that allows the system to report on these perceptions. So that's like a kind of meta level representation that gives the system this capacity to reflect on its own inferences on the lower level. We have also worked on paper where we apply this to the precision on the likelihood method. So probably not get into that too much for now, but it's if anybody's interested. So let me just, I found I've gotten access to the live stream now. Let me just post. Say hello, and then I'll make you a mod so that you can post links. But yeah, thank you for adding links to the live stream. Anyone else have a question on figure one? Otherwise it'd be nice to, I think, continue this incremental unrolling of the model and then we'll definitely get to a lot of different topics. I think in a broader discussion, too. And in the live stream. So here's the paper. And then there's now you're a moderator. So now nice. OK, Steven question on figure one or broader question. Yeah, just building on what was just mentioned a second ago with figure one is, you know, thanks for that. Clarification and so like you say, this this this idea of like a meta level of inference is useful. And I think and this came up in the previous live stream with Ryan Smith that there seems to be something about the updating between. So as well as like going from one chunk to the next chunk, so to speak, of M1 M2 and giving that as a way to get a handle on an inference. There's like something that can be done with whatever gets message passed up. And so from what my understanding, that looks like the two ways into the inferential piece. So. Great. Yeah, so that's where essentially one of the core innovations that we present in this paper is something we like to call deep parametric active inference. Where you're making inferences about the parameters of lower level models. And then that is guiding your your expectations. We're going to get into that more way once we go down this step. Let me see. So I've managed to get access, I think. So what does the moderation help me expect? You could post links. I think people who are moderators cannot post links, but we got your links. So the tutorial paper and the meta awareness paper are in the live chat. So thank you. Yeah, good. Okay. Cool. Yeah, so I just to illustrate the point, I think it makes sense to go to go further. Because essentially what I think it was Dean was mentioned or was it Dave? I couldn't see. It's my own site. But anyway, so what was mentioned is that. Yeah, you have these Lego blocks and you start building your hierarchy, so to speak. And one of the things that was had already happened by the time I came around was this step presented here. Where again, the circle here with the pie inside is a variable. And the simple act of connecting this variable to the state transitions on the lower level. Basically, the variable pie, which stands for policy, starts to dynamically control your beliefs about how the state how the states of the world equal all the time. Yes, so this is then controlled by the prior. And this is a really a little bit of a funky aspect of active inference in that formulation. Because the prior on this on these policies is informed by a the expected free energy. The expected energy I call is funky because it recapitulates the entire generative model down below. The expected energy includes the expectations of the organization about the world. So there's this thing that Carl and Thomas Barr have called the generalized free energy where they reformulate this. And unpack what's inside the policy variable in terms of a whole kind of generative model that mirrors the perception aspect. Don't get me to get too deep into that, but essentially unpacking what the expected free energy here does in the model. It's very important because the expected free energy is what introduces the what I call the phenotypic or phenotype congruent action model. It's also going to be answering one of the questions that came up. I watched the previous live stream that you have on this paper. We'll get to that in a second, but we can use it to answer one of the questions came up about whether the effective charge, why it depends only arises when there's a difference between the prior and posterior action. We'll get to that. So the expected free energy, you can think of it as recapitulating the entire action model with one big difference, namely that it's biased by phenotypic outcome preferences. So it's essentially the phenotypic outcome preferences basically bring into active influence what the reward function does in reinforcement. It does this because it's essentially biasing the way the agent behaves towards the world and trying to make it match its preferences that are specified. Here, purely in probabilistic terms, so the phenotypic preferences can be something that can be ingrained, that can be learned. It's not really, it's pretty agnostic where it comes from. That's because it has this hierarchical nature. So I have models in which these preferences themselves are state dependent. So let's say you're hungry. When you're hungry, your preference factor for food is upregulated essentially. It makes more a stronger driver of your actions essentially. When you're tired, then you have a stronger preference against exerting effort. So these kinds of preferences themselves can be made state dependent. If you just add another Lego block, you start having inferences about how hungry you are, how tired you are and you can have an organism that's managing its energy resources. Then there's this last component that I haven't really discussed is the E here. The E matrix, you can think of it as a kind of habitual criteria for what they call in the reinforcement, they call it mobile free. It's just counting the number of times a particular action occurred. It's not evaluating it by any rewards or something like that. I actually don't like the term mobile free because in essence it is still a model that has a separate parameter for every possible action. It's kind of like when you took your driver's lessons and I remember repeating mistakes, not because I liked to make those mistakes, but just because I made them before and my body and brain kind of stored that information and I ended up repeating those mistakes just because this pathway had been explored before. So that's like a little bit of unpacking here. The way you can read this diagram is again from left to right. So you start with the action model and the expected energy guiding the a priori expectations. And this is a kind of group of scientists would say it is biased towards not what you believe the world to be exactly, but more towards what you would like the world to be like or what's if you are currently doing well, what you would expect the world to be like. So it's pretty, it's an intentional bias, like the bias is introduced in that side. And then if you look on the other side and get the perceptual evidence, it's the free energy. You can think of that as a kind of reality check. So I can have lots of expectations about what I want the world to be like and where I want to end up. Or what I want my shoes to be like when I buy them. I remember this example from last time. But then I need perceptual evidence to actually see if the shoes actually fit after I bought them, let's say. So this perceptual evidence is a very crucial component of how the whole perceptual feedback, without that your agent will be living in a fantasy world. So unless there are any questions, we can move to the next slide. Anyone can raise their hand. Otherwise, Steven, question on this figure or a broader question. Okay, go for the question of this figure. Just want to ask you one question there. With your action model, because with expected free energy, you often have the risk ambiguity and pragmatic gain as three terms on one side. So is it like that the pragmatic gain is built into the action model implicitly? So it's kind of you've rearranged it slightly with that in mind so that all action has some sort of pragmatic gain built into the phenotypical morphology of the animal or something like that. Yeah, so here the pragmatic part, so that's the kind of the C, the outcome preferences. So your preferred observations essentially is built into its part of the phenotypic risk. So if you look at the equation on the left, so the right hand side to the left term for phenotypic risk, you see there you take the difference between the logarithm of your expected outcomes minus your preferred outcomes, then in terms of units of nets, so information to your audience. So what you're essentially the risk here is just quantified in terms of the difference between your preferred outcomes and the expected outcomes given a particular action. That makes sense. Cool. And it's also interesting, it's the natural log of O sub pi. So it's conditioned on a policy. So it's not like some sort of all by all chess boards, all strategies, all situations. It's very constrained with respect to what actually matters, which are in the end the affordances and the niche of the actual agents in its scenario. So it's really cool to see how that gets baked into the model, not like a secondary pruning or a heuristic. Yeah, so this is just Paul's out of calculating the expected and under a certain policy. So this G is conditioned on policies in that sense. And one interesting aspect here is that if you make the generative model more sophisticated, then this G is going to become more sophisticated as well. So you can add, if you add learning for parameters, let's say the likelihood map, then the expected energy also expands accordingly. And it ends up having a term for active learning. So your agents can anticipate basically that exploring the environment can allow to reduce uncertainty about these perceptual mappings. So then you get an agent that's intrinsically curious about the world. You can do the same with the B matrices, so the transition matrices. So you get an agent like a child who kind of just experiments, exploring different states of the world, and then starts picking up on different kinds of control that it has. And can even, at some point, can even learn to anticipate that if it's experiments in a certain way, it might be able to learn new things about these methods. So this is currently not in this particular model, because if you remember to do that, you need to add variables to the particular mappings. And the nice thing about this Lego box of active inference that speaks, that you're completely free to just add, to make anything variable as long as you specify certain, the probability distributions, you can do the same trick again and again, different on discrete and continuous state spaces. So I think this is a nice segue into, oh, here you first introduce in Table 3 some of the equations. Here we have Table 3 up on the screen, or do you want Bigger 3? No, I think we can go on to, I mean, unless somebody wants to go deeper into this, but I think we already can discuss this part. Yep, it was really helpful to see the tables laid out. This is work. This is actually not the way I delivered the table to the journal, it's just that they had three, I mean, it's crazy, you pay them this money to fix it, but then they make it work. So basically they had a new layout, and their web layout exists, and that's the most of the figures. So when you want to read it, I would really recommend downloading the PDF and not using their web layout. Oh yes, this is their web layout. It definitely makes sort of a false equivalence or a way of making it look like there's something missing on one side or the other. Yeah, so just read the PDF. So unfortunately, because it's supposed to be their job. Well, here we are on Figure 3, and so we have just the Figure 3, and then we also have that part with AC, if you want to kind of highlight that, because I know that's something we really want to understand. Yes, so just we're going to continue our building of this different, we're adding these different blocks together, and now what we add on top of it is we have G, this is the recapitulation of the entire action model. We add a variable on top of it that modulates the extent to which I rely on this action model. So the precision, gamma, can take any value between zero and infinity, and it just tells me when I predict what's going to happen in the world and anticipate what's going to happen, and how much can I rely on my action model that's biased by my preferences. So that is not the same as, I mean, yeah, that's kind of where it comes into story, why it's this effective charge that we're talking about ends up being depending on the difference between pi and prior and the posterior. So essentially what this precision is tracking is tracking the match between the predicted value, you can say, of G or predict for the actual policies that I end up selecting, the actual state transitions of the world, and how well they are predicted by my action model. And this is my crooked scientist model, right? So it's like the action model that's trying to realize preferences, and effective charge is essentially the kind of, we call it like that, because later on it's getting a special role when you go to an article. Setup, we add one more Lego block on top of this, but essentially whenever my, whenever, so if you unpack this effective charge equation, you see that it's calculating the difference between my prior policy vector that's just based on my preferences, my action model, and the posterior, and that's the one that happened that I have after I integrate perceptual evidence. So and then that difference, so that's where it's a little bit hard to imagine maybe, but basically the difference between these two is kind of like the rate of change, you could say, over time of my beliefs. And then you take that product of that with the expected free energy. And what you essentially want to know is after integrating perceptual evidence, did I get closer to matching the expected free energy to distribution anticipated by that? Or did I get further away? So that's why I call this phenotypic progress. If I get, if the expected free energy is a bad predictor, then this term ends up being negative. If it's a good predictor, it ends up being positive. And I mean, there's like a little bit of a play with science there. And we just define it in a way that it's positive, that positive value is a good thing. That's like, always a little bit confusing, because in some different fields they use, there's even fields where the free energy is defined in the opposite way. So the whole story clips on his head. That's why you always have to be careful how you, how you define it locally in each paper. And yeah, when you come from a different field, you have to yeah, make sure that you got your science in your science in a row, so to speak. Yeah, if I can say that one major field, there are differences that physicists often go downhill, gradient descent, optimization, we're going to minimize the loss function. And biologists often talk or think about hill climbing and fitness peaks and optimization, maximization and congruence and the best possible world, that kind of Pangalos world. So totally agreed. And especially when there's a lot of natural logs and negative signs and differences and divergences that can only be positive and things that are strictly negative or bounded, but when it's something where it really needs to be walked through slowly because it's so easy to get tripped up in how these things flip back and forth. So yeah, really important notes to kind of just go slow and really make sure that we're using the right tools to establish that we're connecting the right qualitative ideas with these variables, because it's not just like you throw up the equation and then it's going to be easy to define what each of the variables are for real systems. Yeah. And on that note, we basically, yeah, we have interpreted this gamma as in the paper as a type of subjective fitness. And that's because literally tracking the degree to which your preference bias action model is fitting with the actual perceptual evidence that you're getting back. And that's why the effective charge can only be nonzero if there's some mismatch between your prior and your posterior beliefs about policies. Because there's only when there's some mismatch between what you expected and what you got. It makes sense to talk about updating your beliefs about perceived and can be in a positive direction or in a negative direction. So my fit can be actually better than expected or worse than expected. That's why I describe in the abstract that this lens is signed to these predictions. Essentially, because if you think about just in terms of predictions, it's kind of all predictions either wrong or very wrong. So then, I mean, if you want to know if you go to this meta level where you're estimating how wrong am I, so is my prediction getting better or worse? That's where the sign comes into the story and where you can talk about improvements or improving or worsening in the state of the world, so to speak, in the state of yourself. This really applies to any kind of states, by the way. So it can be about internal states tracking, as I said, hunger and fatigue, but it can also be about the effective states other or conspecifics. So this kind of subjective fitness can be linked to any arbitrary state of interest. And if you go to the next level, then the state of inference can be an observation for the next level. So that's a hierarchical trick is, I think, the power of it. We have freedom to do what we want to do with our Lego blocks, let's speak, depending on this system of interest. But we try to keep it simple while we're still just demonstrating possibilities, but then the implications get really, really big once you start to release the simplicity constraints and you just go wild on constructing models, hierarchical models with different, you can have to expect your energies at different levels, you can have these kind of precision terms at different levels, so you can have competing, effective charges, you can have the effective charge from your lower level, let's say you're trying to fight your addiction or something like that. The lower level parts of your system are like kind of creating and generating this creating in the higher level parts are like trying to generalize and generate this kind of self actualization or something like that where you basically end up having conflicting drives. So I mean, that's where we can go, but to get there, we need to move to add states on top of this. And I think that's there, yeah, if there are questions, you can open them. I'm going to ask a question from the live chat and then anyone else who has didn't spoke yet or ask a question and then we'll ask a question about anything else we've talked about Steven. So in the chat, someone asks, can we relate affective charge with motivation as motivation can also affect policy? So where do we, how do we think about motivation in this kind of a model? Is it an affective charge? Is it affective charge? Yeah, so motivation in that sense is a very kind of fuzzy concept, right? But if you think about the gamma term here, it's modulating the extent to which you're relying on your action model. And the C, the preference factor kind of modulating the extent to which you're driven by your preferences. Combining them together essentially means that you're, if you have, let's say, a high confidence in your action model to be actually allow it to influence your expectations. So in that sense, there is definitely a link with motivation. But then again, you also need preferences that motivate you to move towards them, right? So what's kind of interesting is if you take away the preferences, then you can have an agent that's motivated entirely by epistemic family. So I suppose to some extent it's kind of interesting because you get the precision starts to modulate curiosity. And so that's a different kind of motivation. Anyway, I can say much more because it's such a fuzzy concept. But very true. I think reward motivation, curiosity, motivation. There's expressivity in the model to actually talk about motivation with respect to specific framings of it. No. And the way I've recently thought about it is also that if you add a higher level and you connect it to this preference matrix, you can basically modulate the extent to which you're motivated by different kinds of outcomes. And that means that if you're monitoring on a higher level, which extent you're currently satisfied, for example. So you're hungry. So you're upregulated to preference for eating food. And then you get your food and you get feedback, interceptive feedback that you're satiated. And then you're open for more for epistemic drives, let's say. So there's an extent, a sense in which the various types of motivation have to be balanced by a higher level. This current figure is not able to do that in a way. If, I mean, our full Lego box can definitely do that. But the current figure, I think, is still, doesn't have dynamic preferences. I think that's what you need to get more satisfying accounts for motivation in general. Thanks for that. Awesome response. All right, Steven, and then anyone else who raises their hand. Just one question. You mentioned product with expected free energy. And just if you could just clarify that a little bit, because I think you've got something to do with the precision, you've got the larger free energy piece. So if you could just clarify that would be helpful. Yeah, so this effective charge term is not something we came up with, right? It's something that comes out of the maps when you start with this gamma distribution. And so the capital gamma distribution is particular mathematical shape and exponential, a natural exponent based on natural exponent. And when you use that as a probability distribution, so it has this kind of tail that runs towards infinity. The expectation value of this tail is regulated by what's called here the temperature parameter, and also called the rate parameter in other contexts. But essentially the temperature parameter kind of regulates how strongly this yeah, the distribution is shifted towards zero. And essentially what happens is that this, when you use that as the probability distribution, you try to optimize this beta parameter to best fit what's happening, you can do this free energy minimization to do that. And what you get out of it kind of for free, once you write that down in probabilistic terms, you get this effective charge term. And what I was talking about is essentially an interpretation of what, okay, this is what we got out when we took the generative model, minimized, let it minimize variational free energy on this, this beta parameter. Then we just, it just kind of spits out this effective charge term. And I was, yeah, then it's a matter of interpreting like what does this term mean in mathematical sense. And essentially what it's doing is taking dot product between two factors. One factor on the left is the difference between the prior and posterior overaction. And the other factor is the expected free energy for each policy. And essentially, if the two factors are pointing in the same direction, for an organism that's bad news, because if my posterior results in larger expected free energy, then that's bad because I'm trying to minimize it. Wait, let me turn the light. Darkroom solved. What's wrong? Just solved the darkroom problem. Yeah, so the idea is basically that the expected free energy, you can think of it as a kind of landscape across the domain of potential actions. And what I want to know is that after integrating my perceptual evidence, did I get close, like did what that vector looked like on the left? Did it match the original vector I had in terms of the expected free energy? That's why if you look closely, it's the prior minus the posterior. Because that's the negative rate of change. The actual rate of change would be the posterior minus the prior. But we flipped the sign there such that you can interpret the effective charge as like being positive when the precision goes up and being negative when precision goes down. Does that answer your question, set the specter? Yeah, I think that's very helpful. Thanks. One example that came to mind was like, you're driving and you're having a maps application tell you it's going to take one hour. And then as the estimate gets worse, it's like, it's bad. If you want to get there, if your preference is to get there on time. Now, if your preference was to get there later, maybe it is more neutral. But then as the update is changing and your relative policy, it could be like, oh, it's going to be 18 more minutes because of this crash. But then this speeded up this way. And so we're always in a really specific situation conditioning on policy and the information we're getting. And so we're not doing the whole traffic city flow. This is about like the person getting the info from the screen and then making decisions about, oh, maybe I should get gas here or not. Not every possible decision they could be making. So a very, very interesting way to frame this variable. And this dot product, the way I don't know how many people here familiar with the way this kind of factor calculus tends to work with, essentially, can just think of it as the dot product kind of match calculating the degree of overlap between the vectors. If they're anti, if they are exactly in the opposite directions, then the dot product is very negative. They're pointing in exactly the same direction than this maximally positive. And are these, what are the dimensionalities? So you're talking about them as vectors, but are they scalars? There's just one entry in the vector, or are there potentially multiple, or how do we think about, are these one number, or is it a list of numbers like a long vector? So in this older formulation of the expected energy, it was like one value per policy, essentially. So every policy has, which is like an action sequence. And so every policy has a total kind of sum of the expected energy associated with it. So then it's a vector in the sense that every policy has its own element in the vector. But more recently, we've actually extended this in the sophisticated inference paper, because I mean, anybody who's worked with policy spaces knows about this problem of combinatorial explosion, that once you start considering forces of action in time, then every point in time you expand it further, the more untenable and intractable the problem becomes. So in a more recent iteration, we actually subdivided further in terms of the action components instead of having one single policy vector that has to regulate all the machinery. We have every time step has its own action variable, and you're forming inference, doing inference on every time step. This makes the problem, prevents you from having to integrate everything right away. The end result of that is, again, integration. But because here it kind of allows for parallel computing, let's see. Anyway, but that's a different paper. I can actually also post that. I think we actually read it, but you know, it all blurs together, but we did read sophisticated inference, I think, at one point. Oh, yeah. Yeah, nice. Yeah, you treat lots of papers, that's why you have a question. We got to catch up. We got to catch up. You're leaving them as fast as we can get to them. I mean, I made more recently, in a paper called Sophisticate Affective Influence, where, so you have another thing to catch up, but it's essentially combining this affective inference story that we're talking about today, combining it with this article three search, and which allows you to simulate an agent that has an effective response to imagined futures. It would be cool to maybe at some point to come back to discuss that one. Cool. Cool. So Dean, with a question, and then anyone else who raises their hands? Kasper, when you go from a 2D M1, M2 to a 3D action, have you thought a little bit about the less about the integration, more about the deflation and inflation that happens when you go from fewer dimensions to more? Yeah, so that's kind of one of the motivations for going to new types of generative model, because this policy variable here can explode pretty quickly once you get to multiple steps in the future. You need somehow to reduce the dimensionality of the state space that you're considering. One thought that kind of suits me when I build these types of models is that even though, as we say, all models are wrong, but some are useful, but even though that's the case, we know that the organisms that we're trying to model have exactly the same problem that we have, namely they're trying to make something that's generally intractable. So the simplifications, when we're trying to build a model of their model of the world, you can actually justify the simplifications in the sense that they are trying to deal with, they have to do over simplifications as well to be able to make sense any sense at all of their environment. I mean, I don't know if this exactly answers your question, but this was thought that came up in response. Kasper had an interesting point about the dimensionality increasing. One thing that kind of happens is that you basically have to stimulate potential futures internally. Once you start conditioning the states on your actions, that means every possible action has a parallel kind of process of inference that's happening to predict what's going to happen in the future. Every dimension you add on top can potentially make the whole thing intractable. So we're always kind of facing that challenge. Yeah, Dean, and then that's quick follow-up. So sometimes you hear the expression, oh, well, that problem is going to require some outside the box thinking. And of course, that's so ambiguous. Nobody really knows what that means. But what you've done here, I think, is given people an opportunity to get out of the spatial envelope, get out of the I'm an agent enveloped by the world around me, and you've given them a bit of an Isavis perspective. They can actually get outside and see how, as an agent, they're perceiving the world, how they can then potentially act on the world. But as you said, they're constantly updating. A lot of this stuff isn't front of mind as they're doing it. But I think that's one of the big things about going from 2D to 3D. You're not just zooming in and zooming out. You're not just now you're able to look down on something, but if you hit the button on the Google map and you suddenly see a 3D, it doesn't stay at one position. It actually circles around the thing that you're looking at. And I think that's what this provides people, is that sense of not just being inside of something, but actually being able to step outside of it as well. I mean, that's something that actually does happen in this type of model is when you add this pie, let's say in the M3, you end up expanding the whole thing. And it is, like you said, like stepping from 2D to 3D in a sense. Because you're expanding the whole number of possibilities basically in the model. And in the same way, it's kind of hard to wrap your mind around it. I think that's part of what makes these figures hard to read for people, especially when you present the top level right away. It's like asking somebody to step from one dimensional to three dimensional in one go. You kind of have to go through the increases in dimensionality step by step. So this is also part of what motivates this incremental presentation. Thanks. Awesome question. Thank you, Dean. So Blue. So we read this sophisticated affective inference like a long time ago. And I always think about it. And I think that I brought it up in the dot zero live stream, actually. But I thought it was the big five paper I think we read them like right at the same time. And they're very different. But this sophisticated active affective inference was really like prevalent with anxiety like the future time steps the further out you go in the future, like the more like you can't predict what's going to happen. And so that's like the underlying basis for anxiety. And I thought about this. And I also thought about the question about motivation. And it made me think of Tony Robbins has like this, like theory that there's like six like driving factors that like influence people or motivate people through life and like, you know, depending on personality type or whatever. And they're like, certainty is one of them. Uncertainty is also one of them, like people who like variety or then there's like growth and altruism. And I can't remember like all of them what they all are. But it's interesting to think about like the motivation, what underlies like what's an underlying motivation for people and how that relates into this type of model, as well as like those driving forces through life, like how you might start to model those things like, you know, someone who's driven by uncertainty versus someone who's driven by certainty, like the uncertainty driven people might place more of like an epistemic value than an actual, than an actual action reward value. So it's just interesting to think about that. Yeah, I mean, there's like a paper in the words that got stalled, unfortunately, but I think where we're trying to do something like that where you're talking about the way the big five have kind of emerged as apparently pretty strong factorization, people's behaviors, people's behavioral tendencies over long periods of time. So their personalities and why they tend to factorize like that. And to which extent you can capture them with something that looks like this. And with at least one more layer, I would say, but to some extent, yes, you can capture these tendencies towards exploration, the kind of exploration drive versus so risk aversion and risk, let's be risk seeking behaviors. I mentioned it when it comes out. Cool. Blue, that was really interesting. And it reminded me about how people prefer, let's just say, to read different amounts and to read different topics. So for some people, historical fiction versus fiction, science fiction, all these different genres and sub-genres, it's related to maybe what their regime of attention will latch onto. And that's something that's different as you age. And it's encultured and it's embedded as well. So it's kind of an interesting example that we could go into. And then we did a live fact check. Act of 11, way back when in 2020, we did do sophisticated affective inference, the simulating anticipatory responses paper. And that was way more anxiety driven. No, it wasn't just because it was 2020. And then also the first one of this year with Adam Saffron and Colin DeYoung was the big five, cybernetic big five, kind of free energy inspired or based as well. So yep, it's just, it's good. You didn't say we're kind of going through them fast, but there's so many to read. And just to even get a little bit of a grasp on it, it's, to read 25 in a year or one every two weeks, it's kind of a pace that we have to hold up to. But I'm sure you read more than one every two weeks. So yeah. I guess I did read it somewhere and I was like, whoa, you, I wasn't sure. I didn't remember where it was like sophisticated inference or sophisticated affective inference. So it was it was just the conference proceedings though. So it was a very short Yeah, it's a little too minimal, I think to really, it was more like a technical note. It would be hard to really get it just based on if you're not. Yeah. So familiar with the modeling and the actual like nitty gritty, you know, like it's like, in that sense, conference proceedings are just a way to communicate to some people who are just in the exactly and this technical part domain, not best for communication to broader scientific audiences. Yep. Well, kind of on that point, and anyone can raise their hand, I moved the slide to the additional information where there's the code. And maybe just what would you say about the code? What are the inputs or the outputs? What would somebody need to run this code? Or like, what could they do with it? And it's really cool that you did provide all this information. But to kind of bridge that gap that you just mentioned, what does this function do? Or what do these different scripts do? Yeah, so essentially, you first need SPM 12 is that contains the core functionality for yeah, that it's based on. And then these scripts are like additional, are like adaptations of scripts in SPM 12. And the only thing you need to do is run SPM mdp dpbx emo app. So that stands for emotions factorized, which is something I worked on at some point. But yeah, in the in the rest of the so let me see. So that's, of course, not enough to for you to then you just get the mobile that we already have rerun. Yeah, the one that's on this slide before. So this, yeah, this one, yeah, and the one above it. So let me just, it's not by now, it's so long that I would have to this is cool. But what I would recommend actually in the future, I mean, I don't to be honest, I don't like mob not at all. The only reason I worked with it was that the existing SPM active impulse modeling was part of the existing SPM package. But there's lots of important work has been done by Alec Shams and Connor Hines and a few others and to transform everything to Python and to do computational optimization while we're at it. So there's this GitHub page called imperactively. Yeah, here we are on it. PYMDP, imperactively, is the GitHub repo? Yes. So instead of trying to get trying to get moblog running on your PC, I would recommend just working with Python and then in the future, we'll have a module in there that can do the same as what we did in this paper. Awesome. So that's something I think that is much more sustainable for the future. We're working on integration with GPUs, etc., to make things that can be simulated in a scalable way, can integrate with TensorFlow, so you can even do hybrid modeling with what they call a mortised active impulse. It's like deep learning models that are connected to components of these Lego boxes that I was describing. But if you want, I can run you through this MVP in moblog. I'm not a panel moblog myself, so I think it would be a little bit of fun. Yeah, I think, yep, let's definitely have a model stream for the Python because I know that will be something a lot of people are interested in because the MATLAB, it has a certain charm to it, but I can see how you might want to develop with other approaches, even though when Christopher and Ryan walked us in the model stream through the code, just seeing how the matrices multiply, there's something nice about how it was done in MATLAB, but it's not interfacing with all these modern tools like TensorFlow that you mentioned. Anyway, let me see. I think we can move on, but I think what I should do, basically based on your feedback, is write a little bit of an explanation of what to do with those codes. If you want to run it, let's say you do want to use mobile. If you want to run it, what should you do? That's actually useful feedback. The reason I added these codes was mostly for reproducibility, in case other technical folks want to look at what I did. But it's not really out of the box in a way that anybody not familiar with it could easily use it. Yep. My advisor should always say, make it so that the anthropologist from Mars will know what to do with that spreadsheet, because even for people in the field or even yourself, months later, it's like, wait, what? That was my research. I mean, the proper practice is to add a readme page on the GitHub, so I mean, this, who did you notice? I guess it's, in my mind, a kind of transitioning way for mobile. It kind of felt like, but I do agree that should run that part of the documentation. Oh, well, yeah. Blue or Steven? Either of you raise your hand. Yeah, just one point about the effective charge piece that you mentioned before. Can I just check? Is that like an in-between level between one inference layer and the next? Is that correct? That's entirely correct. Once we go to figure, I think it's figure six. Let me see, here, seven, sorry. Yep. So, essentially, now we connect what's happening in a higher level, connected to inferences about, or predictions about what's happening on the lower level. And that's when the effective charge basically becomes like an ascending message that informs the inferences. So, there is an equation in the manuscript that I don't see here, because it was a separate table. Oh, this part? We split figure seven. These are the prior, so this is like, doesn't describe the posterior beliefs. The effective charge basically factors comes into, yeah, and then there's many figures in the papers. We didn't include that one, but if you go to, it's in multiple places, essentially, what we call effective evidence on page in the files, page 420, but the actual page is 23, you know, the PDF. Just for, here we go. Yeah, answer your question. Yeah, yeah. So, just to illustrate what's happening, or to explain. In this paper, we use bars to illustrate, to indicate posterior beliefs. So, here the bar is on the left, the bar S is like posterior beliefs at time capital T. So, this is the cross-trial time steps, a large, yeah, large cross-trial time, essentially. The superscript here is indicating the effective state. So, posterior state, the posterior effective state on the higher level is a softmax function, that consists of the prior belief, which is the logarithm of the previous, posterior from the previous time step, multiplied by the transition matrix. So, that's where the prior comes from. And then this term that you see after that comes from what we call Bayesian model reduction. And it's what you need, what happens when you want to connect discrete and continuous state space models. It drops, again, this term drops out of the derivations that you can do based on this formalism, Bayesian model reduction. In this case, this is sort of pre-energy minimization, assuming that changes are small enough locally, are small enough to make this connection. And then if you ask what's small enough, well, that's something you would have to test in the world, and see if this approximation holds. Essentially, what you see here is the effective charge coming back to Hantu, so to speak, but now as a message that this past upwards. So, the higher level state consists of two extremes. So, the extreme positive side, so beta plus and the extreme negative side, beta minus. And at any point in time, the organism is somewhere in between. It's never extreme. It's never like at the exact extremes. It's beliefs just like you can never have 100% certainty about anything. Because downwards, in the hierarchy, we make predictions. That's where we do Bayesian model averaging. Upwards, you have to gather these messages. So that's what they're called effective estimates here. They sort of use Bayesian model reduction. Yeah, so that's essentially a very long answer to your question. But yeah, no, thanks. Just one other thing with that is this then gives a way for somatic kind of processes in every day or even in trauma to be integrated in a way, because it could be that non cognitive processes could be at play in adjusting this effective charge or keeping a score of that. And then that could then be that could give a way for that to influence perception or action model selection. Yes, and this high level state is not constrained in any way in terms of what you can add other types of evidence from other action models, other components of your system can add their own effective charge. So that's at some point in the beginning, we mentioned this effective workspace theory. So like we call this effective charge because it's essentially a pretty domain general can be gathered from any kind of action model. It can also be used to model the way when you're listening to music. Your attentional action model has is kind of being played around with with congruence and this congruence and you can kind of create mismatches and matches and you can create fluency and this fluency and this will create some kind of effective roller coaster, so to speak. So these ups and downs can be gathered from an action model that would be guiding your attention states. So the question of what to attend to when you're listening to music could be something that generates effective charge and then informs your your balance state. It can also be purely associative. So it doesn't exclude purely associative types of playlists. So there can be certain things that comes back in the contextual evidence if you go a little bit down. I think it's up here. So you can also gather evidence from in this case we just talked about contextual states in terms of where the food is on the left or on the right, but you can also have contextual evidence being passed back up and that can generate some kind of pop-lopian learning or you just learn to associate particular contexts with particular positive states or negative states. So there's also space to include these kind of what I think are computationally a little bit less interesting types of feelings, feelings of experience, because they're just kind of heavy and learning in a sense, but it doesn't make them less important for experience. So these purely associative relationships also come. Actually we discussed that in the there's a new discussion set. Let's see how far we get today. Nice. I wanted to ask actually about figure 10. What were you showing in figure 10 or how could we read this? And does it apply to your model only or future models, other categories of models? And everything that's happening here from the orange level downwards, let's say, is what we implemented. But everything that's in the gray is basically something we've presupposed. So we presupposed, for example, that this red already learns how the maze works. We presupposed that they have some perceptual capacity or some action control over, let's say, where their body is. So all the things that we needed to presuppose in order to actually do the simulation demonstration are in kind of the gray part of this box. But the nice thing is that if you follow the arrows, this gray part, evolution, development, and learning, it's also kind of ordered in terms of time scales. And the nice thing is that it's always a circular story. So you get these different methods of time scales that are all influencing each other recursively. And once you get down to our actual computational experiments, you are at the point in time where you enter basically, on the left, this gray arrow that points into the affected box. It's kind of, you can think of that as the one that initializes the simulation, so to speak. So this gray arrow is what initializes everything that allowed us to even simulate it. And then we explore the dynamics of what happens within that. And all the other arrows are basically illustrating that. So the orange arrows provide priors of precision for the action model for the perceptual state. So that's all the orange arrows. And then from the minimal metacognition level, again, there is this arrow pointing down to inform action, jointly the effects end up informing perception. And then it kind of passes through in the way that you described also last time, in the last session. And that's where it kind of connects to the world. And then it passes back up again. And that's where this kind of perceptual integration happens and it trickles upwards into all these layers. We could have simulated learning as well. So that's where this orange arrow points back up on the right to the posterior phenotype. We didn't simulate learning because there wasn't the focus of the paper. But we have, yeah, we have the machinery, it's in our Lego box. And that actually, I think will make the story much even more interesting because then you can think about active learning and how that influences affective states. So how people can enjoy learning just for the sake of learning or how they can learn to associate affective states and contextual states. And how that time recursively influences their system. There's one simulation that I've been preparing where we have a setup like this and we actually simulated the development and learning as well. And the idea there is that you have a child, you can simulate a child. And it has this hyper parameters that influence different parts of its lower level model. But in the beginning, these hyper parameters are very unstable. And then there is a kind of parent simulated parent that labels the states like, like basically kind of reflecting and giving the system labels to work with. And in the end, what's the idea here is that the labels end up stabilizing the imprints. And this kind of to simulate the idea that as we know that social interaction is crucial to develop any kind of emotional control, emotional self control. So there's these famous stories about children growing up with in the wild or something like that and then never really reached a level where they can exert this type of self control and reflective capacity, communicative capacity that we have. Anyway, so that's another tangent very interesting one. Yes, very interesting. Steven and then anyone else with a question. And in this diagram, you mentioned so you mentioned minimal metacognition and then you've got affect and context. So would that minimal metacognition be a sort of phenomenological consciousness to kind of the awareness that you can't necessarily take a perspective on and it goes up into affect and context. And that's where it's consolidated enough to be able to sort of take a perspective on it. Would that be correct or? And that's how I do tend to view it to some extent. I mean, the actual representation of these precisions is often assumed or hypothesized to be to occur in a localized fashion in the brain, to some extent, in the straight. But that's just a hypothesis that can be tested. And it's actually an interesting thought that other systems, other biological systems can have entirely different kinds of representations for this kind of this precision term. So Daniel works with and columns, they might have their own way of encoding this type of reliance on action models and that maybe has to do with weather conditions or something like that. There can be some kind of shared very minimal way of encoding that reliance. It doesn't necessarily, in that sense, pretty agnostic on how it's represented. And you can just test hypotheses. And also, it's really interesting how we can draw out that link to qualitative concepts or to phenomenological experience. But that's not part of this model. This is a claim about a modeling architecture that lends itself to certain kinds of calculations that maybe previously would have fallen within the domain of information theory or just control theory or cybernetics or just Bayesian inference or just multi-scale systems modeling. And I know that those are some of the areas that you've drawn on in your work, Casper, which is why you could kind of see that there was kind of like one, two, three, but not four, five, six, for where the active inference model was going. Because you even discussed how before some of your recent papers, the active inference model had less temporal depth and less metacognitive depth. So, it's just really cool to hear about how it kind of remaps where the questions and the modeling approaches fit in with respect to maybe previous disciplinary approaches. So, Dean, and then anyone else with a question? Oh, you're muted, Dean. Yep. So, because you've got the time both from left to right, and that's pretty much consistent, and you've got the parabolic introduced from top down and then back up. Can you see, Casper, the idea that around perception, around action, around minimal metacognition and around effect and context, there could be a counterclockwise spin? Because I actually saw that in the way that people were working, not even aware of this model, but that that's kind of the direction where you've got the flow across the bottom and then the introduction of the parabola from top down and then back up. And that's where you could get actually get people looking through both directions. That's why you had the effective valence. That's part of where the charge came from. Or am I sort of confusing the matters? Because when I looked at that figure, I actually saw how people in that flow state were able, what direction the spin was actually going around the middle of your figure. Yeah, I mean, this figure is smaller. Yeah, it's not, it can move away from the very precise mathematical meaning of the other graphs, right? That's directly a cyclical graph, but it's a little bit of the same ideas here, like you said. In that sense, it doesn't really matter where they're moving. The reason that I put the upwards arrows on the right is that in the temporal sense, it's representing the posteriors, right? So it's moving from the prior to the posterior, but then if you would, because it's nested, every next level means that you already cycled from basically yesterday's posterior is tomorrow's prior, or like yesterday's posterior is today's prior. It's some way, right? So the capacity to integrate to the next iteration can also be itself problem. So you see that a lot with people with traumas that actually integrate an experience too strongly in the way they work. And this actually comes to very interesting questions where it's like functional forgetting, something like a lot of maybe a lot of these meditative practices actually are able to do. It's like help us forget the irrelevant bits or the things that we shouldn't take to the next iteration, right? That's sort of the counterclockwise spin is essentially expressing reflection. And so sometimes you choose to reflect on something, sometimes you over reflect on it, which is what you're describing. But all I'm saying is, is that when I saw this, it was just more confirmation of, yeah, I think it's, I think it's correct. That's all. Just one. Yeah, it's a nice idea in how it's in the reflection part and kind of becoming aware of the things that you want to, so how you want to learn basically. It's something that we did try to capture. And it gets very interesting, you start to be able to simulate processes like meditation and what's happening in the mind when you develop control on these attentional processes on a higher level. Anyway, but I think Stephen wants to say, yeah, yeah, Stephen go for it. Yeah, just following on from that. And this, this what's interesting is how this brings in this idea of effective representation and how we think of representation normally as something as a thing in the head, as opposed to maybe objects in the niche, which we act on or create or interact with. And now you've got an idea in ST of ST2, I suppose, of this kind of what could be stored in an accessible form where the prior phenotype may not be entirely accessible except through science. You know, is this effective representation? I just wondered how you because we've had quite a lot of conversations about this is what do you see representation being in all of this or the possibility for active inference and representation? Yeah, so what I like about this approach is that we basically don't have to pin down exactly what the effective state means, as long as we're just figuring trying to figure out what, yeah, how it can be perfect for an organ. And the organism can use very limited cues to infer that. And then you have like a whole kind of organic states of the system that will correlate with that inference to some extent. But just as the organism is doing is doing that internally, and usually we think about this in terms of animals, but it's interesting to think about more abstract types of representation. And three, just in the end, the statistical concept. So I work with models where you add a layer on top of this effective contextual layer. So like here, but just in terms of to link it back to verbal expressions. And if you ask me what does the representation mean in this case, it just has a very specific computational effect on how you think of it as a generative model generative models. So each effective state will be corresponded with a certain mode of cognition and action all the way down to the lowest level. The model doesn't have to integrate or like specify all of those details. As long as you know how, let's say, the connections in each level happen, the rest becomes like an emergent whole. And I think try to be sensitive of the way in which the internal representations can be are not necessarily the same as the actual effective state. If you look at all the different layers combined. And that's when you think about that, but then you start to get into really interesting domains like this disconnectedness between your beliefs about your effective state and the way your system is currently behaving and the different kinds of effective disorders you can start thinking about basically hallucinations or like self-reinforced and anxiety, as in one of those papers. I guess I don't really have a very concrete answer because these representations are meant to be abstract until you start to model them in a specific context and I think, oh yeah, you wanted to. Yeah, now I think that that that makes a lot of sense and also if our represents, if what I see as being a representation or models that we people think as in people's heads are actually always something in the niche that we work with so to speak. So what we know is the action model for how to act and perceive to recapitulate. It just feels like we have that also in the brain but if it's actually to some extent this would be the most concrete model is the emotional imprint that helps then undertake the interaction with the models that we work with but those models don't need to be in the brain in that representational form but you might need some affective code to be possible to know how to and that ties in a lot with some of the work with micro phenomenology and some of the work with working with mental space psychology so it's quite interesting. So thanks for this point. Yeah, I mean what motivated me actually to move in this direction is something that's not really included in the figure that much here that's super important is the social dimension and then once you think of this affective state like kind of internal representations that are inferred on any number of cues and any number of arbitrary number of sources of data so to speak. You can also think of the facial expressions of my conspecifics as being informative my affective state. You start having a very natural way of modeling things like contagion and just in general this kind of empathic responses where if you're also kind of tracking the group affect and that has places priors on your individual experience basically have a very natural way of moving from the connecting kind of the bodily interaction with the world in the niche as you described to this abstract social relations that we're tracking all the time. So I'm involved in a project right now that's foraging like primate foraging and it's really interesting to think about the group dynamic versus like seeking out your own epistemic knowledge right so this is just foraging and it's you know agent-based modeling project but I wonder to what extent like instead of knowing which trees have fruit for example like if I could just follow a group member I mean just what you had said about the social dynamic is this kind of interchangeable like just following the group like I mean we're modeling contagion and so forth so following the group is that like interchangeable for you know epistemic knowledge really. So there's different things you can do I mean one of the ways you could factor that in here it's like for something that has been worked out in the active inference communities in terms of the ontic cues basically cues that other conspecifics give you to kind of indicate what contexts you are and then your action model specifying for this particular context what is the appropriate action. So a very simple thing can be like in whether you are in a context where you're following another conspecific or not can be something that I think bees have their kind of dance that they do to indicate whether other bees should be following them. So there's like a kind of way in which these levels can talk to each other through signaling basically the appropriate what the context is and then from that inference about the context you can have the whole action model kind of rolling out on the lower level and I think the following the behavior that you described does have a pretty direct analogy to our innate tendency to synchronize affected states with our conspecifics. Even with other animals this is actually pretty interesting something that we hear that other animals also seem to be able to do to pick up on nervousness across species right so stress markers of stress we seem to be able to kind of pick up on those and kind of cross cross species in generalized fashion it's like endless number of rabbit holes we can go down. Here's one more little rabbit hole looking at this figure it reminded me of concurrent programming languages like go lang and the idea of having nested processes that were defined by interfaces so we've been talking a lot on the interfaces as Mark Hoplank it's or holograms with Chris Fields but interfaces are also programming patterns that make nested processes that can be effective or tractable or run on distributed computation without halting or spiraling out so in addition to the python potentially there could be something like a concurrent implementation and then it relates to this question about representation like is the representation a static object is it a dynamic process and I don't know what the formal answer is but it reminds me of what we did talk about with Chris Fields with the types as processes and the two spaces so maybe by defining these interfaces with the right dimensionality or bandwidth or structure however the screens are defined then there will potentially be modulability not philosophical clarity on whether oh is the group is the party conscious or the person or the part of the brain it's like the ant question those are going to be potentially perennial debates but the modulability of the ant colony evacuation is going to be more akin to the modulability of these other higher level processes because of how abstractly but also excessively defined the interfaces are yeah I think in a similar way the way I see these representations I'm kind of fine with just thinking of them entirely as implicit and just descriptions of something that's implicitly occurs in the system basically like a kind of tool essentially and if you make it explicit it ends up being amenable to modeling doesn't mean that there's ever this kind of explicit representation in the system actually and all we're doing is making these things explicit so we can actually compute things I'm pretty happy with being agnostic about whether there's anything and I pretty I have no doubt in the fight of representationalism versus some like an activism I think they're they're both pretty I mean they're both points and I'm pretty happy with the models being agnostic on this question because models speak for themselves and yeah cool it's almost like you're putting up your code and your model as an evidence and if they want to prosecute the case if they want to have their debate in debate whether your evidence is supporting their notion or this notion or some future notion it's a second level question and the first level is actually what you're just laying out here and how the variables are linked to each other and to each other and then there's alternate architectures that are possible yeah every every model is in that sense just a very elaborate hypothesis and you can just test it and like doing empirical work yep and then every architecture has yeah has a can be compared to each other once you have that and to do that you need to go explicit and that's the only part I think where we're doing being reductive to make it a to get a computational grid only the scientific grid you could say okay I have been working on getting more observational constraints and these types of but there are such an amazing richness of things to explore by just even just capturing the phenomenology of our that we already have accessible in the first person that often you can already get very far by just making sure that whatever your model is going to be it has to be able to recapitulate our our lived experience that we have directly accessible and it's actually one of the more recent paper that I could share is from generative models to generative passages I don't know if any of you are familiar with that kind of work if I can ask one question on this figure Casper so you mentioned that models are hypotheses which might be really interesting to people because they might think about the model as generating hypotheses which it also can do but actually you're talking about how the architecture and the way that the variables are connected is itself not some authoritative claim or final description of the system but it's a hypothesis so it makes me wonder if somebody says well I think that there's you know letter you know insert your favorite letter here and I want to hook it up to e I want e to be influenced by a new letter r or something like that is that going to keep this nice mathematical tractability are there certain kinds of wires that if you cross them just the script is going to die is there certain pieces where we know that we can build the legos really well or there's certain just how do we know which pieces can be tinkered with and what structural changes could even be done or is it like all by all well one of the constraints is this basically the way it works is that you can have only local interactions and that's the Markov blanket so in the end the way it's kind of kept at least close to biological possibility is by assuring that asserting that you only specify local interactions between variables and that's what these arrows do so one kind of forbidden thing to do would be to connect this policy variable directly to the observations because then you're breaking the market because the policies are directing in this case the relations between hidden states and the hidden states are generating or like based on the hidden states you're generating observations predictions about them so there are certain kind of forbidden things in these directed basically for the graphs that being said it's pretty universally I mean as I said whenever there's an arrow it actually implies bi-directional interaction and you're completely free to add a higher level state that modulates your E matrix so the yeah the sky is a limit so to speak I mean in the end you should be able as long as you keep this Markov blanket structure intact so you ensure that it's biologically possible and what you're doing then you can do yeah you can make any number of states connect to each other and if there's like redundancy in your model that should show up because the pre-energy can be decomposed as accuracy minus sorry complexity minus accuracy so any increase any increase in the number of states in your model will result in increasing increase in uncertainty about the parameters and that's the complexity so like Occam's razor is built in to the way these models were um so yeah in some yes you're free to do almost anything here but if you look at the the model evidence in the answer will punish you if your major model unnecessarily complex thanks for this awesome response we'll have a question from Dave and then any closing thoughts and then that will be it for 19.1 so Dave go for it yeah two points about the point of local interactions only they're kind of at odds with one another first is that actually the definition of locality whatever interacts immediately is by definition local second in one of the computer simulations that was put together at the Friston lab several years ago the dots on the screen where he's trying to get up uh no I assume a simple model of how market Markov blankets work they found that the distant portions of the simulation were more accurately modeled than the nearby ones and in my mathematical illiteracy I cry out oh so it's all working holographically the intermediate entities are focusing the more distant ones yeah so this is really interesting that you picked up on this because and this is why we kind of we start with this Markov blanket criterion but then when you start when you kind of go into the way these dynamics work actually gets more interesting once you're able to predict things that are outside of your system and the further away the harder it gets you but like you described um in this so I think you're talking about the Markov blanket the kind of soup the emergent the emergent life paper and in that case for some reason it also has to do with the way you set up the state space system um but the way he yeah identifies the crop basically he did it just in terms of cross correlations between the states of these chemical particles and the locations of the state of the particles outside and it was an emergent effect I'm not sure if I think it has to do with the particular simulation so but if you if we kind of abstract it a little bit away from that particular simulation um when you go to deep temporal models actually um what you're trying to do is you're trying to you could call it the hologram I guess in some way you're trying to create an image of what the future looks like outside of the current blanket and that's why they call these the temporal models in the sense semi-Markovian so lots of these interesting cognitive uh phenomena actually emerge when you start to hack your way out of your blanket so to speak and you're trying to um get a grip on what's happening outside and that can be spatially it can be temporary so temporary will be in the future spatial will be places that can't reach um so yeah that's a very interesting question I think um the semi-Markovian aspect it's uh something that has been actually discussed recently I think that it holds an important key to um the way our cognition works is that by kind of absorbing information over time and combining with the right priors and the right um so nature and nurture have to be like kind of coalescing in a way that gives you a grip on things that are not immediately accessible anyway so that's a very interesting question that I can't directly answer because it's we just know that for any physical system to like the biological system to work we can only assume local interactions and then the rest of the story is trying to figure out how far you can get with just local interactions and to create those what you call holograms I guess you could call them like that this is a particular literature you're drawing on I think chris field actually talked about holograph specifically Dave what were you thinking of a holograph okay that's what I got that from the chapter of mark solm's book that just came out in the last number of weeks uh where he first is talking about actually visiting uh Carl friston but he doesn't use the term holographic the only place I've seen that in this kind of context is a some discussion of um using interferometry to see around corners this is something that presumably people can't do but instruments can that the there's a holographic or holograph like effect that um allows but either with sound or with light to image objects that can't be seen visibly but you have to have a corner as long as there's a corner evidently that induces the kind of self interference that allows distance perception but you know more more relevant probably is the way that cons specifics in flocks and so forth convey information mediated information sorry I don't have anything more specific than awesome awesome points Dave thanks for sharing it um it's time and that was an awesome point one discussion really appreciate having you on here Casper and everyone else who's joining live so we kind of just close with little you know pause the video or pause time whatever your affordances think about these questions see you and um thanks for organizing yep oh for sure and just fill out the feedback form in the calendar invite uh we'll see you next week so just keep thinking about the paper and uh other topics we'll be back for a follow-up discussion on the same paper next week so bye everyone yes bye