 Hello everyone. It's May 26th, 2022. It's the fourth week of the first cohort of the Actin of Textbook group. We're in our first discussion of Chapter 2. Today we're going to go over the questions that are posted for Chapter 2. Then we'll see where that takes us and what ideas of the book have been explored. That'll be the main focus. Also, just a note on the math learning group. Those meetings have been occurring on 19 UTC on Wednesdays. In the math learning group, you'll find information about how to participate. You'll find and be able to edit resources on learning different topics. This is a really helpful activity if people want to share the resources that help them learn about different ideas. That'll help people search for a video about this. We have a notation table that will help us as we start to look at the notation more. There's some math learning basic questions around math. Anything related to the textbook content can go in the main questions. Math learning group is part of the real textbook group. These are not the questions to be answered like the other ones that we're going to go to. This is just about math. There's some math-oriented overviews. I wrote this summary of the first two-thirds of Chapter 2. If people want to be continuing to distill and represent what they're learning and what they're asking, that's all good. Any other notes that anybody wants to add before we start to go into the questions? You can raise your hand or you can type it in the chat. We're in the first week of discussing Chapter 2, the Low Road to Active Inference. Just to warm us up, what is the Low Road? The Low Road is the Bayesian brain approach right here. It's the approximate next big concept, I guess, previous. It's based on mathematical. The way I conceptualize it is literally the Low Road, meaning very primitive, very mathematical, very hyper-analytical, foundational. Whereas the High Road is kind of more the upper side of that. We're going to be learning about the High Road and the Low Road. Those are the two paths that are offered in Chapter 2 and 3 of the book. Hopefully, again, it would be awesome to have as many people as possible speak and feel welcome to share their views during this session. Everyone is especially encouraged to put questions or comments in the chat or raise their hand to speak. Or find other times to have synchronous communications and especially to make contributions to the CODA. I really appreciate that everyone has come out to go through these discussion questions. We're going to now go to the questions that have been placed into the questions table. The more questions the better, they can be super short. It can be anything that you're having uncertainty about. If you come across a table or a figure or a sentence in the text and you have a question, please write the question down. It will be extremely, extremely helpful. It doesn't have to be with extensive references or citations. It can just be, what did this mean if you're uncertain about something? And then if you were certain about it, you could still write a question like, what would you ask somebody to test or to understand what they understood about it? Okay, so we're going to start with this first most uploaded question. So there's still time in this session and also till next week in our next session on Chapter 2 to add more questions if you wrote them down somewhere else and also to upvote questions. But we're just going to start with the most interestingly voted questions. What is the relevance of the support and surprise for the different probability distributions in Table 2.1? So what does anyone see in Table 2.1? And what are support and surprise? Mike, yes. So Table 2.1 looks to me like a guide for selecting distributions. If you're building out one of these models with support saying, what set of numbers would fit within this distribution selection? And then if you chose this distribution, how would you look at the support equation for it? Can you unpack what you mentioned with the surprise part? Yeah. So if you chose this distribution, how would you look at the surprise component of it? And we're going to get to surprise and Bayesian surprise in the coming questions. So support using the definition pulled here from Wikipedia. The support of the function is the subset of the x-axis like the domain where there's a y value that's not zero. So it's where the function is non-zero defined. The Gaussian, the normal belcher of distribution has non-zero support at all numbers. Fancy R. And then other distributions have different variables, different values that they're having non-zero values at. Like the gamma distribution is defined over the interval from zero to infinity. So if it was like, how many of this thing do you have? That might be a reason to use a distribution that's only positive numbers. Or if you were wanting something that could have positive or negative numbers of any size, the Gaussian is a distribution. And then another one that's like a relevant support difference would be like the Dirichlet being defined only within the zero to one interval. Which helps connect it to probabilities. So the support is where that function, that distribution, family is defined. And then what is the surprise? Does anyone know the name of the little squiggle they are using to represent surprise? I think it's called fracture I, aka fancy I. The first, we're going to come to the definition of surprise in a coming question. And without going into every variable and trying to over learn something, the surprise of a given observation, like a data point, let's just think about the Gaussian. X is the observed data point. And Mu is like the mean of that normal distribution, which is also the mode. So it's like the sensor of that distribution. And then this capital Pi is a variance estimator, I believe here. So how surprised one should be by a new data point coming in is related to the difference between that new data point and the mean. So if the new data point were exactly on the center of the distribution, this whole term is going to go to zero. So one should not be surprised at all by exact data points on the mean of the distribution. But any data point that has not exactly on the mean is going to have some nonzero value. And so it's going to be a function of how far the data point is from the mean and some scaling by the variance. So if we were expecting 100 with a variance of 10 versus 100 with a variance of one, in one case, like getting 101 is like a one sigma event. It has a z score of one, just speaking loosely here. Whereas in the case of it having a variance that's very small, then that might be more surprising. Any other comments that people have on table 2.1? I have a question. Yeah. So while I was reading through the chapter there. So, okay. Going back to section 2.2, perception as inference. So when they say that the brain is a predictive machine or a statistical organ that infers external states of the world. Okay, here's my question because I'm primarily from a telecom background. So most of our things are to get messages across between two distant transmitters and receivers, right? So we build noise models. So without any noise, you don't get signal. So you need to know what noise model looks like. And then, you know, so for example, if the power spectrum of the signal is what you expect from, let's say you're using a Gaussian distribution of noise, right? So if it's the same as you expect from the Gaussian, you do nothing. So that's just noise. So you don't classify the data point that's coming in. If it is more than this, then, okay, so there might be something interesting about this. There might be some signal here. So we need to find that signal to noise ratio. A lot of this chapter seems to be dealing with the fact that we by default know what the signal looks like. So there's no attempt to build a noise model here. There would be like, okay, so maybe the sensor, in this case, our eyes, for example, or has some parameters, right? It can detect between, say, a bandwidth. Well, okay, so it can detect light between certain frequency ranges. But maybe it performs very well in a certain frequency range and not others. There's more error in other frequency ranges, right? So the noise model should incorporate such things, imperfections in the sensor, imperfections in the connections between the sensor and processing equipment. But there's no attempt to do that in this entire chapter. So I'm just curious as to what exactly is the predictive machine here? What is being predicted here? Great question. And in general, every question that people have, they're going to be 100 times more impactful if they can be written down. This is just the second chapter. It's introducing us to the essence of starting with just Bayes equation. It's the first equation here, starting with the foundation of increasingly advanced noise filtering approaches. So check out live stream 43 to see how Coleman filter and advanced multi-level Bayesian filtering schemes are used and learned and fit with free energy minimization and so on. But you're right that is not approached in this chapter because it's starting with the essence of Bayes equation and then going to build towards another kernel, which is going to be the kernel of the variational free energy. And then that in this one layer case is going to be able to be elaborated and parameterized and fit with the nested models to accommodate like learning on hyperparameters, learning of noise models and so on. But it is not in this chapter. Okay. Yeah, so I was just, I'm sorry if I threw us off track. No, so it's perfect. But they're they're good questions. And then Jacob asked, what do you think of the surprise the distributions involving a some product of the product probability distributions except for gamma? Actually, I guess the Gaussian kind of as well. Yeah, maybe I'm thinking about this wrong, but I thought that the surprise was like the, I guess the information gain or equivalent. And so for the for the Gaussian and the gamma distributions, it's always for one single one single element of the distribution for one X. But for the multinomial and Dirichlet there's something over all of the possible probabilities which makes it, I guess, an entropy function. But is there is there a difference? For example, one could have a multi dimensional Gaussian, a two dimensional Gaussian. And then the surprise for the two dimensional Gaussian would be represented as a sum over the different dimensions. So they've shown the Gaussian and the gamma in the uni dimensional case and the multinomial and the Dirichlet in a multi dimensional case. So in that case, is it I thought that X was just one element of the domain. So that's not necessary. So X is just, if we think of the Gaussian as this kind of bell shaped curve on the X and Y axis, we really want the f of X value to get like the Gaussian value, but X is just the input value. Is that correct? I believe that is accurate because yes, X is the domain of the function and then the sum is over eyes. So if we had a multinomial Gaussian or multi dimensional Gaussian in I dimensions, then it would have a sum over eyes. I guess I still have uncertainty about why we don't do that for the single dimensional Gaussian as well because if we take just one point on the Gaussian, I understand why that would have a certain surprise. And in the same way, if we take one point in the Dirichlet distribution, shouldn't that also have an equivalent equation for surprise? Why in the Dirichlet distribution are we summing over all the eyes? But in the Gaussian we are considering just one point on the probability distribution. Or maybe Dirichlet is multi-dimensional and that's the reason. I think the unidimensional case would be just without this Sigma I. It would just be this. But if anyone has any other thoughts, especially if they want to contribute them in writing to improve our understanding, that'd be awesome. Let's continue to the next question. In section 2.4 they write, we now introduce the simple but fundamental advance offered by active inference. This starts from the same inferential perspective discussed above but extends it to consider action as inference. In your own words, what is the simple but fundamental advance offered by active inference? So what would anyone write or like to share? What is the advance of active inference? Sorry, is it just bringing action into the equation? So predictive coding and Bayesian brain are specifically about perception, visual perception. I think the key thing with active inference is that it's, well it's not just that it brings action into the equation, but it treats them as essentially the same thing that kind of united as part of the same process, which I think is kind of a new development. Thanks for the awesome answer. Ali? Yeah, I think Alan Bartho, or Bartho's in the brain's sense of movement, has a particularly relevant description of action and perception, and he writes that perception is simulated action. And I think that really speaks to the advance made by active inference. I mean the integration of action and perception, or in other words, as Bartho claims, the simulation of action and perception is, I guess, a really novel development. I think I read something similar to that, but I can't remember where I read it, but I think it was roughly that on these existing Bayesian brain theories, an agent perceives in order to act, whereas under active inference an agent acts, action kind of precedes perception. So active inference essentially flips this traditional schema in terms of which one feeds into the other. I was just going to say that adding on to bringing action into the equation, it flips it. Where do your priors come from? Where do you have the hyperparameter to get specified? Where does the model start? Where is the agency kind of in all of these other models? There's really no affordance for that, or it's hard to, it's like very emergent, whereas here it's like very explicitly specified some affordance for that. Thanks, Brock. Mike, and then anyone else who raises their hand? Yeah, so I'm thinking if you have priors, you have some prior assumptions, then do you necessarily need to take action in order to sort of envision what your perception might be? So it's almost like getting into imagination or something like that. You have a set of priors that can drive what you imagine your perception to be before any action is taken. Great question, Blue. So I think that this was like elaborated on a lot in section 2.7 when they talked about pursuing a policy. Because there's consequences, like when we're planning and making predictions, that has consequences for both our perception and our action. I mean, even just planning, even in a conversation, when someone says, oh, have a nice day is expected at the end of that, but have a nice rabbit would be like what? You're not expecting that to come at the end of that sentence. So when you plan, it interferes with your perception because you didn't plan for them to say have a nice rabbit, right? But it also interferes with your action. And so I think like this, the novelty or enhanced aspect of active inference loops into the fact that perception and action are continually impacting one another. Yes. Thanks, Blue. These are some of the core memes and themes. And it really is important to understand how active inference is similar and different than other work in this area. Like variational Bayesian inference is not introduced by active inference. Bayesian models of perception is not introduced by active inference. Bayesian models of action is not introduced by active inference. So it's about finding what has been done to understand what is being offered. And then whether one chooses to take this like history of science development of science view, or we kind of just want to state plainly what it is without wondering what the advanced relative to other frameworks is. These are all really critical ideas like the signal processing framework was brought up earlier and predictive processing, predictive coding, anticipatory systems are often purely about sense. And just to give one thought on that, it kind of makes sense for a video encoding algorithm, for example, because every piece of the camera is in focus. But vision requires action. And so that's where we're going to be bringing in all these other important concepts like attention, sensory attenuation, and the active decision making components of vision to resolve uncertainty, which is what gives rise to a generative visual field that seems like there's color everywhere and seems like there's high resolution everywhere, though that is not the incoming sensory information. So by fully taking this generative modeling perspective on perception, which in live stream number 43.0, Maria did an awesome job of connecting this even before Helmholtz and Kant, it's Plato's Cave, and it's part of this long discussion about perception. And action could be a variable. So there's perceiving. And then if action is a variable, then active inference is providing not just a unified framework, like a Bayesian graph framework for some variables that are interpreted as perception and some variables that can be interpreted as action selection or policy planning. But there's a tractable approximation by shifting from the, in bringing this answer, by shifting from an exact solution to the mathematical problem, Bayesian inference, to an approximate solution, variational free energy, which is going to provide a bound on some quantity that might be intractable to compute. So it's going to give a heuristic or the action perception cycle. But this is a really great question. So people can continue to return to it because people will probably ask us for a long time to come. How is this different than blank or what is active inference? These are some of the things that we would want to have in mind. Ali? Yeah, again, going back to Bartho and his definition of perception as simulated action. I think it also relates nicely to Marlopanti's phenomenology. Marlopanti has a famous statement and has a famous insight as vision is the brain's way of touching. And so you see, for instance, we as we move around space as we construct our sense of the spatiality, I mean, or our sense of temporality or everything else, we don't just begin with the representation of space or time. I mean, the image of movement constructs because we move and not as a consequence of our movement. Thanks. And we'll continue with the questions, but that's an awesome area to go into with phenomenology with for e cognition extended embedded in culture, etc, etc. And many, many of the live streams and papers have been on those areas. So it's cool to bridge from formal models of perception, cognition and action into qualitative and philosophical areas. Okay. And then oh, thanks for adding that. The idea of interrelated action and perception has been around in cognitive science for a long time, like dynamical systems theory and an activism emphasize that. But active inference brings a coherent formalism to the table, which is a nice advance. Well said. Next question. Figure 2.2. Figure 2.2 illustrates why as a result of an internal generative model X and an external generative process X star. In order to measure surprise, wouldn't we need another value of Y, i.e. a separate Y that encodes prior beliefs? If Y can be objectively measured from external signals, is there a third Y that's considered the observation? First, I'd like to actually go to this question about two notions of surprise. So because they might come to bear on this action perception loop. So we move from just Bayes equation and calculating surprise on parametric distributions to thinking about prior updating and cognitive entities with a generative model that constitutes a prior with incoming information coming in. We're going to start to see why this notion of surprise and Bayesian surprise are similar and different. So using the apple frog example. So the unobserved hidden state, the latent state of the world is whether there's an apple or a frog in a bag, for example, or just in this person's area. And then what can be observed just speaking coarsely is it can jump or not. And we can totally go down the rabbit hole with what is truly observed and things like that. So this is just what are in the context of this model. The unobserved state is the actual identity of the object and the observation is going to be the action or not. We take the opportunity to unpack two different notions of surprise, both of which are important. The first we refer to simply as surprise. It's the negative log evidence where evidence is the marginal probability of observations. We saw that first sense of surprise with the fracture I in the previous table we looked at. The second notion of surprise is referred to as Bayesian surprise. This is a measure of how much we have to update our beliefs following an observation. In other words, Bayesian surprise quantifies the difference between a prior and a posterior probability. What are the similarities and differences between the two notions of surprise? Okay, we'll see what has been said and then hear what other people are thinking. So page 20, they wrote that Bayesian surprise scores the amount of belief updating as opposed to surprise, which is simply how unlikely or likely that observation was. So in that Gaussian case, the surprise is like about one data point coming in given how the distributions parameterized right now, how surprising was that one data point and then Bayesian surprise is going to be about how much that distribution is updated after processing that data point. Similarities, they both depend on how well the agent's generative model matches the external world and they're both measuring surprise or Bayesian surprise in the same units, which are information theoretic quantities of information, i.e. knots or bits. Differences, plain surprise, marginalizes over all the model's degrees of freedom under the model's prior distribution over its adjustable parameters. That would be great for someone to unpack. What does it mean to marginalize over the model's degrees of freedoms? And then Bayesian surprise lets the model choose the best set of parameters it can to fit the observed data and then measures how much of an update that was. Mike, and then Blue, and then Ali. Yeah, under the first similarity, I'm wondering if it's this dependency on how well the agent's generative model matches their perception of the external world as opposed to matches the external world. I agree with that. Great addition. Blue, and then Ali. I think my hand was just left over, left up. Ali? Well, I believe that the plain surprise is a kind of raw statistical surprise, but Bayesian surprise is a kind of processed surprise. Or I'm not sure if I'm right in saying that plain surprise can probably be described as an objective surprise, as opposed to Bayesian surprise as a subjective surprise. But as I said, I'm not sure about the plausibility of these descriptions. Great comments. Even the surprise alone still depends on the parameters of the generative model. So it still is within a processing or filtering frame, albeit a fixed one. So they talk about, like, let's look at the figure where there's a graphical overview of this apple and the frog. And this is kind of giving some graphics and words to fill in this apple frog example. So initially, the person has this likelihood model in the back of their head, where they have beliefs about how likely apples are to jump. They do it 1% of the time, and how likely frogs are to jump. They do it 81% of the time. That's the likelihood. That's about how observations depend on hidden states of the world. Their prior beliefs are that there's a 10% chance that the entity is a frog. And these are mutually exclusive options. There's no third option here. That would be like another category model structure learning on the model. So we're staying within this model for now, and they sum to one because there are probability. Then there's an observation, which is jumping. And then the posterior reflects the updated beliefs about what the entity is. And so it's like it's so much overwhelmingly more likely that frogs jump, that seeing something jump updates the prior from here to here. So in this case, one can calculate the surprise of the observation without doing any updating at all. One could just stay fixed in their prior belief and then could describe how surprised they are in knots by given observation. In this full Bayesian cycle, there is an updating of the prior to the posterior and then that updating can be described in terms of how much the prior was updated. And so that's like here, the model is updated to the observed data. This is now the best fitting model. And then we're calculating the difference between these two distributions. So regular surprise being zero means the data point was exactly as you expected. Bayesian surprise being zero means the distribution was not updated. High surprise means that the data point was extremely unpredicted. It was extremely unlikely whether or not you update your model at all. High Bayesian surprise means that the distribution was changed a lot as a function of seeing that happen. Ali? Well, perhaps I didn't understand it correctly, but isn't it the case that in the plain surprise or more generally in the Bayesian formulation of the probability of events, the likelihood and both the likelihood and the priors are somehow inherent to the phenomena, inherent to the events independent of the observers. Yeah, great question. I hope I'm not going off on a branch here, but this is related to the difference between frequentist and Bayesian approaches to statistics. So frequentism does have viewed from the Bayesian perspective. Frequentism does have priors. They're uniform priors, which are not uninformative priors. They just are uniform. And so in frequentism, we come across like the maximum likelihood solution or the maximum likelihood parameter, which is just like, well, if all outcomes were equally a priori likely uniform prior, then we would just need to evaluate the likelihood and find the model with the maximum likelihood solution. Bayesian offers another degree of freedom and says, well, sure, you could pick a uniform prior that would give you the maximum likelihood solution or you might want to have a prior distribution over that space so that if something is twice as likely a priori and then you observe less than 2x evidence for it, you still might want like the Bayes factor or your posterior to reflect that one rather than just jumping instantly to a different maximum likelihood solution. So yes, priors and likelihood are implicit, but they're using a different ontology than Bayesian statistics, but we're in the Bayes or the post Bayes area now, but there's so many connections to classical statistics. And in SPM, the textbook, there is parametric classical statistics, nonparametric classical statistics and Bayesian statistics. So they're more similar than not. It just is about seeing where one of them is like a special case of another or a generalization of another. Okay, let's see if we can do one or two more questions during this session. Okay, so let's return to the previous question. So we're looking at figure two, two, which is something we're going to see different representations of this entity as generative model world or niche as generative process. And then here, the hidden states are those that are unobserved as data. Why is referring to data points that are observed as data? There's a cognitive hidden state, which is like a prior in this generative model. And then there's some hidden state X star, but it could have been any letter or any shape about the hidden state in the world. The observation is going to everything that happens in between here is cognition, like the sandwich model, like sense, think, act, that type of model is just referring to that little boomerang. Data come in cognitive processing action selection. So we're in that figure. The question was, in order to measure surprise, wouldn't we need another value of Y, i.e. a separate Y that includes prior belief? Great question. Let's just assume that there is a Gaussian generative model. So the entity has like two parameters in its cognitive model, the mean and the variance. Then Y comes in and given the parameterization of the generative model, all that's needed is the data point to come in for the surprise to be calculated. So another value is needed, but whether we call it ABC XYZ is just a mathematical abstraction. So yes, prior beliefs are important. The prior beliefs that can be interpreted as the parameterizations of the cognitive model are required. If Y can be objectively measured from external signals, is there a third Y that is considered the observation? So one could imagine a lot of like real-world scenarios where a more complex model would be required. Two people looking at the thermometer or all these different sorts of situations. But unless somebody can unpack this a little more or explain what they were asking about, then I don't think a third Y is required, Mike? Yeah, that was my question, so I can try to unpack it. And so sticking with the thermometer example at the heart of that last question is, if we have a thermometer that's registering the temperature, can we consider that as sort of the true observation that is the actual Y as measured by this instrument? And then therefore we can compare our sort of internal Y with the actual Y. So I have a really similar and related question that's like two questions down. But it's in a different section. But let me just like place some things on here and it goes back to like what we were saying about like what exactly is the hidden state. So is the hidden state the temperature and the data is the reading from the thermometer? Like that's like the data Y, right? Like this Y in the middle. But then like what is how I feel hot or cold? Like I mean I'm perfectly positioned to be at 75 all the time. So like I know if it's like one degree too cold or one degree too hot, like I've got to turn up the heater, turn it down or whatever. But like my registration, my sensory input is not at 75 degrees. Like my sensory input is I'm hotter, I'm cold. And so this is like where I mean we were talking about this yesterday in the math group. Like there's this really fuzzy line for me. Like I would love for someone to clarify that. So I do get what you mean about like this in like shouldn't there be another Y? Like there's the Y out there in the world 75 degrees. And then there's how I feel about that Y. Okay, thanks. So variables are not like innately tagged with being observables or hidden states. It's a model specific framing of what is going to be modeled as generated data. And what is going to be modeled as a Bayesian prior in multi-level Bayesian modeling. The priors themselves are generated from a higher or deeper level of the. So one can be serving in fact multiple roles. This is the minimal prior generated data expectation maximization type single layer Bayesian kernel. So we're going to think about this temperature example. There's a hidden state of the generative process, which is going to be the temperature of the world. Latent unmodeled. So this isn't even claiming that there is such a thing as temperature. This is just the latent unobserved temperature that is giving rise to thermometer readings, which might have like different sorts of noise. Then there's another hidden state, which is the evaluation of temperature. And so this is just a schematic. And also whether one is labeled X or Y or L or triangle is like a norm and a convenience, but the letter doesn't matter itself. It will be used mostly consistently within the textbook, but there's only so many letters and there's not a lot of coherence on notation use. What else could be explored here? Yeah, Mike and then Ali. So as I keep remaining on this, which is probably more than I should, it seems like there can be infinite wise, right? So to the extent that you are taking action, which will change your perception, then you are as a result triggering potentially new wise in the system, right? So you just have this infinite potential for what Y can be. This speaks to the composability and the flexibility of active inference. So like Y could be one pixel of visual input. It could be a 4K video. Y could be smell and a 4K video. Y could be lidar and this and that. So Y is just the generalized input of sense. And then action might be related to this in a very direct way. Like Y could be visual input and action could be your eye movement. So that case is going to be explored a lot in the book. However, it could also be you're getting something in and then the actions are just totally unrelated. Maybe they don't affect the hidden state, the causal process in the world at all. Or maybe actions enable different wise to enter the picture. This is just the total essence kernel. And then even trivial cases require some more apparatus. Ali? Yeah, I have a question. Is it correct to say that the minimization of variation of free energy, a.k.a. the surprise is exactly equivalent to the maximization of expectation? I mean, is it the wholly linear relationship between these two? Or possibly we have some kind of plateaued areas between these two extremes? Great question. So this is something that we'll come to next week in our discussions on Chapter 2. So we have not even discussed variational free energy yet. Today we talked about starting with a low road on some of the atomic calculations that are going to come into play, like surprise and Bayesian surprise in a Bayesian framework and beginning to partition active entities and their action perception loops in terms of a Bayesian graph that's going to be amenable to a flexible modeling of perceptive, cognitive, active, and out there in the world variables, variables with those interpretations. And then variational free energy is going to come into play as a way to bound surprise. So let's have some questions and discourse and we'll come to it next week, but it's an awesome question. Jessica? Hi, yes, I have a question. I guess related to this video 2.2. Let's say like in regards to like how we interpret and like what we observe. So like I understand like you have a prior that might, you know, you're anticipating, you know, something like some other people's behavior or like how things should be that might be different from what you actually observe. But a lot of times, at least in terms of like when we're thinking like human beings, like say like an action in the world, like how you interpret it and has to do a lot with your own like views or like your own experiences. And maybe that connects to the priors and like like the interpretation might be very different like from people to people, the interpretation will vary a lot because everybody has like a lot of like different experiences. So those could be like encoded, I guess in the priors, but it's not connected for say to the prediction part or maybe it is. So I'm trying to kind of connect that idea of how like our personal interpretation based on our experiences, which could be priors, make us see the like what's happening in the world differently than other people. And so that's why we look very different from you than for me just because we have lived different experiences. So it's not like an actual why. Like it's not like you cannot really say like this is a fact and because it's an interpretation at the end of the day. Great question and points. Understanding how the individual setup of a given entity is related to its past experiences and how that could be modeled by priors is an important area. And some of those rich dynamics are not in this minimal nucleus. But it'll be awesome to start to think about what does have to be in the box here to give rise to those kinds of dynamics. So that's going to end this discussion. Next week, we will also be staying in Chapter 2 and taking it more towards variational free energy and expected free energy. So that should be fun. We're going to in this room transfer immediately to the Dot Tools meeting. So if you want to join for actinflab.tools meeting, everyone is welcome and there's no prerequisites or anything like that. If you want to hang around and join Dot Tools, stay in this exact gather space. If you want to continue talking with other people about the book or anything else, just head up into one of the rooms above. And if anyone who wants to can just continue any discussion that they want in this space, we're going to continue now with tools. So thanks, everyone.