 Hello and welcome. This is Active Inference Guest Stream 71.1 on February 22nd, 2024, 22222. And we're here with Ryan Smith, Rowan Hodson, and Roshka Mehta. There will be a quick overview and then discussion on their recent work, the empirical status of predictive coding in Active Inference. So thank you all and let's hear about it. Okay, so you just want me to jump in and go? Yeah. Okay, cool. Well, thanks a ton, Daniel, for inviting us on. It's fun to get to present some of this stuff and hopefully get it out to the broader community a little bit. So I'm going to quickly just walk through sections of the paper to orient people to generally what the point is and what we're trying to do. And then at that point, I think the goal is just to launch a discussion and see if we can extract out some of the more interesting points that might be most relevant or interesting to the Active Inference community. So just to start out with here, so the paper, as Daniel said, is called the Empirical Status of Predictive Coding in Active Inference. The main point of this is just that as a lot of people in this community know, at its beginning and for a long time, like the Active Inference is primarily, so the literature you'll see has been very focused on theoretical sort of conceptual work and simulation-based work. So you'll see a lot of things out there that's kind of an Active Inference model of X where X is just some interesting psychological phenomena or condition. And usually that involves showing some fun interesting simulations that are kind of potential computational explanations for whatever the specific phenomenon of interest is. And that work is great and I think there's been a lot of developments there. But at a certain point, we can come up with as many theories as we want, but without actually being able to test them scientifically, it's really hard to be able to say with any confidence that these are kind of accurate stories of what the brain's doing and what are the kind of, yeah, again, like empirically supported theory, which of all these different sort of models than simulations people are proposing sort of actually correspond to what the brain's doing and whether they can actually explain human behavior. So what we were interested in doing is actually kind of taking a step back and looking at what the empirical studies that have been done using these sorts of modeling approaches, what they actually say and how supportive the evidence actually is for the hypothesis that the brain is doing things like predictive coding and active inference. So that's why it's called the empirical status as we're trying to say, okay, what is the current evidence and where is evidence missing? So what should future work and future studies focus on to try to actually like fill in and answer questions and provide additional support or not for these theories as hypotheses for what the brain is doing? So to kind of walk you through sections here, so we focused, so people use this umbrella term of like predictive processing, but that's really a fairly vague overarching kind of umbrella term. And so we picked kind of the primary, actually sort of like well-defined algorithms that are most prominent within the kind of under the predictive processing umbrella, which is predictive coding and active inference is a kind of well-defined mathematical algorithms that can be sufficiently precise to test empirically. So the first section is more or less just kind of an introduction that says more or less the sort of thing I just said about predictive coding and active inference being the kind of most prominent precisely formulated algorithms that the brain might be doing and the importance of testing these things empirically, etc, etc. And so I should say that, you know, most of the credit for this paper should really go to the first author, which is Rowan Hudson here, who I'm hoping will certainly say much more after I give this kind of brief overview. But you know, so what Rowan did is he started out in each of these sections by defining the algorithm and laying out the associated mathematics in a way that was really very clear, I thought, of course I'm bragging a little bit about a paper that I'm on, so take that with a grain of salt, but I thought he did a very good job of laying it out, you know, being explicit showing the mathematics, but describing it in a way that was and what I think should be very accessible to people that don't have a ton of mathematical background. And the idea was that we would just, you know, introduce this so that what we were talking about was clear, but the focus really isn't on, you know, just the mathematics, we're just introducing a mathematics so it can be kind of more understandable what, you know, what the studies that we review are actually testing. And so, so we just kind of walk through a lot of that and I'll come back to this figure in a second, but you know, the underlying mathematics of for instance, you know, here is a definition of the negative free energy when applied within predictive coding. And just kind of how you get to some of the predictions and prediction errors and things like that that we actually use. And this, this figure is a figure that took a long time to make and to try to be clear on, but you know, we thought it ended up being pretty, you know, colorful and engaging and it looks complicated, but we tried pretty hard to kind of walk people through step by step with this is what this is hypothesizing, but this is just a representation of way that cortical columns in the brain might implement hierarchical predictive coding. And this is a particular form of a particular predictive coding algorithm that allows for a certain type of temporal depth to it, where where you have essentially a higher level hidden cause that's being represented that actually predicts dynamics in the representations at the level below, as opposed to just predicting and trying to minimize prediction error with respect to some sort of static equivalent temporal scale representation at the level below. So we try to kind of lay that out so that people have an understanding of what, again, what the kind of general basis and what the hypothesis is. It's kind of important here because predictive coding at an algorithmic level is a different thing to test than, than trying to test a theory about the specific neural mechanisms that implement predictive coding, right? So it's I think it's a really important point is that you can test a theory empirically by looking at the brain to try to like test a hypothesis about a specific way the brain could be set up to do predictive coding. But the brain could be the brain could be set up in more than one way that would implement predictive coding. So to be specific about testing something about the brain process, you need to specify what algorithm you want to see whether there's evidence that the brain is doing that. More generally, though, you could also just test for evidence of predictive coding at the algorithmic level. And you could do that behaviorally just by looking at whether, you know, people's people's what people detect perceptually and how that evolves over time, whether that is just consistent with the with the predictive coding algorithm. So just that just the mathematics, not some more specific hypothesis about the way the brain is doing it through different patterns of connectivity. So then, as I mentioned, we kind of go through now just we lay out what they propose neural implementation is, or as I said, one commonly proposed implementation, there's actually a couple that we talk about. And then here in this section we have the empirical studies of predictive coding, and we review more or less what the evidence is for these different for different kind of aspects or testing different predictions that predictive coding might make about what you would see, either behaviorally or in the brain. And I won't kind of go into it in detail here, we can obviously talk about it when we get born to the discussion. But, you know, take home is is really there is a lot of indirect evidence, but more there's a lot of studies that still kind of remain to be done to test more specific hypotheses about the neural implementation. You know, whether or not, for example, there actually are separate neurons in the brain that are responding just to prediction errors and other ones that are representing just predictions and things like that. There really is more kind of explicit model fitting that needs to be done to test out this, you know, predictions associated with really both quantitative simulations and what those predict based on the predictive coding algorithm. One place where probably the most kind of related evidence is not with predictive coding proper, but with hierarchical Gaussian filter, which is a different model than predictive coding, but it has related high, it has it predicts dynamics that are related to predictive coding. So for instance, it does have prediction errors in it, and it does have a precision weighting on those predictions. And it can be the updates can be modeled in relation to prediction errors. And one difference, though, is instead of being hierarchical in the sense of each level in a hierarchy representing different causes, in the hierarchical Gaussian filter, higher levels represent predictions about the stability of the contingencies at the level below. So basically, the highest level is representing something like how quickly it expects the predictive relationships between hidden states and outcomes, how quickly it expects those will change over time. And so it can do this kind of like dynamic precision weighting, depending on essentially how much you should trust the predictions you have in a given moment based on how you think those predictive relationships will change over time. So it's not the same thing as predictive coding, but it's related. And people have done neuroimaging studies, for example, and looked at the explicit relationship between the prediction error dynamics that this model, the predictions it makes about those prediction and prediction error dynamics, and whether or not you can find neural responses that look like they match those predicted simulated time courses. And there haven't been a few studies that do support the specific brain regions are encoding prediction errors and predictions and things like that, and different precisions. So it's not the same thing as testing predictive coding directly, but it is testing and finding evidence for a neural basis of encoding of precision weighted prediction errors, which again overlaps in interesting ways as kind of again, indirect but important evidence. So then after we go through that, then we switch to talking about active inference. And again, kind of similar structure. First, Rowan introduced the mathematics associated with the active inference framework. And for people that are watching this, you know, this, you know, the active inference Institute program regularly and should be aware that active inference, you know, nowadays has evolved into kind of also being a broader umbrella term for multiple different specific mathematical formalisms. And so to be clear, this, this one that we go through here is is just the kind of standard initial partially observable markup decision process framework. So this is a specific model of decision making, based on a particular explore exploit tradeoff in a model based way, where instead of just using reward as a cost function, or an objective function, it's based on the expected free energy, which is just this combination of expected reward and expected information gain. So it's a talking about a specific algorithm and kind of the original algorithm under under active inference, but now there's there's lots of different kind of variants of it. So we're just going through the basic one. So again, you know, we talked about how it's related to predictive coding. And what's different about it from predictive coding. And then in this case, what we do is there there is actually quite a bit more in active inference, at least this version of it. There are a number of empirical studies that have directly fit active inference models to decision making behavior. You know, several of these come out of our lab, actually. But there are some other ones that haven't. And so we reviewed each of these in detail, kind of laid out how the model, how the models were set up to model specific decision making or, or perceptual decision making tasks. You know, so for instance, here we kind of introduced the model for a specific task we use to model interception, so cardiac interception. And then yeah, here you can just see this is just kind of a standard, you know, version of a depiction of the active inference model. And then let's see, you know, again, these are sort of standard things you see in lots of active inference papers, right. So just describing and laying out the mathematics of the of the general update equations and the the overall algorithm and then a very kind of coarse grain generic sort of representation of one way that the brain might implement those those algorithms kind of inspired by cortical column structure. But so anyway, you can see this is a model that we use to an active inference model, we set up to model a three-armed bandit task that we used in in substance use disorders to show that and in this one, you know, what we were able to show, for example, was that substance users showed slower learning rates from negative outcomes than healthy individuals. So you could see it's much more based on trying to answer psychiatric questions about clinical groups because our lab is focused on computational psychiatry in particular. To walk through that, we walk through a task that we modeled using an approach avoidance conflict paradigm. And that model also was able to show differences in decision uncertainty and what we call the motion conflict. It's a type of preference precision effectively that also differed in affective disorders and substance use disorders from healthy controls in an interesting way. There was also a couple others, one really cool study that was not by us, but by Gibson was the first author and they took they took, for example, a set of publicly available data sets for what's called a two-step task, which is just a common task used in reinforcement learning models. And in that task, there's a standard reinforcement learning model that's used that's essentially kind of a mixture based model of of what model free reinforcement learning model versus a model based reinforcement learning model would predict. And what that task was designed to do initially was to use this kind of mixture reinforcement learning model to test individual differences in how kind of model based versus model free people are. If we can talk about that a little more if people listening aren't totally familiar with model based versus model free distinction. But what they did, which was really which was really cool was they took that data and fit an active inference model to it and then compared the ability of active inference to explain behavior compared to the standard reinforcement learning model I just mentioned. And what they were able to show is that the active inference model in two of the four data sets was actually a little bit better at explaining behavioral patterns than the reinforcement learning model was in the other two data sets the models were about equivalent. So that was probably the most direct evidence that has been shown. And this was one of the major points I think that that we wanted to make is that even though it's the case that you know us and other people have started to do actual empirical studies trying to fit active inference models to data. The main purpose so far and what people have done really hasn't been to provide unique evidence for active inference. It's just been using active inference as a way to try to identify individual differences or group differences and in a psychiatric research context. So one thing that is still really needed aside from this one Gidgen study I mentioned is more studies where people actually take you know behavioral data or collect behavioral data on decision making tasks and fit a bunch of different active inference models to it and also fit a bunch of sort of competing models like reinforcement learning models to it to really show that the active inference that the behavioral patterns that active inference predicts are that active inference can can do a better job of predicting that behavior than competing models can because that's really the only way it's not enough basically to show that active inference models fit well. You need to show that active inference models fit better than competing models because that's how you'd really show that that active inference is more likely what the brain is doing when people are making decisions. And so you know the overarching conclusion is that you know the evidence is promising right. I mean there's nothing that suggests that the brain isn't doing active inference and certainly it's consistent with people doing active people using active inference but a lot more still needs to be done to show that active inference is a is a better explanation for what people do than simpler competing models. So that's you know just as a kind of broad brief overview of the general structure of the paper and the message we were trying to get across. So you know we can we can certainly talk about it more just kind of interactively now. There's lots that I didn't talk about but hopefully that starts as just kind of an initial launching point for for discussion. So I guess I'll just stop there. Thank you. Awesome. Perhaps the other authors could give the first takes introduce themselves. Sure. I'll start. Yeah. I'm I'm Rowan. I'm a PhD student. Ryan's my supervisor. I'm at the Royal Institute for Brain Research. I did my masters at the University of Cape Town under Jonathan Shock, Mark Swames and Ryan as well. I think yeah just to talk a bit more generally about this paper this paper started off as a book chapter actually which was only going to be focused on active inference and sort of especially particularly in parallel states of active inference and in the process of writing this I found myself what needing almost to write about predictive coding it's I think whenever we can we should take the opportunity to really present this in a in a very methodical and complete way. And I think sometimes I remember when I was first learning about active inference it's very hard to dive in them into the middle and I think that's how it sort of became it morphed from just being focused on active inference to being focused on active inference and predictive coding where you pretty coding acts as a sort of foundation for a large aspect of active inference. And I think that's how that sort of happens. And then of course it morphed a little bit from just just talking about predictive studies to also talking about the sort of the background mathematics and theoretical foundations. And so I think this is the sort of field where it's a noble cause to try whenever we can get the opportunity to try present a quite a thorough review of all aspects of this because it's a difficult field to learn can be very confusing. So yeah I really hope that this review I'm on top of presenting the empirical side of things also acts as a reasonably intuitive way to look at the foundational theory of predictive coding active inference. Yeah and one thing I will say just to kind of add to that is that you know for people that you know have more of a kind of broad conceptual or philosophical interest in inactive inference as opposed to you know more of the kind of like detailed mathematical modeling sort of understanding often what I've seen in the past is that predictive coding and active inference get a little bit kind of conflated right so you know people think that active inference is somehow just predictive coding plus motor control or plus decision making or something and so making a claim that's actually really not the case I think is important right that predictive coding is actually an entirely different algorithm it has a different generative model associated with it you know it uses continuous state spaces whereas active inference uses discrete state spaces and discrete time like to really lay out like I think Rowan did very well you know with Meriska's help also was you know to kind of state clearly what the connection is between predictive coding and active inference but that they are not the same thing and they're not even directly connected to each other so to make that to make that a little bit more clear and more intuitive maybe so that was just another thing I wanted to say that I thought I thought that they that they did pretty well and is another kind of maybe useful take on point for the paper. Hi I'm Meriska so I'm also like Rowan a second year PhD student at the Laureate Institute and Ryan is my supervisor my background's more near science and psychology so this paper was like a really cool and challenging first step in my PhD where I come from a more empirical background so kind of working on this paper was kind of going not taking baby step but deep dive into the active inference and the predictive coding literature and again I'm the people that Ryan was talking about who had more interest on the empirical side and kind of going through this journey like really helped me like walk step by step into like different aspects of the just like predictive coding or active inference literature that you should take into consideration and think about really like the algorithmic side the neural side of things and kind of like connecting those three levels how we how different people kind of just think about from one perspective but what really is needed is connecting all those three things and I felt like I was playing a supportive role in this paper but this really was a great learning journey for me. Okay why not to Mr. Empirical Dean? So my interest in this was anytime I see a title where there's a minimum of two ideas being brought together as the unit of analysis I'm always curious is it okay if I just because I have one question that's sort of an unpacking question around predictive coding and one question related to active inference at the conclusion would it be okay if I read a little section of the paper? Oh sorry just start just start just there's a little microphone thing can you just repeat again? Okay so I'm on page eight of the paper and the section that I'd like to get a little bit of unpacking around is thus fitting predictive coding models to responses on perception tasks and testing were quantitative predictions from simulations remains an important direction for work going forward this is crucial because simulated dynamics and predictive coding models can make predictions that are not always straightforward a priori and that depend like this depend on the specific hypotheses built into a formal model Ryan kind of touched on that in his introduction for example the specific mathematical form assumed for the mapping between levels in a hierarchy the direction in which neural activity is assumed to represent a particular posterior estimate often this is the thing I was hoping somebody could unpack for me often sequential dependencies between task trials and patterns and dynamics can be missed in summary statistics that average over trials so models are necessary to predict and test for the presence of those precise dynamics I was hoping maybe somebody could feel cut put some color on that that process and what it means I'm sure I can take a stab at that and um I mean you know Rowan and or Mariska can as well but all um I I'm pretty sure that was something that I wrote so maybe that's for uh for me to be the one to at least take a start on it um so yeah I mean that in in so in the field of computational psychiatry more broadly right so not just active inference but um you know the goal is really just to take a take a set of task behavior right so we can ask people to do say some kind of like sequential decision-making tasks that involves usually like exploratory dynamics of some form right so so say like one common one would be I give people so like a three-armed band a task right so I give people that play some little game and there's three different options and they don't know what their reward probabilities are um so and then they can just trial and error in the beginning they can say choose option three and they can see whether they win or lose and if they win you know let's say they probably stick with option three again um but if they lose maybe they try option two and then say if they lose again they could try option three again or they could try option one um and so on and so forth right and so predictive and active inference model in this case so I'm just using this as a kind of a simpler starting example um an active inference model um under different parameter settings in that model right under different learning rates or under different um um directed exploration drives um or however you want to parameterize it um would make different predictions about what that sequence of choices would be um and there's a sequential dependence because the choices on each trial are not independent right so what a person chooses um on trial two depends very much on the outcome they saw trial one um and what they see on option three what they choose an option three will depend on both what they saw choice you know after choice one and choice two um so if I do this standard right empirical thing that you would like analytic thing that you would do for a task like that that's not model based is you might just have some summary statistics right you might do something like a common thing to do would just be to count the number of times they switched what option they chose after they lost or how many times they stuck with the same choice after they lost so we call those like win stays or win switches um or same thing like if you lose did you stick with the stick stick with the same one or did you switch so it is win stay win shift lose stay lose shift you could just kind of count the proportions of those right so you could see something that you know maybe you know how often people um lose switch for example might might tell you something kind of like their learning rate for losses right like they they might learn more quicker they might learn more quickly right update their beliefs more after a loss um if they if they're the kind of person that switches away right they'd be more quick to assume okay if this option led to a loss this time um that's probably not a good option anymore so I'm going to switch to something else right whereas if a person has a slow learning rate for losses then maybe they'll stick to the same one a couple times before they decide okay this is definitely a bad option they have to see like a few losses in a row right so but the point is is that if all you do is kind of average over trials and get some kind of summary statistic about how many times people you know win shifted or lose stayed etc um what you're not going to get is anything about the pattern in the dynamics right so if they like for instance if they like lose shifted in the early first few choices right like that tells you something pretty different than if they lose stay or lose shift on some of those later choices um right the early choices might be driven much more by like exploratory drives by information seeking whereas on later trials that much might be much better explained by differences in learning rates um so so fitting models to behavior which again just means you have the actual behavior and then you see what the model predicts under a bunch of different parameter values right you try to find the parameter values for a person that best reproduces their behavior right so point is this is that we've done that kind of thing several times now for active inference right which you can do with just by taking patterns of decisions right that people make on games um but you can do that same thing for predictive coding right you can give people perceptual decision-making tasks right where people say okay this is what I perceive this time okay now this is what I perceive this time etc and those are also going to have sequential dependencies because people are going to build up prior expectations about what they're going to see next right um and so predictive coding models again under certain parameter values might say okay well if they saw the same thing the last five times they're going to probably a lot more biased right they're probably going to have a much more precise prior that they're going to see that again so the probability they're going to say they saw that again even if you showed something different would be would be higher right and so and so there are the point is this is that it's not always the case that predictive a predictive coding model is going to predict some trajectory of choices that's going to be really obvious without actually fitting them all you know if if for instance the person like has no tendency whatsoever to be more likely to say they saw one thing just because they saw it a bunch more times in the past right um that wouldn't really be very consistent with the idea that people are using predictive coding um and um it gets even a little trickier than that because predictive coding is um based on continuous state space so it's not really even something like a person would be you know just like choosing I saw this I saw that it'd be much more something like them kind of continuously turning a dial or something you know as they see something get brighter or dimmer or you know like motion in a direction or another direction and um because the prediction error equations are are set up in this kind of continuous way they also have these kind of like oscillatory dynamics to them right so it's like not like you get error and then it drops as the thing resolves but it kind of oscillates up and down a little bit right and so that would also predict pretty kind of sometimes funny like not definitely not something you could just predict a priori like exactly how someone's going to like turn a dial um so that's that's the kind of thing I mean okay uh in in very simple terms Daniel and I've had some conversations around if you're in a and if you're in a situation where your focus or your concentration is on the next move or if you can look and see the entire space and it's taken all moves perspective so my asking you the question was to get out of the sort of model piece of it and that so what what do people actually do and I think you're the question around the oscillate they're the answer around the oscillation sort of speaks to that it's it's difficult when you're focusing on that next part of the sequence to be able to take in the entirety of the and vice versa if you're focusing on the averages maybe you're not able to pick up on the nuances of the next step can I ask one more question sure or would the other authors like to continue to fill in on that first answer I think Ryan explained that very well I think this is a general um I'll talk about it I know we that is referring to predictive coding but in something like in something I'm very interested in active inference is along the lines of differentiating for example between a model that uses reinforcement learning and a model that uses active inference and um yeah using summary statistics sometimes it can be hard to actually see differential behavior right this is what we care about when we are looking at comparing different models what do they actually predict differential behavior and that differential behavior can be difficult to capture in summary statistics sometimes um so in general I think in the world of sort of um yeah model fitting this is just a common common theme and as Ryan said while we're doing the sector for instance they are limited limited applications of that methodology in predictive coding so okay mariska so I'll ask this that they think this was in the in the conclusion of the paper um again I'll just read it in contrast to predictive coding research in contrast predictive coding research can be traced back nearly four decades and make specific predictions that can be investigated across a variety of fields so there's a deeper well of priors there I assume that's what that means um it will be an important direction for future research to further develop the neural process theory underlying active inference and allow for precise implementation level as opposed to simply algorithmic level predictions about brain and behavior I completely agree until that time confidence and active inference as a neural model of decision making should remain tentative but this is the part that I thought was really interesting he wrote another important limitation associated with the current active inference scheme is that of scalability which constrains the phenomena that can be examined in empirical studies that is while these models work well in the context of simple tasks become less tractable if applied to many real-world problems side dimension sets of states observations policies my question was um is the idea of strategy because it active inference is known kind of as a basis for strategizing behavior is the is the question around scalability one that ties into the idea that even when humans are trying to strategize um how far out they can generalize their strategy is a difficult measure to get precise given that most contexts are dynamic and changing so there's there's several kind of related things here um so so the the tractability issue kind of comes in two different flavors so one flavor is tractability with respect to like our that our ability to like use these models to even simulate behavior um in in in contexts where the the kind of space of options and how far in the future people are planning right when that gets big right the other the other question is more about psychological plausibility which is that um even if I can get you know an active inference model to simulate kind of like 10 steps ahead and you know where there's like ends up being like 30 or 40 different combinations of like 10 moves or something like that right um it's not really very plausible that humans are really doing that right that the brain is really doing that and yeah standard computer it's going to take like I don't know like Rowan you know Rowan has these fun planning models that he has this paper that um you know we uh I think are pretty close to submitting there's a preprint of an earlier version that's out um where you know the thing does like plan right a bunch of different possible paths that it can take to you know find you know different sorts of rewards um and uh you know it takes like hours right to run like simulations on a computer to do something that however humans are doing it they can do it just like you know in a minute or something right so so um you know I don't so it's just not very plausible that the way that active inference is solving problems like that um is in any kind of exact form the same way that humans are doing it um so there's kind of scalability with respect to what humans can do because humans can't do this explicitly in a fully model-based way it doesn't seem like um after a certain sort of level of complexity to the planning problem that needs to be solved and then the the other part is distractibility with you know actually doing the modeling itself uh because of just like how just um interactively long it could take um to actually run these simulations um and so both of those things kind of come into play and it's not super clear exactly how you know to address this I mean there's there's certainly ideas out there but most of them involve um taking additional kind of machine learning tricks um you know like adding like adding like deep neural networks to active inference models for example or you know doing doing um you know various other sorts of um you know a little kind of heuristic short-cutty things that make stuff tractable right like not actually searching all the way down every possible branch of a decision tree but using some heuristic to kind of say two steps in or something nope this branch seems bad I'm just gonna cut this off and not consider it anymore you know things things like that or like sampling-based approaches um for exploring just kind of little bits of the decision tree at a time um you know it doesn't it's no longer pure active inference anymore right I mean it's it's a kind of combination of active inference and a bunch of other machine learning things um but you know to a certain extent like that's probably just like necessary and I mean bottom line is the brain probably has to be doing something other than fully model-based active inference to be to be tractable and so figuring out exactly what parts of what the brain do it is doing might be pure kind of active inference and what parts are these other things to to make stuff tractable um is I think part of the question yeah look the scalability issue with active inference is not an active inference issue it's just a Bayesian learning and decision making issue yeah it's there's nothing inherent to active inference that is that slows things down um any any technique that can be used in any other sort of Bayesian decision making method can be used in active inference um so yeah this is a the grand question of how the brain is so efficiently able to do this um where there's yeah like Ryan said I mean you run these very basic tasks and because of this the exploding state space of uh Bayesian decision making right where just these is exploding state space of probabilities um you get a massively expanding search tree ultimately and yeah nothing to do with active inference obviously of course active inference is um the Bayesian framework and so it's it has to sort of use this um but that's just a I mean if we can solve this then uh that's uh yeah it would solve a lot of things if we could solve how how we can perform Bayesian inference and especially Bayesian inference in the service of decision making how we can do that very efficiently um that would solve a lot of things um but as of yet yeah if we want to try speed active inference up we have to speed the general field of Bayesian decision making up yeah yeah oh it's okay I was just gonna say it's part of it perhaps from a strategy standpoint not trying to generate the world's biggest plan but actually going back to that rules business and saying can we start there instead of a plan that because of its it just its size makes things intractable can we could we swap something else in for something we know won't work which is world's biggest plan um I mean I mean like there's lots of different you know things that you know might you might consider like like one one thing you know for these like exploding decision tree um sorts of problems is you know finding some way to um chunk things together right to make like the you know so for instance instead of you know so I'm gonna go I decide I'm gonna go walk to the store or something right like in a certain sense I've got a ton of different options about you know where to put my feet at each step and you know like what door to go out of and things like that um but it's not really clear that when I'm planning like to go to the store that I'm really explicitly considering all of those details right like I can chunk it into you know I'm gonna walk out to the street you know I'm gonna walk you know 10 you know I'm gonna walk a mile to the store and then I'm gonna walk into the store right in which case if I've chunked that to just kind of three steps right then my decision tree is is already much more tractable and then you just need to tell some story about you know how you know when we get to certain points and that really abstract chunked um plan when we get into those kind of chunk states how we make these kind of more local decisions about what to do when we're at that smaller scales like when we're in those states so you know there's lots of kind of hierarchical chunking-ish sorts of things that you know you might think that um the brain could be doing to make these things tractable but and there's lots of different kind of options on the table and things that people might try but all of things have certainly all these things have not been kind of you know tested against behavior yet to see you know which ones are are the most plausible but there's certainly there's certainly different possibilities for what might be going on um you know another example you know this is something that we test empirically in my lab right now is um the way people might do something called like aversive pruning um and so so what you're kind of doing there is you're just and this is kind of similar to what I said before is you know when you start kind of planning down some tree if there's some early outcome where you're imagining that it's going to lead to some really negative outcome right like on like if like on step one or step two and a possible plan I think oh there's going to be some big negative thing then I will just no longer simulate down the rest of that branch so it's a way of kind of reducing the number of branches in a tree I have to search and and so that's it's often called again aversive aversive decision tree pruning and that um you know obviously you need to do something like that right to keep it tractable but at the same time it can cause problems right because sometimes the best plan might go through something negative in the short term but lead to the best thing in the long term right I mean so that's the kind of thing we we test in the lab is whether or not different psychiatric conditions involve kind of doing pruning too much right or too little um and how that could lead to so optimal behavior in the long term um the other I think the other aspect of the the previous question you asked I also had to do with you know you read part of the thing talking about um the uh the the neural basis of of active inference and how that should kind of remain tentative and just kind of clarify that a little bit I mean the I mean the main point is is just that again this kind of involves being clear about the separation between algorithm and implementation right so the the algorithm is basically just the mathematics right that we um you know that we kind of lay out but then the implementation question is you know one of the different possible ways you can kind of set up the brain to to carry out those equations and it's a separate question what the you know how the brain um how the brain is doing that even if you think that there's good reason to believe that the brain is doing the doing something that's well characterized by the mathematics um and um unlike predictive coding which has been around a long time and you know there's been much more opportunity to test um at least um qualitative predictions right you know things like whether or not there's evidence for like omission responses in the brain right like when there's the lack of a stimulus when it was expected leads to an oral response which is you know one of the stronger pieces of evidence that the brain is doing something predictive like predictive coding um you know for for active inference uh there really has been like one imaging study and it was done like in 2015 and it was like an older version of active inference it wasn't the current for based on the current formalism um you know so there's there's just hasn't been done right and the and even the um you know the little kind of column structure things like what's in the um like what's in the review in our in our in this paper um is super just kind of like promissory heuristic sort of things like here's a bunch right here's a bunch of little balls that will pretend are like neurons and here's how you could connect them together roughly you know to um do some sort of you know some message passing algorithm that you know your thinking is you know your imagining or hypothesizing is that is the one that the brain's doing for this kind of approximate inference process um and you know even when it comes to um the current mathematical proposal um for how the brain might be doing something like this um there's different hypotheses right so like initially most people when they were doing simulations with active inference models were assuming um something called variational message passing which is just a particular way of kind of repeatedly doing local approximate Bayesian inference on different kind of nodes in a graph um and they you do this kind of over and over again and then kind of converges to a good guess about what the posterior should be over states at each time point right but but you know then after that a little bit um more recently you know Thomas Parr and Carl um and uh maybe other people were on the paper I can't remember they proposed a kind of updated version of that called marginal message passing which is a little better um it does a little it's slightly more it can get a better approximate posteriors um you know on average um but then there are others right there's belief propagation is another sort of message passing algorithm that's you know been considered so there's a bunch of these right and each of those even if they're doing active inference right like they will also predict different neural dynamics um so so those are also even just active inference under what message passing algorithm um separates into a bunch of different competing hypotheses um about what you would measure in the brain awesome I'll read a question from live chat Andrew P writes curious for Dr. Smith where might he place his previous work like simulating the computational mechanisms of cognitive and behavioral psychotherapeutic interventions insights from active inference in relation here would this kind of work be more on the side of theoretical exploration like the cognitive affective behavioral interactions construct or is there a direction for empirical testing I mean as is absolutely it was a theoretical simulation work sort of paper right I mean we weren't fitting that model to any sort of empirical data um it does I mean that well that paper in particular does lend itself to at least certain that that model is a little too general like the patterns of behavior that it predicts are really not specific enough to like test in a you know some sort of task um you could you could make it you could modify it in a way that might make it specific to some sort of task that would involve um you know specific sorts of explore exploit decision-making choices under kind of expected negative outcomes things like that um but uh it does make a kind of sort of qualitative prediction um that could be that could be tested um you know so one one one parameter that you might fit right based on that model that I think would be super interesting if you could kind of turn it into a task is this um parameter that reflects like the degree to which your cognitive beliefs influence your automatic affective responses or expected affective responses um the um so I mean it would be really interesting to to try to use that to figure out whether or not there are we can measure individual differences in this kind of what what often gets called like cognitive penetrability right so whether or not I you know because I might explicitly believe I'm safe right in a certain situation but I might see something that still makes me feel really this automatic kind of fear response in my body or something even though I explicitly believe it's safe right my affective responses don't always have to match my explicit cognitive thoughts that's often a thing you see right in people with with with affective disorders so this kind of individual difference that you could estimate about the degree to which these two um essentially these two uh um um different hidden state factors you know in the in the model the degree to which those things interact effectively um but the the other the other um prediction that it makes um again it's a little more qualitative but certainly something that could be tested or made more precise is um that you know in that in that model one of the the interesting things that came out of the simulations is that um you actually it's probably actually not a good idea to make people think that they're explicitly believe that they're in a safe context before you do something like exposure therapy um because if you do that then basically what you're learning what you the beliefs that you're updating in your your likelihood right about like what actions are going to lead to what outcomes those will be updated under the belief of safe context right so that means all you have to do is switch back to believing you're in the dangerous context and then you're just right back to the problematic avoidance behavior again right um and there's a bunch of other there's recent empirical work that's um very consistent with this idea actually about you know explaining using this kind of late and causal inference um framework not not an active inference it's in particular but just late and causal inference more generally that um you know to really get exposure therapy to work long term um you need to kind of prevent that people are inferring that there's a new latent cause in operation when people are doing exposure um because otherwise they're they're just learning safety under under a context where they can really easily get the get their um maladaptive avoidance to come back just spontaneously um just by kind of being uncertain about the context they're in later right like they're not they're not actually unlearning the problematic belief they're just inferring there's a new cause where under that new cause it's safe um so so it was much better um the that the prediction of that of that paper would would be that could be tested right and again like I said there is now some evidence consistent with this um is that uh it's kind of better to do things to keep people uncertain or keep people believing that they're they're still in the same context so that when they go through this exposure therapy and get unexpected outcomes they're actually kind of overriding their previous beliefs um as opposed to just inferring they're in a new context awesome all right I'll I'll ask a kind of general question that I'd love to hear everyone's perspective on we have active inference or predictive coding or some other type of formal framework that does interface with empirical data like we're discussing so on one hand we have the empirical system or the measurements coming from it existing data sets data sets we create and so on and then on the other side of this kind of statistical interface is like numbers free energy principle category theory stuff that is really not having its validity or truth um or essence described by any particular system so how do you just approach this usefully as a graduate student researcher or as like a broader clinical research program and like think about what system are you choosing what statistical interface and how detailed do you go there and how do you think about like what's on the other side of the statistical interface because a lot of times the discussion around like the validity of active inference and free energy principle is a super philosophical question as if a philosophical discussion would resolve uncertainty about the empirical status of active inference which is actually the direction that you all took it from the interface back to the system rather than interface as outcome from something theoretical to debate um i feel like i have been kind of hogging the show here a little so i'll let other people answer that first at least um okay sorry that was quite a long question so i'm sorry if i missed some of it um and just correct me if i'm not answering right uh i think i will talk to on um this idea between yeah on the one side uh sort of mathematical formulations representing something like uh brain processes and then you have sort of mathematical formulations brain process and behavior and um how these are sort of scenes as unified i think um as ryan said about uh talking about neural implementation um there's many ways in which um we could set up neural like a neural neural implementation to achieve um something like predictive coding or active inference uh the same goes for algorithm where um there's many different sort of forms of algorithms could achieve or statistical windows as you said if you want to go more broad um could achieve the same neural implementation so and then ultimately behavior so uh in terms of and please stop me if i'm not answering correctly um but yeah in terms of sort of as a graduate student choosing what to focus on um i think it's sort of a evolving process where you constantly just try read and come up with new ideas about how one can think about how these can connect and how one can represent the other um and and for me it's been particularly interesting just actually understanding what active inference is in its um mathematical essence right we've got this first principal account of it um debated from the concepts of homeostasis and yeah it's it's can sometimes feel a bit strange when you look at um quite simple mathematical implementations in an algorithm and see that well this isn't so different from something like Bayesian reinforcement learning with specific directed exploration um so there are these there's a sense of of that sort of first principal account um that's comes all the way from Markov blankets um where's that captured in these algorithms and um that's something i'm still trying to figure out to be honest um it's i think it's it's in in general it's a problem right to and i think that's why um when ryan mentioned sort of looking at more complex tasks i think more complex tasks would allow us to explore more complex representations of of active inference um that can that can perhaps capture um some of the foundations that first and laid out um so does that answer the question sort of yeah that's awesome yeah i mean the other i mean the other things i would say i mean i i think i understood at least part of the question a little bit differently um was that um you know sort of what i by that i don't mean that you misunderstood raw and you just answered your different part that's all good part part of what i took your question to be was something about you know to what degree answers about the you know quote-unquote validity of active inference or the for energy principle can be established through kind of like philosophical argument versus what can be um or what needs to be answered um through kind of direct empirical testing um and um and i think um you know here i think it depends a little bit right so i mean i i think what you know what philosophy is really good at is actually more finding ways that something couldn't be true right i mean i take you know i mean i have a philosophy background um as well and i think i think philosophy is really good at like conceptual analysis right so trying to find you know where some particular conceptual framework um you know where there's tensions right or where there's things that don't sort of at first at first need sort of appearance seem like they're contradictory have some sort of internal tension um but when you really drill down you find certain contradictions right that you know rule out something or make it really implausible um either through sort of direct logical um um argumentation or um through kind of like um just proposing sort of thought experiments that they're really kind of pump intuitions in a way that they make a difference so so i think you can rule out with philosophy philosophical methods um things that are conceptually problematic right so you can get down to kind of the set of frameworks or the set of yeah the set of sort of interlinked concepts that make up a theory you can find a set of them that are at least internally consistent and and have some sort of general sort of sufficient coherence to them but once you've weeded things down to the kind of subset of possible theories that are internally consistent right and and don't have any philosophical problems associated with them then then i think um things really kind of need to get empirical um you know because because you know to the extent that different kind of competing um conceptual sort of theoretical frameworks um to the extent that they make different predictions um and to the extent that the goal of this whole project is to try to come up with an accurate theory of what you know what brains are doing right um if that is the goal then the way the only way ultimately to validate that is by actually looking at brains and seeing what they're actually doing um and um and so and so i think i think that's kind of where rubber meets the road a bit right this is that um um yeah you need to know what's plausible right and that's where i think philosophy can help a lot but after you know what's plausible then testing what's kind of actual right this is is kind of the next necessary step and um and this kind of comes back to what roan was saying um was that testing the unique predictions of say some given act of inference model versus some competing model from another framework like reinforcement learning um can be tricky um usually you have to do this in an incremental way and you do that by finding by identifying some prediction that is different between those two models and then designing a task that would really put people in a decision in a in a position where they have to make choices that are consistent with either the prediction of one or the other um and um it really requires some creativity right and coming up with the right kind of the task with the right sort of decision making demands um and constraints um that will do that because there's lots of tasks um where say the kind of simplest most straightforward active inference model will really make like almost exactly the same predictions as what like a simpler reinforcement learning model would make right so in those cases you really can't differentiate with you because they pretty much predict the same thing um and um you know and as as roan mentioned even though it's true the active inference is kind of motivated from these sort of first principles um which is super nice um the the thing that you ultimately end up converging on in terms of the decision-making algorithm is an active inference start to look pretty similar to the sorts of things that people in machine learning and reinforcement learning have ended up kind of coming to anyway they just haven't come to it from a first principles perspective they've come to it because they're just kind of like tinkering around and figure out what works um and so um and so that that's also kind of an interesting question is is you know where can we find differences between active inference and the other stuff that's out there that has nothing that has no connection to active inference but is kind of converged on a similar type of solution um and uh you know and then you get to a point where okay well if we've gotten to the case where we have two algorithms that have different origins but are basically doing the same thing right then there is no competition anymore we've just gotten to the same solution two different ways um so that's uh I don't know I guess I guess goes a little bit beyond your question but but I think um hopefully is all sufficiently relevant that's awesome maybe um marshka and then like kind of just in closing here um how do people pick up and run with the work and just how do people continue with it but first how do you approach this at a bigger level yeah um so I come from like a different side to this I come from more empirical background like what Ryan was saying so to me it's more about taking all of this information kind of accepting the broad literature that that's out there not just in active inference or predictive coding but also more from the psychology and your science side of things and kind of seeing where these things fit in an empirical setting where do these algorithmic ways um can be tested in a plausible manner for example like how Ryan was talking about the last fmri study was done in like 2015 that actually tested like um active inference models that to me is like the first things where we would want to test what is actually happening in the brain whether it's through fmri eg kind of testing at different levels through different computational models what the brain is doing to me that's the most interesting next step like taking the computational work out there and trying to relate it back to brain and behavior Rowan and then Ryan or anyone else can just kind of give any last thoughts or what they're going to take going forward go for it go for it Rowan if there's anything you want to say sorry I was muted um yeah I think I think active inference is in a it's in an interesting spot and I think the world of AI isn't a very interesting spot I know you know this by AI I mean artificial intelligence not active inference if we want to look at active inference as a sort of an element of artificial intelligence it represents a side of things which is a sort of neuroscience inspired artificial intelligence and as we all know there's a there's a bit of a war going on between that and the large language model I'm a kind of things and I think active inference is yeah I think while it's tempting to sort of go ahead with amassing more data and training larger models I think that it's really important not to to forget about this aspect of trying to not just come up with with you know more architecture and more data but to come up with more interesting frameworks I also think that unlike traditional artificial intelligence or machine learning it's important to remember that there's this tension between as we mentioned before scalability and and just the efficiency and biological plausibility we don't always want a solution a model to just be better and quicker that isn't necessarily what we're going for we know that in many cases artificial intelligence can do things better and quicker than the human brain but we shouldn't necessarily only care about that sometimes yeah it's important to get a solution where it doesn't act more effectively but acts more like data empirical data that we have gotten from actual human behavior and that it even if it isn't the most efficient sort of architectural algorithmic structure it matches the structure that the brain might implement so I think it's important to remember these factors in this world where yeah it's fast developing into you know big data and massive models sometimes I find myself drawn to just always trying to make active reference faster and always trying to make it more efficient and oh I sometimes have to check myself with that and and remember what I'm actually interested in so yeah yeah and I guess I guess as you know concluding on my end I mean you know Daniel you asked specifically kind of about future directions you know and and the kind of thing that we're doing in the lab right now is you know very kind of inspired by what we said you know needs to be done in the in the in the paper right I mean it's always a little bit of an advertisement for in a certain sense for you know here's what we've done and here's what needs to be done and you know that we're trying to kind of follow that ourselves but you know and again you know very very much kind of in the context of of trying to also use these frameworks to you know be of some kind of practical benefit to the world right so I mean you know we kind of in tandem right use this as an opportunity to one you know test theories about you know how brains and minds work but also to take it as a simultaneous opportunity to look for where these sorts of mechanisms may be um you know going wrong a bit and people with different sorts of disorders and whether or not that can um you know give clinicians right like ideas for novel treatment targets right like to target mechanism x versus y you know as kind of shown by what the differences are in models when we fit them to behavior and say like healthy versus clinical populations um and um and so you know the sorts of things that you know we're doing now on the on the corner more basic science side is is um you know I have another grad student um co-paying um chow who's not on the call here but you know she's done um some very extensive work now fitting um um something like 50 or so different models um to um some decision-making behavior that we have in both a Taiwanese sample where she's from and and an American sample from around here um and you know trying to find right like which model fits best and so these are a range of simpler kind of reinforcement learning models and a range of different active inference models that are just parameterized in different ways right so ones that assume sort of like you know static versus dynamic decision noise or static versus dynamic learning or I mean again there's tons of little variants on these things um and so you know she's she's found in in both samples that you know the model that best fits the data um and wins in model comparison is a is a specific um active inference model that has six parameters I think um that involves a you know dynamic learning or dynamic decision noise that includes particular sort of forgetting rate function um that's you know not included in kind of the simplest version of active inference um and um you know so showing much more kind of conclusively in two different data sets um you know that um active inference um at least wins in model comparison against some of these other um other kind of competing models um that being said I think it's important that even though it wins in model comparison when you actually look at the predictive accuracies um they're very similar you know we're talking about active inference being able to explain like 81.5 percent of the you know like data versus uh the best reinforcement learning model being able to pick like 80.6 percent of the behavior you know so I mean these differences are minor right it comes down to being better at predicting just a couple of choices um but sometimes if those choices follow from different sorts of expected dynamics along the way then that can still be meaningful um you know so that's the kind of thing you know that we've been doing on the on the more kind of basic science that but that follows up on you know what we were saying needs to be done um you know on the sort of clinical end still not super clinical it's you know we have a not a study that where that's ongoing right now um that has both an online and a um in-person component where we came up with five different um five different tasks that um where different models would predict different patterns of behavior um in distinct ways in each of those tasks a couple of them were designed specifically for active inference um others were designed like more in relation to the hgf the hierarchical Gaussian filter um and um you know so we're in the process of collecting a lot of data in those where again we'll be able to fit different models and um that one we're also focusing a lot on um on whether or not we can find models and parameter values in those models that predict um individual differences in subjective well-being as well as um negative affective stuff so we're trying to go for not just predicting like symptoms and negative emotional stuff but also um positive emotional stuff and what makes a difference to decision patterns that are say more consistent with um greater satisfaction with life and subjectable being and things like that so um you know we're doing that we're also doing some um computational sort of Bayesian um modeling of um interception data so how well people can like detect differences in their um heartbeats or differences in their kind of like respiratory difficulty things like that um and how they're always anxiety disorders so so we're um it's very much kind of a combination of this basic science stuff and uh and trying to move the the usefulness of this thing clinically at the same time so it can be kind of practically helpful to people that's awesome dean and then anyone else and then that will be awesome but where where where do you go from here dean um have a shower i don't know i'm not i don't have anything really planned but i appreciate the uh i appreciate the explanations and like i said when you start from a place where you take two things and hold them up together at the same time that's that's the part that i uh appreciate here and i think it again reading through all of this um i didn't understand all of the math but um it also i think based on what you were able to show i wasn't sort of questioning what what process you went through to try your conclusions and and state what some of the limitations are until those tests have have been carried out so yeah yeah thank you for the very clear very relevant paper so good luck and you're always welcome back to share more we can follow up and um it'll be an ongoing rolling literature meta analysis across different systems there will be thousands of fractal sub reviews empirical status in the amygdala on this kind of thing so it's great to hear how you did it this time but thanks for having us yeah thanks so much bye bye