 Hello and welcome to the Active Inference Lab. This is the Active Inference Lab and the Active Inference Livestream. Today we are in Active Inference Livestream 18.1 on March 23rd, 2021. Welcome to the Active Inference Lab everyone. We're a participatory online lab that is communicating, learning and practicing applied to Active Inference. You can find us at our links here. This is a recorded in an archived livestream. So please provide us with feedback so we can improve our work. All backgrounds and perspectives are welcome here and we'll be using good video etiquette for livestreams. Today we're in Livestream 18.1 in our first of two participatory group discussions on the predictive global neuronal workspace paper and we're with the authors here, Ryan Smith and Christopher White. So thanks to both of you for joining today. Today in 18.1 the goal is just gonna be to discuss and learn and ask some questions about this awesome paper and we're gonna go through some introductions and warmups and then I believe Christopher will have some slides to share with us and set some context and then we'll just proceed with probably a bunch of questions that we all have about the paper as well as some questions from the live chat. To get started though, let's just go around and introduce ourselves and we can each just give a short introduction or check in and then pass it to somebody who hasn't spoken. So I'm Daniel, I'm in California and I will pass it first to Dean. Dean, I'm up here in Calgary in Canada and I'll pass it to Steven. Hello, I'm Steven, I'm up in Toronto in Canada and I'll pass it over to Christopher. Hi, thanks. I'm Christopher White, I'm a PhD student at the University of Cambridge and this is, just to give some brief context, this paper is really a work that kind of came out of my master's thesis that I was doing with Ryan but we've kind of continued on working on. So hopefully we can chat about both this paper and then at some point, chat about some things that we've been continuing on. And so I'll pass to Blue. Hi, I'm Blue Knight. I am an independent research consultant based out of New Mexico and I will pass it to Dave. Dave Douglas, I'm in the mountains of the Northern Philippines where it's almost the end of spring. I'm retired from IT, especially from machine translation. Working on the semi-technical aspects of our education project. And then I believe Ryan. Yeah, so I'm Ryan Smith. I'm an investigator at the Laureate Institute for Brain Research and I, yeah, we're in Tulsa, Oklahoma. Basically I run a lab focused on computational neuroscience and computational psychiatry. Well, I'm sure we'll talk a lot more about that. Let's maybe just go right to Christopher's slides and then we can pick up just with conversation or with this slide deck as needed. But Christopher, go ahead and share your slides and I'll make them the full screen. Yeah, sure. Okay, so I'm assuming you can see my screen. Are you seeing? We're seeing our Jitsie. Okay, how about now? Yep, looks good. Okay, well, it seems like I can't both look at the, even with the dual monitor setup, it doesn't seem like it. It wants to share the same screen that Jitsie is on and that's okay. Because I just can't see anyone. So just give me a shout. So this is these are slides that from a talk I gave at UCL about midway through last year. And so I may need some reminding of bits of this paper because we did write it quite a while ago, although it did only just get published in December last year. Okay, so the idea with this paper was really just to kind of formalize some things that had been around in the literature for a while and namely putting global workspace theory into kind of a predictive processing framework. And so they'd been so, Carl Friston has a really nice 2012 paper that kind of did this somewhat formally. Jacob, but it didn't really connect that deeply with the global workspace literature at all really. Jacob had in chapter 10 of his 2013 book, Jacob Hovey, sorry, has a really wonderful chapter where he discusses the relationship between phenomenal unity and global workspace theory and how active and how predictive kind of the predictive processing framework in general and active inference kind of broadly conceived fit in with global workspace theory and how these ideas can kind of take help make sense of some puzzles within global workspace theory and also kind of give predictive processing a theory of consciousness as well. And then in terms of, I published a brief paper kind of building on Jacob's work where I kind of said, hey, here's, that's where the predictive global and a workspace comes from or that term, which is just a term that I kind of wanted to introduce just to denote basically kind of predictive processing approaches to a global workspace style architecture. And so when Ryan and I started chatting because of that paper, we really our goal was to put some fairly vague idea, conceptual ideas into a more computationally precise framework and that's what we did here. And so the idea was really just to use a fairly straightforward, deep temporal partial observable markup decision process solve the fire active inference and show how by kind of tuning some of the parameters, you can get basically all of the global work a very large perception of the global workspace literature and all of those empirical results kind of just falling straight out of the model. And what's I think really important to me to my background is in cognitive science and philosophy. And so I did my undergrad in cognitive science and did my masters in a computational cognitive neuroscience lab. And so although I work on for a long time for a while is really interesting visual awareness and visual consciousness. I was around a lot of experimentalists and it's really important to me that all of these ideas are testable and we actually specify testable predictions. And so really what we were doing in this paper was trying as best we could to one reproduce a bunch of existing results but also kind of extend the model and show where it actually gave testable predictions. And some of those have actually been confirmed recently which is kind of nice. So I'll just kind of flip through some things. There are just a couple of things I wanna highlight. And so the first thing is just to kind of define terms. So we called it active inference model of visual consciousness. And that was really just to kind of so consciousness is either way I think about it at least maybe this is some audio-syncratic but consciousness to me is just a kind of a catchall term. It isn't one thing. There are lots of sub things within consciousness that I think can come apart. And so block kind of very famously in 1995 proposed the term access consciousness where this is just content that can be reported in the first person perspective and it is available to kind of cognitive control processes and working memory. That's not a quote from him that's gonna be paraphrasing but that's roughly speaking the idea. And so that allows room for there to be other processes or unconscious processes say in the visual system of which you aren't aware and put in nonetheless influence behavior. And he then also proposes say his chief in that paper his chief kind of worry is distinguished between access consciousness and phenomenal consciousness where phenomenal consciousness is kind of the experiential aspect of this. And he thinks that they, so this is now, and he thinks that they can come apart and that's what's controversial. So I don't personally agree that they can come apart but I think it's a useful distinction nonetheless. And so I think in terms of thought experiments I don't remember where it might not be it might be in that paper, it might be elsewhere but he has kind of talked about the example that at least to me was really intuitive was if you imagine someone with a jackhammer or if you're doing, if you're working at a desk and you can hear a jackhammer in the background. You might, at some point you will become aware of the jackhammer and it was kind of a part of your experience all along. You just weren't aware of it in a way. And to me that sounds rather odd but that's kind of, but that's fine. We can set that aside. We're just defining terms here. Yeah, I mean, obviously I agree. I mean, I think there's lots of cases like in attentional blindness and change blindness and things like that where you might think you were aware of something the whole time actually worked out anyway. So I think. Yeah, exactly. I think like there are people, I think I've forgotten who coined this term but it's like the refrigerator light illusion. Like you open the refrigerator and the lights on do you think it was always on? Yeah. It's kind of a nice analogy. Okay, so just briefly about the global workspace. So this idea kind of started in the late 80s with Bernard Bazin was then stand to hands group really just took it and ran with it. And so the idea is roughly speaking there's this prefrontal or parietal network that connects in a kind of motor processes, attentional processing, perceptual systems, memory, and all of these systems are in some sense competing for access. And they are otherwise relatively kind of modular and disconnected but when they, what is access to the global workspace or the content that kind of wins gains access to global workspace as broadcast throughout the system. So it enables kind of otherwise isolated subsystems to share information in a common format. And he's identified this with a prefrontal network. And it's like prima facie there's quite a bit of evidence for this. So across meta analysis in report paradigms. So actually people kind of like actually report their experience binocular rivalry and phenomenal masking. There is a network of frontal parietal regions that is active for conscious but not unconscious. Hey, are you cut off? We can't hear you. Oh, really? I can actually still hear him. Oh, maybe it was just me then. Oh, good. Oh, good. No worries. Thanks for pointing it out to Ryan and continue along. Yeah, so there's basically that the longest short of it is across meta analysis. There actually does seem to be quite a bit of evidence for the view that there's this frontal parietal network. And this story gets more complicated when we get into no report paradigms. Maybe we can chat about that at the end or next session because that's kind of the work that Ryan and I really kind of concentrating on at the moment. Okay, I think we can... So I'll just kind of talk briefly about these two and then kind of move on to the model. So the idea... So basically across event-related potential studies, so this is a kind of metacontrast marketing study on the left and on the right, there's a intentional blink. I think, goodness, yes, it is. Basically what you can see is that there's this red line on the left and there is this little P3... It's literally labeled with a P3B here and there is this late kind of seemingly nonlinear all-or-nothing event-related potential over frontal-central electrodes that seems to characterize reported versus non-reported consciousness, content, sorry. Reportable, rather, not, not. And this is actually... I should just flag for anyone who's kind of actually in the neuroscience of consciousness literature who is watching this. This is no longer considered a neural corollary of consciousness and I agree with that. It seems to drive, be driven by task demands. But nonetheless, it is still a question why it is that when something is conscious and task-relevant, you get a P3B and that's kind of what we tackle in the paper. And so really all of the evidence for the Global Workspace Theory, I think it can be summed up quite nicely with this 2006 kind of taxonomy. And so the idea is that you can orthogonally manipulate stimulus strength and attention. So stimulus strength, you might just imagine that either might be, like, time of presentation. So if you have a classic kind of masking experiment, you'll have, I don't know, it might be a Gabor patch. Then just after it, there'll be some static noise. And the static noise kind of masks is the mask to the Gabor patch. And depending on the time, kind of the distance between the mask and the stimulus, you can render something unreportable. And nonetheless, it can steal in some circumstances influence behavior. And so you can imagine that stimulus strength might be something like either you could get, it might be contrast, it might be the distance between the mask, you could do both. It doesn't really matter for our very course purposes here. And then in terms of, you can also vary attention. So either have attention to the task or attention to be distracted away. And the idea is that when there's a weak stimulus, a bottom-up signal, and attention is absent, you basically just get very little activity. It's whatever activity there is, it's confined to very early striate cortex. When you have attention is present, but the signal is still weak, you get some priming. But it's still unavailable for report. And so at that level, it starts to influence behavior. Then at kind of the pre-conscious level, or what they call pre-conscious, borrowing a term from Freud. This is where, this is attention, inattentional blindness, effectively. So you have a very strong stimulus. The gorilla can be walking across the room and wave at you. It can be in the dead middle of the phobia. This is kind of activity. It's still very strong. It's within the visual system, but it's kind of gated out of awareness via a lack of attention, at least most of the time. And then when, in this last quadrant here, when you're attending something and there's a strong signal, you get this widespread kind of bifurcation in neural activity that characterizes behavior versus unaware states. And so we kind of had, or personally, I had a couple of reasons. There are a couple of reasons to be a bit suspicious about this, and I actually do think global workspace theory has been updated to kind of accommodate these, but nonetheless, we'll go through them. So basically unconscious information can still be decoded from frontal regions. So the PFC activity is not synonymous with consciousness. So the idea is that you could might have the idea that there is just a global workspace and you enter it loosely associated PFC and entering it is sufficient for consciousness. This can't be right. They have a solution to this. They then basically introduced a Bayesian model of conscious report where consciousness also requires a stimulus representation to be separable from a noise distribution. I really like this model. In the beginning of my master's thesis, I spent a lot of time kind of playing around to the code. I was like, I'm not going to do this at all, but it's an ideal Bayesian observer. It doesn't really give any neural predictions at all. I don't know. And that's fine limitation. It's an ideal Bayesian observer. That still would be nice if we had something a little bit more neuraly detailed. Christopher, could you actually just unpack that slide 13? Just what are the disks representing and how does this relate to Bayesian statistics? Oh, yeah, sure. So basically each of these axes is a source of evidence. So evidence for stimulus Y, evidence for stimulus X, and then basically these two coloured lines, the pink coloured line or the green coloured line correspond to different decision criteria that you are imposing on a space. So the idea is to alternative force choice tasks where you force choose between two things or versus report something as present or absent actually employ different decision criteria. And so you can imagine these as essentially a multivariate. Each of these as a Gaussian distribution. Each of these circles as kind of a univariate Gaussian distribution where if you are in the blue that's kind of stimulus to the left. If you're all the way to the right or on X axis that is stimulus on the right where and then this green thing kind of close around the origin is a noise distribution. So this is just saying there's nothing there. And basically what contrast or awareness is on this account is basically these distributions or either left or right are enough away from the origin that they don't overlap substantially with the noise distribution. So when you impose the decision criteria on the space you are confident that you have seen something. Does that kind of make sense? Nice, thank you. Cool, okay. And then seen versus unseen ignores the noise distribution. It just separates left versus right. So really cool model. I really like it. But yeah, unfortunately the right is nearly detailed as I would like. Second problem is the P3B does not equal awareness. So these are two paradigms. So it's now being replicated in a no report condition. Basically person is there's really sophisticated experimental techniques that allow you to basically access gain access to a person make inferences about what a person saw without having to make explicit reports. And when that happens you see that basically the P3B which was at one point taken to be a signature of awareness just vanishes. And that's a bit of a problem. So I would like an explanation of that. And then the last one isn't really a problem so much as just a point of absence. So there's a really large body of evidence showing that expectation seems to be a pretty crucial factor in determining the content, the visual consciousness. So across paradigms, continuous flash pressure, monocular rivalry, motion-induced blindness, blah, blah, blah. There's all of these areas where to show that expectations probably along the lines of similar to attention should be a part of any taxonomy or any taxonomy that claims to kind of try to be complete of the factors underlying conscious access. And the global neural workspace as it stands doesn't really describe how expectations are implemented. And so our job or what we saw ourselves as doing really in this paper was just to kind of show how we could account for all of this existing work namely the taxonomy and also explain how the P3B doesn't equal awareness or show how that dissociation arises very naturally and three, show how expectations could be plausibly be implemented and give kind of a more give a model that generates predictions about expectations I should say and has a story about how they're implemented. Okay. And I think the real thing is it's okay to revise a model. Like, revise models is that's just good science. And I think one of the part of global workspace theory that made me a little bit suspicious lately is that the revisions of problem one and two, they didn't generate surplus neural predictions really. They were just kind of they abandoned the P3B signature of conscious access and they endorse an ideal Bayesian observer model of subjective reports. But neither of those things they were kind of responses to problems that didn't themselves generate new predictions that go on. And that's kind of a characteristic of what's been called degenerative research programs by LACTA. And so I mean we can argue about whether this is a fair characterization of global workspace theory, but I think this is kind of at least a nice way of setting up where we were coming from. Okay. Would you like to give a primer on active inference of partial observer mark opposition processes? I'm not sure how much more you want me to talk or whatever. I mean this is an hour worth of slides. So I could talk forever, but maybe do you want me to skip to the results or just summarize the model or what? Let's actually hear your take here. This is just a really awesome presentation and then everybody here maybe let's write down a question or two so then we can have another however many minutes you want of the slides have about an hour or a little more for questions and then we'll have next week for questions. But it's really great to get this information out there because the way you're sharing it is awesome. Yeah, cool. Okay. Well, I'll try and be a little bit more detailed when I go through the slides because I've kind of just been clicking through things at the moment. Okay. So what just a brief this is a little bit of a point of confusion and Ryan may want to kind of come up describe this as well. So what is active inference? Well, it's both a noun and a verb and there's a noun that's kind of you can use it to describe two things. It's a system mathematical framework that allows one to formulate models of kind of sending behavior using equations that have kind of justified a priori from a very general set of assumptions about the nature of self-organizing systems. It's also that you can attach a process theory to that mathematical framework that allows you to give predictions about various neurophysiological variables like firing rate of neuronal populations, ERPs and kind of how neuromodulators like dopamine or noradrenaline affect things like firing rate in the ERPs and also behavior. It's also used as a verb and especially in kind of maybe pre 2016 literature that's basically what it was. It was basically an extension of predictive coding to motor control. And so people often use active inference just as a catchall for basically the predictive coding version of motor control. And there's been all sorts of confusion in the literature where people are taking kind of active inference as a theory basically of Markov decision processes. And they're conflating that because Markov decision processes and your decision process is kind of a model of Bayes optimal decision making. And then if that's conflated with a theory of motor control or you use a motor control account of decision making, you're going to get say bizarre things. Don't actually track what's going on. Yeah, I mean that's the main point if anything that I would not have fully emphasized. All active inference is a theory of motor control. I think people still do it. But active inference where that just means hold a prior constant and make a reflex arc, move the body to minimize the prediction error with respect to that prior about say like proprioceptive decision or something like that. That's just a theory about once you've decided what to do, how to get the body to move accordingly. I don't have anything to do with deciding what to do before finding what to do. The motor control thing is just how do I move, how do I get the body to move how I want to? Using something predictive coding right, whereas you know what we're talking about kind of like current inference is again it's a theory of decision making which is totally different. I really think there should be two different terms for these things now because I think that's the conflation as far as I'm concerned. Yeah, exactly. Thanks. And Steven, do you have a quick question? Yeah, just be curious as you're here, you know, you're doing work in this paper here with the neuronal context of how deductive choices are made and it taps to the phenomenological questions as well. Do you think that's when these, you see these problems happen a lot because when people are you're trying to move between different aspects of fields of practice which are sometimes thought about separately, you know, phenomenology phenomenology is often talked about in the inactivist field and you've got the neuroscience and they're often clean and separate but when you need to bring them together and try and talk about this bigger picture the distinction you're making become even more important, if that makes sense, because there seems to be a lot of conflation that happens with that so I'll just be interested in your thoughts. So I think one thing to say is I like inactivism, especially what's kind of been called computational inactivism. I've forgotten there was a really awesome paper out in Synthes I've forgotten the author name of that author but sorry about that but to me that stuff doesn't really have that much to do with what I would call consciousness or what I think of as consciousness as like a cognitive scientist I suppose when I think of awareness I think of a very sharp my explanatory, the thing that I'm trying to explain is very specific. It is there are these states of which you are unaware, some of which you can become aware of, say for example like the inattentional blindness case or in phenomenal masking. In some cases there is a visual stimulus in one case where it's unaware and there's a very small change, physical change the stimulus but there's this radical non-linear change in whether you can report it or not. I'm interested in that contrast between reportable stimuli and non-reportable stimuli. Now along with that comes when stimuli are reportable, generally speaking there is a we might also be interested in characterizing kind of the content of experience as it were, not just the availability for report. So we're going to get to that actually towards the end of the presentation we actually do endorse a version of the phenomenal access consciousness distinction but really I think to answer repeat myself somewhat and answer the question all that I'm trying to explain with this model is that distinction between content that's accessible and content that's not accessible everything else I don't really mind like people can call it whatever they like I'm not interested in modeling it. Yeah, so I mean in general I mean again this just speaks to kind of having precise theoretical constructs and having a good kind of taxonomy of these things within a particular area and also if you're going to try to do any kind of interdisciplinary work where you have constructs in one kind of field and you want to find some way to map them on to or translate them into what you think they ought to be in some other theoretical language then I mean also specifying that in a precise way and how you kind of recover one from the other in terms of the way they're operationalized empirically is also something that you have to do to do this work well so I mean Chris already started to kind of describe people really loosely at least not in all cases obviously some people are very precise but in some cases the word consciousness gets kind of thrown around to mean a ton of different things which makes it confusing and also makes that term imprecise and therefore not really all that useful for instance there's distinctions that people make between what we're talking about whether a particular content a particular represented content in the brain and if people report experiencing it versus if people report not experiencing it but then there's also some people talk about levels of consciousness which is a totally different thing that's like being in a coma versus being in a minimal vegetative state versus being like awake but really sleepy versus being really alert that kind of thing which is a totally separate thing you can kind of think of one as a precondition in some sense for another you have to be in a level of consciousness that corresponds to being awake to have any or organ or dream but your brain has to be in a certain general state of consciousness level wise to be able to have the experience of one particular content over another so that's kind of saying being awake is a precondition or studying the thing that we're interested in which is why you become conscious of one thing versus another given very similar stimulus conditions exactly so there are different distinctions like we said there's this distinction between or potential distinction between phenomenal consciousness just like what it's like to be experiencing something versus this access consciousness thing which is roughly being aware of the thing you're experiencing and like I said several people think this kind of thing can come apart I also am very skeptical about that e.g. the fridge flight example that Chris mentioned earlier you know I mean like I said at the end of this paper in some sense describe how our model can identify something like two different types of access where one type of access is kind of what's minimally necessary for empirically verifiable phenomenology whereas the other is more kind of what determines the actual reporting so there is something like a distinction in our model but it's not the standard kind of phenomenal consciousness versus access consciousness distinction but anyway my point is there's all these different sorts of things you know you mentioned like phenomenal literature where people just kind of describe the way that phenomenology is or kind of the dynamics in phenomenology and things like that and then also is kind of completely different or kind of descriptive field right it doesn't have to do with this distinction about what becomes you know experience versus not so I mean the general kind of I mean it's already kind of like long-winded answer but just just I think very precise terms and very precise mappings between terms in different fields are necessary to do this kind of thing well at all and that's not always the case I yeah just to briefly fill up on that okay really what I'm interested in is this is kind of was put out in Bernard Barr's he called it the goodness I might put you this basically a minimal contrast approach where you have some content of which you're aware some content of which you're unaware and up to all of it up to some limit the physical properties of the stimulus should be almost identical there might best be 10 milliseconds extra between the mask and the stimulus and that's the only thing that renders it so then the idea there is that we can contrast the neuronal kind of consequences of being aware of a stimulus was unaware independent of hopefully the physical properties of the stimulus and so so just to give an example and just to make sure that I'm being charitable so there was recently a special issue of I'm fairly sure special issue of the Journal of Consciousness Studies on like consciousness and plants and I guess one thing that I'm not entirely convinced of is that we're really talking about the same thing I think that's great and people smart we should let smart people do what they want and study all of these things but I'm not convinced for example that the processor or it will take a lot of convincing convince me that something like phenomenal conscious whatever you want to call phenomenal consciousness in a tree or a plant really has anything at all to do with this minimal contrast approach in like the neuroscience of consciousness yeah one one other one other thing again just like things kind of popping up in my head here I think that we have a fair amount of time so I don't think it's problematic I'm just going to turn on a bit here but well it's that way totally I totally didn't blank out on what I was going to say well we'll come back to that we can move forward okay thanks Ryan also maybe you could talk like a little closer into your microphone it gets a little quiet sometimes okay thank you all right continue and I'm really curious to see how active inference is going to play into this sorry sorry I just remember what I was going to say so I think another thing that sometimes gets played is just representing things versus being aware of them right so you know like a major so for example right I mean in the in the act of inference literature there are other papers that are described you know things like Jacobian monism right it's something that I like like on the Carl's recent papers you know where where more or less it's a story about how via the free energy principle you can come to have systems where the internal states of the system come to parameterize or track in some way the states of the world outside of the system right so in a really broad sense you have some some way in which the internal states are representing or keeping track of the external states but that as a kind of representation or the kind of seeds of internal representations of what's kind of going on out in the world but the idea you know with this sort of literature is that the brain can represent a bunch of stuff at the same time but only some of the stuff that's being represented becomes conscious at any given moment right so like I can flash like a word I can flash like so unlike some studies for instance I could flash the word guilt for you really fast and you won't report ever having seen the word guilt but you'll actually behave in ways that are more consistent with feeling guilty or something like that or I mean simpler like perceptual or semantic priming examples where you might flash something really quickly again a person doesn't say they are aware of having seen anything but they'll say completely weird stems in one particular way versus another right if I flash an elephant for you really fast you don't know I see it and I give you like a word stem like EL you know it's like complete the word people are more likely to say elephant than they would be if you didn't flash it right or something like that right so behavior is affected by something that was clearly caught right by the brain but despite that it was represented you didn't become aware of it right so I just want to make that distinction as well but being conscious of something is distinct from whether it whether the brain actually in some sense knew it was there yeah yeah okay so just the idea so we used a part of the Markov decision process as our model a hierarchical one I should say so this is just a single level Markov decision process and just very briefly of POMDPs I'm just going to call them MDPs just for brevity they describe transitions among kind of hidden unobservable variables in the sensory data they generate so you can have a hidden state of the world and the observations generated by that hidden state and the goal of active inference is to infer the states and the sequences or action sequences or policies pi from the set of observations and so we'll just kind of like step through very basically how this would work so you could just have a very simple case right here where this is just a graphical representation of Bayesian inference we can just have a hidden state and we have a prior over that state and we have some likelihood mapping between that hidden state and some a state of the world I think I'm getting feedback from someone's computer okay try again I think it's yours Ryan but mine I can see if I can continue Christopher thanks okay thanks that's better yeah it's funny how that messes with you we've got predictive models of when our speech timing should be at least we know that you're awake okay so we've got yeah so this comes out to basically if you then invert this model this is just exact Bayesian inference now if we start moving through time we have to add in transition probabilities and so this just imagine this first level of model is just a Markov chain it's just you have some initial state you transition into your first state then you have state transitions that govern how you how those states change over time but you also have variables that you don't ever get to see this Markov chain so what you get to see is a set of observations and so the idea is these observations are generated by these hidden states and so the task of inverting these models is to infer the most likely hidden state conditioned on some stimulus and so you get this is this little sigma here is a softmax denoted function and basically you can derive message passing algorithms or using message passing algorithms rather you can derive very simple update rules that allow you to come up with posterior probabilities over states and then actions under this framework are then just kind of state transitions that the agent has control over so you can imagine in like a psychophysics experiment the actions this is a super limited set of action space for the agent basically they can hit left or right and they might be able to move their eyes so those would be the policies that they can control everything else the state screen those are hidden states but they're just not transitions transitions between those hidden states aren't things that the agent can control okay so then in a way so how do we select actions well we select actions according to the frangy functional so this has two component and you select the action or the policy rather that is kind of best minimized expected for energy where this has two terms the first is the expected cost so this is kind of the kale divergence or the difference between our predicted outcomes which is this q here and our predicted outcomes under some policy so what the outcome I expect given that I take a certain action and the predict and just and our preferred observations so these are things like having preferences for winning versus losing where the agent generally so you can encode the agent wants to win or not to lose or something like that the second term here this is the exploratory component this is the expected ambiguity where this is basically the expected entropy of the likelihood mapping and so the idea here is that you seek out actions or seek out states of the world that have a precise mapping the sensory outcomes and so for example if you're in a hotel room that you've never been in before it's really dark the best way best actions to take to minimize your uncertainty is just to turn the light on because that will give you a really precise mapping between the hidden states of the world out there and kind of the observations do generate all that they generate and so the idea is that expected frangies minimize when both of these terms are minimized and it's actually a little bit more complicated than this I should say so policy selection isn't just the minimization of expected frangy it also involves minimization of variational frangy but I would just say what's the model stream to get into that for our purposes in this conversation we can just say policies are selected but best minimize expected frangy and variational frangy where variational frangy is also basically a stand in where the posterior or optimal posterior here that you infer will also kind of you minimize you infer the optimal posterior by minimizing variational frangy okay so what about I haven't really talked about the brain so far one of the beautiful things back of inference is that comes the process theory and so belief updates changes to your posterior over states correspond to changes in activation levels in response to input and the matrix entries here d, b, h, c blah blah blah blah these are all h actually isn't really a matrix I guess it's a matrix entry what is h? entropy so you can just these correspond to synaptic connection strengths I mean this is just like a very standard assumption across neural network literature really and just in neuroscience in general where changes in weights of synaptic connections correspond to some kind of like function approximation okay and so then you get these some intimidating looking equations we can just select the ones that are really relevant so this first one here is state prediction error and so minimizing state prediction error in perception corresponds to where the corresponds to essentially the inference of the posterior probability q of s whoops and you can cast this as a gradient descent on variational free energy and that's actually where we get our EAPs from Christopher could you go into a little detail what the variational means independently or as related to the free energy here? yeah sure so variational free energy is just to distinguish it from something like thermodynamic free energy so like there are formal relationships between these things I would just say read Carl's Carl Friston's work if you want to know about that but basically variational free energy is the KL divergence between a generative model and your approximate posterior distribution and what you're trying to do is change the approximate posterior distribution slowly but surely so change it bit by bit so you nudge it in the direction of steepest descent until it basically this error term here which is basically the difference between the log difference between your generative model and your approximate posterior you change s until that difference is at the minimum and the rate at which that changes so this little v dot here is your EAP so the idea and I'm going to go into that a little bit more in a moment the idea is that you have the log probability of our posterior is essentially a non-normalized log probability and it can be positive or negative like a voltage and then you can have your EAPs and this is just like your change in membrane potential over neuronal populations and that corresponds to roughly speaking like these big potentials we see at the scalp and then we have a normalized firing rate where you will basically we normalize the depolarization variable by branding it through a softmax function and the softmax function is just a generalization of a logistic function to more than one input more than two classes and what's nice about using a generalization of a sigmoid function is that everyone who's looked at rate models of population nodes for example or what's sometimes called mean field models these basically you can have this firing rate versus input curve where on the x-axis you have the amount of input that you're driving into the population and on the y-axis you'll have the firing rate of the population it's a logistic function so it's going to be bounded roughly speaking between zero and one so what you will end up with is a sigmoid style looking curve that we treat as probability distribution but can also be interpreted as this lovely biophysically plausible well I don't know about plausible but biophysically sensible maybe interpretation as a normalized firing rate and that kind of allows us to translate between this kind of normative models thinking about probability distributions and what the brain is actually doing is everyone kind of happy with that? Yes Stephen do you have a quick question? Just one quick question is the expected free energy on policies more important at level two and level one it kind of can soft max itself or is it that's a bit simplifying but does expected free energy go all the way down or is it a bit more higher up? Depends how you set up a model you can generally have models that you can have policy selection at both levels you can policy selection at just the lower level or just the higher level depends what you're trying to model really generally speaking in models in the model we're going to be using for the first moment we don't have policy selection at the first level we have policy selection at the second level but you might for example there's work by Thomas Parr Ryan and I kind of working on some similar work as a part of my thesis where basically the idea is at the higher like working memory ish prefrontal level you have goal directed policy selection and at the first level you just have kind of epistemically guided selection and so the policies that are being selected at the first level might be something like where individual field to point your attention or to move your eyes and that's just kind of determined by the salience of various items which in this framework corresponds to essentially the information gain offered by all the precision of the first level A matrix so hopefully that answers your question Yeah, thanks So as just a little bit of kind of a generalization of that I mean really and Chris will go into this but the way this is modeled officially is just each lower level trial is just a full trial as in if you were modeling it separately the higher level is just putting a prior on whatever the starting whatever the starting prior is for each lower level trial so I mean in a sense you're modeling these try you could be you could think about each trial either at the lower level the higher level is being modeled independently in some sense right each one can have a set of policies each can have anything that a single level model could have it's just that the higher level is putting priors of some kind on the lower level and then posterior observations at the lower level are then feeding back in as the observation for the higher level so it's just think about them as just separate kind of trials on different time scales where they're just connected by their priors and posterior Yeah, exactly and so I guess what we did here was we wanted to have something like to implement something global neural workspace like in a deep active inference architecture I have mixed feelings about defining the term the predicted global neural workspace because people lump this model in the few people who have been kind enough to cite this work have lumped us in into like models of this as being kind of just another version of the global neural workspace I don't think that's accurate we took the global neural workspace kind of and all of the phenomena that it encompasses kind of as something that we want to explain both Ryan and I really like a global neural workspace like Stan Dahan, Bernard Bard's huge heroes of mine and aside from kind of like my personal love of the theory it's also among kind of neuroscientists and philosophers who work on consciousness about 30 the majority of them which is about 35% or something like that in 2018 said that they regarded it as the most promising neuroscientific theory of consciousness and so we thought okay if we're going to go into this literature this is what we should be modelling like the phenomena encompassed by this theory is where it should be the starting point but without said our model is an active inference model through and through and so it really does kind of stand on its own independently of the global neural workspace and so I just kind of wanted to flag that and so I don't use those words or in the draft of the paper that I'm writing at the moment kind of following up on this one of those words and I don't do it that's very deliberate but anyway sorry that wasn't a side that was a very esoteric aside of that anyway so the idea here is that we implement something global neural workspace like in a deep temporal architecture so this second level we identify with the workspace and so as Ryan was just talking about you have these states the second level beliefs about how trials at the first level evolve over time and so they provide a prior at each time step of the lower level they provide a prior over the hidden state and then after when there's posterior inference at the first level after that's happened it then shoots back up to the second level and that acts as an observation for the second level so the functions the idea here is that the functions we typically associate with conscious processing take place at the level of the hierarchy that's kind of deep and temporally deep enough to abstract away from processes that are entrained by the moment-to-moment sensory flux and this allows the agent to kind of construct a report of its experience or to do any other kind of kind of quite abstract cognitive task and so the idea is that ignition occurs when a first level state is inferred by the second level so ignition is basically like what Dahan has called this non-linear bifurcation of this frontal parietal network in the neuronal activity so the idea is that ignition for our model occurs when the first level state is inferred by the second level with a high enough posterior probability to kind of influence policy selection at the second level so okay on to the task so this is a very kind of general task structure which was a really awesome empirical study done by Michael Pitts in 2014 and colleagues so the idea was at the first time step there was a forward mask presented and so second time step there's a target stimulus so these little lines in the center can rearrange themselves into a square and the second time there's then a backwards mask at the third time step and the agent is then asked to construct a report of its experience so whether it's say whether it's or something and the idea roughly speaking these little colored circles on the outside this is how we manipulate attention and how Michael Pitts manipulate attention and so they either had people attend to the inside and perform a task basically click a button whenever the circle the square changed or click a button whenever the circle changed and so they manipulated kind of what was task relevant and I'll get into how they manipulate awareness in a moment but roughly speaking the way we implemented the model this is kind of a Bayes net representation of the model and then if we just kind of click through it at time one there is a forward mask presented and this just corresponds to a black with a square trial so that's the full, this is kind of a full beliefs at the second level there is a full sequence this is beliefs about the full sequence over the full trial step one there is a report hidden state factor here it's in weight because we haven't asked it to construct a report yet then at the first level each of these second level states kind of implies a first level state that makes sense and each of these first level states kind of implies an observation and attention in our model was basically we just hard coded in as a, but we didn't have to hard coded you can get these things to emerge it's just a little bit easier in our case to do it this way we basically hard coded in so that attention corresponded to modulations of the the precision of the mapping between these first level hidden states kind of whether there was a square or lines out there in the world and the observations so okay so first first time point forwards mask we see a sequence type is going to be a square there's a trial phase one report okay next one we see trials moved to we now see the target has now rearranged itself into a square we did this for two time steps and we did the reason we did it for you to send the target stimulus the two time steps is to allow recurrent feedback processes from the first level to kind of the first level second level to kind of recognize what's going on and feed back on to the first level kind of give that that was kind of our way of modeling this ignition style process then fourth time step we take away the math we replace the little square just mask again and then from five to eight we allow the model to construct a report of its experience so basically this scene hidden state here implies a sequence of language processing states at the first level I see a square it just yeah I see a square and then what would it say if it was unseen I didn't see anything I mean to be fair just to be clear like we could have and the model would be exactly the same in all aspects we could just chop off this kind of language processing part of the model and the model would behave exactly the same way and we could still use the generate reports and all of that stuff what we wanted to do was really just kind of use this as kind of an intuition pump for what having this deep temporal representation allows models to do so holding something at that level allows sequences or a cascade of hidden states kind of evolve at lower levels that might be something like constructing a sentence with a goal in mind or it might be something like I don't know making up a coffee where you have to hold multiple tasks in mind and break things down into subcomponents blah blah blah as opposed to mapping to a button that says yes or no if it's just a two button option then it's a one time step paradigm yeah exactly I mean we could do that it would be completely identical this is just kind of a nice way of showing what these models can do really yeah I mean so this was this was partially for a kind of to make a theoretical point right you could say right we're showing a two level model here but you know the brain probably has you know I don't know how many levels but a very large number of levels right not just two um so you know you could you could kind of level the concern or question at us that like you know why is it that this second level is the one that corresponds to conscious access right as opposed to just any other level that's above some level right in the brain since there's a big hierarchy in the brain you know so the the idea here is is that you know we're appealing to a certain level of deep temporal structure right like um you can you can flash a stimulus at somebody and they can see it over a very short time scale right of like you know half a second or something really fast but actually but the sorts of cognitive processes that um can integrate that together right and generate a set of thoughts right that corresponds to being aware of that right constructing a um constructing a sense or any of the things that Chris mentioned those are much deeper slower integrative cognitive processes right so the idea is that there is a level in the brain that is integrative enough and that represents things over a deep enough temporal scale that it has the minimal resources to um integrate things together hold them in mind generate sequences of the sequences of words that would be necessary to communicate that you saw something or not right so it's that abstract deep temporal structure that is the necessary conditions for what you need to get empirically um confirmable awareness yeah some brief kind of modeling notes about what we did so to model attention and stimulus strength we basically just altered the precision of the first level a matrix mapping by passing it through a softmax function twice where one of the softmax temperature so when you pass something through a softmax function you have a temperature on it and the temperature decides how precise it is and we just had a different temperature for each iteration of the softmax function um and so the idea was then we had an ordering basically of precision where the most precise was attention present, stimulus strength high the second was attention attention absent, stimulus strength high third was you can imagine how that goes on um and kind of the reason this is this sounds pretty abstract and a bit kind of cooked up and that's true to an extent but I think this does actually really correspond to so if you look at something like device of normalization in vision where attention basically is a you will have some tuning curve, neuronal tuning curve and basically basically like how precise that tuning curve is is modulated by both the contrast of the stimulus and then the contrast the stimulus gets fed into some attentional process where that will together determine kind of the tuning curve um that was a very rough and ready explanation um tuning curves but whatever um and then our threshold for report we basically just set up the model to give have a preference for receiving correct versus incorrect feedback at the final time step and to have behavior was basically modeled by simply decreasing the preference for being incorrect to encourage guessing um and so I think this was just kind of our first pass at doing report um I don't think this is how the brain does report um we actually in our follow-up model we've got a much nicer way of doing this but uh I think it was it worked in the sense that it was able to kind of quite accurately capture a number of empirical findings so that's kind of good enough for me at this at this first what first model okay so that was a lot of preamble um but we can get into the results so foundational simulation um this was just like kind of full like a very high precision stimulus was presented model kind of with either on a square present or a square absent trial and this just kind of to test it um so great uh when there was a square present the model reported seeing it a hundred percent of the time you see this nice kind of bump at the second level at the first level where um the firing rates for square increase when we present it goes down and Chris let's just make sure that people understand how to interpret these things so just just so people understand so um time in these plots goes from left to right and the grayscale from white to black black corresponds to a hundred percent probability and white corresponds to zero um probability and the gray so is something like you know 5050 um so and these are you know cast in terms of firing rates here so darker than implies a higher firing rate um so the top plot on the left it's a second level firing rates that just means it first sees the black circles on the outside so it's confident that it's either um gonna be a the sequence that has black circles and black square or the the second row the other sequence which would be black circles and just lines with no square um so basically once it sees the black circles then it knows okay this could either be a sequence where there's going to be a black square or when there's not going to be a black square um so then at the and that and you can tell what the stimulus was because if you look at the first level firing rates so again the next kind of plot down the bottom um row there corresponds to seeing the lines and the top one corresponds to seeing the square so basically what happens here is it starts out just seeing the lines and then at the second time step there you can see it kind of bumps up and the agent is really confident that it's seeing a square um which at the second level corresponds to becoming fully confident that it's going to be the sequence of black circle and square um and then the stimulus goes away at time point four which you can see at the first level and then stays again just lines for the rest of it um and then the bottom two plots show the actual outcomes um where again black is um the agent's sort of confidence in what it did and the um cyan dots um correspond to the actual ground truth um so that's just knowing um that uh the agent waited for the first three time steps and then reported scene um and kept that scene while it generated the sequence of words at the following four time steps and then the right one just kind of shows it started out silent for the first four time steps and then um the words that it observed itself report where I see a square which is just those dots moving down the diagonal to the right um so so that's that's just so people know how to interpret you know the last name yeah exactly another clarifying point would just be that the same simulation is leading to all of these behavioral outcomes and neural correlate predictions and a bunch of other things so it's like one integrated model and the model stream is kind of where they built it up yeah exactly um okay so we kind of talk through all of that okay so uh in terms of the taxonomy here what we wanted to do was basically recreate that taxonomy that we talked about earlier so this was uh modulating signal strength from high to low and from kind of uh and whether attention was present or absent and then looking at the corresponding firing rates and looking at like the percent seen and percent correct that kind of thing also percent seen report behavior and percent correct which is um false choice behavior okay so kind of couple things to highlight uh firing rate first level is enhanced kind of most strongly um in the contrast quadrant that's nice that's again what we see in empirical findings um whoop high firing rate for kind of square sequence at the second level even when reports are low this is kind of nice right because I talked about earlier you can have kind of a firing rate at the set you can have kind of activity in prefrontal regions that is generally speaking below the threshold for report um that is nevertheless or sorry when I say below the threshold for report report frequency of reports are relatively low but you can still have a preference for um you can still have a higher firing rate for the representation showing that there's a representation of the stimulus basically um and then the last thing to highlight is this the highest firing rate for the kind of the square sequence at the second is at the second level is in the conscious quadrant which we kind of might think about in terms of ignition okay so then ERPs we see the real thing to highlight here is that we see a large p3 like event related potential in the conscious quadrant again like a lot of the empirical findings in the report literature um whereas first level ERPs seem to be fairly unmodulated by um by conscious access essentially which is again what we see so now to a very that was kind of a very abstract kind of taxonomy where we were just manipulating factors what we wanted to hear was show take a concrete empirical result and show how we could basically use it literally exactly the same model to reproduce it so this is a really clever paradigm by Michael Pitts that I talked about before and so the idea is that you will have these kind of uh three phases to an experiment in phase one the participant is just told to attend to the dots on the outside of the screen and hit a button when they dim and what the participant isn't told is that every now and again these uh things these lines in the middle rearrange themselves into a square pattern then at the end of the first phase she seek like 300 trials um and at the end of the first phase they are then given a debrief and they're asked so how many people saw a square pattern in the middle and roughly speaking it's about 50% and this is now actually really well replicated result uh across kind of different varieties of inattentional blindness roughly speaking about 50% of people don't report seeing it when it's not task relevant so that's nice so then they then all of those so then they have 50% of people who didn't report seeing it um and then in fact and now aware of it and then phase two they do exactly the same tasks they're attending to the disks on the outside the outside uh so it's irrelevant um still task relevant but they're now all aware of it and the debrief at the end of the second phase everyone reports seeing it and so now they've got this contrast between aware versus unaware so phase one versus phase two where the only thing that's different between them is whether you're aware of the stimulus and there and the stimulus in the middle is task relevant and what you can see there's no p3b there's no late positive centro positive central um component in the ERPs then the final phase aware task the kind of the aware phase so all participants are aware they now make it task relevant and have them basically attend to the dot to the squares in the middle and when we see there is this big p3b very noticeable okay so just kind of some um this slide to summarize what I just said then great so basically what we did was we just used a different inverse temperature parameter to modulate to kind of represent each state so phase one very low um temperature parameter for attention attention temperature parameter um that was kind of and so this is you can kind of see the precision of the A matrix here we bump up for phase two we bump up the inverse temperature parameter just very slightly um this is kind of roughly speaking to correspond to like a diffuse attention so not you're not particularly attending to one thing one particular feature of the stimulus you're attending just kind of diffusely to the whole screen and in the last state we basically model the temperature make task relevant by making a very high or very precise temperature parameter where you basically get almost something that looks like a likelihood like a identity mapping okay and so these are the ERPs that we do ERPs and the firing rates that you get out of it and what you see this is my favorite slide in the whole plot oh in the whole paper um what you see is basically results that look very very like shockingly similar to the empirical results especially through a model that is so incredibly simple like ours is relatively speaking at least in terms of biophysical detail and so what you see is at the first time step so there's basically no ERP so first phase basically no ERPs and so we give the model feedback on every trial but what you should see is it's a little bit darker for the square for the um until we give it feedback and tell it no you were wrong you did we did actually present squares um and the model changes its mind it was fairly sure that it had seen lines then um the second one we bump up the attempt we bump up the um tension parameter and now the model sees things and it gets um or it reports seeing the um stimulus so at this I should say at in phase one a model saw um the stimulus 49% of the time basically identical the empirical result was 50% phase two sorry 99% of the time basically the same as the empirical results report phase two is 100% and then phase three it's 100% of the time um and there there's little but by the way I should say what the way I generated these results is basically I ran several hundred truck I basically set up just ran a for loop over the stimulus over the code and for each condition and had it generate I think so each kind of iteration of the for loop generated 300 trials and I then averaged over maybe 10 iterations of that for loop so um each with a different random seed um and so what you should see with the the sorry that was a quick aside the crucial thing is that in this last point there is this when there's a very precise stimulus mapping at the first level there is this abrupt change in firing rates at the second level and that abrupt change corresponds to the appearance or the emergence of the p3b where the p3 bit where remember that ERP is a generated they are the rate of change of posterior beliefs and so the model suddenly becomes very confident that sort of square and so you see this big ERP boost in the p3 which corresponds to a very precise um input mapping that comes along with something with a stimulus being task relevant and so there we've kind of got this really nice explanation of this kind of somewhat counterintuitive result about the dissociation of the p3 and awareness okay and then the last thing we're going to touch on is just kind of we then decided to extend the taxonomy um so basically this is exactly the same but we also now manipulated whether a prior was consistent with i.e. a valid expectation so we gave the model kind of we made it roughly speaking two times the model forward was roughly speaking two times more likely to see a stimulus than see a square than lines and or two times more likely to see lines than squares and then we presented it with squares and so that's kind of the consistent and inconsistent prior and then the key result is that across all of this expectations boost the effective feedback on firing rate which is nice um and in terms of ERPs what we see is something this very specific prediction when there is so these are kind of like fairly small differences but they are differences um only when attention is present do you really do see a p3 that's again in line with the empirical literature but crucially when there's a consistent prior that is to say a valid expectation you should see reduced p3b in contrast to when there's kind of flat priors and when there is an inconsistent prior and this is actually something that has been that has been confirmed recently again this was something Michael Pitch was on this paper if anyone's interested I can kind of post a link to the paper and that's really nice we didn't plan for that to happen um okay it is slightly annoying I would have liked to cite it but the paper actually came out before ours was accepted for publication oh sorry after ours was accepted publication rather that's what I meant to say okay and just a very quick summary summary of the results with really what is a really kind of like surprisingly simple model we can reproduce a very wide range of canonical results from minimal contrast paradigms including kind of fMRI and ERP findings so this includes we can account to unconscious PFC activation we can account for the association of the p3 in conscious access where the p3 roughly speaking reflects the velocity of conscious working memory updates I kind of like that phrase and four we have an explicit and formally defined role for prior expectations and the implementation of process theory and then we've got a number of key predictions that emerge from this so p3 is attention dependent and should its amplitude be inversely proportional to expectation feedback from frontal parietal regions this is a very specific prediction feedback from frontal parietal regions should be enhanced on conscious trials and it should disinhibit granular layers in the relevant population sensory cortex that's a very specific prediction that falls out of the process theory that can be tested with DCM and then consistent expectation consistent expectations should disinhibit granular layers in sensory cortex while inconsistent priors should inhibit them okay so any questions at this point? Christopher, thanks for this awesome presentation do any of the panelists maybe want to queue up a question how about you do this last slide with a bat leave on that sort of phenomenological note if you want to otherwise we can return to the panel yep sounds good so all the panel prepare a question or a thought and then raise your hand when you're ready and give it a last thought here okay so just like a brief aside on phenomenal consciousness so we associate so I'll just back up a little bit there might be a puzzle in that we associate global broadcast and the availability of information for a port with the second temporary level our experience although it's kind of like there is a depth to it in the sense that experience is kind of smeared over time right there is an integration window that's relevant for experience it certainly isn't doesn't occur at the level of whole sequences and so there's a puzzle here so where does experience live as it were where is the content of experience and kind of Brian I've been chatting about this and this is a viewer still evolving but broadly speaking I think what's right about this or what we should say is that the first level states what the content that we are conscious of are the first level states that have a high enough precision either through bottom up precision or because there's a really precise prior that comes down from a second level A matrix so basically the message that's being passed back up to the second level has a high enough precision to influence policy selection or to pass the threshold the posterior belief threshold for conscious access and report and so this kind of has a nice consequence that it keeps experience at the relevant level so it's updated at each time step when we get new stimulus in stimuli in and yet it has a distinct kind of link to report and behavior yeah so I mean just to expand on expand on that a little bit I mean the kind of ideas just that as opposed to and again Chris partially said this but just kind of reiterate the idea is that what you consciously experience is a consequence of both the posterior over states at the first level and whatever the second level likelihood is right so it's a function of both the posterior there S1 and what the structure is of A2 there and you know a way to think about that is if you were to rearrange A2 so the mapping was different and S1 had the same posterior then that would update the beliefs at level 2 in a different way right so you know if I arrange A2 one way even though S1 there is still sensitive to the same stimulus then if I rearrange A2 one way then the posterior at S1 is going to lead to awareness of one thing whereas if A2 is set up a different way then the second level will gain awareness of a different thing right so in some way the phenomenal content that you become aware of depends on the nature of this link between what messages the way that the messages at the second level are kind of decoded in a sense right what the second level takes to be the meaning of the messages that get passed off from the first level um so so this is kind of a nice thing like Chris mentioned where it's the updates um that the first level gives to the second level at each time point so phenomenology kind of stays at the temporal scale of experience but uh phenomenal content still depends on a certain type of access right it depends on the first level having a specific type of influence on the second level and thanks a lot it's really interesting so let's return to the guess the panel mode Christopher awesome presentation and the first question is going to be from Dave and then anyone else can raise their hand so Dave go ahead my question is about possible involvement of the air toss the reticulothalamic activating system either in your data or the data that are drawn on neurological data and the other relevant studies is there any representation of possible contributions by the air toss by the mid brain and is there any accommodation in your model of something maybe not explicit with the air toss but data that might originate there say the endogenous attention um so subcordical structures really interesting and as for the reticulothalamic activating system I'm curious did you did you get that from this book by Bernard Barres so a cognitive theory of consciousness where he first introduced global workspace theory and he has a whole chapter devoted to the reticulothalamic activating system oh that's great I will definitely have to get that no actually I got that this from the work that Yachtpunks up and Mark Solms did especially the 2013 thesis of the conscious id he is of course a very close collaborator with Carl Friston he just brought out a full length book on the conscious id especially as active inference interacts with that so my actual name focus of research is on emotions applying computational models to emotions and I'll just say that I have very strong disagreements with the kind of theory proposed by Yacht and Mark we actually published kind of a joint like on paper debate where me and a colleague Richard Lane kind of debated back and forth with Yacht and Mark about the right way to think of the way consciousness works and I'll just say that their view places this really strong role of affect where they more or less say that these mid brain you know structures like the PHG right the reproductive gray send these afferent signals up to kind of activate the rest of the brain and that somehow permeates it with this kind of necessary emotion and that's kind of primary whereas kind of the way that the way that we have the kind of structure of the sort of thing that we've defended is just that no like the global broadcasting ish kind of thing applies to all types of information right so if you have a representation of your beliefs about your emotional state that has to be attended to and sort of sent into the workspace just like any other stimulus for you to become aware of it and so I mean these sorts of mid brain structures they do a lot of really interesting like inter receptive and visual modulation processes and you know they do have this sort of general like upward modulation role where they kind of you know my view is they kind of keep the cortex in general in a state where it's capable of representing things being sensitive to simulating holding different states active and other states not so it essentially allows things to be represented in general and then to become consciously accessible but you know with respect to the way that kind of emotions feed in and whether the mid brain structures actually contribute content in and of themselves we just have theoretical disagreements about that. Can I also briefly throw up on that so the contrast the reason I think that frontal parietal regions in cortex are so important is because there are a huge just amount of empirical literature now in neuroimaging and also very recently data from kind of invasive neurophysiology in monkeys and recently most recently in mice showing that yes when there is this contrast between conscious versus unconscious these are the things that are differentially modulated now you can't there are some very impressive findings for very recently coming from Matthew Larkham's group showing basically that you can wake creatures up from general anesthesia by stimulating the medial dorsal nucleus and the thalamus and there are these crucial connect like reciprocal connections between layer 5 of the cortex and the medial dorsal and the thalamus and when those goes out basically they go out under general anesthesia and when you are awake or when whiskers are stimulated in barrel cortex when the mouse kind of reports as in it says it's whiskers were stimulated this circuit goes off essentially so I think the thalamus is crucial but kind of Allah Dahan and everyone else I think the reticulophilamic activating system and all those cortical structures these are background conditions similar to what Ryan said it puts the cortex in kind of the state where it could be aware of something but the actual contrast between something being broadcast that's not broadcast or being reported that's not reported does not rely on those structures and I would just want to see from those authors I would actually just want to see them engage with something like the visual awareness literature and show that those structures are differentially involved Thanks for that Steven go for it and then Dean blue or actually then Scott then Dean or blue if you want but Steven with a question then Scott Yes so thanks just a question just to be clear so effectively you you tagged the degree in prediction confidence and that mapped against the the data that's empirical if I'm right in understanding so you like as the model showed a difference in confidence that was showing a correlation to how firing happens and I'll just be curious to know if you think that this sort of key level or something where this firing or deductive piece can happen where there's is that where there's a shift between and maybe a more Markovian a Godic process to something more semi-Markovian and less ergodic and so that an organism can make non ergodic choices in conscious awareness So I mean a couple of things so I'm not sure about the ergodicity comment so I think you can probably have I'm not sure how semi-Markovian systems relate to ergodicity I assume that you can have a semi-Markovian system in a ergodic system that doesn't like that wouldn't surprise me yeah so basically when you move from a I just don't know actually just to flag that but if you I'm not particularly up on a lot of the physical physics that relates to how these models work but when you move from a first level model to a second level model they are semi-Markovian Can you just maybe define that a little more what does that mean to you what does that enable what was asking to basically the model is no longer just purely dependent upon the previous state at the first level of the model there is more information that is determined at the second level than what just goes on at each time step of the first level that's basically it yeah so I think we have a sentence about this somewhere yes what the temporally deep scale buys you is essentially what we were saying before all of those cognitive actions that allow you to do things like construct reports make plans etc etc and kind of abstract away from moment by moment sensory flux all of these things depend upon there being a second level and that might be reflected in the brain there's some from the yeah yeah so uh no so grand grand grandial cortex is generally speaking like so the cortex has six layers roughly speaking and as you kind of move up or move towards the center of some centrifugal hierarchy depending on who you talk to and what kind of anatomical maps you believe that type of thing uh layer four of that the granule cortex which is essentially the input layer from the thalamus and the feed forward layer that gets smaller and smaller until when you're at kind of the center of this hierarchy or this centrifuge um you essentially have no granule cortex at all uh this is kind of ties in at least Felden Barrett's done a lot of stuff with this um I have problems with that for you to be honest they kind of say the top of the hierarchy is where um they think consciousness lives um I have a rather vehement disagreement in that I just think there's absolutely no evidence that that's true um in terms of neuroimaging evidence uh the areas that are relevant seem to be these frontal parietal regions uh they do have a layer they do have a somewhat well defined layer four it's not completely agranular it's not granular somewhere in between um yeah I don't know I think they what one thing to say about I know Ryan has very specific views about this and I should say that I really love at least Felden Barrett's work I think she's wrong about this um I just want to go with kind of the empirical contrasts right and not speculate too much about very specific implementation details so look they kind of go from a very loose model of kind of predictive coding and then map it to these like cellular uh kind of details about the structure of neurons um I am more interested in going from these computational properties to basically what would measure at the level of neuroimaging because I think that's what is most tractable experimentally and then if we're and there's still a lot of debates about that neuroimaging level right if we kind of hammer it out and decide what's going on at the neuroimaging level of analysis then good then then we can start having more serious discussions about particular involvement particular types of cortex but I just don't think we're there yet empirically yeah I mean I I would say yeah just kind of echo that a little bit um you know it's important to realize that it's not as though you know the particular sets of you know little node neurons and somatic connections um in the kind of like you know hypothetical columns um you know that Chris showed earlier in terms of the neural process theory um that's just kind of one example of a way you could set it up there's there's a very very large number of ways that you could connect up neurons together in different structures that would um implement basically the same algorithm um and in addition to that there are different message passing algorithms that you can use um to solve the same graphs um an active inference isn't defined by some particular message passing algorithm right like Thomas has shown this you know specifically in a paper where you can have different neural implementations um that require more or less um neural resources to implement where it's kind of a trade off between efficiency and accuracy um where you know for instance like really really accurate um message passing algorithms like belief propagation they require more neurons to do um but they're a little more accurate um and the brain could use that right or the brain could use variational message passing which is not quite as good of an approximation but requires less resources to do um so the point being is is that there's not really that strong of commitment to that level of kind of circuit detail in active inference it's more just a theory of you can have this pattern of synaptic connections that implement the matrices and we represent posterior beliefs over states with inspiring rate functions um and then you take the rate of change in those to put the PRPs um but so this is this is kind of the the issue with it doesn't really make sense to me anyway at this point again since I've been with what's said um to start with hey let's assume that this one very specific way of setting up neurons to do this is the right one you know let's just assume that and then figure out what must be true about different areas of cortex based on their layer based on their cortical column layer structure um so so then it's definitely it's super interesting and probably has um you know a really meaningful interesting mapping between the differences between granular and granular and agranular cortices um with respect to computational function but um but again you know like I don't think until we've really nailed down that there is a particular circuit implementation um it doesn't really make sense um in my opinion anyway to kind of infer from that direction um the function um so I should say just to kind of echo that I think if you work in your imaging it doesn't make sense if you work in mouse as in a mouse model and you were literally testing hypotheses at the circuit level then it would make sense to something to do but right but in that case but in that case you're literally testing like the predictions of specific message passing algorithms yeah exactly so that that was kind of a testing that's testing for message passing algorithms that's not assuming a particular one and inferring from that so anyways I mean but I mean if you I mean for people who are interested I mean I mean we you know so so I and a couple previous papers have um you know kind of built um multi-level active inference models similar to this but specifically about emotional awareness um you know the the sort of show how you can get um you know from simple kind of lack of awareness of single emotional states all the way up to being aware of feeling multiple emotions at the same time and being able to report them and things like that that um kind of show how a structure not too different from this one can um can generalize to um to emotions or really to to any other um sort of thing that you could experience and be aware of um whether that's coming from inferring things about your own kind of bodily state and its emotional meaning from interception or whether it's something like vision um so I mean we we haven't covered that stuff in in any of these sorts of things before but but um that stuff is also out there um that might kind of help at least show what kind of the view on our end is about how active inference can uh account for um emotional phenomena and be consistent with um the current literature on emotional awareness as well. Thanks for these awesome answers Scott with the question and then I'll ask one from the chat and then Dean or Blue or someone else. There we go. Um thanks great and fascinating presentation. I was wondering this book um the Intelligent Movement Machine Cross Yano. Cross Yano. Yeah and um it's it talks generally about the um cortical mapping being predicted by the movement repertoire in the motor cortex and kind of that embodiment kind of element. One of the things I wanted to ask about the reason I think it follows from that this most immediate prior part of the conversation is when we have an active inference setting and that model is being employed for multiple um perception I won't use the phrasing right because my active inference uh terms aren't all up to stuff but um when the inputs into the um individual who's performing the active inference that system are from various um sensory inputs and various temporal distance so there's several inputs and then they're being synthesized into a an action or a um an externality something's going to reach out and again my apologies for not internalizing all the definitions um what is what are some of the cortical mappings or have you found cortical mappings that embody that process of synthesizing multiple inputs into a single action what are the core what are the structural correlates of the process of synthesis of multiple different strands of active inference into a single action I'm using the wrong words but this is a question essentially about multimodal integration yes so yeah um so I guess I think active inference is probably you able to build a specific so I think there are two parts of this question it's an interesting question um first part is I'll just kind of answer the active inference end of it uh the way I think about active inference at least is just as a really useful modeling and mathematical framework and I think it's possible to build multiple active inference models of any phenomena so I would want to hammer out an empirical paradigm um and then figure out okay what models could I build of this so that's kind of the first part second part is about multimodal integration are there kind of places that house multimodal I haven't looked at multimodal integration from since I was an undergrad I'm just trying to remember like my uh third year cog neuro courses I think there are places in like posterior parietal cortex where there is integration between multiple mode out like you might say like vision and one other thing or whatever it is and then that gets shot off to motor cortex implemented in actions um so I don't know I think you'd I would just say look at the motor planning literature um there's lots of stuff about Bayesian visual order integration and one just follow up one of the things that the that leads to is I was wondering about the situated cognition opportunities for those synthesis so for instance you have hysteresis or my son used to say stereotypes are real time saver he said it as a joke right so the idea is you have externalities that facilitate the synthesis so they're actually not cortical structures they're social and linguistic and rhetorical structures right so you start I mean is that could those what I'm where I'm exploring is could those be described with some of the same models as the cortical structures so you're actually able to have a scale independent description of the processes both in a social and a cortical cognitive context for the synthesis of multiple multimodal inputs hmm so I'm not sure about the scale independent part yeah yeah I hear what you're at no Scott it's a good question I see where you're coming from because we've been talking about the scale free formulas with Chris Fields and others and then this whole nested all blank it's like you could have the neuron and then the neural you could go within a model and just like you said your model doesn't presuppose that awareness is actually at a higher level you're just saying there is a structure that's deep temporal structure and that's the state that's the timeline that you are at but there could be another level and then mechanistically or functionally how that level plays out is going to come back to a task specific measurement which is a plurality of models so still good points I mean part of I mean I think to go back to there's lots of nice opportunities to build active inference models would be very simple for multiple multimodal or cross-modal integration tasks for instance I haven't seen this done but maybe I have to look back but things like the double flash illusion or the rubber hand illusion or any of these kinds of multimodal things like a double flash illusion is really nice because more or less for people who aren't familiar you either play one or two tones and they are coincident with a single flash of light and it just turns out that when you play two tones really quick then people perceive two flashes of light when there was really only one and the reason for that is so there's just this kind of like friar in the mix presumably coincident things that are coincident temporarily have the same hidden cause but also at the same time the temporal resolution of audition is just better it is more trustable than the temporal resolution of vision so you can just say look like the system just trusts right it has beliefs that the auditory system has higher precision higher probability of common causes of coincident input therefore there must have been two flashes given that there was two tones that's just one example of a really simple well-known empirical task of cross-modal integration that would be pretty easy probably without a ton of tweaks to the kind of structure of the model that we're using here you could do and all you'd really have to do probably is just assign the right precision to the auditory and visual input where those are jointly generated by the same hidden cause or hidden state in the model and so basically it's either you're just inferring one cause two causes and that either generates perceiving one flash or two flashes I mean anyway it would be simple to do and I guess that I don't know if I don't not aware of that having been done with normal active input and DP literature but there are known neural correlates of this stuff and they're in part where you'd expect them to be they're at the borders of auditory and visual processing and association cortex like posterior association cortex and because it'd be really a really cool project to try to build a model of something like that and then just see whether the predicted neural time courses actually better kind of single out the association areas where at the kind of borders of vision and auditory it seems like it's almost like a situated synesthesia where you have the Gabor square where you have the frequency and time variance but it's being one of the variance is being hijacked by another perception in the visual field so it's crossing over to that other sense is very interesting, thank you I'm going to ask a question from the chat and then if anyone wants to give probably a short final note but then we'll end around the hour so that we can have a whole second discussion next week the question the chat is to the authors what would you say the main limitations of the model are or where are some next things you're building on but what are the main limitations of the current model and then how are you going to be building so I mean there's lots right so I'll just list them I think we list this in the paper somewhere too the starters it's a discrete model we treat the whole visual system as basically like one discrete hidden state to be inferred by the high level that's obviously like not true but it's a good enough approximation that you can reproduce a bunch of findings it would be nice to have a multi-level model now with that multi-level model I think there are some empirical cases there's this thing called the Partial Awareness Hypothesis by Sid Kudair where they talk about having partial access or the global workspace having partial access to different parts of the hierarchy and that giving rise to kind of odd experiences so you might have something like you might have really good access to the fact that the colour of Ryan's shirt but not to the fact that it's a shirt or vice versa so that would be nice I'm not sure quite how to do that in a hidden mark of model situation but that would be nice second thing is we really just model kind of report paradigms here so where we're going with this is we're trying to extend this explicitly to model some no report paradigms and actually casting kind of thinking about that in a nice way and so we've actually built another model and we're kind of writing it up at the moment I mean it's not it's kind of a side project because it's not a part of my PhD thesis but the idea is roughly speaking to actually have a model that is able to explain a lot of the weird findings in the no report literature and also I think we have a much more principled we now have a much more principled account of report that we're quite excited about then just like I guess on a last limitation would just be and this isn't really a limitation because Ryan has got a lot of work in emotion and introception on this and I know that somewhere that his lab is like really pushing but it would just be nice if we could use these very similar architectures and apply them to something like introceptive awareness and I would be I would find it personally very compelling if you had one I think it's too demanding to have one model but one very general model structure that generalizes across visual consciousness to introceptive awareness to like auditory awareness to smatter sensory awareness that would be really nice awesome answer Ryan do you want to add anything and then we'll have any final comments you know I mean I think Chris covered most of the kind of limitations or I mean I don't know I mean I would just say like yes I mean it would definitely be nice to do something like have a so there are what are called mixed models where you have kind of a discrete model like ours on the top but then the kind of posterior some predictions downward that essentially set set points for a continuous level below it so then you can have a continuous state space where for instance you know it's a lot more plausible right to think that these sorts of inferences that are being made and kind of like early visual processing or on a you know on a continuous a continuous state space rate it doesn't have to be like motion speed one motion speed two right it can be continuous values for things like that but like another thing that I you know I mean for instance interception and emotion I mean that's something I mean we have empirical paradigms that we've been setting up to test just this kind of thing in interception and we have a paper under review right now where we applied a computational model to inter receptive perception and like a gastrointestinal interception task that's not a multi level model at the moment and that paradigm doesn't have probably doesn't have enough trials to do contrast between conscious versus unconscious stimulations but we're kind of working on that so that would be nice another thing that you know Chris kind of mentioned the example of like being aware of the shortest red but not a shirt right that kind of thing also is really important for modeling emotional awareness because a lot of times right people might feel right some kind of bodily sensations you know kind of like a pit in their stomach or like fast heart rate or something like that right they might be conscious of that but not infer one step above and be conscious of the fact that that's associated say with like feeling fear right like like a people with like panic disorder for example right they might feel like a strong heartbeat but they might instead infer oh that means I'm having a panic attack or something like that or they might infer oh I'm having a heart attack right so there are different kinds of conceptualizations that you can infer from right patterns and lower level of experience some of which are emotional and some of which are not but and this is where it becomes you know interesting is that you might also prime right an emotion category right like I mentioned with the guilt example right I could prime guilt so that higher level concept representation is activated but also but does not enter into awareness right so you can kind of have multiple levels in a hierarchy where each level can independently or compete for broadcasting you know so you know I've been thinking about ways to do this model wise and the best thing we've come up with right now just in terms of the practical implementations we have would be to for instance like have an interceptive level and then have the emotional level above that but have the interception representations also get kind of like duplicated and passed up to the second level so then they can also kind of equally compete right for access to like a third level that would be like the workspace level but anyways I mean these are all things that are kind of in progress but just by way of saying this additional limitation is you can only be in our model you could only be conscious and conscious or unconscious of one type of processing or one level of abstraction which is not you know the case which is not the case in the it's not the true case Thanks a lot for these awesome answers always a great time with the two of you on the stream so just thank all the participants who joined us live and in the calendar event there's a form so it would be helpful if you want to add feedback to that form and then we're going to be having 18.2 next week at the same time it's going to be the same paper so hopefully we'll all get another chance to reread the paper, listen to this stream because Christopher what you said was really deep so again much appreciated to all and we'll see everyone next week Thanks guys that was really