 Okay, hello everyone, we are live. Welcome to Acton Flav, livestream number 22.2. Nice repeating digits there. We're here with Alex Kieffer and others. We're gonna be having our second followup discussion on the paper we've been discussing for the little last bit, Psychophysical Identity and Free Energy. It's May 25th, 2021. And this will be a great discussion. So thanks again for everybody who's joining live as well as watching and replay. If you're watching live, especially feel welcome to add a comment in the live chat that we can address. Welcome to Acton Flav, everyone. We're a participatory online lab that is communicating, learning and practicing applied active inference. You can find us at the links that are listed here. This is recorded in an archived live stream. So please provide us feedback so that we all can improve our work. All backgrounds and perspectives are welcome here. And we'll be following good video etiquette for live streams. We are here in the second of the two discussions in late May, 21. And happy to have Alex joining us for both of these discussions, which has been great. Today in 22.2, quick correction, we will be just trying to learn and discuss. Enjoying these conversations, seeing where they go, enjoying what questions bring up and also carrying on with a few of the topics that we knew that we wanted to cover from last week. And with that being said, we can jump right in. So we can go to the introductions and we'll just maybe go around, introduce ourselves, say hello, and then end with Alex. And Alex would be happy to hear your thoughts or reflections in this past week. So I'm Daniel. I'm a postdoctoral researcher in California and I'll pass to Dean. Hi folks, I'm Dean. I'm from Calgary and yeah, I'll pass it to Adam. I'm Adam. I'm a postdoctoral researcher at the Johns Hopkins Center for Psychedelic and Consciousness Science. And Steven. Oh, hello. I'm Steven. I'm Toronto. I'm a community development practitioner and applied theater artist and I'm doing a practice-based PhD at the moment. And I will pass it over to Alex. Me, Alex or other Alex? I think you're the only Alex right now. Okay, okay. I wasn't sure if there's a fellow jitter. Yeah, what? Jitter. Oh, so I'm Alex Kiefer. Thanks for having me back again to talk about this paper. I'm a philosopher. I'm currently affiliated with Monash University and I'm also doing more applied work for nested minds solutions. So kind of active inference AI startup. Cool. There's a few ways we can start, but why don't you feel free to give any opening comments and then we can ask questions. We have some other slides. Right. So you asked about my reflections over the past week. I haven't reflected a whole lot on deep themes. I did go through. So in the point zero discussion, there was a sort of a question about how we get from 2.1 equation 2.1 to 2.2 with the flipping of the P's and Q's. So that strikes me as like not the most central theme we could spend time on, but I did just spend some time writing out the derivation if that's something we can get to at some point. But in general, like I guess my feeling is I think last week was really fun and we kind of it was very far ranging. I think it would be fun to drill down a bit more on some of the specifics this week, but it's kind of also whatever people want to say. So that's what I'd be interested in doing. Thank you, Alex. Welcome, Lou. If you'd like to just say hi, we're having intros and then we'll sort of walk through some jokes and questions. Good morning. I'm Blue Knight. I'm an independent research consultant based out of New Mexico. Cool. So there's a bunch of places that we wrote down to jump in. Maybe actually we could go to Dean's three functions, walk into a bar. So it's slide 11 and I'll have it up large on the stream. So, Dean, why don't you unpack what you were thinking here and let's get some initial reads to get the contrast of ball rolling? Well, essentially what the point to aspect of this process does is it takes basically what we've introduced as the topic, tried to gain some interpretation around it. And then so what does that mean? What does that turn into? And so one of the products of that for me was I kind of went back and reviewed the 22.1. And one of the things about what you mentioned, Alex, was the explication of a simple idea. And I thought that was really important. And so what can that turn into? That simple idea, that was the three functions, walk into a bar piece and essentially spoke to what Scott was talking about last time in terms of paradoxes and what those mean and when do we model and when do we follow. But I thought the bottom part was an important piece in terms of you mentioning that you sought continuities. And I wondered if you were referring to cynicism and that idea of how we take discontinuities and turn them into continuities. Man, that's really interesting. So I had never heard of cynicism. I keep coming to Pierce through strange avenues that were not planned. I never consciously thought of it this way, but this nicely summarizes how I see things. And the way I've just been sort of forced by experience to approach philosophy, I've found it difficult or impossible to draw boundaries that would create discontinuities between categories quite generally. So that gets into things like the Saraites Paradox. There's plenty of examples of this, but I think for me, it's much harder to find a discontinuity than to, yeah, than the opposite. Okay. Can you maybe just walk us through these three functions, walking into a bar? What is the narrative here and where does it sit in relation to the paper? You're asking me? Yeah. Okay, so essentially if we're gonna talk about contrastive divergence, which I think is a critical path to understanding how we move down a gradient and arrive at some sort of generative model. I think that there are three functions that essentially exist in that. It's an inversion process, which again, I'm not gonna go into the reeds on that. There's an identity piece, which is spoken to explicitly in Alex's paper. And then there's the contrastive divergence piece, which enables a person to remain balanced despite the fact that they don't have a particular answer that there are hidden states. And I think a lot of what was being spoken to last time, Scott mentioned some things blue. Stephen was things like scale, which don't necessarily, we don't come at it with the same perception as to what we're looking at. Some of us look at things a little bit on a grander model, and some of us look at it on a granular model. And I think that the part of being able to fit that into a form like the joke that three guys walk into a bar or whatever kind of speaks to the fact that if you think that this is gonna be about stability, you're gonna get interrupted by the change aspect of it. If you think it's gonna be all about the change piece, you're not gonna be able to realize any sort of conceptualizations because conceptualizations require stability. So you have to kind of be able to play in both fields at the same time, and that's hard. And I think that's why Alex's paper probably knocked me down about five times when I first read it, because he actually did. Alex tried to somehow pull together the idea of stabilizing something, even though things were pretty choppy and rather dynamic. So I don't know if this is an example of where the part two of us discussing your paper goes, but I think it's interesting, especially in the context of cynicism and your idea that we're constantly trying to move from the unknown to the known, as opposed to the traditional. This is what I know, and I'm moving into these unknown spaces. That reminds me of then anyone, Alex or anyone else, reminds me of sort of two complementary ways of thinking about education with the tabula rasa, the blank slate that inscription is put into. It's like filling the cup with knowledge or inscribing into something that's template-less. And then the alternative is to burn away ignorance. And so it's almost like a move, whether we're moving from the known into the unknown or the unknown towards the known, they're left in our right hands of education. Right. Yeah, yeah, this is really cool. I mean, it seems like, Dean, you have an act for reading things in a sort of very deeper, almost allegorical way. We're like, I'm talking about a very- A Percian way, yeah. I read Charles Saunders Pierce two decades ago before it became popular. Right. So, yeah, so I think it's interesting and it's all, these connections are really there. But I just, to comment on this for a second, so I do think that there's something about just how ubiquitous sort of dynamism is in all this and how there's, in a sense, there's no stable process that we can latch on to here. But when I was talking about contrastive divergence, my main point was just, well, look, that's, I'm thinking of that very narrowly as a machine learning, like an unsupervised learning algorithm, right? Which I know you have that in mind too, but you're also taking a broader picture. So, my point in contrasting contrastive divergence to simulated annealing was we can suppose that the brain is running something like contrastive divergence where it's trying to reach a lower energy state. But, you know, the way that algorithm works, of course, you don't need to actually run, right? The basic idea is that you don't need to, so, okay, sorry, let me take a step back for a second. Right, so, like, one way of doing this sort of unsupervised learning would be to to take the distribution over hidden states in a neural net that's induced by an input and try to lower the energy of those states while raising the energy of the fantasies that the network would produce top down, right? And so, the ideal way to do that would be to run, run the network until it reaches its equilibrium distribution and sample from that when you're doing the negative phase of this, of increasing the energy of the, or the, therefore, lowering the probability of the fantasies. Contrastive divergence just takes a shortcut and just runs this for, like, one step or two steps or whatever. So, I just want to make that clear. I don't know if it's clear for everyone in the audience who hasn't looked at this deeply. That's the specific contrast that I wanted to draw there was just between that and a process like simulated annealing where, if you run it to the end of the process, you end up reaching the lowest possible energy state. So, the brain, I guess my point is, the brain doesn't necessarily need to be settling to, and this does, I think, connect directly to your point. Doesn't need to be settling to a completely stable state or even any particular, like, metastable state. It just has to be, there has to be this gradient, right? That is available. But I do think that there's a deeper point here, right? So, the NESS or the non-equilibrium study state that organisms are approaching as they do this variational inference thing is not any kind of final, you know, it's not a static state, like some kind of ultimate equilibrium. But going down a set of stairs, the one steady thing is that I remain balanced and not go tumbling to my eyes, right? That's the one steady thing, the one balanced thing, and then there's the gradient itself. That's why there's a next slide that kind of speaks to that piece of it. And that's all I basically wanted to bring up because I think the cynicism piece embraces or entails all of that. There's a dynamic piece and there's a balancing piece and you want to be able to sort of address full. That's... That's cool. Yep, thank you, Dean, Stephen, and then Adam. There's interesting questions here for pragmatism and social constructivism, you know, in terms of how these ideas roll out into how we might talk to other people in the world around projects and stuff because of this implication. So I wonder how you would bring in this kind of embodiment of that. So if the brain is an organ that is doing expected, more of an expected free energy with the nervous system to help to manage the movement of the organism and the organism overall has got that variational free energy to maintain homeostasis. You've got this energy balance piece, but you've also got that need for entropy. So I wonder how this can inform us, just like people when we do, I don't know if you know about the Alexander technique, but it's when you think about the body and the skeleton, we think about it as just the bones, but actually it's really, everything's held together in a muscular sort of watery soup and with the tendons and everything. So there needs to be some noise in there, that entropy piece needs to be there when you're going down the stairs and that, you can't be frozen, you're full. Yet you need to have some of that energy equilibrium because again, you're full. So I wonder how if we were to talk to someone out there about their life and their world, walking down the stairs, what would we translate this in their language? Alex, if you wanna give a thought on that? I'm thinking about it. I'm not sure beyond saying that I do think that there's a balance of two sort of forces or quantities here always, but I don't know, this reminds me actually, I'm not gonna try to put words in your mouth, but this reminds me of Adam of your remarks about free will in relation to degrees of freedom and entropy and stuff. So that's a segue, but it doesn't have to be taken up. Cool, yet it reminds me of stream of consciousness or walking downstairs, you can't just stop there statically, can't stop anywhere along the staircase and you can't halt the stream of thought. So what is happening that's allowing time t plus one to be generated from time t, the state only has to move by one moment at a time, it doesn't have to do the ultimate steady states, the lowest possible energy. So Adam, and then anyone else raising their hand. I reread your paper last night and it's even better the second time. I'd like to actually talk to you later about like getting into the weeds about different like mappings on the neural level of like between the thermodynamic and the informational. But before we move on, I was wondering how you'd feel about like a multi-phase or a multi-part description of neural architecture where like there's some aspects which might be more annealing like and some aspects that might be more contrastive. So like you could like for instance, tell a story of like some sort of predictive workspace where like you can think of it as like that the model selection is like iterative annealing potentially in terms of some of its aspects. But then like it's just like around like you, you could think of like like Dehane style ignition event as you're generating this big high temperature complex that then settles down and selects the model. But this would just be like one of the things that's happening like something contrastive could be like the foulness like maybe like orchestrating different cycles and comparing them or like the Campbell system. And that could be like doing your contrasts. Like would that be copacetic or with that? Right. Yeah, no, that sounds awesome. I mean, in general, it seems consistent with like the way I've just, I've learned to see these things, you know, just paying attention to all these different sources of evidence. It seems like, yeah, this kind of, it sounds right to me. I guess my only uncertainty is like the reason I focused on this contrast was in part because I wanted to argue for an identity between the free energy in the thermodynamic sense and in the sense relevant to variational inference. And so I needed to make explicit the argument that whatever's going on, even if there's sort of a multiple time scales on which this is happening or if there are different parts of the system that are in effect being annealed to different degrees, that the whole thing, it's not possible to think that the whole thing settles to like, you know, the lowest possible energy state while it's alive. So anyway, I don't mean to like deflect. I think what you just brought up is really interesting and you probably know more about the relevant, you know, neuronal mechanisms that I do. So I can't like critique that idea, but it sounds, as you said, copacetic. Yeah, Adam, you wanna add something there? Oh, I mean, yeah. And I wouldn't want to like distract from like the point of like if we, you actually, you have to not be at equilibrium. You have to have this like open-endedness and dynamics to keep the process going. But I don't know like how productive, like sometimes like I'll squint it like different machine learning algorithms and like, you know, push them a little bit further. So like you could think of this iterative Bayesian mild selection as like a kind of annealing, but that could like not be productive potentially, even though aspects of it could be described that way. Yeah. Well, that's a good, I mean, it's kind of a good way of framing it. Like I think this goes back to this syncytism thing. Like I think you can probably always re-describe something in terms of something else, but it's a question of how useful it is. And I never exactly know where the boundary is, right? It's what becomes just kind of a fanciful, like false versus trick versus what's useful. But the thing that you're suggesting sounds like it would be a useful way of thinking about things. And when I contrast of divergence and annealing, when I think of annealing, I think of DNA in the test tube as you turn the temperature down. So you come to a final frozen state that's like a fixed crystal, essentially. Whereas the idea of contrast of divergence, it's more like cybernetics. It's like a guiding and a navigation on a flow already. And so even if you didn't have any divergence at all between your North Star and your header, where you're going and where you wanna go, you still would be moving. And so in that sense, even if you're trying to minimize the divergence, there still is a sense of movement. Whereas in simulated annealing, the idea that there's no difference is more associated with a crystalline or a static state. So they have very different sort of tilios almost. And for dynamic systems, that person going down the stairs, stream of consciousness, action selection in an uncertain world, that feels more like the cybernetic navigation rather than the DNA in a test tube. Though those are just some aspects. So Stephen and then Adam. That idea of how things keep going and the dynamics, I'd be interested in your thoughts about how that might play out in behavior or in the world. So I could imagine annealing being a little bit like if things are trying to hibernate, there may be a state where they can go closer to like glass, glass gets annealed basically. That's how you plunge it into water and it holds that liquid state even though it's colder than liquid. But the question I have is how do you reconcile some of these questions around where we can get caught in loops? For instance, you know, Kasper Hess talks about depression where you're seeing a lack of affordance in the environment. And that could be confirming your predictions. You can get kind of caught in a loop. And some of those challenges in computational psychiatry are there's sort of low energy. I actually haven't gotten answer to this, it's maybe a thing but they're sort of low energy states but they're also like looping confirmationary states, you know, where you get a reduction in free energy relative to some sort of stick in the sand that has been established for the priors and the dynamics. And then it kind of gets stuck in that, you know, be it, you know, hydroactivity, something that's outside of where the organism want to be but it's still going there. And I don't know if that's quite the same as going into the sort of a low energy equilibrium, you know, going to, it's not necessarily going down in energy, it's sort of getting caught in a loop though of variational free energy. Yeah, that's really interesting. Actually, sorry, it was someone else who wanted to Alex and then after your thoughts, we'll go to Adam. Yeah, so I mean, my intuitive reaction is that, is that, yeah, something like that happens in depression. There's a sense in which it's a low energy state. In my paper, which is kind of like a sophisticated argument for a really simple minded idea, I want to identify those two things and say, yeah, it's physically a low energy state as well but it's a local one. So again, this is my intuition at the moment. I don't know if it'll be born out or if it'll work but it seems like sometimes in depression, maybe we reach like a, yeah, like a local energy minimum in a certain part of the network or something but it's not part of an overall optimal state for the organism when it comes to behavior and things like that. It's really interesting to think about how that happens. Yep, thanks, Adam. Actually to speak to what Steven just mentioned, within like the psychedelic research communities, there's interesting models based on therapeutic change through annealing where Mike Johnson of the Quality Research Institute has something called like the neural theory of annealing where it's the psychedelic state increases the temperature, helps you to break out and then you can crystallize new regime and Robin Card Harris is described in very similar terms. One thing about like generalizations of annealing that came to mind when Dan was talking was Doug Hossetter and Melanie Mitchell with their copycat architecture. They said it's different from annealing but it had something that was kind of like ant colony optimization annealing where there would be a temperature parameter for when your analogies weren't fitting. And so you can think of like some sort of contrast of bits saying like how good is your like matches? How good is like these analogical matching like discrete comparisons? And then this parameterizes the temperature of a continuously adjusted annealing process that doesn't settle. And more recently really interestingly is from the beginning the language was completely like workspace up and more recently Melanie Mitchell has like explicitly drawn a connection to the global neural workspace theory. And so there could be like a generalized annealing that provides that it has a contrast of bit and an annealing generalized annealing bit that does give you like two kinds of like psychophysical identity mappings maybe. And a total speculatory note would be sometimes we have accuracy minus complexity or pragmatic versus epistemic gain. Maybe some decision-making approaches are able to hold both together and maybe even rebalance between those different modes. Dean, I'd like to actually go to 12 and hear a little bit more on what you thought about contrast of divergence before we head to the next topic. So what did you see or find interesting on slide 12? What brought you to that slide? Well, I'd actually looked at stuff from all of our Woodford before. And so that was kind of why, again, I was sort of, I was connecting dots in my own head and it's interesting because the parts that Adam and Stephen just brought up, in part of our conversations offline, one of the questions I asked Daniel and Blue was around the idea. So you have the physical math and you have the statistical math and somehow they come together and make babies. And that's what we kind of have here. And so what is this project? And all I liked about the not to discount or put aside the simulated annealing piece, but to give what I think happens as a, for example, when the person goes down the stairs under control as opposed to falling to their death, lowest energy state. I thought that this essentially spoke to it by just giving a nice metaphor in terms of shining the torch beam in our chosen direction of travel, which allows us to see the lowest point in the field in that direction, which I think Alex was, again, addressing, I don't think Alex, you meant to make this such a central point of your paper, but I thought in terms of trying to explicate the simple idea, the fact that you didn't just focus on the fact that there's a gradient, but how does a person move down that gradient and not lose control was kind of what I wanted to bring forward here. And in the notes at the bottom, which I can't see on my screen, there's a really nice explanation for what contrastive divergence does in terms of being an assistant. That's why I want to bring that one forward. Cool. Alex, if you want to give a thought on that, and then Stephen. I don't have much of a thought. I mean, I think it's a really cool angle that you have on this. And I think it's, again, it's something that's there, even if it wasn't my intended focus. And it reinforces, I think, some of the themes that I was focusing on. Thanks, Stephen. So what I think is quite useful, and I'm kind of bringing this idea of if I was trying to explain this to an arts practitioner or someone in psychotherapy or something, how to explain the implications. And one thing that I think is important here when you talk about that energy minima, if I jump to the bottom of a hill, I metaphysically, I might know that I'm at a lower energy, potential energy overall. But actually, I'll probably be at a higher energy when I hit the floor for a while, right? As I start to get stressed and perspire and assuming I'm not just completely squished to oblivion. So you're in this interest in, I suppose we do that on rollercoasters a little bit, but there's what we are inferring about our states isn't what it is, because we don't directly experience being at the lower energy state. We experience what our sensorial states enable us to infer about the results of our actions. And then from that, we compare that to our expectations based on the stick in the sand or the big sticks in the sand, which is our phenotype, which is what we can then use as something to anchor against. So I'm wondering, that has an implication about how people think about the world because of these types of ideas. You know, there's a sense that they are connected and interconnected. At the same time, they're always separate in terms of being able to touch something because they're not touching something, they're getting sensory data, which allows their model to predict what they've just done, but they didn't actually touch in the way that people might think they've touched. So I'm just putting that out there as some question. Yep, Alex, go for it. Yeah, so there's this piece of what you keep doing Steven, you keep trying to make this, like put all this in touch with like the body in the world and stuff. And I'm like, no, let me just stay in my internal model and not worry about that. But so I don't know, actually, I don't know if I understood the thrust of just like the last couple of sentences, but I think the point that the fact is we're not, yeah, we don't just want to like, you know, leap to the lowest energy state possible come what may, right? It's much more about the process. And I think what you said about the, this thing being anchored in the phenotype and you know, the priors in general is really important. One, another paper that I wanted to write or I don't know if it's a paper, maybe it's just a blog post or a tweet or something, but it's just on the fact that, yeah, so like you always have to bear in mind when you're talking about this energy minimization process that this is all relative to a phenotype, which itself embodies some amount of energy, right? And so, in fact, you might even say that the stable phenotypes are some kind of like optimal trade-off between accuracy and complexity, right? Just in themselves. So there's this force that's sort of pushing up from below, which is just the having a phenotype complex enough to predict anything. And you have to work within that, like those boundaries. So I don't know if that really addresses your point because again, I totally missed the embodiment part just like flew by me, but... It's nice, just one point and then Adam. So I really like that example of a person or a body in motion going down the stairs and the difference between the controlled walking down the stairs or maybe even using an assisted mobility device and then falling down the stairs. And both of them can't stop at one moment. Whether you're tumbling or whether you're walking, you still can't stop there. So it's always dynamical and you are going downhill and you are converting your potential energy going to a lower energy state. So those two are good ways to think about where does bodily control and motor coordination, which we'll talk about in 23 soon with embodied skillful performance. But let's think about a mountain climber. That's somebody who's actually contrastively diverging uphill. And so that shows us that it's not just like the dyes cast and then whether you fall or control on the way down, but there's a way in which going to the top of the mountain is like going downhill or annealing or reducing the contrast and divergence relative to your expectations of where you wanna be. And now you can imagine if the water level's rising, going uphill is the survival strategy. So we wanna have a way of describing where does control and bodily control come into the picture, going downhill and then even generalizing beyond just bodies with mass falling in a controlled or uncontrolled fashion towards the kind of cybernetic planning that could actually result in a mountain climber, for example, going a little downhill and littles the side and then making their way up the mountain. So Alex, then Adam, then Stephen. Yeah, yeah, that's a really good point to bring up, especially in this context of the thrust of trying to simplify things by identifying physical and psychological things. It's like, no, sometimes you wanna climb the mountain. So I mean, my immediate way of trying to summarize how that would work is just, well, you've got a strong prediction that you wanna be at the top of the mountain, right? That's your sort of desired distribution. It defines your desired distribution over observations. And so that's gonna drive you to climb the mountain, but I think that the way it does that is proximally, is that there's going to be an actual, I would say an actual physical disequilibrium in the brain, right? That's caused by the discrepancy between your predicted state and the state that you're inferring from your observations and lowering that potential is what drives the action of actually climbing the mountain. So, yeah. Cool, awesome. Adam, then Stephen. Had spinning a little bit, but hopefully this works, because like up, down, but I'm wondering in terms of something like depression. Could we think of it as like a very high local minima? And like if we thought of like what we'd expect, so I guess, okay, what is it that's being predicted at which level of the generative model? Like what is it there being, what's being predicted like, and actively like between the organism and niche and what's being predicted over the whole, like skin and caps like organism, the brain and then subsystems. There could be some, ultimately they have to come to some kind of harmony, but they could diverge, but the idea would be like thinking of a landscape that like when the energy is high, it might be more jagged and you might be more likely to get stuck at like, local, you're stuck at this local minima that you wanna get to the really low one. And so you could think of something like niche construction as like looking for a catalyst to like try to like smooth out, like make a more navigable landscape or you could think of like, the psychedelic would be like interning the heat. So I'm thinking of the difference between like a thermodynamic versus a kinetically favorable chemical reactions. Like you might like, yeah, it's theory. If you get from there to there, you come up ahead with the enthalpy, but you can't get there. It's not kinetically favorable. And so like the different things we're doing is trying to like find ways of getting kinetic favorability for this thermodynamic favorability and bring those into alignment. Very nice, Adam. You can build that bridge. You don't have to afford the river or swim across or you can just pray for the quantum tunnel to appear. So. I do. Stephen and then Dean. Wow. Okay. There's something very quite important here I think because okay, we're minimizing this energy gradient. Okay, that's fair enough, particularly for robots which are using reinforcement approaches and they're basically working on energy gradients. And then you've got the idea of, okay, you can work with variational approaches. I more like this Gibbs free energy where the reaction can slightly have entropy. But the big key thing as well, if you're going to take your agent is that we're at the level of the chemistry is the chemistry is all non-equilibrium chemistry. So the whole of the Gibbs free energy is built on equilibrium chemistry. Here we've got a non-equilibrium chemistry when you get to the level of the organisms and the cells and everything. So I think what's an interesting point here is so you're going down the stairs, you're trying to fall over, you're being a rock climber. So overall there's this energy piece around how the agent moves. But for the agent to move, unlike a robot where you might have little pistons, the actual cells in that are operating at this non-equilibrium state. And quite how different actually this gap between these attractor basins versus energy basins which we normally think of. I actually don't know. I don't know. I'm interested to ask Carl Friston that. Like what is, how much does it shift things to have basins to do with attractors as opposed to just basins to do with energy? Like equilibrium energy gradients. And that's the thing that's different in terms of what's really going on when I'm climbing a mountain or that, is all my muscles and that at the actual level of making things happen are swarming and trying to enable this kind of biological process to happen. And okay, from a system perspective, I can see the energy being less for me falling on the ground. But for my muscles to move and everything, they're operating at this non-equilibrium process and which will have energy impacts, but also has some other types of dynamics which you don't have. You have this non-equilibrium steady state dynamic which you don't get in traditional chemistry. So there's something there. I just thought of, I'm not saying I expect an answer on that, but I think it's an important point that seems to be a gap and it really shouts shone out for me there. Nice, thanks, Steven. It's kind of like if it's a ball at the bottom of a bowl, then that's the annealing. It's the crystal state with the ball and the bowl. But then if you have a neuron that's holding a resting voltage, that is being actively maintained with an enormous expenditure of energy. But it's its own kind of bowl that it's converging to, whether it's above or below, it's set point. It's still doing a type of convergence, but it's a dynamically held state. And you're absolutely right that sometimes the lower level entity or the entity which the larger one is composed of, they have their own active dynamics even when the top level system is at rest or appears to be unmoving. So Dean, and then we have a few other fun topics that we'll get to. Well, I'm gonna see if it's okay to ask Alex to jump to his third rail at this point because I'm really, really curious about the modified Ramsey sentence aspect of his paper. And what I was hoping he might be able to speak to a little bit about is not just what that, what the Ramsey sentence is talking about, but the implications in terms of different time intervals and what that role that plays in us being able to understand how the modified Ramsey sentence works in your paper. Perfect, I switched to that slide, 44. So yeah, I was definitely curious about that as well, Alex. What does the Ramsey sentence or this entire area of formal logic, how did that come into play? Yeah, so, right, I kind of trumped up the Ramsey sentence thing in the last episode, but it's kind of silly of me because there's not a whole lot to it in a way. But then maybe that's just how it seems to me from the perspective of where I'm coming from. But the reason this is in here is that the whole identity, the philosophical mind-brain identity piece of this is based on, I was mostly inspired by David Lewis's approach to that topic. And so there are a couple of influential approaches. There's Lewis's and there's SMART, I guess are the two big papers. But basically what this piece of formal logic is doing is just trying to capture what a theory says about the entities that sort of implicitly defines or how you could get from, this isn't all explicitly spelled out graphically, but how you could get from just a collection of statements that you might have sent to about whatever the topic happens to be, in this case we're talking about mental states. From that to a theory that's expressed in general terms and that again implicitly defines the entities that figure as the values of these variables here. So I mean we could briefly talk about this I guess, is it useful to go over this? You guys, yeah. So read out the sentence or translate it or provide an example. Sure. So I guess I'll start with just the Ramsey sentence. So we can talk about modified versions in a second. But so the first symbol here as you guys discussed in point zero is just an existential quantifier. It says there's some X such that whatever follows. The second symbol is a universal quantifier over the variable Y. So it says there's some X such that for all Y, blah, blah, and here T the predicate is just literally like, so say that you had a collection of platitudes or I don't want to complicate things, just a collection of statements. Like when I see a cat I'm happy, when I think about happiness, I tend to think about sunshine, things like that, right? Just some examples. And so you just can join all those, just put an and between all of the sentences of that type that you could think of in this domain, in this case it's folk psychology. So then you'd have a long sentence, it's a giant conjunction now, that has some terms that refer to mental states like believe, arguably see, although maybe some people interpret that as a physical state, there's definitely blurriness here at the edges. But in any case, things like believe, desire, think, these are clearly mental state terms, right? So you'd have this long sentence that has those terms and also other terms. And so essentially you just, you replace each of the terms in question, the mental state terms with a variable and what's left over is the predicate that defines what the sentence is saying about the values of those variables, right? And then you just append these quantifiers at the beginning just to sort of spell out what kind of claim you're making about the states in question, namely there is something such that it relates to all other things in this way. And so that's the basic Ramsey sentence, it's just an abstraction that allows you to spell out informal logic and first order predicate logic, like what a collection of sentences says about something. And the last piece here is the, so there's the if and only if, the three bars. That's the Rossellian uniqueness condition. So this goes back to Russell's theory of definite descriptions. So like I got, I mean, my start, by the way, was in sort of philosophy of language and stuff. So that's why this is in here. I came to the, all this sort of statistics and physics based stuff much later. So Russell had this theory of definite descriptions that he was trying to explain how you could have a meaningful expression like the present king of France, even though there is no such thing. So how does, how do failures of reference, how do we philosophically handle failures of reference and things like that? And so his suggestion was that you can think of definite descriptions like the present king of France as just saying, well, there is something such that it is a present king of France and there's only one of those things, right? So it's like a uniquely satisfied description. And so that's what this end part does here. It says this predicate T, which it encompasses all of the implicit folk knowledge about mental states in this case, holds of something only if that thing is equal to X. And so one last piece that's important here is that I took the bold letters here X and Y to be like vectors essentially, right? So like these are we're talking about many different mental states in parallel. We do this, but if you want to, you can just think of it simply as being about one mental state that you generalize to all of them. So I don't know where, how clear is that? What? It's, let me try a little bit of a physical example. It's like there's the stream of consciousness, the stream of statements, what you said with an ant. I saw a cat or photons hit my eye and I perceived a cat. So one of those is more of a physicalist claim and one of those is more on a mental side. And it's almost like we're going to draw a line around the mental claims just so that they can be definitely interfaced into a formal logic, even if the contents of that mental experience might not even have a reference in the real world or it might not be something that exists in the same way that those photons hitting the eyes exist, but it allows us to interface with those kinds of statements by agents in a formal logical framework. Well, I mean, I guess the one part I pushed back on a bit is, I mean, this sentence here clearly states that this X exists, whatever you take that to mean, right? So it could be that there's different criterion for existence and different domains or something, but it definitely said that there exists something such that it's related to all the other things of this kind in the ways that are specified in the predicate. So like, yeah, I think the main, I guess the main point here, so I focused on the Ramsey sentence because people were interested in it, but the main point is just that you can take all the things that people say about mental states as implicitly defining them in a sort of theory, right? And that that's, and so Lewis used this explicitly in the context of trying to identify mental states with brain states, right? So I'll just say a little bit more about this piece and then I'll shut up for a bit, but if you have this network of relations that's described by a sentence like this, you can take that as implicitly defining that the mental states while being neutral about their sort of ontological character, right? So it could be then that if you do some neuroscientific investigation and you find, hey, there are these states in the brain that they seem to bear the same relations to one another that these mental states that we were talking about bear to one another, you know, Lewis says, well, then because this the sentence says that the mental states really are just the things that uniquely realize this structure of relations while you've just empirically discovered the mental states, so it's a really cool way theoretically bridging something like psychology intuitive folk psychology and neuroscience and Lewis's point was once you have this structure if you could extract this structure from folk psychology that's implicit in it and if you made the empirical discovery there'd be no further bells or whistles needed to say that there's an identity between them because that's just what the sentence says is that the mental states are whatever things satisfy this description. Very interesting, so Dean, then Stephen, then Adam. Yeah, so Alex, I mean, as it's written out here it's really stable and I'm really curious because I really don't know this. Do different time intervals, like really, really short time intervals or longer time intervals, if I've got a day to go down the stairs versus I've got one second to decide whether the 96 inch drop is less harmful to me than the lion that's chasing me do time intervals affect the stability of this? Right, yeah, so sorry, I neglected that part of your comment earlier. I think no is my short answer. I think that this is meant to be a static sort of description of an implicit theory, which will include statements about short time scales and long time scales. So I would say that anything about time scales in theory should be captured by the folk psychology that you embody or that you have in mind at the moment and that might change from one time moment to the next also, like this doesn't need to be a temporally static structure. So maybe that speaks more to your point, but I think the possibility of spelling this structure out even for one instant is the lever that is Lewis uses to draw his philosophical conclusions. Thanks. Thanks, Steven, and then Adam. This question about prediction and time, I think like you just saying, there may be a sort of time scale at which things are just flowing, like what sort of speed is it that things are just doing their thing, i.e., cells are doing their processes, but what Dean was talking about is longer time scales than that, and I think like you say, there's something different there that's a more, everything beyond a certain point has to be imagined, I suppose, anything that's not possible to do variational free energy on, you've got to do expected free energy to do expected free energy, you basically have to imagine some, a big of, imagine in a different way to just our limited sense of imagination, like our body has to imagine at some level the expected free energy, basically what expected free energy is. So I think that then does bring in some interesting ideas around folk psychology. Would you say that those mental states can ultimately be action and sensory states that have been predicted? So the mental states are basically the recapitulation or expectation on sensory states and resulting beliefs on action states. So they could be seen as mental states, but actually there is a more distributed realization of that. I'm interested if that sort of is something that could hold. Yeah, I mean, I think the spirit of Lewis's proposal is also functionalist, right? So the basic idea functionalism, of course, is that you mental states in the philosophy of mind is that mental states can be defined in terms of their relations to inputs and outputs and also to each other. And I guess it's the to each other bit that distinguishes it from behaviors. So, but yeah, I think that these implicit definitions that Lewis talks about would rely heavily. It would depend heavily on things like when I believe that there's a cat present, I expect certain visual appearances to occur. And I'm disposed to act in certain ways. And maybe you wanna cash the, I tend to wanna cash the actions and the sensory step out in terms of proximal stimuli and like motor effector states rather than distal things and extended things. That's just, I don't know. That's my, I don't know if it's a bias or what, that's just the way I tend to think of these things. But definitely, I guess I would say that the relations between the mental states though are just as important as the relation to the active states and the sensory states when it comes to this kind of formalism. I feel like I missed an opportunity to, it was a really interesting comment actually. I don't know if I did it justice, but I'm still thinking about it. And I just, just add one little things like just before. Yeah, yeah. Okay, very quickly. But one thing that sort of comes to mind that I think might connect what you're saying. You were talking about this and, and, and. So there seems to be something there about the combination of ants and what's, you know. So for instance, and there was a dog and there was a woman with the dog and we're in the park and the dog was wagging its tail and then there was blood on its teeth when it smiled. And I'm like, suddenly that's changed it, right? So the triangulation of the ants flips once you suddenly, the dog bears its teeth and you see blood on them, right? So that links into what they do in qualitative research where they often triangulate to get something which has got a trustworthiness rather than accuracy. And I think there's something interesting there about what you're saying in terms of that. My knowledge of that field of linguistics is that but if I can bring it back into embodied work I find it easier, which is what I'm trying to do but this and, and, and that's, that has got some practical applications I think in, in a number of areas. Cool, definitely. Yep, thanks, Adam. If you can return, yep. Okay, go for it, Adam. Hey, making coffee. So eventually I'd like to get to free will and actually loop around a little bit to actually depression as a case with respect to that. But before we move on from Ramsey's sentences, not to belabored but so it is the idea that so you strip away the specific entities being signified, you look at the syntax and then you're saying if can the same syntax apply in another system and then you have the mapping that this would be the map. So that would have, is that the basic idea? Essentially, yeah, if you take syntax in a, you know, in a slightly broad sense than totally, yeah. Okay, so I guess I'd be wondering like, like in terms of like active inferential modeling, would it be something like, like with like the graphical brain, like you have like this for any factor graph where you have like the continuous regime down below but then you have this like discrete regime up top and like maybe something in like the factor graph portion, you could look for like some sort of like of a homomorphism or isomorphism there or like if it's the brain, you would do some sort of like, you would look at ensembles and like describe them as like attractors moving along some sort of manifold and then like when you coarse-grain those or throw a blanket around them or reify them in some ways, if the syntax of their interrelations matches people's subjective reports of their experience, then you would be able to actually potentially identify like physical and computational substrates of consciousness. Is that? I mean, is that a bridge too far? Well, they were going. I think that's where we're going except I never made the bridge all the way to consciousness because that's one thing that I reserve the right to not decide about. I think your work on this is really cool actually. I kind of suspect that like most of the main theories of consciousness are kind of overlap a bit and there's a core that's sort of like getting at the way things are, which I think you've written about, but so I wanna just bracket the consciousness thing for a second, although maybe that totally ruins it for you. I don't know, but like, yeah, but that's exactly the idea. Like I think you just expressed it very well that if you look with, yeah, you have to squint, you have to do some coarse graining, but if you look at the dynamics of the physical system, you'll be able to discover in the ideal case some of the same sort of syntactic relations that you would get from the psychological perspective. And I think that's also very consonant with the way connectionist modelers describe their works sometimes. So like, you think a connectionist, a simple neural network made of these simple processing units, you can derive that architecture just by looking at the brain and sort of very much simplifying things, or you could derive the same graphs in some case just by looking at cognitive processes. And so it's the idea that you have two perspectives on one thing, exactly. If you have a graph isomorphism, then that is what was being hinted at with it being sufficient to find that there's relationships amongst mental states. Like, I always feel like I feel this way before this other feeling. And then if you found that there was some brain state that always preceded, is that the end point or is there still more to explain in that context? Let's actually get to another topic that we highlighted last week, which was the sleep wake example. So I'm here on slide 40 and we're considering a system whose analysis is tractable, at least according to Alex. And we have an agent whose behavior as far as sleep wake is concerned is defined by equation 3.1. So could we walk through what is happening in this simulation? Like what is the agent doing? What does the agent consist of? And then what does the equation show? And then how does that relate to what we're showing in this paper? Sure. So I think this is another aspect of the paper that was maybe seemed to be making a deeper point than it was. But I'll just talk about this briefly. So this was actually just meant to be sort of a slight modification of just like a Helmholtz machine running the wake-sleep algorithm. So I don't know if I need to set that up. Blue talked about that a little bit in the point zero. But the idea was if we just have a generic... We have a neural network that has a generative model that's sort of encoded in its top-down connections and an approximate recognition model in the bottom-up connections. And so one way of doing this is to run the wake-sleep algorithm where you basically generate states of the network top-down, like fantasies, functionally speaking, and then you adjust the bottom-up weights so that they're likely to produce those fantasies. And then you alternate this with a... So that's the sleep cycle. You alternate this with a wake cycle where you use an input of the kind of sensory data that you want the model to learn to generate. And you do a bottom-up pass and then adjust the top-down generative connections so that they're more likely to produce those states that are induced by input. So all I was doing here was saying, look, if we suppose, just make the assumption that this is a real system we're talking about. It's ridiculously unrealistic, right? So that's why I said it's something that's tractable to analyze, but it's also quite a far cry from anything biologically possible. But it's also not completely off the mark, right? So I think there's a reason that the Helmholtz machine was kind of like a seminal moment in the evolution towards what we have now. Because it got a lot of the coarse-grained sort of conceptual structure really down that had been pulled together before in this form. Anyway, so let's suppose that the wake and sleep cycles of that machine occur with equal probability. Then we can talk about probabilities of certain states occurring in the network just period, right? Whether it's during a sleep or a wake cycle. And so if you just grant... If you ignore the fact that this is a really only... It ends up being an approximation to an approximation, right? So it's running... Wake sleep ends up being sort of an approximation to the EM algorithm, which is itself not quite full variational base. So anyway, lots of approximations, but if you ignore that for the moment, I just wanted to make the point that you could talk... You could talk about the... You can thoroughly completely characterize the probabilities of various states of the system. And so you could use that to define this R here as just the probability of a certain state... Of the system being in a certain state. I hear the SIs referred to the states of the individual neurons, but that's related in a simple way to the overall state of the network. So you can talk about the probability of the system being in one of these states regardless of whether it's running a wake or sleep cycle. That gives you a... If you have a probabilistic description of the states of the whole system, then you could use that as... To define the relevant terms in the thermodynamic free energy, so the entropy and the energy and such. So that's the basic idea, but the part that you were wondering about, like what is this system simulating? What is it perceived and such? That's all... This example is completely agnostic as to any of those details. So once again, this is highly abstract, and I apologize. No, it's cool. And what do you think this model would allow us to continue developing towards or what does it show for the argument that you put forth as far as the psychophysical identity thesis? Well, I don't know that it shows anything in itself. I think what it was trying to do was write down a system in which the target would be clear. So you want it to be the case that this R distribution... So let me put it this way. So the R distribution, because it's essentially made up of or it's factored into these wake and sleep cycles, right? Those have direct interpretations in terms of variational free energy and inference. And yet, we have a well-defined distribution over the states of the system that we could use to describe its physics. So I just wanted to show that there's one simple case where we can write down a distribution that bridges these two descriptions. And then in the subsequent paragraphs, I try to argue that we can generalize this beyond equilibrium. And that's the very tricky bit, and I don't claim to have shown it by any means. Again, I was trying to remove obvious obstacles to this thesis being true. Awesome. So Adam with a raised hand and then Blue, or if anybody else wants to go for it. In terms of wake-sleep, there could be some potentially biologically realistic implementations of something like that. O'Reilly has a paper, A Deep Predictive Learning, where he describes a model of the phlamocortical system doing predictive processing that's a very weak sleep, a wake-sleep-esque. I'll sum that to be curious to know if you thought that was a good psychophysical identity mapping there. The other thing would be if we think of experience as being the ramification of experience, the mapping would happen at the level of predictions, then, and if the predictions are the sleep part, is there a sense that experience is always, like, kind of paradoxically, it's always the dream? Like, the awake part is the stuff that's just updating it, but the actual thing generating the experience is the dream part? Correct her. Yeah, so I have an unpublished paper on this, too, on the Representational Division of Labor. So people, like, in predictive processing literature, say things like, yeah, we experience our expectations of the world or whatever, but I'm not so clear that it's that way. I don't know. I guess if you think that the predictions are the carriers of the experience or the neural correlates of the experience or something, then that would be the case, and the wake-sleep cycle would do nothing. But I think it's... I don't see the reason to embrace that interpretation of what's going on here. Because it's awesome. No, I mean, I'm all about... No, no, I'm all about the idea that we experience, sort of, virtual models in a way, right? That's cool. I don't interpret that in terms of, like, it's just the top-down driving signal that leads to experience, although maybe I'm not really against that, either, and I can see why you'd want to say that. Anyway, I just don't think that there's any inference from, like, I mean, to that, like, because... And I would love to look at this paper you mentioned, by the way, on, like, biologically plausible versions of this, because I think there's a lot that's right about it. The reason I say it's so impossible is just that it doesn't have any role for top-down modulation of bottom-up signals during perception, which was by design so that it would work quickly. But that's the main limitation that concerns me. Um... Cool. Thank you, Alex. Steven, with a raised hand, and then anyone else, and also anybody watching live, you're more than welcome to write a question in a live chat. So, Steven, go for it. Just to help me clarify, so... this being a tractable analysis, so the key point is that by going to Stochastic, it shows in principle how recognition and generative densities can be used together. Is that the kind of the point that this shows? I'm just... I'm struggling... I think I'm not so good with the formulaism to exactly get it, but I think the general point you're making is that you're going to Stochastic algorithms, and therefore you've got this potential for the recognition and generative densities, which is basically sensory and actually potentially to be reconciled and shown even a very simple case. Would that be a great reading a bit? I think so. I think definitely that's... it's in large part right, so the idea that we're dealing with a Stochastic system is kind of important because yes, I want to claim that or I want to advance the possibility that probabilities are encoded by probabilities, so then you could have an actual identity. Whereas if you're just encoding, like if you're using like, I don't know, a scheme that's more like what Carl Friston talks about often, where you have sort of neuronal states that are encoding the sufficient statistics of a distribution, then I don't think you could get an actual identity because you'd have a clear representation that has a certain almost convention built in of how things are represented. Whereas I'm arguing for the possibility of a much more direct encoding, and that does depend on this being stochastic. But the other part of this that I forgot to mention, that sort of I think maybe crystallizes the the point this is trying to make is that if we're at equilibrium with respect to the network, if we're at the Helmholtz machine's equilibrium distribution, then in that case, if you do take R as describing the physical probabilities of the system that you would use to specify the thermodynamic properties, then you could show that in that case, at least the variational free energy and the thermodynamic free energy would be the same. And then generalizing to non-equilibrium of course is the hard part, but that's more concretely what I was trying to show there. Yep. Thanks. So I'm going to go to a question from the chat and then we'll go to Adam. So Cambridge Breathz asks can this model describe or map the transition between focused attention and mind wandering? Grateful for any elaboration. Thanks everybody. Anyone want to take that? I don't know. I don't know. Maybe the dream versus the sleeping versus the waking but then even within the waking state we have a wandering phase versus a very tightly focused phase so there's sort of what is happening if that reminds you of anything? I mean I'd say if we're talking about this model, talking about the Helmholtz machine or this like slight elaboration on the Helmholtz machine then I don't think it would do anything that sophisticated just because so as Adam says there are biologically possible ways of implementing this kind of thing but like anything as fine grained as a shift from attention to mind wandering I think is probably going to be left out of a model that's at this level of sort of abstraction and approximation. That's my thought. Thanks. So Adam then blue and then anyone else with a raised hand or a question in the chat. Adam? I'm not sure I can address the mind wandering exactly. It looks like a physical identity but in terms of like the mountain climbing analogy I'm wondering if it's like mind wandering is kind of like exploratory hill climbing with a little bit of quantum tunneling but that's a I mean what you think actually it might loop around to the brain in an interesting sense and that there's like if we think of like lower levels of like a cortical hierarchy as doing something more like variational inference via like continuous message passing maybe with like some discrete updating there too so that seems to be like one kind of implicit representational regime. There might be a sense in which you're getting like probabilities in terms of like the attractor dynamics along this hierarchy corresponding to like probabilities of like a hierarchy of nested events in the world and like a deep temporal hierarchy and the deep temporal hierarchy of like the latent causes of the world but there seems to be like potentially like as we like move inwards like another level where these get re-represented into more like constant like when you go into like deep association cortex it might be a different game and then when you're getting like coupling with like the the campus system now it seems like you're getting something almost like like explicit syntax with like discrete semantics because like you can map like there's these pointers going from like the relationship among these like you call like bump attractors the campus system or like these place cells but they can be used to like contextualize the whole overall system within some sort of structured syntax I don't know like would do you need something like that to have the probabilities where is that what you're looking for or could there be more like this like inactive hierarchy of like a hierarchical Boltzmann machine that has a nested hierarchical cause there and those are probabilistic like both do it or would only like the re-representation I don't know I think a lot about this issue of re-representation and how much you know if it happens and what it's doing and stuff but I think I guess I need to understand more and this is totally just my ignorance about the kind of models you're describing with the campus and such like I tend to think of I tend to think of semantics in the brain just in terms of like vector space models and so I'm just curious I don't know if we could talk about it now for a bit maybe I don't know it's up to Daniel and others but like what is the explicit syntax in this case it sounds something like something that would be friendly to what I'm trying to do so I guess like this explicit syntax it's like still heavily researched there seems to be kind of like a gold rush among different AI companies and like Cognor Labs are trying to figure out the campus system like DeepMind this is kind of like what started them I was like Hasabi is like making discoveries about this but the idea would be that you have this graph structure that would be that originally would have evolved for just locations in like meet space or physical space but then got repurposed for any kind of space and so this gives you like structured state spaces whether you're navigating through the world or you're navigating through like spatial topology this would basically the content would be these and so it would be at the top of the cortical heterarchy and it would have pointers that could be unpacked in terms of like these more like modes of an active engagement there might be like another like intermediate level could be associated with consciousness where you're getting like more concentrated attractor dynamics with like potentially a quasi topographic relationship to events in the world and so the principles there graph neural networks might be relevant there where basically you get this orders of magnitude greater representational efficiency by actually making the geometry of the system over which you're doing the deep learning resemble the thing that's being modeled so like there's like graph grid neural networks that could be used for like spatial modeling or there's like graph mesh neural networks that could be used for like modeling of like someone's a pose of a thing or an object and so in theory some things like this might form at the intermediate level and these might be the ones that are most heavily coupling with this hippocampal system which is giving you this syntax of state transitions with like structured composition that's really cool can I comment on that for a minute? yeah I'll just go for it and then blue okay um yeah so there's a lot there that's exactly on the lines of what I've been thinking about this I've written a bit about these like conceptual sort of grid cell type things that it seems like there's a lot of evidence for that it's really awesome I know like Memento is doing some research that's related to this stuff so that's okay if that's what you mean then I'm totally on board like I wasn't sure if you meant that or something more language like with like discrete tokens with other parts but I mean that might be encompassed by this as well but it seems like people are finding also like encodings of like the hierarchical structure of like the grammar of language is also encoded it's found to have a correspondence with like the structure of the like they said the relationships of these bump attractors that's amazing yeah I was going to mention also so you meant you gave all these examples of graph of neural networks that are sort of isomorphic with the domain that they represent I was going to mention recursive neural networks used for language processing as well the network topology matches the syntax tree or something like that so that's very much along the lines of where this is going Thanks Alex so blue and then Dean afterwards so I want to kind of loop back around to what Adam said about dreaming because he totally like took the words out of my mouth but maybe also touch on the question that was asked by the person in chat about mental wandering so is it possible so so Alex your you know objection to the dreaming and the physicality was that there's no evidence for top down control during the wake cycle is that correct is that what I heard you say sorry no I actually meant to say kind of the opposite I meant to say that the problem with the wake with the Helmholtz machine as a model is that it doesn't it doesn't model top down effects during waking perception right and that's what I interpreted so so okay so if it doesn't if it doesn't model wake top down effects during the wake cycle is it possible that maybe there are like overlapping Helmholtz functions happening right like so you know your mind wanders off or you're not paying attention and like maybe you're consolidating like you're you've got like one set sleeping one set waking and there's maybe like an overlapping construct of this like wake sleep learning cycle happening like while you're out mental wandering I'm consolidating you know the information from you know what I learned at breakfast or or something like that yeah yeah I mean I think there's I think there there are like this model is adjacent to like many really interesting plausible variations right and slight elaborations so like what you're talking about sounds to me like in that vein like I saw amazing get repository of just like Helmholtz machine variations right where they just they just tweak certain parameters like let's see what happens if we propagate a few steps down during the wake cycle as well you know things like that so I think I don't know so I don't know if I fully grasp like exactly what you had in mind but it sounds pretty good and thanks for I mean you said go ahead Alex go oh no I was going to say thanks for also trying to address the comment and because I felt like I didn't wasn't able to answer it very well nice and there's definitely wandering in dreams and how is that different than the wake wandering and could that get at what the difference is between sleep and wake and could this model maybe give us a bit of a wedge to enter into that discussion so Dean and then anyone else with a raised hand so Alex back to analogizing so I'm reading your paper and I read something and I go oh my goodness look there's the world's biggest nickel and then I continue reading and then oh look there's the world's largest corn cob and then I have a chance to have a conversation with you and you're like saying to me well let's bring this back into some sort of proportionality so explain to me what is the two carat deep diamond that you were trying to allow readers of your paper to get to because I know it's I know I blew it out of proportion because when I first read it it was a big deal but maybe you can help me bring it back into some sort of scale as to what was your deep hope here it's interesting I also don't mean to suggest that you got it wrong or anything with I think the proportion you're finding interpretations that are consistent with what I said you know it's not like it's not there but it's funny that if I'm trying to if it seems like I'm trying to put you know to put things in proportion or ground it in any way it's kind of funny because I feel like others are trying to are trying to do that like you know relate this to embodiment and activism and things much more concrete and I keep trying to keep it abstract but anyway the main I don't know what the hope was here I think yeah I'm not sure I think I think I just really wanted to point to the possibility I was really about this identity theses right which I'm not even sure that a priori I agree with by the way like I'm not sure that identity theory is the way to go in understanding the mind's relation to the brain I'm kind of like not really a reductionist but I wanted to it seemed like there was a possibility of articulating a like quantitative version of that right using the tools that we've developed over the past several you know couple decades that Lewis didn't have access to and I just really I just love David Lewis as a philosopher I think he had some amazing ideas and I I wanted to and it just seemed to me it's not so much about that I wanted to show anything in particular it just seemed to me that there was a natural convergence here that could be spelled out so that's kind of what I was trying to do thank you so Steven with a raised hand and then anyone else who raises their hand and also in the last half hour or so anyone who wants to ask a question in the live chat so Steven then Adam I was just so tying in a bit with what was also said by Adam around the hippocampus so you've got the place cells in the hippocampus which gives the some way to bring the environment in potentially in the semantic I like that I had not thought about as semantics but a semantics of place if that makes sense so now we've got our environment external states potentially being brought in in a tractable way you've got action states could be the grid cells with the wattness of what's out there and the sensory states is this lower level flux that's coming into the bottom of the brain and then the generative model would be the the higher level kind of well that's interesting what the generative model is in all of that if that makes sense if we were to play in that sand pit it would give that way to bridge I'm a big fan of hippocampus but as I'd love to see how it all fits in with some of these pieces but I'm just curious around if that kind of role for these grid and place cells could be quite useful to do this tractability between higher and lower level states sure Alex if you have any thoughts there otherwise I'll go to a question from the chat and then Adam yeah I guess it's just really briefly it seems to me that if we are using these place cells in this sort of we've leveraged this thing that was originally meant for physical spatial navigation and we've transposed it to like conceptual spatial navigation essentially that's really cool I could see it being the sort of thing that conditions expectations for the sensory motor stuff and sort of it would just sort of define the generative model I guess the dynamics at that highest level would be the thing that sort of sets the set points for things but I think Adam might have more to say so just to the question which is also for Adam says thanks Adam did mention that the transition process probably involves quantum tunneling may I ask if he thinks the same for the transition from mind meandering back to focused attention so Adam feel free to take up that question as well as anything else you like to add and then anyone can raise their hand I guess like there's a sense in which like one of the quantum tunneling events I guess I would have in mind is that like there'll be like these regimes of sense making where the hippocampal system will lay down a given tiling of the hippocampal system in collaboration with the entorhinal cortex with the grid cells will lay down this tiling of some domain within which you're doing this modeling with where you can get basically the relationship among these bump attractors you can get this like structure the syntax and through the pointers of kind of semantics with themselves having a semantics of space maybe just in the relation but the quantum tunneling but then if you get enough prediction error from the overall system this seems to trigger these resetting events where you'll break up the frame and then you'll do a new tiling and it's kind of like a re-grip on the active inference it's like a new kind of engagement a new set of policies that could fit within this different conceptualization of the safe space you're working with and so these events are something I guess you can maybe think of them as the quantum tunneling it's like you're not doing I don't know if this works but it's like they're called like levy flights like an animal like foraging in a patch and then it's like eh and it'll just like jump to a greater patch once it's like not getting enough and so it'd be like you're kind of like imagining in this place you're simulating this place and then you're like okay a little bit bored not quite the thing and then now you're imagining something else and I guess you can think of those as like quantum tunneling you're not just like working within one regime of simulation just fixing that thing you're moving from here to here to here but the question I raised my hand was so there's a sense in which you might think of like predictive coding style models of cortex I know it's like more complicated than predictive coding but there's a sense in which something like that is probably true because if you do a primarily suppressive regime you end up inducing sparsity and if the events that are in the world are clocking slower than your internal updates you come out ahead in terms of like I mean that's why they did it for video coding it's just more efficient and so I'm wondering is that enough to give like a psychophysical identity mapping in terms of like there's a sense in which we'll minimize predictive coding mechanisms to the extent we can grow with just those they should minimize they should minimize activity yeah yeah that's actually kind of where I'm coming from with this paper in a to a large extent I was thinking about yeah predictive coding as a way of understanding neural networks so I think although this perspective can encompass active inference and such I think that's sort of the core of it is that the brain is literally trying to yeah to right sparsity as you said right induce sparsity minimize activity only signals that need to be passed forward and that means that you're sort of quelling activity that doesn't get passed forward so I think I don't know if that's strong enough to get you an identity but I think it's it's consistent with the identity idea and it's I think it's a large part of the inspiration for me awesome Stephen with a raised hand and then anyone else yeah if you could just speak to how you see that predictive coding predictive processing be more or less useful at certain times as the sort of the the discourse and when maybe active inference is not as needed at certain times you know maybe thinking in applied context but I'll just be curious yeah it seems it seems to me that if you like that if you want a like the the main advantage of active inference practically speaking to me seems to be like modeling like explicitly how how policies actions are chosen like decision making right based on um how you expect your actions to change the way states evolve so I think that that happens implicitly in a well-trained predictive coding architecture right and I I don't see any fundamental theoretical advantage to active inference there in terms of explaining how things work on like a in principle level but that said it's really hard to like just train a recurrent neural network or whatever to do what you want to do and then interpret it states and so on so if you have an idea of a generative model like of what the agents generative model should look like that makes much more sense to write it down a priori and you know try to build an active inference model then to just learn it from data um so that's as close to maybe applied setting as I could get but I hope that sort of speaks to it yeah let me let me try to get another view on that from you Alex where would you put active inference in relationship to supervised and unsupervised learning for those who might be less familiar with those topics when you're talking about learning from the data versus a generative model where does active inference fit into that yeah that's a great question um it's it's a bit it's a bit tricky because there's this issue of preference learning that's kind of not settled in the you know like as far as how how much of the feature of of real biological agents learning do we think that is but like so to me the most natural way of thinking of unsupervised learning happening in active inferences that you've got sort of a you've got a phenotype right some some aspects of which are going to be immutable over relevant time scales but some aspects of which will change right so we're talking about synaptic weights you could think of that as an extended part of your phenotype and that will change over you know time scales of perceptual learning and and during the life organisms lifetime and such so like um that that is I think is neatly you can categorize that as a form of unsupervised learning but that's also not really again it's not it's not there are people who are active inferences sort of practitioners who don't really think that preference learning is a thing that happens or um yeah it's not clear what role it plays but but but overall I mean uh well there's also there's also got to be a role for um something like uh supervised learning seems to me the closest thing that we have to that in real systems is like uh reinforcement in a way so like uh and that's certainly got to play a role that's complementary to unsupervised learning but I think um I don't know this this answer is kind of shit I'm sorry uh it's I'm happy to to take it up more I'm just not sure what like I don't know how to how to tease apart the you know the parts of active inference that should be understood in terms of unsupervised learning versus just like learning in general and how much of it I suspect a lot of learning in general is got to be unsupervised um and I think active inference is sort of uh naturally interpreted predictive coding certainly is unsupervised learning paradigm and to the extent that active inference reduces to that it is too great thank you and these are definitely all topics we're always trying to think clearer about and communicate clearer about so Steven and then Adam I think this is a useful conversation I know it's a bit off it's put you on the spot a little bit but I think it's a group it's quite useful this question of um when a generative model comes into play so with reinforcement learning I suppose that there's a generative model in relation to like an induction an inductive reasoning in terms of the goal like you're trying to narrow the gap between the target that you're reinforcing the reward and reducing the gap whereas there's this abductive potential in um active inference to and I wonder whether in the in the population or the populating of the brain's attention I'm sort of wondering if place cells and the sort of more granular parts of the lower levels of the neocortex or I'm not sure if that's the right term are basically that kind of um more unsupervised very unsupervised very abductive kind of potentially ways of engaging and then these grid cells as they get more populated with what things are become kind of more top down maybe more um let's go back into say a more reinforcement learning or supervised learning because like and then you don't as much so to speak you just go straight to the point which probably I should do now and let someone else take over Thanks Steven um Alex if you want to add something otherwise we'll go to Adam um yeah no just say I mean that kind of makes sense to me I think I think you need I think you need unsupervised learning for representation learning of whatever isn't innate right and then once you have that framework then then you can start to think about like reward say like meaningful you know interpretations of sensory stimuli that you associate with rewards so that's all safe for now Adam um quick comment the question but a quick comment on what Steven said um I mean interestingly about the hippocampal system is there does seem to be like this cunt like very contrastive thing to it um in terms of like comparing between current states and then imagined states and like a different like duty cycles of theta you'll be seeing there seems to be this comparison operation going on so it could be very much like this higher level highest level supervisor in this like matruoshka dolls of like quasi homuncular things that we're thinking of um but um the question I was wondering it like there are people who don't believe preference learning is a thing oh I shouldn't I shouldn't put anyone on the spot it's not that it's necessarily that they don't believe it's a thing it's that it's not clear that we need to invoke it to explain behavior behavioral things in at least in many cases um so I feel like I'm going to be speaking for people who are colleagues of mine who I don't want to misrepresent their views so I'll shut up about it but there's definitely controversy about the extent to which it happens I suppose I mean I'm wondering I guess like to what it's like so like there's things like like meta learning uh principles where like you have like you're figuring out like new ways of approaching policies so you're like realizing goals and you're building up these like policies like exploratory fashion as part of your peanut to the plasticity and um but I guess there is a sense in which like if we're like if we're subscribing to the free energy principle you can only prefer one thing and that thing never changes and everything else is just instrumental it could just only prefer to be basically agent smith for the entire world like my pattern me me me um right but you know ultimately this should be like we we we in terms of like the evolutionary favorable equilibria but it seems like I guess there is a sense in which that never changes yeah I mean I just I say one thing to like to try to make defend the people that I am talking about um but I say preference is a controversy about preference learning I mean there's clearly something that that looks like preference learning right it's just a question of whether you're really learning preferences or you're just learning new associations with with the things that you prefer and those two processes are conceptually distinct they might lead to similar results but yeah but the point you're raising is interesting that is that really your preference is like you literally your phenotype so I have a question for Adam that's exactly where I was going to go could there not be a cultural scaffolding that's developed over ecological and evolutionary time that values diversity of perspective and so yes it is true that at different scales some type of active inference could be happening but I don't think that agent smith would be the only attractor at the bottom of that bull it'd be more like a maybe a plate with a lot of different wells a lot of different kinds of food that could be converged upon why would it have to be converging on one specific especially Hollywood represented version of what efficiency is just a thought but I don't think agent smith would win like villains usually lose you know Steven well this could be though when you get into that non equilibrium equilibrium dynamic is like a dictator who establishes an equilibrium that's an energy equilibrium that's going to be more powerful than the more subtle organic non-equilibrium states it just dominates times and if there is made in equivalence in some near or distant future between actual energy usage and governance for example some sort of crypto system energy and voting and information and finances all being mixed together it'll be a enormous nexus of power and maybe we'll be able to use these kinds of frameworks to describe them Steven and also going the other way then if you this is my belief this is without is I strongly believe that to create community engagement in meaningful ways you have to create the container for those more subtle non-equilibrium states to have a chance to breathe this woman Arianne Munchkin she's a theatre director in France she talked about don't crush the butterfly she would tell the actors don't crush the butterfly the butterfly is like the idea so don't grab hold of it because you'll crush it so it's quite a nice metaphor to be likely and allow it to and there's a certain because equilibrium dynamics or someone who comes in and basically starts shouting at people will always over dominate someone who's trying to do some let's all come together and find each other's plurality so anyway that's sort of dynamic sort of ties in with what you're saying cool so I flipped to 49 where we have our usual set of just closing discussions so in these last 10 minutes we'll just be remarking on what we thought was interesting or important from the conversation what would we like to pursue next in terms of our own work or Alex in terms of your research I'm sure we'll also all curious as well about how you apply these ideas in terms of your you know whatever non-proprietary applications you think that this is on if none then that's its own interesting claim or maybe there is something that this helps us do in terms of practice what are the next steps here maybe Alex first and then anyone else can give a thought sure so I mean I could address these questions on the slide or yeah let's hear your answers to these questions on the slide so I mean this also speaks to what you just asked but I think really I think that this is just a way of thinking about things that the main thing I think it enables is just allowing you to infer from what you know about cognition or the cognitive states of some system including yourself to physical states right if it's true then for example yeah you can say well I'm depressed there's not a lot going on you know I'm I'm ruminating a lot you can infer that there's something physically happening mirrors that process and you can take either a physical or a psychological approach to solving any problem in your body right in theory now there's a lot of working out to be done of how that would work in specific cases but that's the kind of like basically just sort of smoothing the path for inference from physical to psychological states I think is a is the main thing that I think is practical about that practically important about this I'm still curious about how to spell that out for specific types of states like depression desires things like that I'm also curious about whether any of it is true at all or whether it's just complete nonsense big questions Adam and then anyone else who wants to give any thought or address one of the questions on the slide if it's complete nonsense I'm not sure what would make sense with me but later I would really want to talk to you about like the degrees of freedom and like volition issue and like different ways that could play out and also I want to drag you into consciousness because you are precise my friend and you're helping me think better so cool thanks Adam happy to cool if anyone else has a thought otherwise we can sort of look back Steven go for it yeah I'll just say thank you for bringing this foundational build up into the thermodynamics and I'll be interested to maybe see if there's any I did physical chemistry so I've got some background but I'm curious to see if there's any ways to bring in some elements of sort of Gibbs free energy and into some of this so I'll be glad to chat about that sometime yeah blue so when you guys are done talking about degrees of freedom and agency you have to come back to the chat because I wouldn't hear all about this I have some thoughts that have been cooking about agency related to active inference and other ideas also cool and maybe Dean I'd like to ask what do we do at the end of the dot two we took that step to the dot zero with yourself and blue and I and then we take the dot one and dot two steps ideally with the author especially when it's an awesome author like Alex who's engaged with the material before the dot one and between the two what's our next step you know on that staircase what do we do after the dot two how do we make the best of this well this is a sample of one so but for me it's so how do you remain stable and dynamic at the same time how do you move through these gradients whether it's ascending or descending and also keep in the front of your mind how you're going to keep your grip if you're ascending or maintain your balance when you're descending and for me that's what Alex his paper essentially pointed out there is a physical piece to that and there's a statistical piece to that and I don't want to become a statistic of another old person who falls down the stairs so how do I pull all of that together and have it make sense especially if I don't want to give somebody a training exercise if I want them to be able to model how to get down that descent or up that ascent and I can't give them a map so what do I what do I give them what do I replace that with that's what Alex's paper gave to me gave me a little bit of confidence awesome one more staircase thought and calming back as well to the up and downhill so when you're going up the staircase you grab on to a certain spot on the rail and then that helps you ratchet up and then when you're going down the staircase you're unlikely to grab on tight to a rail you kind of want to loosely have your hand sliding down the rail so that you can ratchet if you needed to and stabilize but if you ratchet while you're going down then it's going to prevent you from continuing this sort of natural descent so it just makes me think about different modes in understanding especially because as we talked about with the mountain climber it's like going up and downhill in their own way if the top is your goal they're like both going downhill so when in our process do we want to grip the rail lightly just like Steven said not crush the butterfly just slide along and just go with the flow knowing that we can stabilize when we need to and when do we really want to grip and ratchet because we're working against one energy gradient but we're working towards our goals when we do that so I hope all of these mixed metaphors come together for the listeners in an enjoyable way and Alex we really appreciate your engagement with Actin Flab here and everybody else awesome conversations Yeah thanks, thanks Blue, Dean Daniel, Steven, Adam and people who were here last week also I'm happy to be put on the spot I'm glad this was of service to people I think it's been a really interesting discussion and thanks for having me, thanks for all that you do. Great well everyone's always welcome and we'll see you all in another stream Bye