 Hello and welcome everyone to the Active Inference Lab and to the Active Inference Livestream. We're here in Active Inference Livestream 17.0 and it's March 3rd, 2021. I'm Daniel and I'm here with two of my colleagues who can introduce themselves. Blue. Hi. My name is Sarah Davis. I'm a person who's interested in things. I'm a student at Leibniz University in a master's program in philosophy of science and I've been an engineer in a bunch of other things so I'm just generally interested in how it's all connected. I'm Blue Knight. I'm an independent research consultant based out of New Mexico. Cool. Well, we're here for a fun discussion. So thanks both for joining. At the Active Inference Lab, we're an experiment in online team communication, learning and practice related to Active Inference. You can find us at our links and socials. This is a recorded and an archived and a hastily produced Livestream. So please provide us with feedback so that we can be improving on our work. All perspectives and backgrounds are welcome here and will follow good Livestream etiquette. So here we are in paper number 17 and we've just completed our first quarterly Active Inference Lab round table. So check that out on our channels if you haven't. And today we're going to be setting the context for 17.1 and 17.2. The paper that we're discussing in 17 is Information Flow in Context Dependent Hierarchical Bayesian Inference by Chris Fields and James F. Glazebrook. It was published in October 2020. And what's funny is that the .0 videos have always been about context, setting context. And we even wrote in the previous videos, the video's introduction and its context. It's not a review or the final word. And then it got really meta with this paper because it's a paper about context. The punchline or a punchline of this paper is that we can integrate various approaches from mathematics, mostly to achieve a scale-free formula that leads to an interpretation of nested systems where communicating systems are engaged in active inference and distinguished by informational or statistical patterns. Or they're engaged in something that's related to active inference, let's say, because it's pretty out there with what it is. And it's a challenging but also a fun paper. So we're going to start with the goals and the claims and the abstract. And at that point, it might be like, wait, what? With the roadmap headers or with the abstract claims? It's like driving through a country where you don't speak the language, perhaps? And that's okay because the bulk of this video in the keywords are going to be going through and kind of unpacking a lot of the key terms that are going to matter for even just understanding the lay of the land. And then we've pulled out a few other key topics, some quotations. Didn't go as much into the formalisms of the paper itself, but we're looking forward to doing that with the authors in 17.1. And so we're just looking forward to that. In 17.1 and .2, we'll be discussing the same paper. So read it and save and submit your questions and get in touch if you want to participate. And also we're looking forward to your chats during this presentation, giving us a little support along the way. So here we go with the goals and claims of the paper under discussion. So here are the goals and the claims. So maybe, Blue, do you want to read the authors' goals and claims? Sure. So the goals for paper are two-fold. First, to model intrinsic or true contextuality using the general category theoretic methods of true spaces and channel theory. And second, to employ this formulation to reconstruct hierarchical Bayesian inference in a context-dependent way. It goes on to say here we have formulated an approach to intrinsic contextuality in general category theoretic terms. A set of observations exhibits intrinsic contextuality if no cocoon can be constructed over the observables that produce them. Cone, cocoon, who knows how to say it. So they're doing a couple of things here. They have two main goals, so it's great that they're laying it out clearly. The first one is about context. And they want to be having a more advanced way to talk about context. It turns out that they're going to be doing that using category theory and true spaces and channel theory, all of which we're going to be going into at least a little bit. And then secondly, they're going to be linking all of this stuff that they just did up here with context, true space, channel theory, category theory. And they're combining that with context-dependent Bayesian inference. We've talked a lot about active inference and Bayesian inference. So this is sort of the hinge where we connect some of these ideas to active inference. And then this is a quote from their conclusion, which is that they have formulated a new way of modeling contextuality in very general category theory terms, specifically relating to this cocoon construction. But the category theoretic term allows them to move between a few different areas that we're going to be walking through today. So here we go with the abstract. I'll read the first of the three slides of the abstract and then anyone can give any thoughts or just start reading the second part of the abstract. Recent theories developing broad notions of context and its effects on inference are becoming increasingly important in fields as diverse as cognitive psychology, information science, and quantum information theory and computing. Here we introduce a novel and general approach to the characterization of contextuality using the techniques of true spaces and channel theory viewed as general theories of information flow. This involves introducing three essential components into the formalism. Events, conditions, and measurement systems. Any thoughts either of you two or someone can just start reading the second part of the abstract. Let's go to the second part of the abstract. I think we'll have a lot of time for off the cuff remarks soon. So Sarah, do you want to go for the second part of the abstract? Sure. Incorporating these factors in relationship to conditional probabilities leads to information flows both in the setting of true spaces and channel theory. The latter provides a representation of semantic content using local logistics from which conditionals can be derived. We employ these features to construct cone to cone diagrams, commutativity, commutativity of which enforces inferential coherence. With these built a scale free architecture incorporating a Bayesian like hierarchical structure in which there is an interpretation of active inference and markup blankets. Nice. And then Lou, how about finish out the abstract? We compare this architecture with other theories of contextuality which we briefly review. We also show that this development of ideas conveniently accommodates negative probabilities leading to the notion of signed information flow and address how quantum contextuality can be interpreted within this model. Finally, we relate contextuality to the frame problem, another way of characterizing a fundamental limitation on observational and inferential capabilities of finite agents. Perfect. So that is the abstract. That's how the authors wanted to present and communicate their work to a broad audience of specialists and non-specialists. And again, we're going to be going into basically all the words that might be non-obvious in this abstract and the keywords that come into play. That's what we wanted to be unpacking today. So we're going to look over their roadmap because just like the abstract is how they wanted to communicate what they did. The roadmap is how they've chosen to organize what they've done and get us from A to Z. And then we're going to be really going into this road trip and unpacking what some of these key terms are. It starts with an introduction, which I guess is helpful context. And then one of the focal ideas of the paper is this chew space or the chew construction, which we're going to be talking about the history of some examples, some related topics. And just right off the bat when we saw that the examples of chew spaces and chew morphisms included topological spaces. So that's all of network theory and graphs and things like that. So that's one big area, but we're going to be talking about chew spaces as something that connects all of network theory and topological space to types as processes, probability spaces, Bayesian belief networks, event space structure, chew flows, Colbeck Leibler divergence and the free energy. So we're going to be popping up a level from things that are already pretty abstract, but it's going to be done in a way that's quite useful, especially for formulating context. And it's actually really interesting because a lot of times people will say that these topics in the examples, that they're general, but they actually don't have enough context. For example, you can make a social network, but that wouldn't be the total context of the world if you only have some of the users on the social network. So maybe there's a way where we're going to be able to step up from some of our previous conceptions and introduce context potentially by default. This leads the authors to a discussion of probability and how to use probabilistic models for contextuality, for modeling context. And this comes to a recently derived framework called contextuality by default, which will be a topic we dive into. Section four goes into some issues related to measurements and kind of the before, during and after of measurements because you got to set things up and calibrate and then you got to make the measurement at a certain moment and then you have to analyze it in the context of a certain system. So it's like before, during and after the measurements. In section five, they head into channel theory and information channels. And this was really fascinating because it's quite distinct from Shannon information theory in some ways that people might not expect. And I had never heard this discussed alongside Shannon information theory. It's like, here's information theory and it's good at ABC, but it's not good in these three things. That's where the discussion ended. I didn't know that there was another area that kind of picked up where Shannon left off in a sense. And that's related to this idea of an infomorphism as well as classifiers, conditional probabilities and this cocoon diagram. Thinking about information channels and flows of information leads to this formulation of context compliant. So context modeling or following hierarchical, multi-level, multi-scale Bayesian networks. And first, this section has a lot of technical detail, but it involves making a connection between Bayes networks approaches to probabilistic modeling and statistics and inference with context. This leads to a sort of conclusion or a climax section where the hierarchical Bayesian networks that were just introduced in six are brought into the context of two specific practices, which is, number one, the idea of active inference. I highlighted it because that's kind of our lab. That's what we're here to be thinking about. And then two was the frame problem, which is something that's also really interesting to think about. It's going to be the key motivator. So although they put it last with active inference and the frame problem, we're actually going to have those topics closer to the front of the presentation. So people who are listening to this from the active inference perspective or from more generally the frame problem perspective, they're going to understand what motivated us to be studying the paper the way that we're doing it and hopefully what motivated Chris to select this paper when he suggested it to be discussed during his visit. That's the sort of naked roadmap. That's the turn left at this part and turn right at this part. But there was a narrative roadmap that the authors actually put out. And their own words are on the right, so you can see that in the paper or here. But on the left, I just thought of this is like the semantic roadmap as far as the flow of the paper. And I just extracted this semantic flow from the syntax that they wrote, not super knowing what all these topics are, but just rephrasing what they had put. Previously, these authors had been researching quantum and thermodynamics of separable systems. In this paper, they started with a recap of two spaces and two morphisms, which suggests that it's a foundational topic for the paper. They then reviewed contextuality paradigms and the diversity of approaches that model contextuality. They then took a formal diversion into this idea into various ideas of probability theory above my detailed understanding, but they took an extension in that direction. They then turned to the semantically richer methods of channel theory. And they're using channel theory as an even more general approach to inference rather than the syntactic information theory like Shannon Information Theory, which we'll get to. They wanted something a little bigger. Thinking about these kinds of local logics of information channels leads to some new ways that they also combined ideas. And then they visualized this slash represented as this cone, co-cone diagram, as they say, as a natural scale for you representation of Bayesian inference. So scale free just means that it doesn't have an a priori scale. It's not in inches or in kilometers. It's just at the level of abstractness. And then they have the claim that is proven that the failure of the CCCD, the cone, co-cone diagram, to diagrammatically commute is a condition of the measurement operator's non-co-deployability. So it's just sort of like if you have two shapes and then they fit together, what they represent fits together. And if you have two shapes and the puzzle pieces don't fit together, something mathematical that the shapes represent isn't fitting together in the way that it would. As if these two shapes fit together. I don't know what non-co-deployability is, but it's like a puzzle piece if it were co-deployable. And then the part that I think will be of special interest to a lot of our lab and colleagues is they're going to recover this new theoretical and inferential role for the Markov blanket. It's the locus at which free energy is minimized. Part one and part two, the epistemic barrier that keeps contextual information hidden from the observer. So when I saw that I was like, whoa, it sounds fascinating because it's two distinct and critical roles of a blanket construction. And then I thought, so contextual information is hidden from the observer because a lot of times we might think about context as being like what's around us. But we'll see if that's actually the way that context is being used. Can I jump in for a second, Daniel? Yep. So I think that way that I understood co-deployability, like measurement operator co-deployability, like you can't at the same time measure position and velocity. That was like one example that they gave in the paper. So like because those two things like you can't, if you're moving, you're not at a stable position, right? So those two measurements are not co-deployable. You can't measure those two things at the same time. And that's kind of what they meant like about measurement co-deployability, I think. Cool. Thanks for that. Here's to why it matters before we even jump into the keywords, which will be the bulk of our journey. And this is quotes from the authors. While the results presented here are primarily technical. So just take a breath if it's worrying you because they wrote the paper so we don't have to and it's a technical paper. And that's what we're learning by doing and exploring. The paper is technical and abstract, but now we're looking for where we can go. It's like a car is technical, but it lets you get somewhere. So where are we going and why will it matter? These results open a path to application in a number of areas where contextuality has largely been neglected but is important in practice. That sounds really important because it sounds like there's systems that are unknown unknowns. We don't even know what to model. But there might be a large category of systems where we think we've modeled it well. But then there's this problem where context keeps on entering the picture or something that we didn't model throws our local model off that we spent time and energy on. So if we could find a way to respectfully interface with contextuality across different areas, that would be really important. Well, what are those areas? One of the models that they're interested in, just like Friston and a lot of active inference researchers, is neuroscience. And so they suggest that the current approach they're taking takes a previous model, which is the global neuronal workspace model. And that's actually related to what Ryan Smith and Christopher White will be talking to us about in number 18. They're going to make that model contextuality compliant and provide the building blocks needed to incorporate context switching and attention switching into global neuronal workspace models. So it's kind of these two cool parts of the brain, which is that it can spiral in and really hone in with attention. But then it can also move attention. So it's like side to side and in and out. And that sort of context and attention switching is not dealt with well by our current models, which might be about attention or inattention to a stimulus. But they're not actually modeling what the regime of attention is. And another, then there's a formal result, which might matter to some people or others that contextuality can always be associated with this sort of dawn communitivity kind of related to the puzzle pieces, the context we just mentioned. And then this is interesting. From a more probability theoretic point of view, our results further support the authors of the contextuality by default in showing that the distinction between quantum and classical probabilities lies not in any ontological difference, but rather in what has been explicitly labeled. So this was really mind bending, because a lot of times it's like quantum, you know, theory of small things where there's an observer that has an effect in the outcomes of how the system is modeled or set up. And then the sort of disinfo debate is like, who's an observer or does it matter if it's a conscious deserve, you're going to go down a rabbit hole, because there aren't simple answers there. But the system of the particle, it turns out depends on how you structure the context of the experiment, just as is the case for macroscopic entities. So these were some questions that so in other words, is there a continuum of tiny things and bigger things with context as being what is distinguishing the behavior that can seem quite different across scales. So it's not that electrons are magical and they can do wave particle, but nothing else exists in a similar regime. It's not that we're going to get the same diffraction of people walking through a doorway. But then again, maybe if it's set up the right way, there's more similarity than not. So a few opening questions before we jump into the keywords, just how will we connect across formal and informal or just playing different senses of the word context? And then what are some of these cases where, whether they said it in the paper or not, where contextuality has been largely neglected, but it's important in practice. So even within an informal context definition, what's in an area where potentially context is not taken into account? So I'll start with one and then either of you could go. Maybe cultural context or for understanding geopolitics or individual context for an individual relationship. So it does matter, but when we approach those sort of individual or collective level problems as if we can ignore context. Again, just using informally, we're not really operating in the space that the system is in. We're operating in a reduced or a distorted model that we're projecting. Hey Daniel, I don't know if this is going to mess up your video setup, but if you're able to go into present mode on your slides, it would make the whole thing a lot more readable for a small screen. In the live stream, which you can look at and mute in the background, it looks different. But yeah, we've got a lot of small text on these slides. Yeah, it's, I don't know, it's so worth unpacking that last bullet point. There's just a lot there. So are you reading? When I read this, you know, that he's saying ontological difference versus what's explicitly labeled. I immediately just think, well, of course, because you're, you know, you're either looking at small scale or you're looking at macro scale, but you think he's saying something, something a little more subtle. I would love to have a quantum expert, but just speaking with our friend, Jason Larkin. He has often pointed to these sorts of almost like classical meets quantum crossover situations where you get the classical quote behavior from when you define the full system with the quantum observer, then it behaves. Let's just use the twin slit, for example. The system behaves a certain way when you define it a certain way with the observer in the picture. But when you don't define the observer, you're just talking about a different system. And so that might not be something that only electrons exhibit. It might be a feature of certainty and uncertainty in measurements. And that would be various kinds of measurements. It wouldn't have to be just things that were smaller than a molecule. Okay. That's how you're reading it. Yeah, that makes sense. But that's my non-technical physics for biologists take. Any thoughts on this blue or we can go to the keywords. Okay, so I do have a little bit of thoughts. And so it's tied into the quantum, but it's also tied into back in theorem 7.1. So you had skipped over this word that I think is kind of critical and important in the previous bullet point that says our main form of result theorem 7.1 shows that intrinsic contextuality can always be associated with non commutativity. So in that non commutative, that means you can't construct the big cocoon diagram when it's not, when it doesn't commute. So I think that it's important to highlight that that kind of contextuality is where they were describing the context for something that's beautiful. You know, a beautiful what, right? Like a beautiful leaf is not the same as a beautiful person. If that leaf looked like a person, it'd be a really ugly leaf. Or if a person looked like leaf, it'd be a really ugly person, right? So beautiful, it's always like that context dependent, a beautiful what, like beautiful does not always have the same referential meaning. And so I think that this is really important and ties right into the quantum aspect of things, right? Like how they were talking about the Alice and Bob with the machine, right? Like so it's always, you know, when you can measure, like it's always what is explicit context, right? So the quantum is tied into explicit context. Like when we have a set of variables that we measure for every circumstance at precisely the same exact time and under precisely the same conditions, that is extrinsic context, right? And it's different and separate from intrinsic contextuality. Sorry, I just wanted to point that out. You might be saying something that I'm, yeah, it's great. Thank you. You might be saying something that that that I'm already about to say and but in or maybe just triggered my thought. But yeah, this intrinsic versus extrinsic. The way what it made me think of was that, you know, one thing is quantifiable and the other thing isn't. It's like a gradient that that's essentially what you're saying. Yeah. Okay. It's great. Thank you. Fun. Well, let's get to the keywords. This was definitely one of our, you know, even though you look at the keywords, you think, oh, it's the ABCs. I mean, how hard is it going to be? It was really fun and a good learning experience for all of us to go through these topics. And it includes active inference and then Bayesian inference, which are things that we've definitely talked about before as well as Markov blankets a little bit. However, the other words were basically all just even new to learning about not just in active inference. So that's where we kind of started this one. And that's why we want to just include people in the conversation and just see where everyone is at in learning these topics. So contextuality, choose space, channel theory, information flow, local logic, cone, co-cone diagram, frame problem. And then it wasn't a keyword, but it's kind of related is concurrency and concurrent processes. And I just always want, why is the second C capitalized? Let's motivate the paper, not with what choose spaces are, which is how the paper begins, if you were to read it, but let's kind of motivate it with the problems and the context that we're looking to be applying in, which is the frame problem and active inference and thinking about something that most of us are more familiar with, which is Bayesian computation. So the frame problem is the problem of, using their words, the problem of circumscribing what does not change when an action is performed. Viewed broadly, it's the problem of circumscribing what is relevant in a situation. It's kind of like if you want to say, well, what are all the connections in the social network? It's the shadow of what are all the non connections, because those two answers are arrived at at the same time. And if you have no false positives and no false negatives, it's the same to know what has changed versus what hasn't changed. So when we're thinking about scenarios, those two are equivalent. Instead of just reading this for a first pass, what did either of you? How would you summarize the frame problem or what is something that makes it valuable or important for study? Okay, think about it and then raise your hand while I go through these. The issue with the frame problem and the reason why people are working on better solutions is that let's say we want to understand what does or doesn't change when an action is performed. So we're making a self driving car and we want to understand how something changes or not when the scene changes. In order to look for all the changes, aka what hasn't changed, it's really hard to know when to stop or what has changed. There's just an exponentially large number of things to check and you can't brute force it. So there are heuristic solutions. So solutions that aren't exact, but get you on the right track. However, the heuristic solutions, which often work extremely well, like frame differencing, like if something doesn't move between two slides, you just cancel it out. And if it does move, you're going to do some other algorithm on it. It's a good heuristic, but it assumes a coherent context. It assumes that what is observed could in principle be expanded to include all that there is to be observed. So if it's an image and it's a 64 by 64 image and I do frame differencing and we're only studying the image, then I have captured everything that's changed. But when there are systems with intrinsic contextuality, this coherence assumption is violated. If there is intrinsic contextuality, i.e., some of the observables are non-co-deployable, the current context cannot even in principle be expanded to include everything that's observable. So, Blue, what do you think about that or the frame problem? So one thing that I thought was cool about the frame problem is that they had this corollary 8.1. I don't know if you have this up there, but when they're discussing the frame problems, it says a distributed information flow system can be informationally un-encapsulated only in absence of intrinsic contextuality. So what that really meant to me, they reiterated practically that this means that proving the absence of contextuality in a domain requires discovering all of the information that is relevant to solving problems in that domain. So the frame problem can only be solved if it has already been solved. It's like having ground-truth data. If you're trying to predict something, you can only predict if you've been given a ground-truth to predict from or by. Do you need me to repeat that? Yes. Super interesting. Has it been solved in the past, present, or the future and we're just figuring it out again or something? So it's essentially like you can't go forward solving it and this goes into, and I know we're going to get into the computing analogy later on in the paper, the reservoir computing, but it goes into that like without ground-truth data, your neural network is useless. You can't predict the future based on something that if you don't have the context for it, and not the intrinsic contextuality, but you have to have like a real contextuality, like know all of the variables that are important for the problem. Anyway, sorry. That's my thoughts on the frame problem. Cool. And at the bottom here, how would domain-specific or general solutions or simply better heuristics lead to real-world implications? Well, we can probably think of a lot of software and hardware and hybrid systems that need to ask themselves what is relevant in a situation. So that's going to be a recommendation engine, a self-driving car, whatever it is. We want to be modeling relevant aspects of a situation. So that is the big, and it also has implications that each person is going to have their own take. So, you know, put it in a live chat or the comment, but it's a cool topic. Of course there's going to be new places of bias that'll be kind of interesting too. Yep. Any other thoughts on frame problem, Sarah? No, that's it. So, Bayesian inference, another big topic, but we can start it with a fun little meme. So on the top right it says, okay, gang, let's see what deep learning really is. And this is like Scooby-Doo-Bee-Doo, where are you? Meme genre. And so we're going to find out what deep learning is at the end of the episode. We unmask the villain. Oh, it's Rev Bayes. So this is sort of hinting at the idea that Bayesian inference or Bayesian thinking, Bayesian statistics is underlying what deep learning is. What is Bayesian inference? So there's two ways we're going to show it here. There's the equation in its sort of most simple form on the top left, and then a more colorful, unpacked version that's made a little bit closer to natural language. There's equations. And then also we can look at it in a graphical way. And what the equation as well as the graphical representation are getting at is the idea that there's a prior, which means before, and so there's a prior expectation. And it is some type of distribution. It could be tight or wide. It could be different types of things. And there's no uninformative prior. Even if it's flat, like I'm not sure if it could be a 0% or 100% chance or it's all equal. That's not the same thing as unbiased. So this really gives a better lens to talk about bias and implicit or explicit bias because we can actually talk about priors being updated rather than who is or isn't coming to the table with no prior expectations. That person doesn't exist. Priors are transformed through measurement into posterior estimates. And so it's like reality is providing stimuli. And then the prior gets dragged along either a little bit if the learning rate is low or a lot if it's going to be an overwhelmingly powerful stimuli, it gets pulled a little bit or a lot to the data. And what else would you say about that, Blue? Nothing else. I think that that's the deal. We're pretty well described. What about you, Sarah? I mean, the only thing I thought about was a little vague. I mean, I was just looking at some YouTubes on Koopman embedding and model discovery. And when I see these Bayes diagrams, I'm often like, oh, okay, well, they have some data where they presuppose what is connected to what prior. And with this other method that I was looking at, it seems like that's not the case. They just feed it into this gigantic structured AI. But so just thinking about Bayesian inference in a lot of different ways. Cool. And we can also ground Bayesian inference in a history and a developmental trajectory of statistics, which is to say more memes. So first, here's just one quick meme. Here's Bayes saying, I thought the volume of a high dimensional hypersphere would be evenly distributed throughout. Just funny stuff. Truly funny. And then, yeah, exactly. And on the right side is a SMBC comic just joking about how when data comes in, your model is updated. But that doesn't mean it's going to be an adaptive model. So there's a lot of ways to go about integrating new data into priors. You could have fallacious priors. You could fallaciously update. You could make the wrong measurement. You could interpret the measurement wrong. So saying we did it Bayesian is like saying, you know, I built the car out of metal. It's like, what else do you want to know? Does it work? Is it safe? So we can't let the buzzwords confuse us. We just want to follow up with what the buzzwords are. And here in the sort of stereotypical human evolution meme, there's and with a little extra labeling, there's a progression from the and theta here is the model parameters or just the model and then X is like the data or the observable. So the a priori homo a priorius just is walking around with a model. Here's how the world is. What data just it's the model. This is what's likely homo pragmatic is sort of again, it's not an actual evolutionary or creationist parallel. It's just a meme. The data is the only thing that's considered. What's the data? Show me the facts. Facts don't lie. Facts don't care about your feelings. It's just the facts. It's not even about the model. And then we get to homo, the frequentist. So the frequentist is asking the statistical question, what is the likelihood of the data given the model? So what, for example, a p value is coming from like a T, a T test is like, what is the likelihood of this height data, given the model that the two classrooms don't have different heights. And so you can say, that's 0.01 p value. So there's such and such likelihood for this data under that model so we can reject that model. And then that's all this whole point with the rejection of no models and H naught versus H one and rejecting things and the whole idea of falsification at the statistical level but also at the philosophy of science level. Oh, one beautiful piece of data could ruin the theory and you just can disprove it. It's this whole idea of disproving. You have the model in the background explicitly or implicitly and then new data comes in. You go, is this likely or not? And then if it's unlikely, you think that it casts doubt on your model. Homo sapiens here is thinking about the likelihood or modeling both the data and the model but then the sort of advanced or at least farthest right. Homo basiensis is thinking about the likelihood of the model given the data, which actually is what we want to know. We don't need to know the likelihood of all board positions and all possible chess strategies. We want to know the likelihood of certain things happening, given what we're observing, which can often be quite limited or confusing. So not to overanalyze mean, but yeah. Okay. I want to interject really quick. So we were talking earlier about frequentist and Bayesian statistics prior to pushing the broadcast button on the live stream. And you know, I was thinking about the movie 50 first dates, right? Like, I don't know if you guys have seen the movie, but she wakes up every day and can't remember anything except like before her car accident. So the model's there. It's existing, right? Like she has the model, but just the model never updates. So every day, like, you know, it's you just go based on the model, but then there's no there's no update. And so like, you know, the Bayesian would be like if every day you updated the model while you're sleeping at night. And I think like, you know, that's a huge like, like lead into the Bayesian brain like framework, right? Is that like what we're doing while we're sleeping? Really? Are we updating our model? Like, is this why we need less sleep as we get older and we get stuck in a decade? Like, you know, you're stuck in your favorite decade because you're not updating your model as much or as well. I don't remember the movie too well, but didn't the other character have memory across the dates? So it's almost like an alternative title. It could have been, you know, when a frequentist in a Bayesian date 50 times. That one definitely got left on the clipping room floor. So Bayesian networks are taking this Bayesian inference idea and putting it in a graphical or a topological framework. So it's representing variables in the nodes. There's other graphical models, but just for the ones we're showing here, the nodes are random variables. And then conditional dependencies are shown with arrows or with edges. So this is like a very general one on the left. It's like a is just having a conditional dependency with B. And it's directed in this case. This one right here is actually a Bayesian network called a neural network. This is a single layer neural network where one input kind of unfolds to all these hidden layers, just one here. And then it goes to an output layer. But this is the structure of neural networks. So everything that we're talking about like neural networks, all kinds of deep learning are encapsulated. And then on the top right from par at all 2018 is the example of active inference. And so here is the progression that we walk through in our model stream. But the simplest Bayesian network that we can imagine between a hidden state and an observable is you have s, the hidden state, and then there's some sort of P, which is the probability of an observation given the hidden state of, oh, the observable. So that's like hidden state, and then you're getting observables. And then you're trying to model the hidden state given the observables. In B, that same process happens through time. A changing hidden state is emitting observables at different time steps. Still a Bayesian graphical model. C introduces control theory with pie. That's why our lab logo is a little pie because it's action playing into the way that states are modeled to evolve their time. D takes another level of complexity and allows actually policy. So still we see this motif with pie influencing how s one, two and three change. But at each time point, let's take this one down into s one. At each time point, it's being evaluated how a policy is going to play out at s one, two and three. And notice that at one, two and three, there's an inference on one, two and three. And so I'll read it in my camera. Don't worry. The trajectory of action is what is being estimated. So it's not just doing state estimation at each of these time points. It's estimating a trajectory of action. And this idea of action trajectories is going to come back into play in a kind of interesting way. Let's just take one more second, more generally to think about active inference and what it is, why we're here, where the Markov blanket is. In active inference, we have internal and external states. And it's a scale free framework because these internal states could be small or large in scale. It's outside the level of specifying a specific scale. And these internal and external states are separated by a blanket. And as we're going to show in just a second, this Markov blanket concept is just the total layer of insulating nodes such that the internal and the external states have a certain type of independence relationship from each other. And then the innovation, we'll get to where first thing came into play. So that's active inference just as a way to integrate action and perception with sensations influencing internal states, internal states, including generative model estimates of the world and policy estimations implements action. And action can influence external states, which also is undergoing its own dynamics, which we might think of as actually contextuality. Any other thoughts on active inference? So here's where we go from Bayes networks to Markov blankets. This was part of our discussion on Acton 14. And the idea of making a Markov blanket was introduced by Pearl 1988. And where Friston's innovation occurred, and this is, I think, from Mel Andrews' paper in 14, is that Pearl's original construction of the Markov blanket was subdivided into two kinds of nodes. Sensory states, nodes whose influence is directed towards the blanketed node under interest X and active states, which are nodes influenced by X. So we can have an organismal metaphor with sense and action, but a little bit more generally sense is like incoming of the system and action is outgoing. And the Markov blanket is the total set of all the ones that totally give you all the inputs and all the outputs of the system. So stating the conditional dependencies is the same thing as stating all the conditional independencies, just like we're showing where the social network, who's friends with who is the same question as who's not friends with who. Who is there is the same question as who is not there. And then to put Markov, Pearl and Friston on a continuum, you have Markovian properties by Andre Markov and other Markovs actually who are working in a more mathematical frame. They don't have access to computers. Pearl brings in Bayesian statistics, as well as computational approaches in the second half of the 1900s. And then the innovations of Friston were to divide those Markov nodes, the Markov blanket nodes into sense and action nodes, which enables this cybernetic approach to input a generative model, internal states and then outgoing action states. In service of maintaining a non equilibrium steady state. And then though it's an ongoing and open and developing area with how we formalize these things. The idea is that by this kind of partition of the blanket happening will be able to have modeling of non local dependencies like latent causes which are not observed in the world. And therefore have cybernetic agents in niches that are enacting and embodied and encultured in a really rich way. And then just to show one more figure. This is from Wanda's paper with Carl Friston, which we talked about just a few weeks ago. And just showing that you have internal states that influence external influence active states. Then there's a line with external. And so here's the blanket here in the pink dashed line is the blanket states and then there's the internal states. Okay, any thoughts on active inference or can we go to context? So again, putting the sort of end parts of the paper up closer to the beginning, the context in practice, the first context introduced is the idea of active inference in the paper. And the motivation for active inference here and we can definitely talk to the authors more is that observers are constantly probing their environments through action influenced by hidden the states and time sensitive policies. So these are some of the challenges that be set context agnostic or unspecified context modeling is when the environment is changing. How does the action policy update? Does it just reevaluate at every single moment as if it was like fresh looking at a new environment? But how do you have these sequences of action that might involve sequential modifications of the environment and uncertainty in the action policy? And the big question, just like, you know, kindergarten is what actions are appropriate? What actions will reveal aspects of the environment important for defining and reasoning within a context? That's the frame problem. What actions will reveal that a context has changed? So we want to know which of the three cups the ball is under. And so there's like this complex thing that's happening as we guide our action by changing the context to resolve our uncertainty about actions, which we can then get more information on. And active inference tells us that there's two ways to minimize expected free energy, which is that we can modify internal state parameters of our generative model through learning and development. Or external states can be modified through action or their own endogenous changes. So active inference, we're kind of going even more abstract than some of the other papers. And just thinking about this question of if you're an agent who's operating under uncertainty modifying the niche, but you don't know whether you should modify or explore or wait, and all these things are coming together, we want to set a frame that's big enough to encompass that. Any thoughts on that? Okay, let's turn to context more generally. So that was active inference in context. They write contextuality in the behavior of complex systems has traditionally been regarded as a practical problem of limited experimental control that could in principle be limited by obtaining more uniform experimental subjects and achieving better control of experimental conditions. This traditional view of contextuality as merely a practical limitation is however Lee increasingly under strain. This kind of resonated with me as an experimentalist because in biology, you'd get the sense like well if it was a clonal animal and we just had the light dark cycle the same and the same animal handler picked it up and there was no pipetting air and everything was the same, you'd get the same result. And it was really exciting to think about to what extent is that like a faith based position and actually tiny amplifications in the system you have two clonal animals but if there's like a bifurcation point, then unless you truly have every single part of the system, you're always going to be subject to these intrinsic contextualities. So that's kind of cool to think about. And then here in the conclusion context is often used to designate what is neglected when an observation is made or an action undertaken environment is used in a similar way. So that's how maybe informally people think about context like right now one of my context is like this cup of water. It's in my environment it's in my niche it's in my local area like their context is used like that like what is around something. And the formalism of a theory like context by default challenges us to acknowledge that context is always there whether attended to or not. So we're thinking about context in sort of a more active way in terms of what influences processes or where conditional dependencies are. And so it might be the case that you can ignore a context because you've actually isolated the system so it's like a closed system or it's a system where you can fully describe it. And in those situations context is there but you've made a special sub problem where you have all the context. But when you're in a situation where you don't have all the context, we need to have a framework for reasoning that has contextuality by default so that we can work better in those spaces. And let's talk a little bit more about this contextuality by default CBD not that CBD. This is the authors talking about contextuality by default and just totally, you know, raise your hand or jump in if you have a thought or a question. And they say that determine determining whether a particular instance of observed behavior of a complex system, particularly one with memory exhibits intrinsic contextuality is not straightforward. So that was interesting, makes sense complex systems, hard to know what to do sometimes whether you want to include memory and historicity or not. And in some ways like if you have like 10 different animals and you think, okay, well they're the same background so we can just consider them even, but they've actually had this history that differs. So they're not the same, but you want to imagine that they're replaceable or shuffleable in a statistical sense but that might not be. And then here's the quantum sense of contextuality. In quantum systems, a set of measurements exhibits contextuality. If it cannot be characterized by a mathematically consistent context free joint probability distribution. Asking whether a set of measurement outcomes exhibits intrinsic contextuality is in this case asking whether a consistent globally defined globally connectable joint probability distribution exists or not. And that's where this formulism of contextuality by default addresses questions by labeling context and includes them explicitly in conditional probabilities, rendering probability distributions context dependent. So let's look at the papers that were cited and referenced and think a little bit about where's this contextuality theory coming from where's it taking us. What do you think? I have a terrible comment. That brought to my mind emotion related to context because I've been thinking a lot about how like if I'm attached to some idea or something like that. And then, and I just remember that I was attached to it but then I go to sleep and whatever and then I forget what it was. I have a tendency to like really be like no that was really important. And so regardless of whether it was true or not, I want to go back to it. And so there's I'm thinking about the way in which emotion is a kind of a context but it takes you over a longer time domain. Because like it doesn't even just sit in a like a decay rate that would normally be related to the other information. It pulls you way back into another space. So it's kind of like long range driver. I don't know. That's just context. And it brings us to this next question, which is, is there this the title of a paper that's cited by the Zofra of 2016 is there contextuality in behavioral and social systems. So if we were dealing with this informal definition of context is just like the environment. It's like, yes, the answer is just yes, behavioral and social systems. Yes, there's social context like, oh, you missed the context on this dinner party. Okay, so off the bat we'll say yes. But we're thinking about it formally. And this is this is the most hyphenated sentenced. I've just ever seen in science. So just the number of names. It was quite something to read. But we're thinking about contextuality in a more formal way, which is like, so let's say someone says you're missing the dinner party context. And then it's like, well, if you would have told me all the info, then I would have had all the context. So then that would make it that's it's a similar notion, but we're going for measurement systems. And they're proposing that their CBD allows one to define and measure contextuality in all systems of all these hyphenated types. Even if there are context dependent errors in measurement, or if something in the context directly interacts with a measurement. These are the two cases that really matter a lot for scientists. So like what if at higher temperatures, you're a thermometer is biased towards over estimating or being more error prone. Then you're going to get this distribution that looks like it has some sort of super specific relationship, but it's totally not the case. It's all an observation error. And the second one is that the context could directly interact with the measurement, aka every single time you want to know something new. So the context which you don't have because you don't have the full context because you're going out into the unknown to make a measurement is going to interact with your measurement, maybe even modify it itself. What do you think, Blue? So I think like, you know, this goes back to that intrinsic versus extrinsic context. Like in a social and behavioral system, there's no extrinsic concept, right? There's no extrinsic context. Every thing that you're going to try to measure, if you ask like, what was the temperature at that dinner party? Or how's the food? You can ask like, you know, 12 different people and they're all going to have different things to say. Or like, did you have fun or whatever? I mean, everyone you ask because it's only defined by the intrinsic contextuality, which is the experience of each agent in the room, I think. Cool. And interestingly, they look at several datasets. None of the data provide evidence for contextuality. So again, don't just, yes, there's a niche, there's a context. We know what social context is, but we're actually thinking about it in this very interesting way where those datasets don't have contextuality. So a good little gut check is if you're thinking about the applications, this paper and you're thinking about it like social context, it's just not that one because it's not. And their rule is that behavioral and social systems are non contextual. Again, if that makes you cringe, it's because you're thinking about it informally and we're actually thinking about it in a way where that claim makes sense. That's where we want to understand to be like, oh, right, informally, we get where they're saying it, but then we're interpreting their claims rigorously. IE, all contextual effects in these contexts, in these settings result from the ubiquitous dependence of response distributions on elements of context other than the ones to which the response is presumably or normatively directed. So let's say you have a driving simulation and then there's going to be some green lights and some big red stop signs. And you do the behavioral measurement and you find out that conditioned on whether it was a green light or a stop sign, the people accelerate or break. So that's your experimental data, what you controlled and what you observed was stoplight and then whether they stayed still or accelerated. Now that is context. It is context for the person who's driving. That's the social sense, but all the contextual effects in this experiment result from the ubiquitous dependence. So across the whole population of response distributions on elements of context, other than the ones to which the response is presumably or normatively directed. So it's like, you thought you were measuring the response to a green light or a stop sign. Actually, you're measuring a social context and a group of people who are in a relation with those symbols, such that they operated in a certain way. So we need contextuality by default so that we don't fall into like correlation, causation, super fallacious scenarios. Here's another case of where context matters. And what's going to be fun is actually this decision making is going to matter for context and erudicity. Here's a paper by Volev. Local choices, rationality and the contextuality of decision making. So every decision has a context. So we know about that. However, rational explanation appears to be challenged by apparently systematic irrationality, observed in psychological experiments, especially in the field of judgment and decision making. So how many times have we heard, oh, people are so irrational, they'll take the $1 today rather than the $1.20 tomorrow, or they'll pick up their children from the daycare at the right time unless there's a monetary penalty, and then that will make them later. They're so irrational within this one variable that's being estimated. Here, it's proposed that the experimental results require not that rational explanation should be rejected, but that rational explanation is local, i.e., within a context. Thus, rational models need to be supplemented with a theory of contextual shifts. Makes a lot of sense. It sounds like something that's relevant to learn about, and this paper has probably other relevant things, but it's like, instead of just saying, well, who's rational and who's irrational, couldn't we just understand that everyone is in a local logic and a local context? And then they're operating a certain way conditioned on that. And so it's not a rational versus irrational divide. It's like, what's the context? Who's the agent? What's the niche? What's the context shift? So that's a much more open, but also powerful way, I think, to talk about decision making. Okay, raise your hand if you want to pause it. Let's take this idea of context. So we started with context in the sort of folk sense and took it to this idea of defining all the relevant variables. Now let's think about defining relevant variables in a model in terms of topology. So here's some cool slides from the Simmons Institute. And in this top right one, it says topology is about distinguishing the continuous from the non-continuous and also about moving around. So I was like, that's a cool definition of topology, not nodes and edges, not how things are connected, but it's a way it's saying how things are connected. And this slide was really informative. When we're asking several questions at once, the answers could obey constraints. So like if you're going to measure the pressure and the volume and the temperature of an ideal gas, there's an equation pv equals nrt that's going to relate those things. And if one of them is out of whack, like probably something is wrong. So those bundles of answers are related to each other could be by the laws of physics, like here's velocity and time or whatever it happens to be, like physical constraints or logical constraints. So like if things are true, then if they're false, if they're not that, and then if it's double not that, it's back to being true. So you can have these bundles that are logical or a bundle that is based upon some of the constraints of physics. And then also the constraints can be rows of a table in a relational database. That's the trip because that's where like SQL and a lot of programming comes into play. And models distinguish good and bad ways of connecting the dots and bundles just like continuous sections. So this is kind of like thinking about SQL queries and database queries in terms of the context that's being concerned. So in the context of this database, I want to make a query within it. So that's an interesting way to think about how things are connected and context. But this is also stuff that we're all just trying to put out there and hear people's reflections and thoughts on. What do you think, Sarah? I just wanted to restate it to make sure I understand it. I mean, basically they're saying that the way you move through context sets creates a topology space. Like that's, is that an accurate way to phrase it? Just another aspect of it. Yes, like there's certain in the space of making multiple measurements, those measurements can have different topologies. Like if you make three measurements, they could be in a line. They could be in a triangle. Like there's that's not at the formality that's on here. But when you're making different measurements, they can either be in a well behaved bundle or a not well behaved bundle. And then what's whether it's well behaved or not is related to whether it blew. What do you think? It's a hard question. I'll jump in for you. So going back to like the intrinsic and extrinsic contextuality. Like so it's contextual. It's contextuality when you have like a set of measurements that you know are corresponding measurements, right? Like so, I mean, it's like, I don't know, do you have bio background, Sarah? Like I'll talk about it just in terms of like, you know, RNA sequencing, right? Like you can sequence your sample in a different lane on the same flow cell or in two different flow cells. And there's variability, right? So you always have to try to like design controls into your experiment from flow cell to flow cell from lane to lane. And so that's why you replicate the sample sequencing just to kind of control that variation in just running it at a different time. The temperature in the room might be different or, you know, the machine might not be feeling as well that day or whatever. So it's these kinds of are the measurements, are they all like, do they all link up together? Like is there like a correlation between all of the measurements or not? So that's the well-behaved bundle is when they do all link up. They have that shared information, right? So I think as we get into the paper, maybe it might start to make more sense. Yeah, okay. Let's go to channel theory. Okay, so this part was super fascinating. I'm drawing a little bit on one of the background papers, Mosaic of Two Spaces and Channel Theory 1 by Fields and Glazebrook, which we're going to talk more about the two Mosaic papers soon. But let's contextualize this in terms of information theory, so-called info theory, but I personally call it disinfo theory. And I've called that for a long time and now I know why, actually. So here's the second paragraph of Claude Shannon's 1948 work. Shannon writes, the fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Wait, is that the problem of communication? Am I trying to reconstruct something bit for bit via communication or is the goal of communication to impel joint action to update other agents, not to replicate a message? Shannon's working with telegraphs and copper wire. So it makes sense why you want high fidelity data transmission, a signal of transportation of information, but let's think about how broad people have gone with info theory according to Shannon. And then he writes, frequently the messages have meanings they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communications are irrelevant to the engineering problem. Oh, semantics are irrelevant to communicating information semantically for people or for what? The significant aspect is that the actual message is one selected from a set of all possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at a time of design. So fair enough, A through Z, you want to have all of those options because you don't know what next letter needs to be poked. But I think if people read through Shannon's work and considered a few of the aspects of the Shannon entropy, like how it disorders messages, like it just asks what's the symbol frequency of the whole message or page or genome. If you think about how it removes the ordering of symbols and is actually asking about replicating bit for bit messages, not communicating in any sense, people would really consider what information theory is. Thankfully, in this paper, the Fields of Glazebrook wrote about comparing, and this was cited in the 2020 paper, comparing and combining the Shannon theory of information with channel theory. Okay, so though very general, as a quantitative theory of communication flow, the original Shannon theory had largely overlooked the question of semantic content. In any Dretsky type theory, the basis of semantic content is in the world, i.e. the events or situations that signals or states carry information about. The car, you know, it's out there, it's in the world. By showing how local logics are connected by information networks, channel theory provides a generative, general qualitative theory of information flow in this context. So that's pretty interesting, it's more about information flow that's semantic rather than syntactic, and then this last part here, there's more to say here, but in 2004, create a synthesis of Shannon's quantitative information theory with the bar-wise Selegman qualitative theory to address the question of how specific objects, situations, and events carry information about each other. That's what people are always talking about when they want to be talking about sharing information, not sharing surprising symbols, but updating each other semantically, informationally in the terms of Bayesian priors about the world, things that we're referencing in the world. The temperature not just as a variable, but like as about something in the world. So, some thought questions. Is Shannon 1948 correct about the stance taken on the fundamental issues of communication, or what communication is or isn't? How is channel theory distinct from information theory, so called disinfo theory, we can start calling it, and then how is real communication the kind that we care about and want to model in many cases? How is it both syntactic and semantic? All these things at once. Grammatical, logical, as well as lexical, word-based. It's also narrative-driven. A narrative or the narrative context can make just one word semantically powerful. Speak now or forever, hold your peace. It's a narrative context where one word cannot be reduced to just the bits or something like that, but no is a common word, or yes is a common word. It doesn't matter that it's a common token because it's semantic information, and then how is it contextual? Blue? So, I mean, I think that this of like, I like Shannon information theory. I like how it's very logical, and it appeals to like my brain structure. The semantic information that's contained in this channel theory description is very squishy, and I think maybe not as appealing to me, but I do think that it's more correct, right? Like when you say, when you reproduce the information, like if I say something like, it's raining outside, I live in the desert. So like, it's raining outside is like a really awesome event. Like let's go play in the rain, right? So when I say that, and someone else says, oh, it's raining outside. Like, oh, my plans are destroyed because I wanted to go to the park, and I can't because it's raining outside. So it's the same information, but the way that I said it was very different in those different contexts, right? So, cool. Let's return the discussion from information flows and communicating information, and we're kind of thinking Shannon syntax information, reproducing bits, bits on this computer to bits here. We're going to take it up a level to semantic information and then semantic information. So like, I'm thinking of a purple cap. Now you're thinking of a purple cap, but that could have been in a different language, and it's not really related to how surprising each word is. Let's return to this question of local logic and contextuality. So here's a paper, not sure if it was cited or not, but it's going to bring together local logics, contextuality, and topological approach. So we're just going to kind of give multiple coats of paint on everything. Local logic is like the local rules or the context, and in this Qashida 2016 paper, they wrote, the topological approach characterizes contextuality as global inconsistency coupled with local consistency. So that was kind of cool to me. It's like, if you have this field and it's globally consistent, then it's like transparent. There's nothing inconsistent to speak of. There's no fracture in the glass, so it just is purely transparent. So that's just part of the same context. But local consistency coupled to global inconsistency is like a bubble. It's like it's an inconsistency in the tank of water, but then locally it has another transparency that makes sense in that local context. And so you could have like a computer program and then that could be a local logic, but it's embedded in another logic of being a Turing complete or whatever kind of computer. And then within that program, there could be like a simulation of another program of an N64 running another program. And even if it's all the way back up at the top, it's running on Linux, but that local program is running on the N64's local logic. And then another kind of metaphor in the paper, they write, our goal is to capture this local consistency part, not just the global inconsistency. So the global inconsistency is like our receptor is not getting any photons. Boom, it got a photon inconsistency. So not just that part, but the local consistency part. So how do we know? How do we know that it was consistent before that photon hit, which requires a novel approach to logic sensitive to the topology of contexts. To achieve this, we formulate a logic of local inference by using context sensitive theories and models in regular categories. And that's going to be powerful. Here's how I was thinking about this in terms of chess. So the rules of the game of chess are like which piece can move which way and then some of the imperatives. Like you have to alleviate check, otherwise you lose the game. That's the context, the local rules of the game. Then there are local or situational patterns like a pin, where one piece is like in front of the king and it can't move out of the way because then the king would go into check. So that's not a rule of chess, but it's an emergent pattern that arises from constraints in the space given the local logic of chess. But now we're in this even sub-local logic. I'm not going hyper formal, maybe chess players have another way of doing this, but it's like there's a local logic in the corner of the board with a king and not every piece or rule of chess like, okay well the pawn can move two or one. But that rule might not matter for this pin with these pieces. And so not every single rule or piece is going to matter for these local logics and maybe there's a way to pursue this topologically in terms of the connectedness of different relationships of the system. Let's take that to chess bases. Okay, so this one was the second section of the paper, so it's like you've got to understand and learn this one. So we were kind of excited to do this. So they start by defining one of the simplest categories. The in category theory is the category of two spaces which we're going to talk more about. And it basically is about a set K and then within K there are kind of two kinds of things, objects and attributes. And the two space is like a matrix that is the relationship over that set K of the relationships of A and X. So it's actually probably a data form that we've seen a ton of times. Like here's 10 people and then have they RSVP'd for my event or not? Yes, yes, yes, no. 1110, that's a choose space for RSVPs. But it's like that, you know, here, but it could be a lot more general and we're going to keep on generalizing on that. But the basic choose space is a matrix, but it probably can be higher dimensional that is just a regular matrix. But a topological space is also matrix. So you can have like the adjacency matrix, right? Like all the nodes by all the nodes one or zero connected or not or a weighted graph, how strong is the connection? Or it could be symmetric and undirected or there could be a from and a to. So like all of topology is matrix forms and it turns out that they're choose spaces. But also we can think about like programming languages and we can have a type as one of an attribute or a type. You know, is the wood tabling choose space or is the object a table? That's a choose space. So there's a lot of programming relationships that the choose space covers, probability spaces, coin flip, you know, content, you know, is it going to happen or not? So this, it's like if it's a matrix, I don't know if it's, you can go as far as say it is a choose space but it certainly a broad family of objects or categories, I don't even know what to say there are within this choose space paradigm. There's a few more definitions because we looked up just a bunch. Here's from Wikipedia and basically the big idea is that the choose morphisms are transformations on choose spaces. And if things are set up properly, the morphisms continuous and the pair of functions has some special relationships. So one example this makes me think of is like the typographical number theory in Gertlescher Bach where it's like one plus one equals two and then you make a set of text rules like you can add an S to the string under these contexts or you can add or you can, you know, it's like it's basically doing addition and subtraction by typographical like letter based rules. So if you can prove something in one system because you've super strongly shown that there's a morphism you can show that that same relationship should exist in another system. It also turns out that's actually at the heart of like the whole Gerdle question but that's the static interpretation and then here's dynamically choose spaces transform in the manner of topological spaces. Yeah, don't want to go too deep down here but there's special relationships of two choose spaces that are two morphisms and then if that morphism is bi-directional it just allows a really nice thing where you can sort of make a finding on one side and then arbitrage it to some relationship that might have been difficult to show on the other side. So it's like a perfect language mapping you'd say, you know, the boy is younger than the car. If you had a perfect language mapping semantically you could say it in another language and the semantic information would be perfectly preserved. So then maybe one person has a piece of semantic information in English and then they could convey it through the perfect morphism and it would be correctly semantically understood on the other side. Okay, here it's just a couple, we wanted to actually understand it and also shout out to master students everywhere. So here's history from this paper. The shoe construction takes a symmetric manoidal closed category V and with pullbacks, I'm just reading it in an object K of V and completes V to a self-dual category Chew V K. The details of this construction appear in Ph. Chew's master's thesis published as an appendix to his advisor's M-Bar's book introducing the notion of a asterisk autonomous category. So, chapter that was an appendix and now here we are, many years later. So it's all good to research whatever you're researching on if it appears to you to be the big question. Here's Bar's perspective. So Bar, it's another perspective on Chew Spaces. Bar had this idea during his sabbatical to study duality in categories in some depth. Just what most of us take our sabbatical wondering. He wasn't wondering about the simple dualities but self-dual categories like complete semi-lattices or finite abelian groups. He was interested in the possibility of having a category that was not only self-dual but one that had an internal home and for which the duality was implemented as the internal home into a dualizing object. It's technical. So duality is going to mean something different to a mathematician than a non-mathematician and I don't know what a home is. However, searching for an internal home. I just thought, okay, the dual. It's like forwards and backwards or it's like it goes both ways. That would be nice. You know, reversible. It's like those are always good things. And then the internal home. Internal homes when changed together form a language called the internal language of the category and I thought, okay, so now we're going from categories and attributes to like something else. The most famous of these internal homes are simply typed Lambda Calculus, which is the internal language of Cartesian closed categories and the linear type system, the internal language of closed symmetric manoidal categories. Lambda Calculus is something that's quite broad and central to computation. So this is like something about how logic, local logics and nested logics are related to categories in an analytical way such that the kinds of logic that we see on computers is only a type of it because in a way we're dealing with syntax based logic. You know, you have two files. Are they the same? It lines up the bits and if they're the same, it's yes, no. But what would this mean if it could be semantically the same and how would we make that non squishy because to ask whether semantics were the same, you need all the context. So that's kind of this fun twist and it makes sense why for simplicity you use frequent statistics instead of Bayesian inference because it is hard to do sometimes or you do have to grapple with unknown unknowns and it's easier to just do a t-test or a multi-linear regression rather than some other type of generative model because that takes like bravery to specify your uncertainty and to give support to why you did a certain thing rather than just say, well, you know, the countries differ with a p-value of .01 but it's easy just to say that and move on rather than actually do some sort of rich modeling underneath. Okay. Two spaces in channel theory. So like we're kind of just trying to bring all these two, you know, connect two of the ideas and then you can start nucleating some ideas from the ones that are making sense. In channel theory, which again we're thinking of like the more general cousin of Shannon disinfo theory, in channel theory, two transforms, these are the transformations between the two sets or spaces become infomorphisms which are natural maps between classifiers. Okay. So, Blue, what would you say about the red underlined part or what you wrote at the bottom? So, I'll like transition you into the next slide maybe here in a minute, but just the red underlined part, it's the transform from A to B. So, if you have one two space and another two space, whatever it's like, you know, you can think about like a territory and the map of that territory. Like whatever transform going from the territory to the map, that is that channel, right? It's thought informally of as a channel. That is like the channel that we're talking about here in terms of channel theory. And so these, the classifiers here, so a classifier is the set of, where did I have that here somewhere? Classifiers link tokens to types. Oh yeah, but so, this is the definition of classifiers. So classifiers link tokens to types that encompass them. Oh yeah, because you've already read this. So basically the two transforms become informorphisms which map between classifiers. So if you have different sets of classifiers, right? So classifiers link tokens to types that encompass them. So for the type stoplight, the set of tokens could be like, you could have a red stoplight, a yellow stoplight, or a green stoplight. And that's also known as a local logic, the classifier, which we talked about local logic, just a few slides back. So mapping between classifiers. So here if you have the classifier stoplight and then you have like what the motorist should do with the stoplight. So that's the second classifier. You could have the tokens like stop at the red light, go forward at the green light, or proceed slowly at the yellow light, or like get out of the intersection or whatever it is you think you should actually do at the stoplight. But then you can have other classifiers too. So it doesn't just have to be the stoplight that the motorist should do. It's also like what pedestrians should do and like at the game, like red light, green light that you play when you're a kid, like you should run forward, run slowly, or walk, or then stop. So I mean, there's different classifiers that you can have that all map to the same central information core. And so that's what you see here. The classifiers are these things, A1, A2, and you can have many of them up to A to the K. And then you have at a core A to the U to the A to the N. So this is the infamorphism that allows semantic information, not to shanon information, to be transmitted between classifiers. So this comes down to having a shared memory. So we all learned from the time of verse 6 that red means to stop, green means to go, yellow means to slow down. So but to a six year old, it means running. You know, when you're a pedestrian, you're looking out for other cars and there are other behaviors that you do that are policing the motorists. There's all kinds of people that have different reactions to what the stoplight is doing, but that's all based on this shared memory. Like everybody has been taught like this is what you're supposed to do out of light. And so that's the core information channel through which information flows. It can also be thought of as a shared memory or a shared context, if you will. Cool. One thing that makes me think of here is like this sentence that said the existence of a chew transform A to B is equivalent for every flow formula valid in A being valid in B. So it's kind of like you have the chess board and the board position says it's game over and then you have the the data file and then those two are within a bigger C that is establishing a connection between these very different types. And so it's like there's in the example you gave the local logic is like there is a total logic that it does all come back to even though there's disparate types and there's sub-logics. Let's hear what everyone else and Chris can bring to this discussion because it's like super interesting to think about. Yep. Okay. So what about sequence blue? So the sequence here I just wanted to put this in because I thought it was important for information flow. So sequence when a sequence encodes a semantic like a causal constraint that effectively functions as a logic gate. So this really what it's what enables semantic information as opposed to just Shannon information. So it's like you have the message but there's a logic gate there that says if the message is applicable or not applicable or to be transmitted or not. And also that it I think puts forward into the it enables putting forward into the scale free or scalable infrastructure here I think. And then I like all this like math symbols like I've never seen this stuff before these like what is this like parallel lines, olipop stick I don't know. But I just I put a box around this because it's like oh that goes back to Bayes probability. All right. I know this. So coming home again. Cool. I think this one just reading this one X then shape N to assign a probability X satisfies N. So maybe it's the sign of satisfaction. I guess it just says something that we don't know what it looks like. But it's like there's a satisfaction of X to assign a probability that X satisfies N given that it satisfies M. So it's like what's the probability that we're in checkmate on the board given that the file says checkmate. Well if you have the morphism it's perfect. But if you don't have that morphism then it's probabilistic or it's different. So it's a conditional probability that's conditioned upon the way that you can map across different kinds of things and it was the perfect connection to Bayesian inference because at the bottom here we have basically M satisfying A is the chew space. That's the scarlet letter the fancy A is the chew space. And so this is almost like saying M is satisfying some chew flow relationship with N that is itself about how tightly the spaces are connected. And so then we can take this P M given N single line not satisfied and it is belief updating. And so then it provides the necessary semantic consistency for such updating. Infomorphisms thus capture a significant representation of Bayesian inference with the necessary coherence since the target classifier context admits the same semantics as that as the source within the information flow. Information or infomorphisms are infomation flows that capture the updating of something semantic because it's semantic in that Dretsky meaning is in the world way like the prior and the posterior are about the world and then as new data or events come in which I didn't think we capture as much here but like this event theory connection it's like Bayesian statistics is kind of like the steady state and then new data comes in it's like an event in a network and that in that event propagates through a network of semantic information flow and then that updates the system and maybe it's like the system's already vibrating and so the new event comes in nothing changes with a vibration that might be a situation like the system semantically is already at its non-equilibrium steady state and so it's like not surprising but then there's other kinds of events that enter a system that might be more surprising but it's all about the semantic nature of the system okay so wait wait can I just add onto this but like in our contextualizing like and I didn't know if you were going there or not but I just want to explicitly state that like Bayesian inference is the event like which we are updating with active inference and category theory and two spaces we like like implying that or I just wanted to say that explicitly because in this context contextuality talk about contextuality we are updating our own like events with our with our own knowledge yep and I think it was sort of fun about this semantic info flow idea the whole channel theory thing which it really didn't get exposed to in a lot of context was that instead of the propagation of bits and bytes on wires and the signal fidelity and then it's like well then where does the semantics get unpacked does it get unpacked at every little post office and they just say no no just syntax all the way down so it's inherently reductionist because you go no we've explained away meaning because we're studying the connection from an entropy perspective and then it's sort of a strong internally logical argument which is why there's people who believe it and then it's like could we actually think of semantics as the the heart of the question because when we're talking about communication of action or modification of the niche or solving the frame problem we don't need the syntactic differences we want the semantic differences like what has changed my car is broken I don't need to know the syntax but from an action oriented semantic view that's what we're communicating on and then we're seeing these specialized communication message passing systems like on a silicone ship it's a special case it's a controlled system where we've defined the context semantically so that it isomorphically maps on to bits and voltages okay here's the two fields and glaze brook papers from 2018 mosaic of two spaces and channel theory one and two so one category theory concepts and tools is like a lot of formalism diagrams but two has some nice images and that's applications to object identification and marological complexity marological is parts and whole relationships and this is to go from the two spaces to this toe cone a system capable of both object history construction and its dual category learning so just there history reconstruction and its dual category learning it's like I'm learning about cars being broken or not and then if you tell me it's a broken car I can generate a vision in my mind of a broken car so it's like you have a bi-directional relationship that's open to generating specifics from the abstract or updating the abstract from the specifics which is really related to this hierarchical Bayesian learning and active inference we have reached after many years of research an understanding that objects that have this sort of analytical relationship between object history construction and category learning with a singular category maintenance as a special case I think that has to be one object that does both a self-dual not just two objects one that does one part one that does the other is characterized by a cone-cocone diagram a cone-cocone diagram captures the simultaneous upward and downward flow of constraints that characterize human visual object identification and it is reasonable to suppose other sense modalities and they then take it very instantly into something fascinating which is the functional duality between a high precision expectations and high precision inputs in an active inference system and hence the duality between dorsal active and ventral passive attention systems so is about this precision modulation and the way that you get high precision expectations and high precision inputs so how do you have a clear generative model of the visual field but then at the same time you want to be able to recognize new objects at high fidelity as they come into the screen come into your field so how are you going to have high precision generating and high precision learning and there's going to be this rich multi-level ongoing integration of priors and new events and posteriors that's semantic so it's like I'm looking at oh it's a face and then my eyes engage in an action selection pattern appropriate for a face that reveals some sort of anomalous information pattern and all of a sudden the meaning has changed and so it's almost like what is crossing the interface perhaps is the data, the observable, the measurements but what is crossing is actually like the reconstruction of semantics rather like if it's a car driving across your view but it's going 5,000 miles an hour and you can't see it then for you as the measure it doesn't matter because you didn't see it so even though it was perhaps there but it's just a blur and so this is just getting at something very interesting which is these systems that are exquisitely generative yet also hyper able to recognize and to update the generator based upon the specifics of which have been generated up all cats have stripes alright you just saw one that had polka dots and now you can generate that and then maybe you might generalize and even generate another type of pattern of cats and then you could see whether that was accurate or not so here is the diagram that blue talked us through here with the core information channel C at the top and so this is the cone diagram on the top it's it's like a pyramid diagram but like maybe if it rotates it's like a cone or there's some other way of thinking of it like a cone and you have the classifiers and this is the whole choose base as classifiers scarlet letter a choose base so the classifiers are linked through infomorphisms semantic bridges to the core information channel so like here's the chess board here's another choose base and here's another choose base for chess and they're all linking to this like game of chess local logic and then there's a map between the board and the computer program so g1223 are the commutivity so this is like you gotta link between the board and the computer and the computer and the drawing and they're all infomorphisms to see the game of chess so semantically if there's a checkmate on one board it's like a checkmate across paradigms now when you're um a tight cone and it represents like a huge variety of things and then I'll get you the second blue the cocoon is when we let me interject before you do the cocoon that's why I was like trying to flag it so I want to interject here for Sarah because remember we were talking about like the the observables being like a tight like a tight little cluster of measurements versus a not tight cluster of measurements and so here they define it in this before we go into the cocoon um so you can kind of take this back to like the stoplight and the kids playing and what the driver does and what the gesture does and whatever but here if you think about it instead of like that the categories or sorry the classifiers it says when we define a finite information channel C as a finite indexed family of infomorphisms those are all the F's having a common core the component classifiers will be taken to characterize a system of observables or some sub components thereof so here they're specifically stating that all of the classifiers here are all of the things that can be measured or observed in a context that's extrinsic as opposed to intrinsic I think so I could be wrong I hope that someone corrects me but I feel like this is where it's going cool cocoon diagram then we introduce the cocoon the shadow as above so below wouldn't have it any other way a commuting finite cone cocoon diagram which one's the cone which one's the cocoon right comprises both a cone and a cocoon on a finite set of classifier so they're mapping to the same set of a classifier same two spaces are being bridged and um then you have like sort of two systems looking at the same bridge and then if there's a commutivity across the elements of the of the two spaces we have that from above and then there's infomorphisms from all the A's to C prime and all the A's um to D prime then it's a special kind of object what is special about it it captures in addition to a ton of other stuff probably a more subtle duality remember the whole thing about studying objects that were dual objects from like bars 1975 paper and chris fields his other research like what is a dual object shadow a more subtle duality between processes so not an object duality but a process duality maybe forward and backwards it enables object files object tokens and object histories to be viewed not as tokens but as types that organize respectively trajectory components that's files features that's tokens and feature based singular categories histories into mutually consistent collections so this is some next level token algebra and I wondered if it related to crypto because it's like instead of the history like the object history my web history well that's actually a that's a singular category it's a singular trajectory you take this list of a million things and it's like one of a unique kind of thing and then even objects and tokens and descriptors are all understand under being understood as highly relational highly semantically mapable way so that's pretty cool and I guess we'll look forward to seeing what the authors can help us with or like what system could we apply this to first or what would be an empirical system where this is being measured or what would we gain by measuring something this way yeah what are various examples of context and then here's just a few figures from the other paper the number mosaic 2 so here's a cat well it's an image it's a pattern on a google slide who's to say it's a cat right that's semantic there's all kinds of things that we might want to have so it says an object token is classified at construction into multiple types by distinct but cross modulating processes this includes animal animacy and agency detection emotion mediated threat detection and entry level followed by super and subordinate categorization into types of objects so it's like depending on the context and the agents observing this stimuli whether it's a predator or prey whether it's a cat or whether it's felus domesticus or whether it's a sub breed of cat whether it's something that you approach or something that you withdraw from or something you associate culturally with good or bad luck that is like the agents ontology semantically as they interface with a cue in the environment and then this is a funny image it's mapping chris fields' brain so it's like it's it's it says like cf that's him so here's cf is a human humans has a brain and then here's like another ontology of brains a private brain is a mammal brain is a vertebrate brain is a brain so you can go superordinate and subordinate categories in a relational ontology as well as have these like totally disparate classification schemes just like you couldn't programming that person has a height and a weight and that's in their public health record and then there's this other thing are they a friend or foe and that's my private information and then all of those are being linked semantically in a special kind of system alright that kind of that's one little section here is bringing in this element of concurrency so some people will have heard about this from programming languages like go lang and it's called but this is pretty interesting because it's from 2005 and it's called paper called by Pratt it's I think cited in Glazebrook and Fields two spaces and their interpretation as concurrent objects so it's yet another way that we can think about two spaces not just as mapping attributes to types or doing all these other transformations but like also as concurrent programs so they write here that the two pressing questions for computer science in 2005 maybe forever are what is concurrency and what is an object the first question is of interest to today's sibling growth industries of parallel computing and networks both of which stretch our extant models of computation all well beyond their sequential origins so you have the Turing tape and it's just zooming backwards and forwards on a tape it's a standalone machine it's not connected to the internet when you have machines that are in networks then there's semantic logics that are happening due to those local logics that are Boolean but at a higher level they might be more like a complex adaptive system like a DDOS attack on a network or a certain type of governance game theory situation in a crypto economic network using a CAD CAD model those kinds of complex adaptive dynamics at the network level may or may not actually be able to recapitulate from the syntactic or from the lower level and so that's the first question is how do we think about about concurrency and about designing for good concurrent systems when our parallel and distributed computational and even unconventional computational paradigms are rapidly expanding the second question is relevant to programming languages where the definition of object seems to be based more on whatever software engineering methodologies happen to be in vogue on the intuitive sense of object so in other words people use object and process in a way that might have a name space meaning for computer science but is actually less connected than we might want it to be to the formal math and this is pretty another last quote on this paper they write in as the separate chapters of a theory of concurrency all these different topics petri nets event structures and domains stone duality and process algebras these topics make concurrency something of a jigsaw puzzle one is then naturally led to ask whether the pieces of this puzzle can be arranged in some recognizable order and then they give a strong endorsement of the two space framework so it's just interesting that these two spaces can be objects of relations that can be processes but also it can include context that's what this paper is about and algorithmic and game like elements so if you can do it in go lang if you can do it in the process algebra if you can use higher order reflexive process algebra maybe that's a true space all right let's talk a little bit about ergodicity so this is going to have a few slides and definitions so on the top left a process it says to be ergodic if the ensemble average and time average are equivalent and here's a more formal definition from this is from the neuro bytes site a more formal definition is from only Peters who's been at the Santa Fe Institute an economics professor the ergodic hypothesis is a key analytical device of equilibrium statistical mechanics it underlies the assumption that the time average and the expectation value of an observable are the same where it is valid dynamical descriptions can often be replaced with much simpler probabilistic ones time is essentially eliminated from the model so this is on the website neuro bytes it's a live and updating but here's a static it's kind of thinking of two ways you can make purple from blue and red one is to have the spatial resolution increasing and pretty soon semantically it's interpreted as purple so if you doubt that you can zoom in on an old comic book and see how they made color which is by increasing the scale of resolution really small then it makes it look like a new colors there or you can flash things really fast and then it will appear as if there's purple and those equivalencies or ways of reaching and also deviations from ergodicity spatially and deviations from ergodicity temporally are what we're talking about now why does the ergodic assumption matter and what happens if this is empirically if this is violated so here's an example from modern portfolio theory which is what Olli Peters and others like Glenn Megerman study which is that and this is a quote from neuro bytes modern portfolio theory stipulates there exists some optimal risk to return profile but the theory rests on the ergodic hypothesis by conflating an ensemble aggregate return with the individual's path-dependent return through time so it's really interesting and this is related to the St. Petersburg paradox and a couple other economic ideas that basically the total population's performance or a hundred different portfolio's performance at one snapshot is not the same thing as any portfolio being carried forward through time why does that matter how off on the wrong track would we be so here's some funny examples of ergodicity violations and also a hypothetical example of ergodicity so number two from the neuro bytes if it takes one pregnant woman nine months to give birth nine pregnant women should only take one month and then this says recipe says bake for 60 minutes at 200 degrees but I'll just bake for 30 minutes at 400 degrees so like if the degree minute is all things being equal why couldn't we just vaporize it with 10,000 degrees celsius for just a split second it turns out that there's a process that has to play out and so yeah what do you think about this blue or how would you link it to anything else we talked about today um I'm like I don't know I think my brain is I agree absolutely like at the nine pregnant months I'll say something I find those examples actually worse than just giving me no example because like ergodicity implies you have to have a system level demarcation and so you know to compare it to like nine pregnant women it's like well the system is the woman so you can't really do that in the analog it actually makes it harder to understand good point yes it's it's all but it just yeah I think it makes it can work with using I agree it's not a perfect mapping it's like syntactically it's the kind of error you'd be making like here's the ergodic in a life span but then if you go too down the rabbit hole with trying to add the variables that's why we always pull back to the actual formalisms so here is from math encyclopedia and here it's I thought it was quite interesting so it's about trajectories of points and saying that as you'll hear it with many other definitions that it's about visiting a phase space through space and through time that was for that first example here's what was interesting though when the Birkoff ergodic theorem had been proved it became clear that ergodicity is equivalent to metric transitivity measurement transitivity the more general situation it was no longer suitable to talk of the equality of time and space averages systems with an infinite so what kinds of systems are beyond spatial and temporal ergodicity systems with an infinite invariant or quasi-invariant measure not only flows in cascades but also more general transformation groups and semi-groups transformations groups categories semi-groups we saw Ibellion semi-groups earlier in the bar paper so it's almost like we're taking ergodicity even beyond the trajectory visitation idea I don't know where it's going in the way that the authors are using it but it's like there's like the you know the totally inadequate metaphor like of just making a total category error with the recipe and then there's the more empirically observable strategy violations and like you're gonna have a bad strategy if you make if you're using a model with an ergodic assumption you're gonna do strategy that's just ridiculous you're gonna fit what seems to be the best or the most likely strategy and you're just gonna be way off base empirically but this is even taking it to another way okay let's just with a couple of these questions bring some of our clothes I think you may have raised this one Sarah but it was asking that this context by default generalizes the intrinsic contextuality of quantum theory to the case in which other properties of a context for example the order in which questions are asked also affect the distribution of a variable of interest this kind of direct influence is generally avoided by experiment design particularly by imposing no signaling conditions in physics but it's virtually inescapable when studying complex macroscopic systems so these are some pretty open questions how does context relate to ergodicity especially when systems are changing their context or vice versa or both and how does this contextuality by default specifically help us here and then how does fields in Glazebrook build upon contextuality by default see maybe add another plank to this question okay here's another set of questions the existence of complementary observables has perplexed physicists for decades Feynman characterized it as the only mystery of quantum theory how are such paradoxes of the result at the implementation level if they are or worked around if they're not what happens when work around mechanisms fail what is required to detect failure post hoc ability to prove this experimentally would begin to answer the broader question of how prior probabilities are represented in memory and how different sets of priors can be deployed in different contexts so how do we know what we don't know for a given context and how can a context be made to entail unknown unknowns how do we bound are not knowing and I think part of the way is going to be defining everything as the Markov blanket which is going to be us internal states and blanket states is what we can observe and control and so everything is going to be conditioned on generative model of external states so we're never even going to peep outside of the blanket states we're going to be super serious about modeling ourself and the blanket states as interacting with and modeling external world states so instead of how tall is the tree you're going to have like you and the measuring device and the measurement in a system and maybe for some easy questions it's going to be total overkill but I think for other questions it's going to really open the door because once people frame it in this way what was an unknown unknown is going to be something that we can actually encapsulate so that's bringing us to this question here's a quote we apply this development of ideas to recover among other things a new theoretical and inferential role for the Markov blanket concept we read this in the introduction because it's a key point as the locus at which free energy is minimized and the epistemic knowledge barrier that keeps contextual information hidden from the observer so where's Markov blanket coming into play what does this have to do with our active inference modeling and learning and then just all these issues that we've been talking about representation internal versus external perspectives and where is this paper coming into play okay then Sarah this is one of your suggestions slides from a couple weeks ago so but it's relevant here again so what maybe what if you want to anything that's interesting like about context or frame or reservoir just your you know a high forwarding philosopher of science so it's always good to hear what you think about it and my brain is pretty fried but I think I can my eyeballs are fried anyway the thing that strikes me as a kind of contrast about what I'm able to read from this paper is I don't know I get the sense when they're writing in the paper that they just have this like you know how can we have the eye of God in like all contexts and shifted like I you know or at least they're talking about that in the context of like modeling and talking about how our world actually is but the thing that makes reservoir computing so powerful one of the things is it's non-ergodicity it's kind of natural partitioning it's some heterogeneity in some way and so I don't know that's really interesting to me this kind of how can a body compute and what are the attributes about a body that what are the kind of imperfections about the body that make it intelligent in that sense so yeah you know in what way do channels form like these natural cut at natural joints or make these kind of boundaries and why I don't know that's those are the interesting parallels for me yep that idea of like it's the local consistency the organismality the end in of itself that's like the Kantian organism concept but it's an aberration it's far from equilibrium from the global context which is just hydrogen molecules dispersed so it's like when you have a global disordering or some sort of perturbation and then you have a locally organized bubble it's and then it's like it is our unique differences and situation that allow that agent to be adept in the niche it almost couldn't be another way or it's hard to see I don't know the thing that's interesting to me is like you don't even have to think about well I'm sure there is more you have to think about in terms of adaptation and response and things like this but just how does a rock compute and there was one paper that I was reading where they were talking about like well if you stick a rock in the oven it doesn't necessarily heat up in all the same places so you do have this kind of you do have these non-homogeneous kind of attributes about the rock and it's a very loose analogy but that is essentially like this ultimate type distribution that is what makes these bodies computable in some way that is yet still a little bit mysterious. Yep and it's like if it were a little worm you said if it oh its body is set up with the muscle and the connected tissue in the nerve so that it moves away from a prick from on the side it doesn't need to be computing where everything is if its body is able to respond then it can do it and it needs to do the niche. But I mean you don't even need to be alive that's the thing that kind of blows my mind about how this computing works. Yes. So implications and next steps the first so this is good that they went there so the first question which is expanded upon more in this Bestieva paper is exploring contextuality for human measurements. Cool. So understanding how context switching is implemented at the neuro cognitive level so that's the Feynman quote that we just looked into but basically what is happening when it's like somebody's your friend you're in a narrative of friendship and like the cultural down to the you know the day down to the second down to the micro expression you're in a narrative and then a stimuli happens and the context switches and that trajectory is still a unitary your relationship with them is still a thing but it's changed in a way that the context has changed like how does that get implemented this combination of the high precision priors and the high precision inputs how do we get the best of both worlds with this rapid attentional modulation as brains into and away from objects and then context switching those are features that are just not within the scope of Shannon disinfo theory and they're much more semantic questions a third and broader question is whether intrinsic contextuality can be detected and characterized at multiple scales in complex systems generally by formulating a scale free model of contextuality in category theory terms and in particular by employing the concepts of classifiers and information channels we hope to have opened a new pathway towards addressing these questions in multiple ways or addressing these questions in multiple systems so maybe if there is something special to be said about contextuality or the violations thereof two sides of the same coin maybe we could identify in a less biased way what scales of organization are relevant or just detect new dynamics of complex systems because there's a lot of recent work showing that with complex systems they can emit structured outputs that have the signature of white noise the signature of nothing, like an encrypted file is going to have a distribution syntactically that's going to be looking like nothing different from the frequentist's expectations but it's a semantically encoded local logic that is allowing communication through that channel so if all you have is the white noise detector you're going to think that looks like the signal is not containing information but actually semantically it is there could be systems just in and around us that are doing things we don't even see and that would be no weirder than if there was wavelengths we didn't see we know that's true we know that there's things that are too big or too small for us to see so could there be things informationally or from a narrative perspective that we're just not perceiving doesn't seem out of the picture so I think it actually is a major area and for those who have participated or listened to this deep in Chris Fields was the one who recommended this paper to us and just said I think this is a topic that is not really thought about in the active inference community that much but it will become important later I don't even know what topic he was referring to but this was a paper that really raised some huge questions so all kinds of questions that we have but I think it's fair enough just to say thanks to R2 slash 3 brave and awesome participants thanks Sarah and Blue because that was a fun convo and we definitely laughed and memed because it was like the hardest paper I think we've read yet definitely but you know the paper was so well written like I am not a big fan of reading like math papers and lemma this and theorem that like I'm not I usually don't enjoy it at all but I really did find this paper to be very well written and easy to follow I agree when there was a formalism it's a theorem definition, figure, formalism it's like okay I'm not going to understand that but I know that that's just the theorem so I could just say okay got it something I don't understand put it away and encapsulate that so I agree um any last thoughts Sarah or Blue I just said I was almost wishing like as I was reading the paper like for some concrete examples but we would have ended up with like a 65 php so I guess that's our job to be the concrete example you know yep and that will be fun to talk to Chris about so um let's until the time we actually get to talk about this paper which is going to be on March 9th so between now and March 9th let's just chill you know do our other jobs instead of just read this paper um think about what would be really powerful and meaningful to ask them semantically