 Hello and welcome. This is Active Inference Gesturing 79.1 on April 3, 2024. We're here with Roto Kanai talking about meta representations as representations of processes. We'll have a presentation then a discussion. So thank you very much for joining and looking forward to hearing the presentation and talking. So go for it. Thank you very much for this opportunity to present our recent work. Thank you for spotting our very recent paper which just uploaded the preprint. So in this talk, I want to talk about what meta representations are. But to explain my motivation, I want to start from my kind of frustration in theories of consciousness. So a lot of theories of consciousness tend to be described semantically. And then, for example, in today's talk, I mainly focus on the higher order theories of consciousness. But it kind of makes sense at the semantic level, but when we think about how we can implement such high-level theories with neural networks, we realize there are a lot of uncertainties about how we might be able to implement them. But I think this is something we can call a constructivist approach. So by thinking about how to implement theories, we can actually reveal vagueness and make the concepts more precise. So that's the main topic. And so as I said, I want to talk about higher order theories of consciousness. Maybe many of you might have heard of it. But just broadly speaking, a higher order theories of consciousness claims that mental state is conscious, not just by having that processing, but it becomes conscious if it's represented by a higher order mental state. So by higher order mental state or higher order representation, generally it means it's a representation about another representation. So intuitively, I think this makes sense. So for example, when your part of your brain is processing something red, people ask whether that's sufficient for a conscious experience of seeing red. But when you have an additional representation that represents that you are processing something red, so that kind of mental representation seems to be functionally related to awareness. So that kind of makes an intuitive sense. But once we start thinking about how we can implement this sort of mental representation, it's very unclear what it really means. So we try to find a way where we can sort of represent the mental representations. But let me explain why it's difficult. Okay, so this is a very naive way to think about constructing mental representation. So let's say you have an image as an input and then there's some processing in the brain or deep neural networks, then you can represent the contents of the input or an image. But if you transform this first-order representation by another neural network, that's a really naive way to implement mental representation. But with this, you can easily have many mental representations. So let's say if we think of this as a visual information processing in the brain, this could be the retina, and this could be LGN, this could be V1, V2, and so on. But with this, somehow everything is just a representation of the first input. So somehow this doesn't work. Or if you think this is enough for mental representation, then everything needs a mental representation of something else. Yeah, so that seemed a little bit strange. And another way to conceive mental representation is to think about confidence. So in the context of metacognition and cognitive neuroscience, we often take confidence reports from participants in an experiment. For example, we could present some visual stimulus and then people report what they saw. If it's a color discrimination task, they could say blue versus red or something like that. But then you use this first-order information to make a perceptual decision, but also you can report the confidence of seeing that stimulus. So this is probably a bit better than just the very simplistic transformation as a mental representation. But on the other hand, if we think about implementing this, this is a very simple operation. So people working in artificial neural networks easily use something called Softmax where you convert this kind of representation to some sort of probability interpretation. So basically, if you convert activation patterns of this layer to that so that it normalized to one, then it is confident. So in a way, this is just one very simple transformation of first-order representation. So maybe this is still meaningful when we think about how the brain process this kind of uncertainty. But this may not be, at least it was not really satisfying for me as a way to characterize mental representations. So the main solution we came up with this idea. So we always tended to think about transforming first-order representation to something else. But here what we are proposing is to construct a representation of processes rather than these kind of states. So the idea is, okay, so any neural network can be conceived as some sort of function from input to output. But there is a way to represent these functions or processes. So instead of we representing first-order representation, we are proposing if we represent this function f to some meta-representation, we can have a potentially better interpretation of meta-representation because this makes a qualitative difference. But maybe this might be somehow confusing. So I wanted to give you an example using artificial neural networks. So that's the main topic of the paper. Okay, so here I just want to briefly sort of introduce the concept of latent space or embedding for people who are not familiar with this kind of stuff. So in artificial neural networks, there is a concept of encoder and decoder. And so for example, let's say you train an autoencoder neural networks where the input is an image and then output is an image. So basically you want to reconstruct the input using encoder and decoder. But there's a low-dimensional bottleneck in the middle. And then usually when you have this kind of construction, you find interesting features where the vector in this space captures the important characteristics of the dataset. In this case, images. So that's the idea. So basically it's a compressed nice representation of the dataset. But what we are proposing is to create some sort of meta autoencoder where the input and output are neural networks. So you can use a neural network which can be parameterized or represented by the weights and biases. You can use many different ways to sort of convert neural networks into some vectors. And then you reconstruct the neural network. So maybe you can already see this is somewhat meta. And so with this construction, you can have a representation of neural networks. So in this case, each point in this space corresponds to some sort of neural network or function. So in a way, this is a space of functions parameterized, like latent axes. So that's the general idea. So I hope you get the idea so far. And so from here, I want to explain the actual experiments we did. And so this is a little bit more detailed construction of what we did. So we first created a lot of encoders by training autoencoders for all kinds of different stimuli. So for example, like in one case, let's say we trained this kind of first order neural network just on cars or just on flowers. Or sometimes just the sound of dog barking and so on. So like each and first order neural network is specialized in some kind of stimulus category. And we use both visual and audio stimuli. But each network is trained on just one type of stimulus, one category. And so and then this encoder can be represented as 1056 by 16 elements. So like each column, of course. Okay, so here, like in the first order latent space, we have 16 dimensions and then input images 1056 pixels. And yeah, so basically, you know, you can see this number as the sort of parameter for this first order network. And then each column corresponds to, okay, so now you can see that this is a very simple neural network. So just one transformation. And then we took each of these as the input as a filter. And then we get like this kind of root. So okay, so one column here is a filter in this network. And then we use this as the input. So we trained another autoencoder to represent individual filters. And then we computed representation of this whole network using these two networks. So this might be slightly confusing, but the main thing is we get representation of network at this stage. And then we created another meta autoencoder using this kind of latent representation of the first network. So it might be a little bit confusing compared to the previous image, but this is more to deal with the technical aspect. But the main thing is we trained a lot of small networks on a specialized stimulus category. And then we tried to get representations of those networks. So that's the idea. Okay. And so then we applied, that gives me to visualize latent representations. And so here like all the blue dots correspond to some visual stimuli, some sort of category. And then the orange, the points correspond to some auditory stimuli. So that clearly you can see there's a separation between visual and the audio stimuli. So that's kind of interesting, right? So because this suggests that when you have a meta representation, you can already tell whether the network is specializing visual stimulus or auditory stimulus. So that's what we were excited about. But also, you might be able to see some clustering of same shapes. It might be a bit difficult to see this image. But there might be something like, you might be able to tell whether a network is specializing cars or buses or things like that. So we tested that by trying to predict the original dataset just by looking at the meta representations. So as you can see, there's a clear separation between modalities. But there's a little bit brighter diagonal elements which suggests that sometimes we succeed in predicting the specific stimulus category. So that's really exciting. Yeah. Okay. So based on this, since we can, there's already some information about the structure of the weight of the network, which tells us about what kind of information they were trained on, specialized in. So that's an interesting observation. But also, like we tried this in the original weight space, but in that case, we cannot really predict which category the network was trained on. So in that sense, you actually have to have this kind of meta representation to be able to make these predictions. Okay. So, but maybe one question is, why is it possible? So especially when we try to compare vision versus audition, maybe the key differences, they may have different kind of invariances or equivalencies. So, yeah, so for example, like in images, the object category or labels do not change if we make translation or changing scale and so on. So the classical convolution neural networks rely on this kind of invariances. But maybe for auditory stimuli, if you translate the input image or the sound differently, so you experience different kind of sounds and also the identity of the sounds changes. So there might be some modality specific invariances. So maybe after training a network, they learn to capture that kind of invariances. And meta representation may also find this kind of representation of invariances or equivalencies in the structure of the weight. So that's our current explanation why this works. And I'll come back to this later. So in a way, our experiment was also motivated by the classical question of what it's like to be a bat. And the interesting thing is we don't, so bats can do echolocation by emitting sound and then receiving sound, but they use it for navigation. So maybe it's like vision for them, but it's hard to say whether they are experiencing visual quality or auditory quality. So we don't answer that kind of question eventually in consciousness research. And so, but to introduce an interesting context, so this is a very exciting series of research from an American catharsis group at MIT in the early 2000s. And so maybe many of you are already familiar with this experiment, but they managed to do an experiment where they rewired the input from the retina to the auditory cortex. So yeah, I think it's like really amazing that they can actually do this sort of thing. Yeah, so but the idea is that their auditory cortex developed receiving input from visual images, the data coming from the retina. And the interesting, so they had several nature papers and they were all very interesting. But one of the key findings is that after auditory cortex was trained on visual input, their connectivity pattern looked more like normal visual cortex. So yeah, so this kind of connectivity to not their connection to other neurons with similar orientation properties. So somehow the connectivity seemed to reflect the sort of statistics or a structure in the data they learned. And but there's an additional interesting experiment, behavioral experiment they did. And the main question they asked is whether they see or hear the activation of this rewired auditory cortex. And the short answer to that is they, the animals seem to see the brain activities in the rewired auditory cortex. So that's like really amazing result. So yeah, so based on this, I thought that this could be a really useful way to think about whether we are, you know, whether some activities or some neural networks is more like for hearing or seeing. And so yeah, so that's something I always wanted to ask. Yeah, so maybe I've been thinking about like testing IIT using this kind of visual versus auditory, Gloria. And yeah, so the question is whether we can tell whether a piece of cortex is visual or auditory just by looking at their connectivity. And of course, you know, IIT is very difficult to apply and we don't know yet whether IIT is true or not. But at least we need to think about how we can test implications of IIT by empirical research. Okay, so this was what I was thinking originally. So let's say, you know, you take some, you know, like anatomy from visual cortex or auditory cortex, and then there are like many techniques to look at anatomical connections and functional connections. It's probably still very difficult to read out the weights from the actual brain. But if we have this, we should be able to compute information structures as suggested by IIT. And now we can probably match the structure to the report of, you know, visual quality or auditory quality. So that's the ideal experiment I wanted to do. But there are a lot of challenges. So the first one is in the actual brain, it's very difficult to have like complete characterization of anatomical connections. And, you know, it's also impossible to look at the causal patterns in the activation from all the neurons in questions. So that's very hard. It seems impossible. And another difficulty is the computation of phi in IIT is very hard. So, yeah, we've been also trying to compute some surrogate of phi or earlier version of phi. But it's still very hard if we want to compute phi from, you know, let's say, tensor than neuron. So that's pretty hard. And so, and also maybe a third thing is, you know, how can we match the structure of the report of experience to the structure of information? So that's, again, a difficult question. So, but this, you know, with the approach I showed you earlier, we can kind of replace some of the problems. So instead of looking at the complete anatomy of the human visual cortex and human auditory cortex, we can create sort of artificial neural networks by training them on specific stimuli, in this case, like sounds and images. So that's, you know, the great thing about artificial neural networks is we can see all the weights and connectivity patterns. So, of course, you know, we don't know whether they are conscious or not. But this approach gives us something concrete to work with. And another thing is about computing phi in IIT, we can do something slightly different. So there's a really interesting paper by Mediano and politics where they propose a concept of weak IITs. So if you take the original IITs, it has both mathematical and philosophical aspects. So that's my interpretation. But if we can use the mathematical implications, or if we use phi as an index of complexity or something, then we can use IIT in a more pragmatic manner. So that's their proposal of weak IITs. I think there have been a lot of empirical research driven by IITs. So I think that's a good thing. But here, what I'm proposing is more like conceptual IIT. So in the meta-representation work I presented today, I also took some inspiration from IIT as well. So the idea is that in IIT, some network generates visual or auditory quality should be fully described in terms of their connectivity patterns. So that's a kind of conceptual implication of IIT. But instead of applying the mathematical formalism of IIT, we just use autoencoder to find or embed neural networks in a practical, computationally tractable manner. So I think this kind of the conceptual IIT approach could be also a useful way to make progress in consciousness research. Okay, so the final points. So maybe one question is whether meta-representations as we presented today actually exist in biological brains. So that's highly questionable. So for example, today I embedded neural networks just by the weight of first-order networks. But in the brain it seems impossible to read out the weights from other brain regions. It may not be impossible, but it seems very unlikely. So maybe in the brain there might be a different mechanism to achieve this. So instead of using weights, if you have many input-output pairs, you can also construct a representation of that kind of relationship. So maybe without some figure or equations it might be a bit difficult to understand this. But let's say you have two brain regions, like v1 and v5. But if a third region receives input from both regions, the third region can learn to represent the pair of representations. So that kind of meta-representation may exist in the brain. And also we've been thinking how we may be able to find such representations with fMRI, so that's a topic for future study. And another question is what's the point of having this kind of meta-representation? So what's the functional role? So our current interpretation is when you have this kind of meta-representation, you can sort of compare different networks. You can have some sort of a qualitative characterization of first-order networks. So let's say you have meta-representation of red color processing network, or you could have meta-representation of many different first-order neural networks. But then in this space, you can talk about whether two processes are similar or different. So for example, in that space, you can say that visual experiences are very different from auditory experiences. But within the visual experiences, there are many different types of experiences. And here you can talk about the distances and similarities. So maybe it's a bit similar to the idea of what to back, where words are embedded in a later vector space. And when you have that kind of space, you can actually have some reasonable representations of semantics of the words. But here if you embed neural networks, you can have some sort of semantic representations of those networks. So that might be the potential functional role. So in a way, this already has a flavor of something like quality. So when we talk about quality, we always compare certain experiences to other experiences, and we can talk about whether or not two experiences are similar and so on. So in that sense, maybe this kind of meta-reventation might be very important for us to be able to report the qualitative aspects of the other one experience. Okay, yeah, that's all. Thank you very much. Thank you. Awesome. Cool. I wrote down some questions and also anyone watching live can write questions. Well, thanks for sharing it. And it's definitely very striking finding about the visual and the audio differing. So just kind of a preliminary question here. How was time handled in the audio setting? Oh, good question. Well, we just converted the raw signal into an image. So it's basically time and frequency so that we can use the same network for handling two different varieties. Yeah, interesting because it's also a difference between those two features. And it made me wonder about video, video chatting or watching a video, perhaps as being a meta-representation with audio and visual. Because if there's like a lag or if there's a disconnect or any other number of relations, it can be noticeable as a contrast. Like, oh, they're talking louder than it looks like they're talking or something like that. So I mean, have you looked at that kind of fusion or how could you look at it with the architecture that you had here? Oh, that's a great question. So we have very simplistic architecture since this was just a peripheral concept. But I think in the brain, we must have really multimodal representations at the same time. So that way, we should be able to compare different experiences at the same time. So that's where we get somehow close to the idea of global workspace. So we are not directly addressing your question, but we are on a separate project. We are actually training global workspace like neural networks so that they learn kind of multimodal representations. And also, so in that context, we believe it's very important to look at the latent space structures from different modalities and then see how they're related to each other. So with Riffin-Van Viren, we wrote a paper in Trending Neuroscience where we proposed that global workspace may be a kind of shared latent space across different specialized modules. So I think that part is also an important next step to understand theories of consciousness from the perspective of deep learning. Interesting. I found it also very interesting how you began with the uncertainty estimate. Because in active inference, a lot gets loaded on to uncertainty estimates of different parameters. And that made me think about, yeah, thanks for the slide. It made me think about if you only have two parameters to encode, then you can encode like the mean and the variance. However, with the neural network autoencoder concept, you could project down to just two or it could be more. And so then there's a much richer palette for and more bandwidth than just a statistical distribution, even though it's also composed of statistical distributions. But the minimal and the simplest and like the kind of most essential is really the statistical single distribution. But this is basically just talking about the connectivity of multiple distributions. Interesting comment. Yeah, I guess maybe crucial question is whether and how uncertainties are represented in the brain. So when we think of uncertainty estimation in terms of the two parameters, that's kind of mathematical notion. But uncertainties may be, probably we don't use just a single neuron to represent all uncertainties. So there might be also population representation for uncertainty. So in the actual implementation in the brain, yeah, you may actually use like many neurons as well. But I've been also very interested in this topic and especially the uncertainty estimation in the thalamus. So my friend and colleague, he found the uncertainty estimation neurons in the palpiner in the thalamus for visual experiences. So I thought that was a really cool study. And I thought there seems to be a close link between having high confidence and conscious experience. So I thought this might be like a really key ingredient. But somehow, like when we think about uncertainty in terms of deep learning, it seems like really trivial. So I feel like there's a gap there. That's very interesting. There's a lot of ways to go with that. In the last two points that you had up there, you returned to this kind of functionalist question, or at least just perspective, like what is it doing? So that made me wonder, when you look at the meta representations for the networks you construct, do they seem to convey something like summary statistics of the network? Like the sparseness overall of the connection or some kind of network descriptions? You also mentioned the differences in the stimuli type. So what parts of the meta representation reflect what the network is and then also what it does input-output-wise? So we don't know really what's in the meta representations of our networks. But maybe this slide might be relevant. So each neural network, the first-order neural network, tries to find good basis functions or good filters so that they can efficiently encode the images or sounds. But they really come from the statistical patterns in the stimulus. So that should be somehow reflected in the weight structure. So that may be related to these kind of invariances or grid variances. That's our speculation, but also for really specific type of things. So for example, these are all different kinds of specific stimulus like maybe this is air conditioning or car sound and so on. So they may have like really specific sort of structure. So then they may be embedded in the neural networks. But the interesting thing is when you train a neural network, you get different things every time. But they have something common across every training. But on the surface, they look very different. But across the results of many different training, different networks, trained on the same stimulus, there is something common across them. And so that's the kind of features we wanted to capture with metawater encoders. Okay, this may be reading too much into the image, but it's V for visual, A for audio? Oh yeah, yeah, that's right, yeah. The big off diagonal blocks are just the differences between the audio and visual that are solid purple. But then the audio stimuli has more a little bit with other audio. But then like in the visual, I don't know, it's like related to the syllable air. So it's kind of like a natural association, even though it wasn't trained, or maybe it's just because it's not a diagonal. Yeah, that might be some bias. Yeah, so this one, yeah, somehow the predictor thought like all the images come from this one. Yeah, but it seems like from this figure, different audio stimuli are very distinct, where it's like in visual things, what they seem to have. But this may be more due to technical constraints. So here we have to use really small patches. So maybe instead of looking at object categories, per se, maybe those networks looked at the texture patterns across different categories. So there might be still some information, but maybe they are not very specific to individuals just in this category. Hmm. Yeah, I think, I mean, you're shooting for the moon with a consciousness component, but even making simpler networks to understand what aspects latent spaces learn is a very useful method that I think this provides a strong example of. Yeah, I hope so. But also, since we work in the domain of AI research as well, we encounter something called mechanistic interpretability, right? It's a kind of field of AI research, where people try to understand what's happening inside neural networks because, you know, like a lot of people think deep learning is black box and we don't understand what's going on inside. But I think eventually we want to understand what's happening in neural networks as well. And it should be still easier compared to understanding the brain. So, you know, maybe, you know, in the context of our idea, I talked about this a little bit, but in deep learning, we know all the weights. We can do like any experiments, you know, we can do ablation or, you know, we can look at the function of part of the network and we can, you know, give them millions of trials. So it's like an ideal experimental setup for neuroscientists. So if we don't understand deep learning neural networks, it's hopeless to understand the brain because, you know, it's much harder to do experiments in the brain. We cannot know the weights and connectivity. So we only have very limited access to the actual materials. So in a way, we can practice how to understand systems using simple neural networks and then try to sort of use that experience to understand the brain. So I think that kind of cooperation is potentially very interesting. Yeah, very well said. It reminds me of some work by Jonas and Cording from 2017. The papers called Could a Neuroscientist Understand a Microprocessor. And they had an in-silico simulation of a whole microprocessor doing different operations so they could do the lesions and the double lesions make all the recordings. And so that kind of revealed, on one hand, the limitations of different methods that people often use to ascribe function to the brain. However, there's always this component where it's like, well, maybe that's just because the processor is a weird architecture or because the operations that they do are very synthetic and they don't really have a natural component. Whereas when you're proposing that the base neural networks must deal with the symmetries of audio, visual, and so on, it's the kinds of challenges that organisms actually have to solve with nervous systems. As opposed to being software, which could have been written in an arbitrary way. And so the results are a little bit different, but it kind of makes the proof points, which is that to have the in-silico version that you can do the digital simulations on can help you identify where your studies are well-powered or not, and do all these other useful functions, even if they don't directly answer the question itself. Yeah, I'm also a big fan of that, like holding paper. So I think they made a really important point. So even if we have access to all the things, we still may not understand it. And it's hard to say whether the organic systems are easier to understand compared to synthetic systems. So I guess we just don't know. But without making any assumption, I think it's just very hard to sort of understand any computation happening in many different scales, not just in the brain, but inside individual cells, or maybe at the smaller scale or larger scale like galaxy. And so yeah, I think we still lack the science to connect physics to computation. So on a different paper, we also propose something called universality. So the idea is, so the point is a theory of consciousness should be also universal in the sense that we can apply it to non-biological brains. So like a lot of times when people ask whether current LNMs are conscious or not. But it depends on the theory you subscribe to. So the current theories like a higher order theory or global workspace theory do not tell us whether AI is conscious or not. But I think the reason is we need to define theories based on some physical system. So for example, it's hard to say whether an artificial neural network has a global workspace. We have those concepts, but when we try to analyze physical systems and then try to see whether there's a global workspace inside, it's very hard. So we need to make such theories universally applicable so that they can tell us about computations happening at many different scales correspond to their concepts in those theories. Yeah, that definitely kind of calls back to the alien slide. And if we had a tissue sample with only the static topology with no function, what could be assessed on a different note in active inference, I mean, half of it is action. So when we're thinking about variational auto encoders and the transformation from observations like down into state spaces, often that's in terms of the policy rather than only a compression, which is kind of like a sense making and reverse sense making layer. That's the predictive coding predictive processing origins, and then also to bring in the action representations. So that's kind of like being able to distinguish the primary audio and the primary visual from the motor courtesies. And those may have very simple or very sophisticated representations. But there are also ones that could be directly correlated with like bodily movement or muscle activity, rather than being correlated at some level if you go that way like with something you can't measure. Yeah, I think like here that we only talk about sensory experience, but like you said, I think some of the meaning of sensory experience comes from action or in relation to your body. So for example, without body, there's no up or down or left or right. So there's no direction, but you get this kind of direction in the visual image or reactivity in your body, for example. And also maybe you get kind of representations related to affordances. So whether something is actionable, that also relies on some combination of sort of action and sensory representations. Yeah, but I think it's very interesting to think about how we can extend our current research into, you know, more like agent based architecture. That makes me think of the possibility that the meta representations could be higher judgments, whether those are experienced. So like a very cross modal judgment would be like, can a human make this? And then that could be asked across modalities or could reflect multiple modalities, but it has to draw information from also not just the sensory data, but like other kinds of memory. It connects with the broader cognitive modeling and the agent architecture, whereas here it's really reducing it to just the streams of vision and audio, which is the right starting place. And then there's going to be other streams that aren't just the sensory. Oh, okay, yeah, that's a great comment. So this reminded me of something I had in mind and forgot to mention. So we are thinking of this kind of meta representation as one of the modules attached to the global workspace. So when you have something like global workspace, you want to use functions, not just from the sensory inputs, but you want to combine that with action, reward, and so on. So in principle, okay, so we want to use representations of potential actions or even like mental operations. So like, you know, adding two numbers is a kind of mental action. So when you try, so we are actually thinking about connecting this kind of meta representation to general intelligence. So when you have a new program, you have a representation of the task. And but to solve a new task, you want to find a potential combination of, you know, specialized modules you can use to solve that task. So in a way, this kind of meta representation can be used to match the tools you have in your brain with the current task. So in that toolbox, you don't just have this kind of visual representations, but you may also want to use potential future actions you can use. But you can also use the same meta representation approach to represent the repertoire of actions you can make. So that's the connection. Yeah, potentially in the active inference model, like you had the slices with the different filters, the B matrix in active inference or just the transition matrix, it's the policy dependent transitions on the world. So that's kind of like the policy dependent filter applied. Yeah, yeah, yeah. That's a good connection to it. And then also on the thalamus, I'm not familiar with the mammal neuroanatomy very much. But when you said that clarity was important, did you mean like what? Yeah, so I think a lot of interest in confidence or metacognition comes from the literature blindside or blindside like phenomena. So it seems like even when you have the ability to report the stimulus, without confidence, you know, you report, you didn't see it, right? So it seems, you know, this kind of confidence is explicitly present in the brain, so for example, like in the thalamus. And so when you deactivate those neurons, somehow, you know, at least in monkeys, they lose the confidence. So in terms of confidence rating, they seem to become like blindside patients. So in my introduction, I kind of criticized the kind of simplicity of confidence as metarepresentation. But at the same time, in the literature of consciousness research, it's also very important, right? So somehow, without confidence, you don't really report it. But I think maybe we need to have further studies about how confidence or lack of confidence changes the way information enters something like global workspace or some stage of consciousness. So I think that's really likely in the context of precision weighting and stuff. So without confidence, maybe confidence works as a way to get information to something like global workspace. I don't know, maybe I talk about global workspace too much, but I think it's a good way to think about it. Yeah, I agree. These are just like kind of the labels and the models that can then be applied. So what that made me think of was, again, an active inference. If we're thinking about attention or precision, confidence, it's usually just one variable in the minimal case, just changing the variance on a statistical distribution. Latent spaces makes it much richer. So then let's just think about what sources are we paying attention to? Which sources, if are we allowing to ignite the global workspace or have more of a percolating effect? And then there could be like, it's like the dinner party problem. So one part of the problem is just distinguishing and interpreting the sounds. So that might be a simpler like left side louder or quieter or moving or something like that. But then you would have a higher order evaluation space with like this person, you can trust on this topic and that topic, but not this one and vice versa here. So that's invoking more of the world model. And then that could control to what extent different things that those people said after they'd been parsed out. So like the primary sensory mapping is simpler, because this is just a very sensor actuator type problem, whereas the higher order latent spaces could be just very large. Yeah, I guess maybe for like a simple, the presence versus absence kind of judgment, this confidence may be critical for a stimulus to enter conscious awareness. Whereas probably meta representation I talked about, maybe more related to sort of evaluation of the quality of the content of consciousness. So if we want to make comparisons across like different sensory experiences, we need this kind of meta representation. I think that's a kind of different aspect of metacognition. Yeah, so I guess maybe like in like earlier days of consciousness research and people did a lot of like present versus absent report and it's all about detection. But I think detection is very different from kind of qualitative assessment of experience. And so I think it's kind of transition in the focus of consciousness research from, you know, like present versus absent report versus more like structured report, like comparing like two experiences are similar or not. So I know now it's GS group has been like, you know, developing these methods to sort of capture for your structure. So I think, you know, we want to understand why certain experiences feel the way they do, like, and then that those things seem to be very difficult to approach. But by looking at some comparisons of different experiences, it seems to be able to capture the overall structure of the experience. So maybe our meta representation approach here is also more about capturing more sort of structured structure of the quality of experience, rather than, you know, whether we consciously perceive something or just subliminally process. That's a very interesting point. It's almost like the early global workspace and ignition were focused on the extent of the causal efficacy and the binding. But it was sort of left for later what the semantics or the contents were, because the question was, at least from how I've seen it, more like whether it percolated to awareness or not, or like whether there was a binding or a reportable versus unreportable stimuli. And so it totally makes sense that within the space of the aware, the question of how to bring it to awareness is kind of already taken. So then it requires a much richer state space like you have here to even approach it because it's just not an all or none question at this point. Yeah, I agree. I think that's a really interesting tradition. Although I feel like we haven't really solved this reportable versus unreportable question yet because we still don't know where consciousness is really happening in the brain. But I think we need just several different approaches to tackle the same problem. Cool. Well, I guess sort of in closing, what are you going to continue working on or what are you excited about? So we want to do empirical research based on this. So I think here we are proposing a new way to look at meta-representation. So one thing we are interested in doing is to find whether such meta-representations exist in the brain. So that's an open question, but we have some ideas about how to try that. And another thing is, this is kind of beyond my ability, but there might be a way to do this meta using mathematical tools. So intuitively, so now like some people talk about category theories and things like that. So it seems like what we are doing can be made more abstract and formalized so that we can have better insight about what we are trying to do. So here what we did was kind of like naive way to express our thought about meta-representations. But here, so we are kind of transforming functions into an object. So basically one function is a point in some function space. And this point in functional space seems to be somehow related to Korea, so that's my current intuition. But I think in mathematics there must be useful tools to deal with this sort of idea. So I'm curious if there's any mathematics that can formulate this nicely. I think that would be an interesting thing to do next. Very interesting and also to the little I know about it and people have discussed it. That is sort of a category theoretic move to have a handle like a point around a mapping so that you can have the space of the models. So you took a very constructivist engineering empirical approach and then it's an exciting open question like what are the formal structures that generalize that? And then you could spin up a million experiments just like the one that you probably built by hands and then learn from that. Pick up that as a point. Any last comments? Oh, I have just one announcement, which is this one. Yeah, so sorry about advertising, but we have a consciousness conference called ASSE. I think that's one of the main consciousness conferences. And that's happening in Tokyo this July. So if you're interested, please come. I'm one of the main organizers. That sounds awesome. Cool. Thank you very much for this presentation. Good luck with the work in the conference and hope to talk to you again. Yeah, thank you. That was fun. Yeah, see you. Thank you.