 Welcome to the last lecture. I'm running out a slightly reduced capacity today because of a cold. You can probably hear this. I heard that in the tutorial yesterday, there were some questions, which we're going to address right now. But let me first set up the projector. Presentations, slides. Oh, let's try again. Is this one of the bulbs who's missing one of the color channels? It's probably one of them. But why would it happen in both projectors at the same time? So it's probably at the software level somewhere. Yes, I can try and use the HDMI. Most likely, we have another DJI. Unless it's the cable. Let's try the HDMI. Is this one? One is for the monitor. Thank you. OK. So HDMI is working. Now, let's immediately deal with these questions that came up during the tutorial yesterday. The generating process is the interfering process. Yes. Which are the experimental data. Yes. With the DCI, what is the link? How do you couple them? Yes. What is the meaning of the two things? OK. Yeah. What's the meaning of the DCI? Yes. And so why three layers? Whereas in the brain, there are more than three layers. Also a good question. And so what's that? Paralysis of an ideal Bayesian agent? Or what does it mean? So we are not Bayesian, probably? You're not ideal Bayesian in the sense that you don't know the generating process. And so-called ideal Bayesian observer knows what the generative process is, but does not know the parameters of that process. So the ideal Bayesian observer only has the first kinds of uncertainty, the first two kinds we went into. It has absolutely no uncertainty, no environmental uncertainty in the sense that it knows how the environment is structured. It knows how the environment generates its inputs. And all it doesn't know is the parameter values of the generating process, but it can learn that by observing enough inputs. So that was the last question first. The first question was the difference between the generative process, the generative model, and the inference process. So basically, if I just copy this onto the blackboard, we have u here, we have x here, or theta. Let's say x. So x is a part of theta, a subset of theta. Here we have lambda. Here we have A. We haven't gone into this. Maybe today, if there's enough time, we'll look at action some more. That's just a subset. It's like you have set A, and you have set B. Then B is a subset of A. Likewise here, to make the connection to what you saw in the tutorial, these mu's and pi's, they are a subset of lambda. So yes, these are the states of the environment. And also, basically, we take them to be constant. So they are, in some sense, part of this arrow. So this is the recognition process. But we don't perform that may be a subtlety that is perhaps not easy to understand at first. So the omegas, the kappas that we had in the model, they are taken to be constant. And they are part of the makeup of our agent. So they are sort of in this arrow, we have kappas and the omegas. They're part of the way this agent processes the inputs from the environment. And the agent does not update its kappas and omegas. The u is just the objective input that the agent gets. So in the example of the association learning experiment, u is simply 0 or 1, depending on the outcome of the trial. In the example of the exchange rate, it is simply this day's exchange rate. So it is sort of an objective fact that the agent can learn, can find out. Whereas the volatility of this process here generating the exchange rates is, in principle, not directly observable. So u is everything that's directly observable. So this is basically the generative process, this here. And this here is the inferential, sometimes called recognition. This is what takes place inside the agent. And this is what takes place outside in the world. Yes? If you do not have any observational noise, then u is equal to x1. Then x1 becomes directly observable. And this was the case in our binary experiment there, because you could directly observe whether the outcome was a 1 or a 0. But there are examples, and instead of a house or a face that is easily recognizable, you can show a blurred image. You can even show a blurred image of a house and a face superimposed. And then your u is something different. So if you look at, I think, the first HDF paper from 2011, we have a model for that too. Then u is on a continuum. Sort of we've done dimensional reduction from all the kinds of possible objects. You could see that living in n dimensional space, where n is a pretty large number. We've reduced this to one dimension. And this is where houses are, and this is where faces are. And now if you present a stimulus that's blurry, you're going to have a stimulus and the superimposition of houses and faces. Then your input will lie here, or lie here, or lie here. So you will have to infer what x1 is. So u is then continuous. And x1 will still be binary. x1 will be either a house or a face. But you will have a probabilistic belief about whether this was a house or a face because it was blurred. So there are many ways to model the relation between x1 and u. In some cases, there is absolutely no perceptual uncertainty. You can directly observe x1. But in general, even x1 is not directly observable. In many experiments, we set the experiment up in a way that x1 is directly observable. And that also simplifies the modeling. Yes? u has no precision because u is known. u is the raw input. Only beliefs about unknown quantities have precision. u is the observation precision. And this is often but not always taken to be constant. So in the model I showed you in the time series model and also in the models in the paper, pi hat u, I think it is, then that is constant. However, if you take a model, let me get the right slide. This one, for example, here, instead of pi hat u, we now have a variance alpha. So this would be the inverse of pi hat u. And it's now no longer constant. It follows its own generative process, which is not constant. So here, you would have a variable pi hat u, but that's a more complicated model. So it's always a modeling choice. You make certain assumptions because your data only support inference on a model of a certain richness. And if you have data that support a model like this, then you can happily go and use a model like this. But in some situations, your data aren't rich enough to support using such a model. Yes? It is constant in that example. Yes, yeah. Yes, yes, we could go and do that. Yes. Now on this kind of diagram, we will have to look at the update equations. And actually, I'm afraid this is not entirely detailed enough. But here, you can see the precision weight prediction error again. If you take the precision weight on the prediction error, and here, again, this is even more easily visible, precision weight prediction error, and here, precision weight prediction error, and here, precision weight prediction error, precision weight prediction error everywhere. If you do the same exercise we did in the one example we looked at, and take this apart, look at what is exactly inside, you find these three types of uncertainty again. So you can repeat the exercise we did. Yes? On your diagram. Yes, wait a minute. I even have a slide that does this explicitly. So what you're going to see is the exact same model again. And then I have to, OK. Now these are so-called VAPES value prediction errors. So these are prediction errors of the simple kind, where you just have the difference between your prediction and the outcome. However, these are, again, these volatility prediction errors, which are prediction errors about how quickly this quantity is evolving. And VAPES are driving these updates. So from the observation of you, you update your belief on x, and you update your belief on alpha. And these updates are driven by this simple kind of prediction error. But then from the update on your belief in about x, you update your belief about x check, and this is driven by a volatility prediction error. And the same happens over here. From your update, the update to your belief about alpha, you update your belief about alpha check, and this is driven by a volatility prediction error. So yes, exactly. But it's now a state that evolves. It's now variable, not constant. Yes, yeah. Yes, so exactly, yeah. In the paper, I just want to say that you can manage your model in comparison to the importance of learning. Yes. I have some questions about the problem that you tried to explain this to the world. Do you design that model? Do you use different models? Yes. And when you try to use different types of learning in particular learnings, what kind of learnings does it have to do with learnings, what kind of learnings does it have to do with learnings? Yes. The advantage of learnings, what difference is it because of your model to the better to compare them with learnings of experience? Yes. So there are at least three possible answers. And I'm going to give you all three of them. The first is a simple practical one. You can compare the performance of a model like this to the performance of any other kind of model by looking at the model evidence. And you can just let this model run against any other model you like, reinforcement learning model, a hidden mark of model, whatever. And then you can choose the model that has the best performance. Now, often it will be a model like this that has the best performance. It depends a bit on, again, how rich your data set is. If you only have 60 binary outcomes, then it's not unlikely that the reinforcement learning model performs better because it's a simple kind of model. Yeah, it's a model that models. It does model uncertainty. So they were shifting to the second answer. So for instance, we had one concrete case where we did a small experiment, exactly 60 binary outcomes. And we did this in a group of patients and in a group of healthy controls. Now, the group of patients were adolescents with ADHD, with attention deficit, hyperactivity disorder. And then we compared the performance of these models. And what we found is that in the healthy controls and if we threw the whole sample together, an HGF model performed best. However, if we just looked at the patients, the reinforcement learning model performed better. And then that's a difficult situation in how to proceed because you're going to draw inferences, you're going to try to find out why are the patients performing differently from the healthy controls and what exactly is the mechanism that makes them different. And because these models are richer, there's a lot more you can say about that. That's going to be the third answer. So we argued that we're going to, since if you take the whole sample, this kind of model performs better, we're going to base our explanation of what's going on on this kind of model. But this is the kind of trade-off and complicated situation you can get in when you simply compare different models according to their performance at explaining the data. Because it sounds like a triviality or like some kind of superficial quip, but it's actually, in some sense, a deep insight. If you have a data set, sometimes the simplest explanation is not the truth, or otherwise the truth is not the simplest explanation. So sometimes a simpler model will be better at explaining your data than the process that actually generated the data. Because you just have too little data in order to capture the whole complexity of the generative process. Now, the second answer is that in principle, when doing inference on unknown quantities, you always have uncertainty that you should model. Because it gives you a fuller picture of the kind of inference and learning taking place. Your learning rate fundamentally depends on the amount of uncertainty you have about the environment. And in reinforcement learning models, unless you, I want to be fair to reinforcement learning models, and I'm sure you can somehow put uncertainty in there, in a sense, by hand, by introducing some kind of uncertainty parameter or whatever. But sort of generically, reinforcement learning models don't have a measure of uncertainty about the quantities that they're updating. So it's just the value of a state. So you have these state action values that you're updating. Perhaps you learned about this in the reinforcement learning class. And there is never an answer. But usually, there is no uncertainty about that value. So these models, I would argue, depict a more realistic picture of what is going on when inference is taking place. We're trying to reduce uncertainty about hidden states. And this is explicitly contained in these models. Whereas in reinforcement learning models, you just update the values of state action, of state and action contingencies. The third answer is we want to get a realistic picture of what actually the brain is doing and what biological agents are doing. And we have lots of evidence from looking at the neurobiology of how the brain updates its beliefs, that these uncertainty updates that are contained organically in these models play a very important role in the way the brain processes the outer reality. And therefore, at least I believe that this kind of model is more appropriate to describe what the brain does than a model that simply updates states. Having said all this, reinforcement learning has been extremely successful. And these are extremely powerful models. So I don't want to discourage you from using reinforcement learning models. But this is a fundamentally different approach. I do think it has some advantages in some situations. So I think we dealt with the first and the last question you have from the tutorial. Then an additional question now. Could you remind me what the second question was from the tutorial? Yes. We use three layers because adding a fourth layer in the kinds of experiments we've done up to now didn't give us more power to explain the behavior we see. So this may have to do with the limitations in our experiments. But we just found that if we add a fourth layer, we don't get more model evidence. And that's why we stop adding layers. And a more complicated answer is this. You have these two different kinds of updates, these vape updates and these vope updates. And when people look at the hierarchical organization of the brain, where they get about eight layers, they're mostly worrying about this here, about these vape updates. So here you can argue we have even fewer layers that we're modeling. But there is a complex interaction between uncertainty-driven updates and simply positive or negative prediction-driven updates. And this has to be disentangled. I mean, there are signals in the brain that jump all of these hierarchies. So if you look at, for instance, the dopamine system, just anatomically, these neurons that emit dopamine at their axon terminals, they originate in the midbrain, which is quite deep inside the middle of the brain. It's an old part of the brain, but they project to the whole frontal part of your brain. So it's a kind of broadcast signal that you get from dopaminergic activity in the midbrain that informs everything across all levels of the hierarchy that are situated towards the front of your brain. Now, if you look at serotonin, which comes from even deeper inside the brain, that doesn't only go to the front of your brain, it also goes to the back of your brain. So basically, it broadcasts to the whole of your brain. So there are these signals that if you have these, I want to draw a picture of the whole situation. You have the input here, draw it as u. And then you have x1. And now we're going to have a vape update here. And then in this model, we're going to have another vape update here to get to x2. We have another vape update and so on up to the 8th level. But then here, we have a vape update that goes to all of these levels, which is, for instance, dopamine-dependent. DA is short for dopamine and so on. So we have further. So this is the situation as we imagine it. And what we're currently working on, this is the research program that we're pursuing, is to look inside these states. Because if you magnify this, the x1 here, then you find neuronal subpopulations. I would say at least four. You have superficial pyramidal cells. You have depramidal cells. You have inhibitory interneurons. And you have spinase-delate cells. And the interaction of these four states, so they interact in quite complicated ways. So here we get input. Then this comes up here and then down here and then here. And from here, it goes to the next region down in the hierarchy. And from here, it goes to the next region up in the hierarchy. And all of this is both vape and boat-driven. But here in this part of the hierarchy, mostly vape-driven, but there are also inputs that come from lower down that are boat-driven and so on. So we're trying to disentangle this. And we're not there yet. So we have our work cut out for us. So these are sort of our first baby steps at describing how the brain really works. And the actual, I haven't even shown you the slides yet from the actual studies we did where we actually looked at the brain activity. So that is one of the things I just quickly at the end want to do, where I actually show you how these predictionaries and updates are represented in actual brain signals, both from MRI and from EEG. Those are the modalities we use to infer on that. So very briefly, I have a cartoon of part of what we're doing. So here, this is one of the models we have. With the hierarchy going here. So this is a social inference paradigm we applied this to, where we gave our subjects two kinds of information, objective information about the probability of an outcome. They saw a pie chart. The outcome would be either blue or green. And they also got the device from an advisor. So they saw a video of a guy holding up a green card or a blue card. And they had to learn how trustworthy this advisor was. But the advisor knew more than the pie chart told them. So if the advisor was trustworthy, you should listen to the advisor, even if he tells you something that surprises you from looking at the pie chart. But in the course of the experiment, the trustworthiness of the advisor changes. And you have to learn that. So it's a much more complicated experiment than the ones I showed you before. People have to integrate lots of information that they have from a non-social domain and from social domain. And they have to combine the information they have from the social domain with that they have from the non-social domain. And of course, they differ in how they weight these two sources of information. And then we have all these quantities in the hierarchy. So we built a model for this. And we have all these quantities here. And we can locate, using fMRI, the regions in the brain where the particular updates take place. And here you can see the, I don't know if it's here. No, we didn't depict the hierarchy here. That's another slide. I'll show you that slide. You can see how the information passes up through the hierarchy here. So we have mu1, delta1 here, and mu2, delta2 here. Pi2 is here and influences the updates on mu2. Then we have the belief prediction error here. The current belief state about the outcome here. We have the simple queue prediction error, which is the prediction error about the outcome relative to the pie chart. We have this here. And down here, as I said, there are these boat driven signals that come from deep in the brain and project to everywhere. We have this pi3, which is the uncertainty at the third level or the precision at the third level. Here, and you can see, it shows up here down in the brain stem. And then it shows up here in the singular cortex. And just one slide before, I think I showed you this before. This is the same experiment. And you can see the temporal succession of these updates in the EEG signal now, because we need fMRI to locate stuff, because we have a good spatial resolution in fMRI. But to have a good temporal resolution, we need EEG. And this is what we have here. So this is the temporal succession of these updates. And the simpler they are, the earlier they take place. But as I said, the actual situation in the brain is still much more complicated. This is what we're working on. OK. So why three layers? Because the complexity of our experiment basically allows three layers. And that's why in most published papers, you will see three layers. Next question. So in that case, where were we? Proud to estimation individual belief trajectories? Yeah. So in that case, I'll just start running you through some of the concrete studies we did. So a slightly more interesting kind of association learning was this here. This came out about one and a half years ago, Nature Communications. Archie de Berge did the experimental work here. And what we did here was we showed them two images of a rock. And they had to predict whether there would be a snake or no snake under this particular rock. And then they saw the outcome. And if there was a snake, they received a mild electric shock to their finger. Now, the thing is here, this electric shock is outside their control. So the accuracy of their prediction has nothing to do with the electric shock. Even if they predict accurately that there will be a snake here, they will receive the shock simply because the snake is here. And even if they predict inaccurately that there will be a snake there, but then there isn't, they do not receive shock. So the shock is entirely independent of the correctness of their prediction. Anyways, they make their prediction, they see the outcome, and they learn which rock is more likely to have a snake under it. And then we switch around these contingencies between the two kinds of rock we have. Sometimes one rock is more likely, sometimes the other rock is more likely to have a snake beneath it. And then after every third trial, we ask them, how stressed do you feel right now? And we also measure their skin conductance, which gives us a handle on how stressed they feel, and the pupil diameter. So the diameter of your pupil is closely related to noradrenergic activity in your brainstem. So noradrenaline is a neurotransmitter that is closely related to the better known adrenaline, and it also originates in the brainstem and is broadcast throughout the brain, and this kind of neurotransmitter is often called a neuromodulator. And these signal overall states that you're in like stress. And now if your noradrenaline increases, then your pupil diameter increases. So measuring the pupil diameter gives us a handle on the state of their noradrenergic system. As the model we applied, you see the trajectories at the different levels. This is the mu2. No, sorry, these are the uncertainty trajectories. So we took the pies and inverted them. This is the precision of the prediction at the first level, precision of the prediction at the second level, precision of the prediction at the third level, inverted. So uncertainty of the prediction at the first level, uncertainty of the prediction at the second level, uncertainty of the prediction at the third level. And then we used these model-based quantities, these trajectories that we inferred from the subject's behavior and fed them into a linear model. There's a simple linear model here. And this linear model was a model for pupil diameter, for skin conductance, and for their subjective stress ratings. So we looked at the different kinds of uncertainty and how they influenced their pupil diameter, their skin conductance, and their subjective stress level. So the first thing that we saw was that in fitting their behavior, again, we did model comparison. So we used the reinforcement learning model, the classic Raskolov-Arglum model, and then another reinforcement learning model, the Sutton K1 model, and the hierarchical Gaussian filter vastly outperforming them. We saw an interesting correlation between theta, the volatility evolution rate, and the perceived stress score. So people who think the volatility is volatile has a high evolution rate. They tend to have a higher stress score. So here, and then that makes sense. So if you think you're in an environment whose volatility is constantly changing, that'll make you more stressed. So this is the stress rating that the subject actually made in blue. This is participant 42, and this is the irreducible uncertainty according to the model. And you can see a very good correspondence between these two measures. Of course, you can say we cherry picked this participant, and if you look at the other participant, it looks really bad, but this is not the case. So if we take this irreducible uncertainty as a predictor in the subjective stress model that you saw on the slide before, then the parameter here is very valuable in predicting the subjective stress. And if we do model comparison on different kinds of this linear model, inserting different kinds of quantities to predict the subjective stress score, the one that uses the irreducible uncertainty is the best performing one. And we have a sanity check here. So this is simply the decision time in milliseconds. And when the uncertainty about the presence of the snake is greatest, so when the probability between the two stones is at about 50, 50, people take the longest to decide. And this just also makes sense. So this is just to show that the experimental manipulation we tried actually worked. And these are the regression coefficients in the linear model for the winning linear model for previous rating, shocks, and irreducible uncertainty. And this basically shows you which weight these quantities have in determining the subjective stress level. So that's that. And this is an interesting thing. Another thing we found is that performance which you have on the vertical axis is correlated with the subjective stress uncertainty coefficient. So if uncertainty affects your stress level, then that means you will perform better. And that tells you that the adaptive function that stress has. So we are evolved animals. And for some reason, we developed and kept the capacity to feel stressed. So it stands to reason that this capacity is there for a reason. So, and here we have a hint what that reason is. Namely, if our subjective stress level is informed by the uncertainty of the environment, this leads us to perform better. So people whose stress level reflects the uncertainty of the environment will be better at this prediction task. And people whose stress level is determined by other stuff, more than the uncertainty of the environment, they will perform worse. And the same works for the pupil. So if your pupil diameter reflects the uncertainty of the environment, then you're actually going to perform worse and this association is very strong. Yes? The stress level of the pupil. The water for people? At the level of stress level of the pupil. Yes, so the stress level. You mean the level of the pupil? Yes, so every third trial, we ask them how stressed do you feel right now? So we have a trajectory of stress. You saw this here. The blue line here is this participant. This is participant 42. This is how this participant's stress level changed during the experiment. And in participants where this subjective stress level mirrored the uncertainty of the environment, they were the ones who were performing well. Yes? It is simply pi hat one. So the sigma hat one is one over pi hat one. And this is the variance of the Bernoulli distribution. And this is mu hat one times one minus mu hat one and mu hat is the sigmoid. So if this, if we put a time index k on each of these, then this is the sigmoid of mu two k minus one which gives you the prediction. One minus sigmoid of mu two k minus one. So you take your mu two trajectory. You fill this in, you've got sigmoid. Sigma one hat. That's the irreducible uncertainty. So it's basically just the variance of the Bernoulli distribution about the outcome. So you see that this will be maximal for a mu one hat which is 0.5, which is in the middle. So in situations where you cannot predict under which stone the snake will be, you have maximal irreducible uncertainty because if objectively the snake is as likely to be under one stone as under the other, no amount of learning will allow you to reduce your uncertainty. So this study actually, Archie was interviewed on the radio and everything because somehow journalists like this idea that you stress level is caused by your uncertainty and that the better this link worked, the better your performance would be. So also people who were interested in performance in sports and so on interviewed him. He was on the radio all the time when this came out. Now here, a study together with Archie with Sven Bestman with Louise Marshall. These people are all in London where I worked before. So we had an interesting paradigm which is very simple but provides for an interesting kind of learning. So there were four stimuli, these four abstract things. So it's always a circle and the circle is filled in some particular way. And subjects saw a succession of these four possible stimuli on their screen. And they had four buttons. And for each of the stimuli, they had to press a different button, very, very simple. Just four possible stimuli on your screen. Stimulus one appears you press button one, stimulus two appears you press button two, stimulus three appears you press button three and so on. What we do is we record their reaction time. How many milliseconds does it take them to press the button? And what we can learn from that is how surprised they are about the stimulus appearing. So with a more surprising stimulus they will take longer to press the button. And we can manipulate their surprise about the stimulus by introducing a certain regularity in the way in the sequence of these stimuli. So if you have four stimuli, there are 16 possible transitions from one stimulus to another. And you can write this in the form of such a transition matrix here. So this is if stimulus one is followed by stimulus one. This is if stimulus two is followed by stimulus two and so on. This is if stimulus three is followed by stimulus two. This is if stimulus two is followed by stimulus three and so on. And now we have different transition probability matrices that we put in here. So if you sum up the columns of this matrix, the way to read these matrices is always in the sense of from two. So from one to three, from three to four. And what is the probability of that transition? So if you sum up the columns you will always get one because from a particular stimulus you will have to go to another one. So this is a so-called first order sequence. After stimulus number two you're most likely to go to stimulus one. After stimulus three you're most likely to go to stimulus two. After stimulus four you're most likely to go to stimulus three and so on. But only probabilistically, it's only 85% so. You've got a 5% probability of going to any of the other things. Then we have alternating sequences where if you're in stimulus four you're likely to go to stimulus three. If you're in stimulus three you're likely to go to stimulus four. So you're likely to alternate between these two. But it's also probabilistic. So there's a 5% probability of going to any of the other ones on each trial. And then once you're in stimulus one or two there's an equal probability to have any of the other stimuli following. And then we have a so-called zero-thorder sequence where each stimulus has a certain constant probability not depending on which stimulus we have before. That's why this is zero-thorder. So the stimulus you have before tells you nothing about the probability of what the next stimulus will be but each stimulus has a particular probability and some are more probable than others. So we did this for 1,200 trials and another manipulation we did was we gave them drugs. So we had four conditions. PL is the placebo condition. Then we had a noradrenaline antagonist which depresses the activity of the noradrenergic system in their brain. We have a cholinergic antagonist which depresses the activity of acetylcholine which is also a neuromodulator that is important in learning in their brains. And then we had a dopaminergic antagonist which depresses the dopamine in their brains. So we had four different conditions. Placebo, NAN antagonist, cholinergic antagonist, DA antagonist. And then we looked at how manipulating the levels of these neuromodulators affected the way they learned. The model, again, a similar, almost familiar three level model where we had the volatility trajectory here, the mu three trajectory, precision weighted contingency prediction error. We call this epsilon three. This is basically just the precision weighted prediction error at the third level. The prediction error that drives the third level update and it's a prediction error on the second level as we saw this, how these hierarchical updates take place. And delta one is simply the sensory prediction error. This is simply the difference. Delta one is the difference between the outcome, the observation you and the prediction from you one half to game. Because you're learning about a transition matrix on each trial, you have 16 prediction errors. You're doing 16 updates. On each element of this transition matrix, you're doing an update because you're not doing this consciously, but you are actually learning this. So here we have all the 16 trajectories indicating the learning about these elements of the transition matrix. On the left, you have a subject in the placebo group. And you can see the blue line follows the ground truth, which is the black line quite faithfully. And if you take into account that these are 16 updates taking place simultaneously, it's actually quite astounding. I didn't think before we did this it would work, but it actually works. They are actually capable of tracking and you are also capable of doing that. We are capable of tracking these 16 elements simultaneously and quite well. And you also have to consider that not each transition gives you information and not each observation gives you new information about all transitions. So you're observing only one transition, but on the basis of that, you're updating your beliefs. And it's astounding how well these trajectories here follow the ground truth. And this is not just an artifact of the model. So this can be different. So if a participant actually doesn't learn the contingencies, this will show up in the trajectories you see. And you see this in this participant 16 from the Acetylcholine antagonist group. Look for instance here, the contingency changes massively, he doesn't learn a thing. He just, his belief stays right down here. Whereas, if you compare this, for instance, to this here in the placebo group, even if the contingency changes for only a brief period, this guy learns. And whenever something changes, you can see the lag in the adjustment to the belief. So this is what happens when your brain is acetylcholine depleted. This is how your brain is affected if it doesn't have enough acetylcholine. Yes? Yes, so these are the 1,200 trials here. 1,200 trials, yes. Each participant does this. Yes, so these are basically, I mean the black lines indicate how the transition matrices change. So we switch them in and out. I neglected to mention that perhaps. So we change the transition matrix as the experiment goes along. So and whenever the black line jumps, the transition matrix jumps and people have to learn anew what the new contingencies are. And they're astoundingly good at it. So these are, the black lines are, so the objective contingencies in the transition matrix. So this is the four by four transition matrix. And this is also the four by four trajectory of transition matrices that you see here. And the blue is the inferred belief about the contingency in this particular subject. Every participant, yes, yes, yeah. That has not been drugged, yes. Placebo means sugar pill. We didn't inform them about what they received. Everybody got a pill. So and for some people, the pill was contained nothing. There are four groups. So I'm showing you one participant from the placebo group and one participant from the astylcholine deprived group. And there are other groups. So the effect was strongest for astylcholine. So, but we also had noraviroline deprived and dopamine. So these are the, how the parameter estimates affect. The outcomes, I think the most interesting thing here is, again, the shooting up of the learning rate when there's a change in transition matrix. So it's most pronounced in a zero-thorder to zero-thorder transition. You can also see it in the zero-thorder to first-order transition. If you go from zero-thorder to alternating, it's less pronounced. But you can also see this learning, this adaptive learning rate supporting the learning here. Okay, just for reasons of time, you can read the paper and download it. It's open access. If you want to understand the other graphs, I'll just quickly continue so that we can do the rest. So, this hasn't been published yet. But there's a preprint. You can go on the archive and download it. This is what I already briefly described to you. Subjects received two cues. A pie chart of the likely outcome. You can barely see it here. This is the outcome where the white dot falls somewhere on the pie chart. And, of course, it's more probable to fall on the blue part than on the green part. And then there's also this guy giving you advice. So here the pie chart tells you that blue is going to be more probable and he's also giving you the advice to go for blue. So you would probably go for blue in this trial. Unless you knew that he was trying to deceive you. So it was more likely that blue was not going to be the outcome if he said blue was going to be the outcome. Because he gets advanced information on the outcome. His advanced information on the outcome isn't entirely reliable either, but he knows more than the pie chart tells you. So if you are good at estimating his reliability at the particular point in time you are, and you integrate what he says with the pie chart, then you're gonna perform best. So then you've got six seconds to decide. You make your decision what you want to bet on. And here you are rewarded for correct decisions. So your capital, which is indicated here, grows. And this is actually a reward you get at the end of the experiment. You get a certain amount of money and the better you perform the more money you get. So for each correct prediction you get a certain amount of money. The problem is for him he also gets money. But the twist is this. If he gets you to a certain, if you end below a certain range of points he gets nothing. Let's say you can make a hundred points and if you have less than 50 he gets nothing. Now if you have between 50 and 70 points he gets 20 euros. If you have more than 70 points however you get, he gets only 10 euros. So his goal will be to get you to a score between 50 and 70. But he has to get you at least up to 50. And this induces a pattern that he will help you at the start because he wants to make sure that you reach at least 50. But then once you get close to 50 he's gonna start being less helpful because he doesn't want you to go above 70. And you have to learn that. You don't know that. So you just know that he will sometimes be, there will be phases in the experiment where he will be more helpful and phases where he will be less helpful and you have to pay attention to that. That's all we tell you. But this is his incentive structure and this leads to him behaving in a particular way where he starts out helpful and then stops being helpful. And you have to learn about this. So there's the whole hierarchy that we look at. So we have different kinds of predictionaries. So there's a Q related predictionary that's a predictionary related simply to the pie chart. So the pie chart gives you a certain probability, 60 to 40 or 55 to 45, something like that. And there's a predictionary related to that when you see the outcome. Then this guy gives you advice and the advice is either accurate or inaccurate. So he says blue and the outcome is blue, then it was accurate or he says blue and the outcome is green, then it was inaccurate. That gives you a prediction error because you have a prediction about how reliable his advice is. Third is the outcome prediction error. Now, this is after integrating these two, the advice and the Q, you wait these two sources of information and you come up with a probability for each outcome. There's a probability for blue and a probability for green. And this is your outcome predictionary. So your brain has to deal with at least all these three predictionaries. Then you have a precision of the belief about the reliability of the advice. You have a belief about the volatility of the reliability of the advice and you have precision of the belief about the volatility of the reliability of the advice. So if you think about it a little, it'll become clear. And then I've shown you this many times. The more complicated these quantities are and the last one I mentioned to you is the most complicated one. So also your brain takes longer to process it and you're at around 500 milliseconds. When you see the signature of this update in the EEG signal and the more primitive ones are down here. So the Q related prediction error, the one just related to the pie chart, you already have that at 134 milliseconds. Okay, we went through this. We did the same experiment again, this time using MRI and not localizing our updates in time but localizing our updates in the space of the brain. And this is where we see the signature spatial. Okay, and then I have two other studies that recently came out in the past half year and I will do that after the quick break we're gonna have. So back here in 10 minutes. And you couldn't recognize them as easily as if there were no noise. And now this, as opposed to the first experiment I talked about where people had to make a prediction whether they will see a house or a face worked somewhat differently. They first heard a high or a low tone and then immediately they saw the house or the face and then we recorded their reaction time when they told us, oh, this was a house, oh, this was a face. So first you have the Q, then you have the outcome and then you have the response by the subject. The first time we talked about high and low tones and houses and faces, we had first the Q, then the prediction, then the outcome. Now we have the Q, the outcome and then the response. And we infer what their belief state was again from the response time. And we also have a slightly more interesting schedule here. So we don't just switch contingencies always after about 20 trials. We switch contingencies first quite regularly and then we have what we call a stable phase where the contingencies don't change for quite a long time and then a volatile phase and we were especially interested in what happens to the learning rate when we go from this stable phase to the volatile phase here. And we had more trials, so we had around 450 trials. So you can see that with a few years you get more sophisticated in the way you design your experiments. Now at the purely behavioral level, no modeling involved here. We see an interesting pattern. We have classed our outcomes into three categories. Expected, neutral, unexpected. And this involves, as I said, no modeling. If the probability of a house is low and yet a house appears, that is classified as unexpected. If probability of a house is low and a face appears, then this is characterized as expected. The neutral trials are the ones here where both outcomes are equally likely. Here, here, here, here, and here. So in this very simple classification of the expectedness of outcomes, we see that the reaction time is affected by the expectedness of the outcome. And it is differentially affected in people on the autism spectrum who are on the autism spectrum and people who are so-called neurotypicals. So people without a diagnosis on the autism spectrum. You see generally that people without a diagnosis are quickened. And what you also see is that the expectedness affects their reaction time more than that of the ASDC. You can see the slope of this curve is stronger than the slope of this curve. And if we just look at the difference between the unexpected and the expected reaction times, then this difference is larger in neurotypicals than in ASD subjects. And the difference between differences is statistically significant. Does this happen to everybody? Probably not. It doesn't happen to everybody, okay. What a relief. So again, our model. And again, we had a linear model for the response. And this time we modeled log reaction times linearly. The reason we modeled log reaction times is that reaction times look a bit like this. So of course, nobody has a reaction time of zero. It's just physiologically impossible. And then they go like this. Now, this is not Gaussian. And if we model this with a Gaussian, then we have the absurd situation that part of our probability mass will be below zero, which is impossible for a reaction time. However, if we take this and logarithmically and transform our reaction times, we get a nice approximately Gaussian distribution. Log RT. And of course now we have the whole real line going down to minus infinity has a more or less nice Gaussian distribution. That's why we construct a linear model for our log reaction time. And the predictors in our linear model are, sorry, sorry. An intercept. Well, that's just this person's basic baseline reaction time not modulated by anything. It's just the mean reaction time this person, log reaction time this person will always have. Then the reaction time will be affected by outcome surprise. It will be affected by outcome uncertainty. It will be affected by probability uncertainty. It will be affected by phasic volatility. It will be affected by decision noise. It will be affected by omega two. It will be affected by omega three. And here we have the weights of these coefficients in the linear model. And you can see that B to zero has a significant influence on the log reaction time. B to one is big, but the variance is so large that it's not significant. B to two is small, but very consistent. So it is significant. B to three has no influence. B to four has a significant influence. Zeta, this is simply the decision noise is consistent with zero, very variable across subjects. Then omega two has no influence or very little influence at least not significant and omega three has a significant influence. So that's what we learn from how the state of the mind of the subject affects the reaction time. And what we see here, this is basically the most interesting result is the change in learning rate. The learning rate is here called alpha. At the second level, so this is alpha two. And the delta alpha two is the difference in learning rates between this stable phase we had at the end and the volatile phase we had at the end. So this is the difference in learning rates between stable and volatile at the second level. And this is the difference in learning rates at the third level. And again, blue is the ASD subjects and yellow is the neurotypicals. And you can see that both at the second and at the third level, both the ASD subjects and the neurotypicals increase their learning rates as they go from a stable to a volatile environment. But for the neurotypicals, the increase is more pronounced at the second level. And for the ASD subjects, the increase is more pronounced at the third level. And this means that neurotypicals look at this changing contingency as something that is situated more at the third level. So they increase their learning rate at the third level as composed to neurotypicals who think it is more situated at the second level, which is objectively more appropriate. So the neurotypicals just say, okay, well, the contingencies have now switched. So I have to relearn the contingencies and all I need to do is adjust my belief about the contingencies between the high and low tones and the faces and houses, so they don't get stressed. Whereas the ASD subjects increase their learning rate at the third level more than the neurotypicals and at the second level less than the neurotypicals. So they think, oh, the environment has become more volatile. It's become sort of uncontrollable but that's not true. So they could actually just go and learn. And this is an indication of why ASD people tend to seek out very stable, very predictable environments because they often get stressed with changes in the environment. This is reflected also at the level of pupil diameter. So we again measure their pupil diameter and this is the influence of the precision weight prediction error at the third level, at the volatility level on pupil diameter. So for the neurotypicals, volatility prediction errors basically don't affect their pupil diameter which as I said is closely connected to the Nordid and Ergic system. So their Nordid-Veneline level doesn't shoot up when they have a volatility prediction error. This is different for the ASD subjects. They worry about these volatility prediction errors and their pupil diameter increases. And this is within trial time. So from the moment of the outcome which is set to zero, you can see this increase in the influence of the prediction error, volatility prediction error on the pupil diameter. Okay, and now in some ways my favorite study where we actually managed to get people to hallucinate. So first of all, what subjects did we do this in? We have four groups of subjects. We had people who were P- that means they are not a psychiatric patient, no psychiatric diagnosis. H- means not a hallucinator. These are people who do not have hallucinations in their daily lives. And we were interested in auditory hallucinations, people who hear voices, sounds, mostly voices. So then we have people who were P-plus, psychiatric patients, and voice heroes. You often get this in people with schizophrenia. They hear voices, about 70% of the people with the diagnosis of schizophrenia hear voices, at least when they're unmedicated. And then we also had patients P-plus who didn't hear voices. That's the other 30% who have a diagnosis of psychosis at least, or schizophrenia even, who do not hear voices despite being a psychiatric patient. And then here, this is perhaps our most interesting group, we have people who are not psychiatric patients, not seeking treatment, who are doing fine. At least according to themselves, but who hear voices. Now, these people tend to see this as a special ability. And they tend to work as mediums, and they listen to their voices and they can give you advice. Yes, being psychiatric patients, poor. So we looked at people with psychosis and schizophrenia. So there are many other psychiatric diagnoses, of course. So a whole conversation would be a bit much, but there are people who hear conversations. It is mostly sort of phrases that people hear. It's mostly deprecating negative phrases about yourself. So you hear a phrase like, he looks like an idiot. Or look at him, he doesn't have a clue or something like that. Or some people sometimes also hear conversations, so they hear two people speaking about themselves. And then there are also people who, or the same people, sometimes hear voices that address them directly. So they say, don't go there, or this person is dangerous or something like that. Or directly, you are an idiot, you look like a fool or something like that. It's often negative. And yes, that's our four groups of people. Now, what we did is we went and we, put them in the scanner, so we wanted to have frame scans of them. And we, at the same time, did this paradigm, where we presented them with a screen that was usually black, but sometimes turned to a checkerboard. And at the same time the screen became a checkerboard, they heard a sound. And before they went into the scanner, in the previous session, we determined their detection threshold for the stimuli. So that we could present the sounds, the auditory stimuli at a level, at a volume, where they were 75% likely to hear it. Or where they were 50% likely to hear it. Or where they were 25% likely to hear it. So we determined that in advance. And at the start of the actual experiment, most of the tones presented, you see the dark part of the bar here refers to this level of volume, were presented at a level where they were 75% probably, probable to detect it. So they learned this association between the checkerboard and the tone. But already at the start, sometimes we presented the tone only at a level of 50% or at 25% or interestingly, we didn't present a tone at all. So there was just the checkerboard like this, but no tone. And then as the, those are blocks of 30 trials here. Each bar is a block of 30 trials. And these are the probabilities inside this block. So as the experiment went on over 12 blocks, we increased the probability of these trials where there were no tones, but the visual cue was there. And there were still trials with the other where a tone was presented, but the probability of the tone being very soft increased. And at the end, the tones at 75% detection threshold were quite rare. And after each presentation of this checkerboard, subjects were asked to say whether they heard a tone or not. In trials where there actually was no tone, but they reported hearing a tone, we classified this as a hallucination. And I'll tell you why we have reason to say so. Yes, yes, yes, it's an excellent question. Absolutely excellent question. I'll return to that question. Yeah, do we have any independent measure of whether the person actually heard something or not? Yeah, we do, we do, I'll show you. But just for the present, our present purposes, we record whether the subject says they heard something or they didn't. Now, as always, some sanity checks. Well, you always want to know when you compare different groups that these groups were actually quite similar in all other respects than the ones you're interested in. So that you know you don't have any confounds. You know that the difference that you're showing really has to do with the kind of thing that you're interested in and not explainable by other factors. So we wanted to know, did their detection thresholds differ? So are these just people with much keener hearing than others? And you can see that the detection thresholds were similar across groups. Then here, the probability of answering yes during no-tone trials. You can already see that the two hallucinator groups, so this is the patient hallucinator group and this is the non-patient hallucinator group, had a much higher probability of answering yes during no-tone trials. So they had many more hallucination trials which is not surprising because they are hallucinators. Whereas the proportion of these trials was quite low in the non-hallucinators, whether they were patients or not. And if we throw the two hallucinator groups together and the two non-hallucinator groups, we have a very large difference between them in the probability of answering yes during no-tone trials. And then we see that if we also look at the other than conditions with the 25% detection likelihood, 50 and 75%, this difference is abolished. So on 75% detection likelihood trials, all of them were equally likely of answering yes, to answer yes. Yes, yes, yes, absolutely, yep, you're right. There are these guys who are basically always answering yes and these guys here. And then there are other guys who look similar to the behavior in this in these groups. So yes, yes, you're absolutely right. Yes, yes, you're absolutely right. And actually one of the criticisms that many people have about psychiatric diagnostics is that they, psychiatric diagnostics is built on very superficial observations because the main goal since the early 80s has been to get a high consistency between two raters. So if you take one psychiatrist, you take another psychiatrist and you show them the same patient, you want them to come up with the same diagnosis. And the only way to achieve that at present is to just use very superficial criteria that can be objectively sort of ascertained so that two people get to the same diagnosis. And one of the goals of this kind of research is to come up with ways to diagnose that sort of reach deeper and don't just stay at the surface. And finding distinctions within patient groups where you have people down here, people here in the middle and people up here that may have a different underlying condition is, you know, it gets us further and would be an improvement over just lumping them together in the same category. So yeah, that's a good observation. Good, so how did we model this? So we created a model and we said, again, we have some decision noise. It's a very simple assumption. So we have a certain belief that a tone was present and this belief translates into a yes answer with some decision noise. That's why we apply a sigmoid to the belief as we always do. Now the more interesting parts are, the belief is formed as a combination of the new input and the prior. So it's, again, a precision weighted prediction error. And if you remember back a few lectures, we had this precision weighted prediction error updating in general in exponential families in the form of one over one plus new times prediction error. So we have prior plus precision weighted prediction error gives us our belief. Now the new here is, of course, an interesting parameter. We'll get to that in a second, but first the input was what we was given by the experimental design. So the input was a stimulus and the stimulus was at 25, 50, 75% detection threshold. Prior is the prior from learning. This is the learned association between the checkerboard and the tone. So again, our friend, new one half, like here. And new is a subject specific parameter here indicating the relative weight of the prior compared to the input. We have this a few lectures back. So if new equals one prior and input have equal weight for new greater than one, the prior has more weight and for new less than one, the input has more weight than the prior. And in this model we were able to estimate new for each subject. This is the model. So there are some additional twists here. We have our usual three levels, but then here we have a translation from X one to U that involves new. So, and then we have some decision noise beta. In other studies we call this Zeta, but now it's beta. And this weighting between prior and input is at this level translating from X one to U. And what you see is several interesting things. So at first again, you cannot explain differences between the groups by saying one group has more decision noise than the other. They're pretty equal in the amount of decision noise they have. Now, if you look at the estimate of new, the two hallucinating groups have a much higher new. And that means their prior belief that there is this association between the checkerboard and the tone dominates what they're doing. You see new is above one here. So the prior has even more weight than the input. So they've learned this association and they insist on this association being there. Whereas for the non hallucinators, the input has more weight. So new is much lower. Here, we have the phasic volatility estimate and we have a dichotomy between non patients and patients. So the phasic volatility estimate increases for the non patients and increases much less. Yeah, basically stays the same for the patients. So there, we can distinguish patients and non patients. At the second level, the distinction goes in a different direction or in two other groups are associated. Here we have the hallucinators and the non hallucinators together. So this, because these are two similar things because there's only the sigmoid transformation between the contain the same amount of information. So we have the hallucinating groups who learn less and the non hallucinating groups who learn most about the change in this contingency because we're taking away the association between the checkerboard and the tone. So at first it is appropriate to believe in this association, but then as the experiment progresses, as we take it away, you should actually learn that it's gone away. And now what I find really interesting, especially in the light of theories we have about schizophrenia and how it works is that those who learn best are also patients. So there's a non hallucinating patients and those who were learned least are the hallucinating patients. And this is an indication that this refusal to learn could be a reflection of too much input at the sensory level. So this is something I didn't mention because I wanted to save it up to now. We're over time and I'll finish very soon. But you remember this hollow mask illusion I showed you. It doesn't work with most schizophrenic patients. It doesn't work. So they actually see the mask from the inside. And that is because their top down priors are not overwhelming the sensory input as they are doing in us. And imagine if you so very attached or so very sort of glued to the sensory input you get and you process it very accurately, that's extremely exhausting. So we tend to filter out anything that's not important to us at the moment and let our priors take over. And you see that in this phase perception, but also in auditory perception, being in a room full of people is very stressful for patients with schizophrenia because they hear every sound in the room much in the way you would hear them on a tape recorder. So if you record, if there's a party and you're having a conversation with people around you, you're very well able to follow what they're saying. You may have to concentrate a little, but you're able to filter out the rest and hear their voices. And if you record the same conversation on tape or nowadays on some digital device, then you're barely able to hear anybody speaking because all the background noise is there sort of at the same level as the voices of the people speaking. And this, like your tape recorder is housed people with schizophrenia here in their daily life. That's very, very stressful. A similar thing we think is going on here. So they're very good learners. The ones who don't hallucinate learn very well that the association has now gone away between the checkerboard and the toe. And developing hallucinations is a kind of way of sort of going to the other extreme. Saving yourself from all this input that's coming at you, you go to the other extreme, you rely on your priors like this here. And you just start perceiving your priors as actual things happening in the outside world as when you hear voices that are clearly generated by you yourself internally. Now, just two more slides. This is the actual data and how well the model is able to reproduce it. So for all groups and for all conditions, the model is very accurate at reproducing what is going on here. So we have a good model of what's going on. And then to return to your question, we also did the imaging and what we see is a strong difference in activation. So this is yes greater than no responses. So where is the activation in the brain greater on yes responses as opposed to no responses during no tone trials? So we take all the no tone trials and then within the no tone trials, we take the trials where they answered yes and we take the trials where they answered no and we compare the activation. And on trials where they answered no, where they answered yes, the regions that are much more active are here. You can see them here. So this is the brain cut apart horizontally. These are horizontal slices of the brain starting low here and ending high here and progressing like this. And now this is not our data. This is separate data. Where people were put into the scanner and asked to press a button whenever they heard voices. So these were patients who were hearing voices and whenever they heard voices as they lay in the scanner, they had to press a button. And these are the exact same regions that are activated there as those activated in our study. So we have this indication directly from the brain that during trials where they answered yes, even though no tone was there, they were actually hearing something just like these hallucinating subjects here. So that's the indication we have directly from the brain. So it's not just misreporting of what happened. Yes, why are there other parts activated? So yes, well, yes, exactly. So here you have auditory cortex activations here. And in our, this is the auditory cortex here. So it's always, you know, when the brain does something, it's always a network of regions that are active. So it's not just one region doing one thing. These regions talk to each other and in every one thing, several brain regions are involved. Okay, so yes. Some things we didn't have time for. It's just as a last thing, I wanna say if you wanna do analyses like this or if you wanna get the software behind all these studies, it's publicly available here at this address. And it's part of the TAPAS, the HDF toolbox is part of the TAPAS suite of software. It's called TAPAS because it's very appetizing collection of totally unrelated software and the HDF toolbox is one of them. So it's basically all of us who did our PhD at the same place. We started building our toolboxes and then we combined them into one thing but because we did our PhDs on different stuff. It's not always very related. So there's a read me, a manual and an interactive demo. It's MATLAB based. There may in the future be a Python based version two which will be much more advanced in many respects but that's not there yet. Okay, so we missed these extensions but that's not very sad. I showed you some of these slides and then this part, I think we at the Blackboard we discussed about this a little bit. There would have been a few more slides on talking about this and interesting models of active inference and how that works. So next time maybe. Thanks for bearing with me. Tomorrow we'll have the exam and I don't think there's anything to fear in the exam. Any final questions?