 The common hypothesis nowadays, it's quite recent though, so I'll talk about this. So if you thought the previous lecture was high level, this is even higher level and more abstract maybe. So the question is whether it's useful to think of the brain in terms of a probabilistic machine or a Bayesian machine. So I'll describe what that means. So the starting point I guess is the idea that the challenge faced by the brain is that of uncertainty. That in each task that we are faced with every day, there's a huge amount of uncertainty and we have to deal with this, whether we have to recognize people, choose a way to go through the woods, et cetera. There's a lot of ambiguity and uncertainty. The world is uncertain, also the way it's being perceived by receptors adds ambiguity. For example, if you think of vision, it's a 3D world projected on a 2D retina. There's also neural noise. There's a lot of reasons why there are multiple interpretations for the world at every moment. The idea is that the brain has evolved to deal with this uncertainty and to represent it and to compute with it. The idea itself is not new. It's often attributed to having vulnerable modes that perception can be thought in terms of an unconscious inference. So you can think of the brain trying to guess, trying to guess what's happening in the world in an unconscious way and so making the best guess about what's happening. Oh, yes, sorry, I forgot about this. So this idea is not new, but it has been recently formalized and so it's not very... If I go there, is it okay? No? Yeah, but it's difficult for me to see the screen. Okay. So in the last 20 years, this idea has been formalized by people coming from backgrounds such as machine learning and they have proposed that really the purpose of the brain is to infer the state of the world from noisy and incomplete data. So this is also not working very well. So they are thinking about perception in terms of Bayesian inference, where the brain would be, at each moment in time, computing the probability of some hypothesis being true. So, for example, H1, given the evidence. And to do this, so the idea is that the brain again would be computing probabilities of things and probabilities of hypothesis being true and it would do this optimally and optimally means by using Bayes' rule, which is just this. This is Bayes' rule. And so the idea is that if you want to compute the probability of some hypothesis being true given some evidence, you can do that by looking at the probability of this evidence being there when you know the hypothesis is true. That is your likelihood, times probability for this hypothesis being true. So for example, if I want to recognize this and my hypothesis is that this is a bottle given everything I see. I'm doing this by using a model of what should be the image if this is a bottle, that's my likelihood, times the probability of bottles existing in my world. If it's the world where there's no bottles, this should constrain my interpretation so that I don't put a lot of weight in thinking that this is a bottle. So that would be the prior. So nobody has seen Bayes' rule before? Or you've seen Bayes' rule before? So the idea is that the brain would do something like this. It would use likelihoods and priors to compute the probability of some hypothesis being true and that probability distribution would be somehow implicitly or explicitly represented in the brain. So Alexandre Pouget, for example, is a famous researcher in the field. And he says very explicitly that the brain is representing probability distributions and instead of manipulating single numbers, it's always manipulating probabilities of things being there or correct, etc. So you can first ask yourself whether this describes human behavior well. You can ask yourself whether people do really behave as Bayesian observers. And the usefulness of those Bayesian models is that they give you a useful benchmark for performance. So you can always compare performance of human beings with those Bayesian models. So what does it mean to say that human brains would be Bayesian optimal? So the kind of optimality we are talking about is not the optimality of achieving the level of performance which is afforded by the stimulus itself. So for example, if you think of movies, when we go to movies, we don't see sequences of static images. We see motion, motion that is not there in a way, it's an illusion. So we are not optimally in that sense. So the question is not whether we can extract all the information that there is in the stimulus itself, but whether we take into account the uncertainty in the stimulus and at each level of measurements and whether we manipulate this uncertainty in an optimal way. And also whether we combine this uncertainty optimally with previous experience, what we have learned before from what we know about this world we are living in. So the Bayesian framework makes a lot of predictions at the behavioral level and so people work with this mostly nowadays at the behavioral level, although there's a big effort now to try to go down and I come back to this to a more neural level to try to understand what this means for the brain. So the first question people have asked about this question is whether people integrate... So how people integrate different signals which have different reliability. So this is again about uncertainty, how people manipulate uncertainty and if you look at the situation where we are combining different cues, for example vision and audition, you can ask yourself whether we are combining these cues optimally given the uncertainty. So an example of this is ventriloquism. So ventriloquism is a situation where we have visual inputs and we have also auditory inputs and we are combining those two types of inputs and you can ask yourself why we get tricked and how we combine those two types of inputs. And so the Bayesian models tell you that we should get tricked in this situation and you can quantify this exactly with the Bayesian framework and see how you should get tricked exactly given the stimulus. So this is a situation where we try to localize the origin of a sound given so visual input and auditory input. The Bayesian framework tells you that you should combine uncertainties in the input such that if one cue is more reliable than the other, the final estimate should be shifted to on that more reliable cue. So if you have a visual cue which is very precise and an auditory cue which is very fuzzy, the final estimate about the position of the sound should be shifted to on the more reliable cue, the visual cue. And we can quantify this and I'll show you the equations later, we can quantify this very precisely. When you look at also discriminability, the Bayesian models tells you that if you have both visual and auditory cues, the discrimination threshold should be lower, so you should be better in a way, than that for each modality. So the discrimination threshold is going to be a combination of the two discrimination thresholds for each modality and it should be lower than both modalities. And again, you have a very quantitative measure of what it should be if you are Bayesian optimal. Yeah, so unfortunately, I don't have all the curves here, but the idea is that you have two likelihoods, one for vision and one for audition, and you are forming a posterior, which is going to be a combination of those two likelihoods and depending on how noisy they are, or how sharp they are, the posterior is going to be shifted towards the more reliable cue, vision. Here they have the same kind of width and so your posterior is just in between the two cues. This is just to show you the kind of mass that is involved in this. It's very simple, actually. So we are in this situation where we are trying to localize a sound based on, so the origin of a sound, based on visual cues and auditory cues. So X is our estimate of the position of this sound and we are going to try to use maximum likelihood. So we are computing the posterior for this position based on the two types of evidence, D1 and D2, the visual and auditory cue. What we do is we just use Bayes' rule. So to compute the posterior, we use the likelihood times the prior. We are assuming that the noise in the visual input and the auditory input are independent. So the combined likelihood is the product of the individual likelihood. And now we assume that those likelihood are Gaussian. So the noise in each modality could be described by the Gaussian noise. So then using this, so we plug in the Gaussian, this tells me that this is not working. The point, yeah, it's working here. So we assume that the likelihood are Gaussian, so then we use the Gaussian expressions to compute the posterior. So you're just plugging in this expression into the line before. We have a Gaussian posterior. And so basically, if we have two Gaussian likelihoods, we have a Gaussian posterior. Just we arrange the terms, yeah? So yeah, so we have multiplied the two Gaussians. So this is the expression for the multiplication of two Gaussians. We just rearrange the term and now we arrive at our posterior, which is a Gaussian distribution, which has a new mean, which depends on the means of the individual likelihood and the variance, which depends on the variance of the individual variances. So this gives us the position of the posterior. So the mean, so then to estimate the location of the sound, we are going to choose the maximum of this Gaussian or the mean of this Gaussian, which is going to be the same thing. So the position of this Gaussian is telling you where you are going to guess that the sound comes from. So it tells you about the final bias depending on the two cues and the variance tells you about the precision of this estimate, given the two types of variances for the individual cues. Is that more or less clear? So it's a very simple model. So it's a Bayesian model of combining two types of cues. The visual cue and the auditory cue, where we assume that the noise is Gaussian for each modality and independent for each modality. We just compute the posterior based on Bayes' rule and this posterior is going to be a Gaussian distribution with a mean which depends on the mean for each individual distribution and a variance which depends on the individual variances. So given this, we have a very precise prediction about what is the final estimate for the combined modalities and the precision of this estimate. So from this, we can compute the predictions for where we are going to estimate the position of the sound given the position of each individual input and the discrimination threshold given each variance. So we find that, so the mean estimate is, so this tells us that the mean estimate, you can look at the math before to convince you of it if you don't see it right now. The detail don't really matter is just to tell you about the approach. So this tells you that the final estimate is going to be pushed towards the more reliable cue, the one with the smallest variance and that the final discrimination threshold is going to depend on the individual variances and this, if you play with this, you'll find that it's going to be a quantity smaller than each individual variance. So you arrive at those predictions that the final estimate is pushed towards the more reliable estimate and the discrimination threshold is smaller than each individual threshold and if you know each individual threshold, then you can predict what's the combined discrimination threshold. Is that clear? Yeah? So what people have done then is they have tried to measure these things. So they have, so this is a situation where people have manipulated touch and vision. So it's a very famous nature paper in 2002 which I think triggered a lot of research in this area. So as the work of Ernst and Banks, they had people measure the width of some ridge and they could touch the ridge and they could also see the ridge and there was some noise superimposed on the visual input with some artificial system, artificial virtual reality system. So what they've done is that they have manipulated the noise in the visual cue but not in the optic cue. And then they measured for each modality individually only vision or only touch. What is the discrimination threshold? What is the smallest difference that you can perceive? And given the Bayesian model, then you have a prediction given the Bayesian model, if you have a measure of the individual discrimination threshold for vision and for optic modality, then you have a very clear prediction about the bimodal discrimination threshold. So that's what they did. They measured individual discrimination threshold and then they looked at the prediction of the Bayesian model and then they compared with the measurements for the combined discrimination threshold. So that's the data. So they vary the noise level and so first they measure the discrimination threshold for touch and they find this level. Then only for vision. So it depends on the level of noise. They found so the blue dots. So discrimination threshold increases with increasing the noise for vision. And now you have optic and vision alone so you can predict with the Bayesian model the discrimination threshold for the two modalities when they are together. And so that's the gray stuff. And now they measure the discrimination threshold for vision and touch at the same time and it's the pink dots and they find that the pink dots fall onto the prediction meaning that human beings here behave as Bayesian optimal, in a Bayesian optimal way. So they could actually effectively make the visual noise different on each trial. It wasn't as it were that they sort of got used to a particular level of visual noise and then that into the count hall. I think there were blocks but changing quite rapidly. Okay, so it actually was more or less trial to trial noise as it were and that, yeah. But there was no, and so somehow then it was presumably theoretically obvious to explain what level of noise there was that somehow it was quite sometimes blurred or something like that or there was dots moving around or something. Yeah. And so yeah, you find that the Bayesian model estimates so when one cue is very precise, so very low level, low level, low noise level for the visual cue then the Bayesian model estimate is very similar to the visual, so the discrimination threshold for both modalities together is very similar to the visual discrimination threshold alone. On the contrary, when there's a lot of noise in the visual modality, the discrimination threshold for both modalities together is very similar to the discrimination threshold for only the apti cue alone. So you find that the discrimination threshold is lower than for each of the modality but when one is much more reliable than the other the discrimination threshold is very similar to that of the most reliable modality. When those situations where the discrimination threshold is extremely close to one modality alone, like yeah, it's called CAPTCHA. So when one modality is much reliable, much more reliable than the other, it's as if it captures the perception and it's the case with ventriloquism where you have a visual cue which is much more reliable than the auditory cue, so it captures the posterior if you want and so your final estimate is completely captured by one modality alone because it's much more reliable and so you believe that the sound comes from the visual cue. So that was a very important paper in 2002 with this claim that humans would be statistically optimal. Since then, there's been a lot of debates to try to understand what it means to be statistically optimal in what sense this is true, whether this has anything to do with statistical optimality or just about combining uncertainty in an optimal way, which is a bit different. But in any case, it has been replicated by many people with different modalities and different experimental situations. For example, also using vision and audition by this guy, Ali and Burrage and it's a situation very similar to ventriloquism and they compare to ventriloquism. So the idea is that there's been a lot of psychophysics in that direction with claims like this that humans are Bayesian optimal in that sense for cue combination, which has triggered a lot of research on this Bayesian brain hypothesis. But you have to understand it's a very specific situation on the cue combination, combining two modalities with different uncertainties. So now I'm going to describe another line of work in that field and it's going to be about priors. So I've told you so on in this Bayes rule. So the idea is to compute postures based on likelihood and priors. And so the question really, so there's been much less work on those priors that the brain would use to compute the probability of hypothesis being true. So the question you can ask yourself is what priors is the brain using and how it influences perception. So the prediction of those Bayesian models is that the more uncertain the data is, the more you should use your prior beliefs. So in Bayesian terms, you are computing your posterior based on the likelihood and the prior. The wider the likelihood, the more you should be attracted to your prior in the posterior. So the more uncertain you are about the world, the more you should rely on your previous experience, that what it means. And then there's this other idea in those Bayesian models that those priors should reflect the statistics of the world. So your previous experience in the statistical sense, what's very unclear is at which time scale we are talking. So if those priors reflect the statistics of a sensory world, it's not clear at all whether the statistics are over minutes, days, years, developmental scale, et cetera. So I'm going to describe some work I've done in that field where I'm trying to identify what kind of priors people work with in perception and also how they learn priors and how they maybe unlearned also priors that they have learned over a lifetime. And finally, I'll say something about whether this tells us something about the brain at all or not. So people have looked at visual illusions a lot to try to understand the kind of assumptions that the brain is making when it's perceiving. And it's very clear that the brain works with very strong assumptions about the world. One of these is the fact that light comes from above. If you look at this stimuli, I'm sure you're going to see this as being convex and this is being concave and I hope it works. This is due to the fact that you assume that light comes from above, so that's convex. If you would assume that light comes from below, you should see the reverse. You should see this as being convex and this as being concave. But uniformly we are going to make this assumption unconscious completely that light comes from above. There are a number of other assumptions that we are making when we are perceiving. One is that objects are symmetrical. A symmetric, we are going to see things as being more symmetric than they really are often and this has been measured. We are also assuming that contours are smooth in space and smooth in time. So we are often biased to see things as being smoother than they really are. If we look at orientations of lines, we are assuming that orientations are more frequently at the cardinal orientations. And another assumption that I'm particularly interested in is that we are assuming that objects in the world are static or when they move, they move only slowly. And interestingly, all those assumptions have been formalized recently in Bayesian terms and used to explain visual illusions. So I'll describe a little bit this assumption that we assume that objects don't move fast or static or move slowly, which we call the prions on low speeds. So initially, people have looked at the aperture problem. So I guess you've seen that before. So if you look at motion behind an aperture like this, and if I ask you what's the direction of the line, I'm sure everybody is going to see the direction of the line as being perpendicular to its orientation. Even if it's not. But the idea is that in a situation like this, when it's occluded, you can't retrieve any direction parallel to the line. I mean, you don't see it because you don't see the endpoints. And so you are making the assumption that there is none. But in reality, there's an infinite number of motion directions that are compatible with the stimulus I showed you. So you can ask yourself, why do you choose to see this motion direction as being perpendicular to the line and not something else? And one element of response answer to this is that this interpretation corresponds to the interpretation with slowest speed. So if you have yet two of those lines, assuming the motion has been perpendicular to the line and not having a component parallel to the line corresponds to the smallest displacement from one time to the other. So it corresponds to the motion with slowest speed. So people have worked with this idea that the brain possibly is working with assumption that the speed of object is slow. And in a situation of uncertainty, we are going to assume that things don't move or the motion is slow. And there's quite an important paper which was published by Jervais, Ted Adelson, and Simon Shelley in 2002, which looked at all those visual illusions again about motion perception and assumed it's a very simple Bayesian model which has this single idea of having a prior on slow speeds. So this idea that in the situation of uncertainty, we assume that the speed of object is slow. And they fed the model with a number of the stimuli that are used in the study of illusions about motion perception. And they showed that this model explained a lot of data. So there's a number of illusions which I don't have time to describe here about how you would perceive the direction of a rhombus like this depending on its fatness and on its contrast. And it would change actually the direction you perceive changes as a function of the fatness of the rhombus and the contrast even if the direction itself doesn't change. There are other illusions related to the barbell, pole, or other kind of funky stimuli where you see the direction of the stimulus as changing with the contrast of the stimulus. And these are all very nicely explained using this very simple assumption of a prior on slow speeds. So it's a nice and simple framework to explain a lot of illusions in that case. And so more recently, so people have been describing this assumption that the brain is supposed to work with. More recently, people are trying to measure that directly. So I think this has been pioneered by Struck and Simon Shelley. They looked at this prior on slow speeds again. And what they've done is they've conducted a series of psychophysical experiments to try to measure precisely in individual participants what is the prior they work with. And so in a nutshell, what they do is they look at behavior. So they look at comparison between two gratings. So yeah, people have to do comparison with two gratings of different speeds. And what they do is they fit a Bayesian model to the behavior, a Bayesian model which has a parameterized kind of prior. And then they look at which prior fits best the data. And so for each participant, they can fit a given prior to the behavior. The idea is to find out whether people work with this prior on slow speeds, whether this describes the data in this kind of situation, and whether this prior is Gaussian or not, and how it deviates. And what they found, so it's a nature paper, again nature and neuroscience paper, they found that the model that was proposed before works quite well, although the prior that people seem to work with is not Gaussian, but slightly different from Gaussian, but still centered on very slow speeds. So the game nowadays is to try to measure in individual people the kind of prior they work with. That's the idea. So I've worked a bit in this field as well. And so the question I'm asking is about this learning of priors. And so we want to know whether these priors are innate or whether they are learned and on which time scale. And the question I'm asking here is whether people would form new priors for very basic visual features, for example motion direction, and to try to quantify how this influences perception. That's the work of my PhD student who has now finished, Matthew Schalk, in collaboration with Aaron Sight in Riverside. So this is an experiment where people are looking at clouds of dots moving in a given direction on every trial. And they have to tell me two things. They have to tell me what's the direction of the dots by moving an arrow, and they have to tell me whether they've actually seen dots or not. And the contrast is very low. So there is an ambiguity about whether there is something on the screen or not. So I have to do an estimation task and a detection task. It looks like this. They have clouds of dots. You have to indicate the direction and then tell me whether you've seen something or not. And on some trials there's nothing or maybe the contrast is very low, but you still have to do the estimation task and then to tell me whether you've seen something or not. What subjects don't know is that not all motion directions are equally likely. There are two directions which are more frequently presented than others. And so the question is whether participants are going to learn about this distribution over trials, and if they do, whether this is going to bias their perception in some ways. Is that clear? So what we found is that people did learn unconsciously about this probability distribution of the stimuli. And we could see this in three ways. We found that detection performance was better for the most frequent direction. So here I have folded each graph around zero. So the dotted line represents the most frequent direction. So we found that people are becoming better at detecting the most frequent directions. Also they are faster at detecting the most frequent direction. And interestingly, the learning is completely unconscious. If you ask them afterwards whether some directions are more likely than others, they don't know. If you force them to draw which direction of movement are more frequent than others, they are going to tell you random things. But it's very clear they have actually learned something. So we see this in a detection. We see this here also in so what we call hallucinations. It's very simple kind of hallucinations. There are those trials when there is actually nothing on the screen, but people are forced to do the estimation and then to tell me whether they have actually seen something or not. There are cases where there's the red line. There was nothing on the screen. No, sorry, the black line. There was nothing on the screen and they've told me they've seen something. And in those cases, they actually tend to report more often the most frequent directions or the expected directions. So there's nothing on the screen and they tend to perceive what they expect, the most frequent directions. In those cases where there was nothing on the screen but they told me there was nothing on the screen, they don't show any bias. They report any kind of directions. So they tend to hallucinate what they expect, the most frequent directions. And we find that this effect was very fast to develop in a few minutes. They would start to hallucinate those most frequent directions when there is nothing on the screen. And another effect which we found which was even stronger is that when there was something on the screen and they do this estimation task, they tend to have a bias and they tend to report that the direction is more similar. So it's a bias. So zero would mean they report the exact direction of the stimulus. We found large biases where they tend to report that the direction of motion is more similar to what they expect than it really is. So an attractive bias where they tend to perceive things as being more similar to the most frequent direction than they really are. Is that clear? So then we've done this game of trying to measure the priors that they are learning. So clearly they are learning about the statistics of the stimulus and we can measure that. And so what we do then is fit a Bayesian model to the performance to try to fit this prior to individual participants. And what we have to do also is compare this Bayesian model to other kinds of models to try to make sure that this behavior that we are observing is not due to something else which is not Bayesian at all. So we've done that. I'm not going to go into the detail but we've done that modeling and fit the data with the Bayesian model. And we've done a model comparison. So we've compared the Bayesian model where so we assume that they learned some approximation of the stimulus distribution with all the kind of models where basically they would learn about the stimulus distribution as well but they would not combine this prior with this evidence. They would just somehow sample from this prior in the situation of uncertainty. They would pick maybe the maximum of this distribution. So these are, so the alternative models are models where they also learn about the stimulus distribution but they don't combine this representation with optimally with the evidence, that's the idea. You can have a look at the paper if you want to know all the details. But the idea is that it's always important to compare different models and having only a Bayesian model is not a good idea, Angel. Yeah, yeah. Interval, they have to click and then it goes to the, yeah. Yeah. You can call this prior memory if you want. Yes, we do. This is an experiment in statistical learning. The idea is that they are learning the distribution over trials and this is what we measure. What's the difference between memory and learning? Yeah, to me it doesn't matter whether you call it memory or learning, they learn about the stimulus distribution and then they use this learning which is some form of memory for their perception. I'm showing they learn about the statistics that are the stimulus and they use this knowledge. Perhaps what you're saying is, let's say that we want to construct a model of how this actually happens and we want to go down to the sign ups and molecules and all this stuff that Uppie was talking about and the gene expression of all this. But what, maybe what this is saying is what exactly that model has to do. Yeah, exactly. What data are we actually trying to reproduce when we get the model to do this sort of task? Yeah. And this is a very sort of precise characterization of what is in the description of what's actually going on and that's then what we try and reproduce because we might say, we might naively think that it's just doing some winner-take-all thing or in which case we build a network that does winner-take-all but then that might be the wrong network to build. However detailed the network is that we have underneath it, that's probably not what the network's actually doing so we've maybe got the, that would have led us down the wrong path. So this is, I think that's... Well, we are after is trying to understand how much we can learn in a task like this, also with which kind of time scale and as I'm going to describe in the next slide. So now I'm pushing this kind of work to try to understand what is the complexity of the distribution that we can learn and also how we transfer this knowledge to other kind of stimuli. So the idea is to characterize the process of statistical learning and also the time scale with the hope that maybe it will give us some idea about the mechanisms that are involved but I'm not at the level of mechanisms at all here. Just one more question, I always get confused. Is this then what people talk about when they mean a normative model? Is that what, or is that another question? Yeah. So the question of optimality is very complex, yeah. It's a normative model in that it is optimal in combining a prior distribution with the evidence. So it works, so a model that works by using Bayes' rule fits the data very well. It's not optimal in that the prior that is learned is not the exact stimulus distribution, it's just an approximation of the stimulus distribution. So the idea is that people do the right kind of thing, they learn about the stimulus distribution but they are not perfect in that they only learn an approximation of the stimulus distribution but then they are going to use it optimally to constrain their perception. So it's normative but it's not in that, yeah. In that it's a Bayes' strategy but it's not optimal in that the prior that is learned is not perfect because if they were perfect, they should learn exactly the stimulus distribution and we can measure this prior and see how it deviates from the stimulus distribution, is that clear? Yeah, so now I am pushing this to look at more complicated stimulus distribution and look also at this idea of transfer. So if you learn, if I show you two kind of dots, some red dots and some green dots and you've learned about the distribution of red dots, are you going to use that knowledge to perceive green dots or not? In which kind of situation are you going to transfer this knowledge or not depending maybe on the similarity between the stimuli I am showing you? So the second and final project I'm going to talk about, so we are looking at the slow speed prior I told you about and the question here that we are asking is whether these priors which are thought to be learned over a long time, maybe a lifetime, can be modified or not. So in other words, how plastic are we about learning these things? Is it fixed after a while or do we remain plastic about learning even the simplest assumptions about the world? So the idea about the slow speed prior, the fact that people assume that objects are static or don't move much, is that possibly this is true in the world. The objects in the world are mostly static and so the idea is that this prior would be learned over a lifetime, over development, we don't know, but it would correspond to the statistics of the environment. So the question we asked in this experiment is if we change the statistics of the world in a given experiment, whether people would update their prior to take this into account. So the idea is to expose people with faster speeds and see whether the illusions that are related to this slow speed prior would be modified or not. Does that make sense? So this is an experiment where we are looking at a field of lines and they can move in two directions, either perpendicular to the lines or oblique. And in fact, in half the trials, they move perpendicular to the line and in half the trials, they move in an oblique way. The task is to report whether they go perpendicular, so it is up or oblique, which is down. You have to tell me up or down. It has been shown before by Jean Lorenzo, who happened to be my previous PhD supervisor. It has been, he has found 20 years ago that when the display is such that the contrast is very low or the duration is very brief, people are very biased to perceiving a direction perpendicular to the lines, just like in the aperture problem that I showed you before. So in a situation of uncertainty, you see this perpendicular to the lines. And in the model I told you about, I advise Adelson and Timon Shelley can explain this with a prior and slow speed. I don't know if you can see it. Sometimes you can, something. It's just one second, but can you see it? I could see it. So it's moving, it's just drifting coherently and you have to tell me, yeah, so it's fine, finished. You have to tell me whether it goes up or down. So what we've done is that we have, we had people do this task for one hour over five days and in each session there was a test block, a training block and another test block. We had two groups, a group of people doing the task only with slow moving stimuli, which was four degrees per second in our case and another group which was trained on faster speeds. So they would be tested on four degrees per second and trained on eight degrees per second, then tested again on four degrees per second. And they do the same task over and over again, that's all they do. And so the idea is whether these two groups would differ, knowing that one is exposed to faster speeds than the other group. So what we found is that initially, just like Jean-Laurent Sochot 20 years ago, initially people are very biased to see motion perpendicular to the line. So the first session, so red is one group, blue is the other group, dashed line is at the beginning of the session, full line is at the end of the session, everybody is biased to see motion as being perpendicular, whereas if they were very difficult, on average they should answer 0.5. Now, when we go from session to session, we find that this group, the red group, which is exposed to faster speed, starts to see things as being more veridical maybe. So it's losing this bias. And after a while, it actually starts to see motion as being more oblique than perpendicular. So it's exposed to faster speeds, eight degrees per second, and then tested on slower speeds. And now, because they expect a faster speed than the stimulus, they perceive motion as being more often oblique, which is consistent with motion that is faster than the stimulus. Is that clear? So there was a bias towards perceiving things as maybe slower than they really were, and now there's a bias towards things as being faster than they are. So this experiment showed that we managed to revert the visual illusion over the course of a few days. Interestingly, we found two sorts of time scales, one in a given session and one over sessions, so from one day to another. So there was one effect within session, and then that learning, some of this learning was lost from one day to the other, but some also were skipped. And over a few days, it seems that the prior had been updated. These conditions, and then after a while, are they still the sort of factor in all everyday life? Yeah, that's what we think this is, possibly, that from one day to the other, they seem to forget, but that's also because they've been exposed to normal range of speeds. We don't know there, so we are going to test this now. Then what we've done is that we've done this exercise of fitting the Bayesian model to the behavior, so now the Bayesian model is the vice Adelson and Timon Shelley model that I told you about to try to see what kind of priors people work with and how it changes from day to day. And we find that the prior they work with changes from day to day to approximate the stimulus. So in the red group, yeah, that's the mean of the prior, so where it's centered, and we find that this prior is going to be centered closer and closer to the real stimulus in this experiment. And the Bayesian model works very well here. So for us, this shows that this prior that people have thought was learned over a long time scale could actually be re-updated. We find that there's a long lasting effect from one day to the other, and this is just due to exposure, there's no feedback on this task. It's just exposure. So we think it also clarifies the link between statistical learning, the statistics of the environment, and the existence of this prior and its influence on perception. So it also gives us this idea that indefinitely of the assumptions we are thinking about, even those that seem very simple and related to very basic properties of the environment, the brain is constantly learning about those. And here we have a notion of time scale as well. Yeah, so we are doing new experiments related to this to try to understand whether people have different priors depending on their own experience. So for example, we are going to look at people who play video games. There's a number of studies showing that people who play video games have different performances in a lot of visual tasks. We also think they might work with priors favoring higher speeds. And we are looking again at this question of transfer. Whether this prior that you have updated here for this kind of stimulus, you are going to use it for different kind of stimuli as well. So that's it for the psychophysical work. I think in general what's found is that Bayesian models are extremely elegant maybe to describe psychophysical data. Now they are used to quantify the priors that people work with. And I think it's interesting data to look at what's being learned and what's being used in perception by the brain. I think now, so the Bayesian hypothesis is extremely strong in the psychophysical world. It's a major tool that people use now in psychophysical studies. Often you will see claims in psychophysics about people then being Bayesian optimal. It's not always clear what it means.