 We're talking about the equivalence between or to think about utilities as some kind of unnormalized surprise values or surprise value as a particular kind of utility. And what's important and we'll come back to that is that here I have introduced the utility of the whole event space right the reference utility which is given now by we said already the log partition sum right so if I have so if you think about this now if I have the utility of all the axis so I have the space omega right and I cut it up into little pieces x and now I ask and I tell you what is the u of each x right and now I ask you what is the value of the whole thing right this is what we come back in a minute again we talk about the certainty equivalent then this is the value of the whole thing okay it's not just an expectation value of the of the individual use or a sum of these use or something like that right this is what it is now if if alpha goes to zero actually it will just be an average of these use right but in general it will be like this so we we raised before this issue of the alpha it's a translation factor right and here for example it translates between say the units of utilities and the and the units of nuts or bits or whatever you choose as oh sorry I have to write higher right it's okay it's okay I write it again here it's okay I didn't think about this before right this is the normalization of the probability this is in the case when you take the surprise for you I'm just basically solving this for p right and then writing it like this or now if we have unnormalized utilities right and we have this basically so I said that this corresponds to this right that the surprise where it corresponds to the utilities when they're normalized right and this normalization factor corresponds so to say to the value of the whole thing that's my point okay now the screen's gone okay right so now we said the computation right is going from a prior to the okay I don't need the screen actually it's okay so maybe I'll leave it down then yeah okay so we said computation is going from a prior to posterior right and we said that basically how does it happen if I impose an extrinsic utility right I was saying the example I put money and all of a sudden you do something else then you would naturally do and so how can we think about this so we think that we change the energy function or the utility function right by adding a delta u okay that's the operation of putting money or something like that and then the overall utility changes so if I want to include that into this sort of thinking right then I basically I write q and p just exactly in the way that I introduced it here right I just write p depending on this use right and I have this exponential and the translation factor and now I expand p because I assume that p is basically the original utility plus a new utility difference right and then I can re-express this part that depends on q as q right so that I just have the change in utility that I have in the exponential right there's very very simple manipulations and then basically I can interpret this that I have a prior distribution q right I'm of doing something the utility so this is what you would do naturally without being paid or something like that then there's a utility difference because there's a change in the external utility right so if you're a particle maybe a magnetic field is switched on if you're a human maybe somebody pays you to do something if you're an animal maybe your environment changed right for example I don't know there's a big predator coming okay so that means utility function changed drastically and your behavior is altered accordingly okay and what's interesting and this is what we're going to look at again is now this this delta f right because this is sort of the baseline so this simple update as many interesting correspondences and literature one is of course you could say it's like a Bayesian update right I go from prior to posterior and this is my sort of what corresponds to the likelihood right you will find this formula for example in people that have looked at Bayesian search this would be then interpreted something like a search cost of Jains for example has worked on this or you can find this also as a version of the replicator equation in if you are interested in evolutionary game theory or just like doing evolutionary modeling right because in that case X would be a distribution of a different species in your population and this would be sort of the fitness of X right this would be the average fitness in the population and then the simple interpretation is in all of these if my fitness as X is larger than this sort of reference value the probability of X will increase if it's smaller than this reference value probability of X will decrease right that's true for evolution but also for Bayesian updating and so on so what is this delta f now so this delta f is what we can think of as the certainty equivalent for this change in utility okay so we regard now Q as the thing that you would do naturally so that's why we don't write it anymore as a utility and we're asking now what does this delta u what kind of changes that induce and what does this delta f mean so again this delta f has this form of basically like a log partition sum except that it has these probabilities here now as well and that gives it a nice property that for all the different limits of alpha this value is well defined now okay so such that we can interpret this alpha as some kind of a rationality parameter so for example if alpha goes to infinity that means that you have basically unlimited computational power right and you can choose the best X that you want okay the prior is irrelevant you just optimize perfectly this external utility okay if alpha goes to zero that corresponds to the case where you have no computational power in that case you cannot change from your prior distribution right you do you stick with Q and the best you can hope for to earn is basically the expected utility gain under the distribution Q because you were not able to change for any alpha in between you're in between these two right so you see that here in this graph on the x-axis there's this rationality parameter and you see here for zero you have this expectation value and the more alpha you have right the higher your certainty equivalent becomes and we can also have a negative right you can try that out in that case the value goes to the minimum why is that interesting well there's two reasons this could be interesting one is for example if you were to anticipate an opponent right to be able to say okay what is the worst they can do to me it's important when you plan like in these minimax trees but this is also important when you talk about robust decision-making okay and so you're planning for worst-case scenarios essentially we'll come back to that later so so the important thing here is that we want to advance the idea that this basically this delta f should replace the notion of the certainty equivalent okay it includes the expectation operation as a special case but it allows for many more certain equivalent operations depending on the resource parameter alpha now if you want to basically relate this right to the initial picture I was showing you where you have basically we said that we have this set x we want to choose something and then this is the set of acceptable ones and we said okay maybe we have a distribution over that right so the initial set x also may not be a uniform distribution right this would be the distribution q and then we change that through deliberation through to the distribution p right that's the idea we and we do the reason we do change is because our utility function has changed right okay so here's another interesting aspect of this delta f okay and here is something that basically is like a simple physics example but you can get this is nothing special about physics you get this basically just from simple mathematical considerations so the idea is that so this delta f in in physics gives you basically the best that you can get out of a system in terms of work okay and in work usually means pushing pistons if you've ever sit set in a thermodynamics class that we do all the time and then you would express in the best case the free energy as basically the work that you do with the piston pushing against the pressure and so on but the details here not so important but now imagine that instead of having one piston we have several small pistons okay that we push and then we can basically express what is the work for each of these small pistons and again now you would think okay maybe the work of the big piston is the expectation value of the small system pistons but that's not true what you get is essentially that it is this expectation value plus an extra sort of information cost right and it's exactly the same what I just said a few slides earlier where basically you have this space you split it up into different regions right you say okay I know the utilities for these regions how do I get the utility of the whole thing so it's basically what physicists would call cost-graining right so depending on how far you want to zoom in okay so if I know the energies for these compartments what is the energy of the bigger compartment and the important thing is that it's not just the expectation value but you have this extra term and essentially where this comes from is that if you postulate that there's this relationship between probabilities and utilities right through the log function and you also postulate that the loss of probability hold right so also for example if we now introduce a new partition right this should sum to one but you should but also if I sum over x and y but if I sum just over x right this should be equal to p of y so because of the marginal and so on and if I want to represent the marginal also like this right as an exponential of some utility right then this utility will have this kind of shape okay that's basically where it comes from so but it's simply it has this recursive like property for this cost-graining so in so one could even argue okay if you want to take this idea to the extreme that that the utilities don't exist the only thing that exists are these free energies right because everything that you call utility if you would zoom in further and partition the space even smaller you would find out that okay there's now again an energy for this small and this small and this small like y or z or whatever you want to call this right and then it and then what you put here would again be a free energy right we could say okay now pretty with x now we take each x and partition it further with y's or something like that it would again be another free energy so it just depends how far you want to basically zoom into the reality and the bottom level where you stop zooming in you call this is the utility but if you could zoom in more it would again be like a free energy if you see what I mean it would always have this sort of recursive relationship okay so now we want to see how we can think of this in terms of a variational principle so we start with this first line here where does that come from it's just this line if I solve this for Delta F right then I would get this this is only very simple manipulations and now this has to be true so if P is chosen as this what we call the equilibrium distribution okay if it is chosen like this then this top equation is true for every single x and of course then it's also true if I take an expectation with respect to some W right and then so this is equally true and then you notice that if I now change this P as a W right then I would have variational principle such that if W is equal to P I would get an extremum okay so that means that this distribution here becomes the extremum of this functional that makes sense okay so basically we've created a variational principle to find the equilibrium distribution right so we can try out now all probability distributions W and the one that is the best is this one okay and then you could so this is again the relative entropy right between W and Q could apply by divergence that we mentioned before now it's returned as promised yeah and of course so this is saying the same thing just in a different writing right so it's a variational principle so if you basically try out all the different probabilities you will find that this is the best one okay so here's an example where this holds maybe one that you know already or you've discussed already variational Bayesian inference so variation Bayesian inference is a technique that's used in machine learning when you want to compute Bayesian posteriars but they're so complicated that you can't do that so say for example this is the posterior over your hypothesis H given some data the true one that you cannot compute or represent and but you have some other distribution for example a Gaussian that would be the distribution Q and you want now to find the best distribution Q to approximate the true posterior as much as possible okay and usually this problem is formulated like this right and you would have a natural so this is always strictly positive quantity right so you have a lower bound given by the likelihood of the data and you just basically change Q so for example you could change the mean and the variance of the Gaussian to minimize this right this is how it's usually studied but we can rewrite this basically in a form that such that the variational inference looks exactly the same format that we've been looking at now namely we basically we split this posterior into prior and likelihood right the log likelihood we treat it as if it was a utility right and then we have basically a trade-off between the two terms one is to maximize the utility and the other one is this information term right and the distribution that trades off these two things optimally is basically the same one that you will get here right maybe I didn't say that explicitly on the previous slides I should have said that so the idea here is that this with this variational principle right you you trade off two things on the one hand you want to choose a distribution that gives you as much utility gain as possible right and on the other hand you have these sort of information processing costs okay there's some something that you lose in the value because you have to move away from the prior distribution right and we can think and if you want to give us a boundary rationality interpretation is that this is costly right moving away from the prior distribution is costly you have to do some kind of you have to spend resources okay and whatever resource you spend whether it's time money or whatever right it should be monotonous in this DKL right so for every extra resource that you invest you want to be able to rule out more bad alternatives so to say right if you don't do that then don't spend the extra resource right it it must bring you somehow a little bit closer to what you want okay so that's why we can think about this public liability versions as a general information processing cost that's the idea right and here in the example of the variational inference we can think of it like that right the one hand we want to optimize the likelihood of the data right on the other hand we don't want to move away too far from the prior and these two things are traded off okay so now here's a little example to make it a bit more intuitive so imagine you have to choose between four different actions okay I want to a four and imagine that this is the utility function for each action so this examples a little bit engineered to basically like hammer home the message but it may be still useful so it's engineered in a way that action a one and a four almost have the same utility but a four is a tiny bit better okay so maybe you cannot see it that's why I'm telling you and a two and a three are basically a little bit rubbish okay and let's say initially you don't have any preference amongst the action so let's assume we have a uniform prior and now so this beta is now the alpha from the previous slides right so if we have a low beta right then essentially but it's not zero right so then essentially we put most probability mass on a one and a four if we have a very high beta means lots of rationality we figure out that actually a force the best right and we put all probability mass on a four okay and so now look at these plots right so this is the beta the rationality parameter against the utility so you see that very quickly right very quickly your expected utility is close to the maximum why well because choosing between you only have to do right you don't have to solve for two bits but only one bit this is what you see in this graph right with one bit of information spent right you can basically choose between this you distinguish this and this versus this right the rubbish versus not so rubbish states and that's enough to give you a good utility if you want to improve even more right to spend the extra bit of information will be a lot of effort but result in hardly any utility gain okay that's the point so you can achieve almost not a hundred but 95% of the utility with one bit and then it becomes more and more expensive okay so when you have limited resources it makes sense to ask how should I spend them right so if you can only spend one bit which tool sets do you want to distinguish well distinguish the ones that bring more utility from the bad ones that's the best way to do it okay so here is an example of how this could be like a mechanistic example okay yeah yeah yes yes you produce one bit of information which because the prior uniform corresponds to exactly making this distinction between these two sets what happens if the utility is uncertain and the way so this would already be an advanced example okay but I would say the way to do it is you would have what you have to do is you would have to say that the utility does not only depend on x but on some hidden variable z right and then you have a distribution p of z that could or could not depend on x right and then you basically you have to apply this reasoning for two random variables and then you dissolve it that way but that's already that would already be an advanced example so to say oh here mm-hmm if you yeah if you have uncertainty as well and you're learning yeah mm-hmm yeah that's right okay but my point here is that even if you don't have uncertainty right so this is this is what I was saying about chess right chess is boring there's no uncertainty but actually we're limited and there is uncertainty so I'm so the same is here right this example is so simple you even you can query each value of the utility so in that sense there's no uncertainty but still you could be limited and still then it could be optimal to do this in a certain sense if you're bounded optimal otherwise it would indeed make sense to put 100% in this scenario but if you have uncertainty on top of course then even that doesn't make any sense that's right okay so here's a little example like a simple mechanistic process the process itself is not so important just to give you an idea how to think about this whole framework so rejection sampling it's a method that was invented by von Neumann I think that's very old and the task is to obtain a sample from a distribution P right but the problem is you cannot sample from P you can only sample from a distribution P0 okay and they basically they engineered this method where he said okay let's now assume also I sample from a uniform distribution this U right and I sample an X from the P0 which would be our prior right and then I decide whether to accept X or not right and if I accept the right X and throw away the right X then my samples from my accepted samples from P0 will be as if they're drawn from P okay that's the idea of this method why am I telling you about this because the question is what do these formulas mean that I'm showing you do they mean that you actually have to compute this free energy and make this trade off and so on the answer is no right this is just like a description of course you could do that if you wanted to but you don't have to so for example here you have a decision maker that has never computed the free energy in in this life okay so all he does is he samples from P0 the prior then he looks at the utility of the sample right and has this acceptance rule so he has a target value and he says okay if the utility of my sample is the target or above I accept for sure if the utility is smaller than my target I also accept but with a probability okay and if it's far away from my target then I will not accept because the acceptance probably becomes very small right it's like I don't know this is this you can think of it as a stochastic version of satisfying right the T tells you what's basically what's your goal and then you accept basically with respect to that so if you do that you you are guaranteed basically to draw samples from P even though and it's enough to do a single sample right you to make a decision you just need to produce one sample you don't need to know the whole distribution so you generate a sample from P and you're done and you don't need to compute this free energy stuff and so on you don't need to solve a constraint optimization problem you just optimize this use until you run out of resources and it's clear the more samples that you look at right that is again where this alpha parameter comes in how strict are you are you only content with the optimum right so if this is your if this is your say T okay are you going to accept so this is the acceptance probability right this is one P accept are you going to accept things that are worse so this is the utility axis right worse or are you basically strict and say I accept this is probability one and otherwise zero right and of course if you if you're very strict the stricter you are the more samples you will need to see on average right to to find one that has this high utility when you're sampling from the prior and so it's clear that the more alpha you want right the stricter you want to be the better you want to be the more samples you have to spend right and you can actually in the simple case compute the amount of samples and what's important is how it has to be because that makes sense right it has to be a monotonous function in the DKL right the more DKL you want the more samples you have to spend so to say I mean make sense right and so as I said the idea of this slide was not so much to promote rejection sampling or anything like that it's just to say that you can basically even if you don't optimize this sort of equations explicitly I can look at you right and say okay that's what you're doing so that's also an argument against these against people that say that it doesn't make sense to look at a bound rationality as an optimization problem because of the you optimize the constraint optimization and then you're basically back where you started but that's what we're doing right this guy is not optimizing a constraint optimization problem it's just trying to optimize the utility until he runs out of resource but still we can then look at this guy and say okay it's acting as if it was basically computing P right okay so we can use this bound rationality as a normative framework where we basically put information on the one axis and expect the utility on the other axis right so I'm saying if I am given a particular task a particular utility function right and a prior now given I spend a certain amount of information what is the best utility I can achieve right and that gives me this efficiency boundary and everything above the boundary is unachievable it doesn't matter what kind of algorithm you come up with so it's a little bit like Shannon's plot the rate distortion plot where we're saying right that the you can say this is the with this distortion this is the best information here it's the same if you use for a particular task with a particular utility right if you spend one bit of information then this line shows you what is the best utility that you can achieve with this amount of information right and no matter how you build your agent you will not be able to achieve more okay and it's it's quite a simple statement because the utility in the end at the information tells you how many distinctions you can make in the world right and if you have more information you can make more distinctions and if you can make more distinctions then you have a better chance of finding something better right I guess but again this curve does not tell you any kind of mechanism of how you would find such a policy right just like in Shannon's case there it doesn't say how to find a code it just tells you there is something so there's also nobody saying that you have to lie on this line right you could be below the line you cannot be above the line but you can be below right so I could measure you in an experiment I give you utility function I measure your behavior so I have the probability p I know what utility you achieve right I can compute from the probability what is the information that you've produced and then I can put you as a dot in this plane right and then I can look how close are you to this efficiency frontier right and I can even determine an efficiency parameter by saying okay how close are you to the line essentially how bounded optimal are you okay so this line basically tells you what is the bounded optimal behavior so to say and the the perfect one of course would be the one where your basically on top where you can spend the maximum information that you need to solve the task perfectly okay so the idea is just to sum up now quickly that we have one variational principle right where we basically a very distribution p and the same principle can be used to describe action perception and learning why am I saying that because as I said to you already that we can model Bayesian inference like that if we take the utility to be the log likelihood right and then we look for the best p which will be just the Bayesian posterior in that case so obviously with Bayesian inference we can model learning and perception we can also use this to model bounded decision-making like I was just arguing and we can also model the same thing to model robust decision making I'll come back to that later because don't have to explain it twice then okay so we have a utility function a prior it's like a resource parameter we transform the prior into the posterior right that's that's the idea okay we're doing with time so maybe the next slides I go a little bit quicker so it's not so important to understand all the details just to give you an idea that this can also be applied to more complex scenarios right so so far we just looked really at the whole time we looked at the most simple scenario choose an apple versus a banana basically right this is so simple that if you open an economics book usually there's not so much time spent on this particular example right so we spent a lot of time on that okay so you can apply it for example to multi-step decisions right so in this case now you look at the trajectory of x1 to xt and we have assume you have a utility function for this trajectory right and then we write down the same optimization problem as before right it's the same trade-off and we get the same solution okay so so far nothing interesting has happened but the interest maybe more interesting part is now if we look at each individual x through time right and then we make more assumptions we could assume for example that the probabilities are macovian we could assume for example that utilities are additive for each x why would we assume that well because these are basically assumptions that people often make when they solve like Markov decision processes or something like that if we do that right and we basically put these assumptions into our equation then we find that we can write it in a sort of a recursive fashion right it's basically like this tree where if I wiggle now with px1 everything here will move right that means that affects everything if I just wiggle with the last decision that will just affect the last part right and so what I have to do is just backward induction like we said earlier I would solve this last problem first then that gives me my utility value to solve the next problem so I work my way backwards right so that's what I'm doing here right this would be the last problem the utility of the last one and so that would give me my solution probabilities for each xt when I do that then I realize that the right so if I put this back into the equation there then the value for this decision is actually given again by the log partition sum also shouldn't be surprising because that's what we've been saying the whole time right and then basically you get recursive relation between these log partition sums at time t and time t minus one right you can basically express it like that if you want for example emma Todorov has done something similar just written it slightly differently something he called that iteration which he basically proposes as a way to solve Markov decision problems in a very efficient way and what you have to do is you have to basically express we'll come to that in a minute again the Markov decision problem slightly differently than you usually do not that you have separate actions but that you somehow you can manipulate directly the transition probabilities and if you do that basically then because of this you can solve the mdp just with linear matrix operations okay so this would be something that would just yeah fall out of here this I mean bounded rationality analysis so what I'm trying to show you is that with this simple kind of reasoning you cannot connect a lot of different research projects we can also sort of looking at the partition some could look at the log partition some define that as the value right then instead of the recursion for psi you get a recursion for v right which would basically look like this and now if you take alpha to zero so here I have the inverse so that would mean you have basically infinite resources in this case so you can define so depends whether you define the temperature the inverse temperature whether it's zero or infinity so don't be confused about that but basically what you get now in the limit is that you always pick the minimum right and what you get is a version of the Bellman recursion okay very simple version of the Bellman recursion where you always work yourself backwards right he said okay I take the the best value is the one that optimizes the cost of the next step and then the value of whatever comes afterwards right and with this you can do a backward recursion and compute the value for everything so so you can say so this is the basic idea of dynamic programming right so you could say that this dynamic programming proposal from Bellman is also a special case when you have the perfect minimum or maximum operation where this with this alpha you would extend this to basically bounded rational decision makers that are not able to pick only the maximum but that would have like broader distributions say around the maximum for example the same idea can also be applied to continuous problems so here's one example where we're looking at paths so this is something that I did when I was in Stefan Charles lab in the robotics lab so you can basically now have an object is a path it's like a continuous function right and then define again the same optimization criterion you get the solution that's also not interesting the normalization constant would be this right so this normalization constant now as you see is an integral over paths right because to normalize you have to always go over all the alternatives so here it would be all alternatives paths that would be a so-called path integral and then so so how can you have such a system so for example what you could do is you could say okay I have a have a system that evolves according to a certain differential equation and then I have a bounded rational decision maker right that produces a control signal with some noise here okay and it has this utility function that is sort of the continuous equivalent of additivity I guess yeah so we have essentially diffusion processes then right and maybe the mathematics here a little bit more intricate then okay so you can basically rephrase this path integral as a differential equation related then to the Hamilton Jacobi Bellman optimality equation that you would usually use to solve these tasks anyway and you get out the solution where the optimal bound rational controller would be just following the gradient in the log partition sum and essentially what you can do to operationalize this and that's what then people did in Stefan Schall's lab so these this path integral control was invented by Bert Kappen and Evangelos was working in Stefan's lab then used it for the high-dimensional robotic control and what I did was just to show how this fits in with this kind of framework you get then a controller that basically or you say that the basically the drift of the controller because I said the controller is a drift diffusion process is just the probabilistic position of controls where these probability weights are essentially just the Boltzmann factors that take into account the natural like the intrinsic utility of the particle how it would move right and the cost and so in the end what you can do is you can sample random controls you compute the Boltzmann factors for these trajectories that you've randomly generated right and that will basically give you a bound rational controller right and they use that quite robustly to all kinds of tasks what's nice about it is that you don't need to compute any derivatives or anything like that so this is quite a robust method okay so with these two things that I've just shown you the temp so we had now paths instead of individual events but the temperature or this resource parameter was the same across time it doesn't have to be like that in principle and this is what I showed you with these trees in the beginning you could have at every node a different resource parameter right so now here it's in the same still for every time step but you could have even each individual node different as well and that way you can create arbitrary decision trees right and depending on this parameter how you choose it right whether you choose it positive or negative you can create different kinds of behaviors so if you for example if you choose it say infinitely positive then it would correspond to a max operation if it's zero it corresponds to the expectation operation if it's infinite negative it would correspond to minimum operation but if you choose them these values in between somewhere right you can have anything in between and combine that so you can create rational behavior bounded rational behavior you can create robust behavior so we look at these examples in a minute and you can basically so this would be a decision tree with completely arbitrary resource parameters and you can then create of course again a backward induction in this case that would be a generalized Bellman equation right or dynamic programming equation I say okay what is the value of this point up there it's the value of the immediate cost plus the value of whatever comes next and of course all the values would be given by free energies okay so that was just to say that you can go beyond the simple banana apple example that I was discussing with you most of the time and that would conclude the first part so if there's no questions then I would start with looking at some attempts of ours to use this to like these abstract things that we've talked about to relate that somehow to reality which is also challenging yeah okay so so I'm going to describe now just a series of experiments to you to see how could we apply these things as I said so we start with a recent study where we looked at okay how close do humans get to this optimum right to this efficiency frontier then we have quite a few studies on the topic of model uncertainty that I keep mentioning but never really told you the details and then this is just the short more like a machine learning paper to bring these two ideas together model uncertainty and limited processing capabilities for action selection and then I want to also explain to you the idea why bound rationality is basically a source for forming abstractions okay so the idea is that abstractions are like useful because they save information processing and allow you to deal with the world in an efficient way okay and we're also currently working on an experiment here okay so how efficient are humans when they make decisions if we want to apply this boundary rationality theory so we had a very simple task a reaching task so subjects had to reach to one of these four circles one of them would be selected from a probability distribution and then you have to move to this circle okay very fast and then we look at the endpoint distribution of your reach okay and essentially what we did was that the movement time so once you started moving was always the same so the movement was always the same but the reaction time that means the planning time was different okay so we had the condition where you had to react very fast so that was like on the order of 200 milliseconds okay you have to you see the target if to immediately move and the slow condition was I think like 250 milliseconds or so so that doesn't sound like there's a big difference but it's quite a big difference okay it's nonlinear highly nonlinear this relationship okay and the idea was that if I give you different amount of planning the movement right would that show in the quality of your movement and of course also the information that you generate right that means essentially in the end here the variance that you generate okay and how efficient do you do this trade-off okay as I said we basically we have to distinguish planning from execution the utility function in this task right is either you hit the target or you don't right these are the four the four different well states the four different possible targets if you hit you get utility of one if you miss you get zero now the thing that makes this paradigm a bit more complicated is that that there's also execution noise okay so people know that when you basically execute a movement just from your muscles there is I mean this was also illustrated with William tell right he could aim as long as he wanted so in that sense there was no planning restriction but still there's a noise and this is execution noise and we don't want to basically mistake the one for the other because we want to know how much the time I give you for planning effects your movement variability not what comes from your arm okay and so we need to estimate this execution noise somehow and the way we did that is that we said okay there's always we had basically trials where we only had one target and you always went to the same target was always on the same spot right so in terms of planning there was I mean you still have to plan the movement I guess but you always knew where the target was but you didn't have to make a plan on the fly and we can measure the distribution that you generate there right and this we take as the noise that comes basically from the execution this would be the execution noise and so what we get in the space of planning right when we look for the aim point is basically a convolution of this hard utility with this execution noise so to say right because in your brain you basically have to take this execution noise into account you should say okay what is the probability of hitting and so on and this would be if you assume that you know the execution noise would be this utility function that you see there on the left okay and now we assume that there's some kind of planning process that uses this utility function on the left to find the best action depending on how much planning time you have okay that's the idea and this will give us the best planning number action of course the problem is we cannot observe that action so again we have to convolve that with the execution noise to get a distribution over endpoints and this is then what we can basically compare to our actual data right so that's the problem when you know so what we did was that so the target appeared right and if you would not so basically you had to leave the home position in the reaction time limit right and if that was not the case then you had a screen telling you were too slow and then the next trial would start right so of course you want to avoid these two slow trials as a subject because they just extend your experiment and so you try to be as fast as possible and once you moved out right the 300 milliseconds start counting to make sure that the movement time itself is always the same I mean obviously you don't know exactly when 200 milliseconds are up right so you have get a sort of a distribution yeah I mean much more if but yeah right okay so you see the when you try to apply these things there's always lots of practical things that maybe didn't think about as a theorization and maybe they are not also interesting but anyway if you cannot apply your theory then it's also not good right so you have to go through this even though it may be painful okay so here we see how limited planning affects the endpoint variability okay so we said we have two reaction times one is fast two is slow right and now what you see here is the endpoint variability and the end point accuracy so both more or less capture something similar just the variability and you see that in the fast condition so I plot your variance in the fast condition versus the variance in the slow condition right and so if there was no difference you would expect all the points to be on the diagonal right but what you see is that most of the points are not on the diagonal and that in fact the variance in the fast condition was higher okay even though the movement execution was the same okay so this is really an increase in variability that comes from restricting your planning time yeah I just look at the end points right of where you reach I know this are angular variances so we in the end we just looked at because they were on a segment of a circle so we just looked every all the analysis were done in angular variable so that's the variability the accuracy sort of measures your deviation from the correct target right so but shows the same pattern decreased accuracy in the fast condition okay then we can look also now in terms of the quantities that we are interested in namely information utility right okay this is not very good to see right the diagonal would be even I can't see it like this okay so this is the utility in the fast versus the utility in the slow condition when you see is the utility in the slow condition is higher okay that means the probability of a target hit was higher and on the left you see information for the fast and for the slow condition and there you see that the information for the fast condition is decreased what does that mean what is the information we measure now it's the information between the target and the endpoint right so I'm asking if I show you this endpoint can you tell me which target it was right and obviously the more precise you are the easier it is for me to tell you the correspondence between the two right and so in this setup the information reflects nothing else but this variability right because if you're less variable then I have more information mutual information between target and your endpoint okay so now we have the two critical quantities right and we can plot them against the curve with that basically represents this efficiency frontier right and so this is for all the different subjects so you see two curves one curve is the one where you basically do not take the execution noise into account of course that would be unrealistic because we know we have this execution noise right and so we computed this like highlighted curve which is the best that you can do in the presence of execution noise and of course this curve is fitted in the sense that we have to take into account the execution noise for each particular subject right okay but taking that into account this curve will tell you what is the best utility expected utility that you can achieve with a certain amount of information that you've produced okay yeah no the utility so that was this picture so that you saw at the end of the movement right and know whether you hit or not right if you hit the target it's you 31 if you miss it's zero right and so we set for planning if we take this execution noise into account the utility will be this right which is convolved with the execution noise that we measured separately right and that will be the utility that we plot here right and of course this utility is lower than the one without execution noise right okay and now what you see is first of all that all these points are fairly close together right because these are such a highly trained task that also the time differences are small right there's a super stereotypical so it's hard to see the the effect but of course you can clearly see it I mean with a slower condition right you're able to produce more information means produce movements are more precise okay and now we can look at okay how close are you to the optimum and you see these efficiency values here and you see that the people would have well above 90% efficiency if you compare them to this information theoretic bounded rationality frontier okay then we asked okay so now what would happen if we change the world state distribution that means the distribution with which these targets appear right so we had a uniform distribution now we ask what happens if we take a non-uniform distribution okay and of course if we do that then the whole function the efficiency function will change right because I mean let's take an extreme example if always target one appears right then you can solve this problem with basically zero information expenditure right so of course if we have non-uniform distribution you can achieve more with less information right and an important prediction here is that if we now look at the entropy over the action given the target right that in this non-uniform scenario this entropy should change and in particular we would assume that basically frequent states would have lower entropy and non-frequent states high entropy so it's basically the same idea that we have in coding right low short code words for frequent symbols and here we would basically say frequent targets we should have less variance in the end non-frequent targets more variance I mean these are all small effects but that's a prediction right okay and so this is what we found also so here for the two reaction times you see here's the theoretical prediction so you see that depending on the frequency the variance would go down and here see what we measured in the experiment indeed we found this effect right that you adapt your your variance to the frequency of the world state okay it's like a basic information theoretical idea and then we also looked at the variances so as I said as the at the efficiencies in this changed environment and here again this would be the efficiency line yeah and what you can see is that here the people are not so efficient okay maybe on average around 80 percent the question is why is that the possible reason could be that they've not trained enough right even though we had like like I think 5000 trials in total right when these are measuring like really small distances now so this could take thousands of trials so I presume the decrease in efficiency comes mainly from that right because these small training effects they can really take many many thousands of trials like this example with the Cuban cigar drillers right where you can even after decades measure small improvements on a logarithmic scale you can see it even though if you would measure it on a real scale you wouldn't see a difference no improvement anymore okay so then this is an earlier study where we asked can we find deviations from expected utility decision-making in sense more control using this sort of free energy functions this was one of our early attempts when I was still in Cambridge and here this was a simple task where basically there was a ball undergoing random Brownian motion in this direction and moving with constant velocity towards this bar okay and this was the target line and when the ball hits the target line you basically you get a cost an error cost like indicated with this parabola but you would also pay a control cost right so if you would land the ball in the middle so the control cost depends on how much you basically spend effort to direct the ball close to the optimum which would be here in the middle right so if you would do a lot of corrections you would have a lot of control cost otherwise if you just let go right you would have no control cost and the ball would just hit somewhere and you would get whatever cost there is okay and the prediction here is that if you just care about the expected cost then you can show that the optimal thing to do would just be this linear feedback control so that means the more the ball deviates to the right the more you correct back towards the middle and if it deviates to the left the more you correct towards the middle but if you would optimize this sort of free energy cost function okay again like a log of exponentials and then then you have an interaction now with the noise level okay so what I can do is here I can change the strength of the noise that governs the Brownian motion it could be a small noise or it could be a big noise okay and then it would fluctuate much more and if you're just care about expectation values these fluctuations are irrelevant to you right so you will choose exactly the same control strategy in both noise levels but if you have this kind of free energy cost function then you basically depends on the sign of this parameter right this theta would be like the alpha so we're moving slowly towards this model uncertainty in a minute and then what happens is that if you are risk a verse right then you control more and that's what you see here when the noise becomes higher the slope of this controller becomes larger that means you become more of a control freak you think every little deviation you see you become nervous right because you expect the worst right on the other hand if you're risk seeking then you have this kind of gambling attitude and you think okay if there's uncertainty that's great because nature is friendly and good things will happen to me right that means there's more uncertainty you don't have to do anything anymore because nature is doing it for you right and you decrease your control that's the other attitude okay but the important thing is that this is different from what you get here which is basically you don't care about the noise level okay so so what's interesting here is you can think of it as the amount of control you think you have about the environment or you could think about it in terms of how friendly the environment is to you right if you believe that essentially the environment works in your favor you'll be risk seeking you could think that the environment is an extension of you in some sense right it doesn't do exactly what you want because otherwise it would be like theta infinite right but it does not just random things but things that help you okay and if you're risk averse it's the other way around then you think that the world is mostly an unfriendly place okay maybe it's not always doing the worst to you but in general it's good to be careful right and then as the consequence you increase your control okay so that's what we observed so in the top you see a single subject you see in the high noise control increased we also did that for two different cost levels that's not so important maybe just ignore one half and and we found that basically for five out of six subjects what you see down here the gain increased so for five out of six basically that behavior you couldn't describe with expected utility maximization but they were essentially risk averse okay then here is another study basically looking also into the same kind of phenomenon but slightly differently so here's this so this was basically an extension of previous experiments where maybe this comes here yeah okay so maybe I explain this task quickly and then I explain how it relates to the previous work so in this task you had to there was a target area and at the beginning of the trial a target would be drawn from yeah yeah this yes I mean bird I mean in the sense that bird cap and also published on risk sensitive control so these are basically okay what I what I didn't explicitly say was that here it's so you only get this linear simple linear control or when you have basically a linear system dynamics and quadratic costs okay which sorry yes so that's why I put the parabola there right and so you get this simple solution and so for this linear quadratic systems you can also solve the risk sensitive case analytically this is done been done I think by Whittle in the 70s or so and bird cap and basically has worked on nonlinear extensions of both of these the risk neutral and the so in that sense yes but not directly okay so as I was saying any other questions okay so there was this target area where from a Gaussian distribution a target would be drawn and you would have to move to this target if you can through the force area and to the goalbar and then back okay so that was the trial now in the beginning this force area was switched off okay so you just had to try and hit the target move to the goalbar and back and the important thing is that the target was displayed either precisely so you knew exactly where it was or it was presented as a cloud so you had to guess where it was so this cloud was also drawn from a Gaussian distribution or it was not displayed at all so if it's not displayed at all then the best you can do is basically learn statistics of the target over many trials and then you know basically where is the most probable location right you basically have to learn the prior and so this is an experiment yeah yes I yes I will explain it in a minute so it's a bit awkward right now but just think of it now that you try and hit the target but then you have to move a little bit further you can touch the goalbar anywhere right and you move back so you just have to do the move this is the movement that we're interested in and the other part of the movement will need it in a minute to make you pay a cost so to say immediately okay so so the original experiment basically was like this without any force or anything like that okay and so what actually that was done in my old lab but way before I was there so what do you see illustrated here is that on the top say the prior distribution okay so you know the target appears at this locations with this distribution and then you see here the likelihood basically this is your observation under these different visual feedback conditions right so if you know exactly what the target is the likelihood is fairly concentrated if it's a cloud it gets already broader if you have no idea then basically the likelihood is like flat right doesn't give any information and so that's what's illustrated here right so this is a good visual feedback and then as this likely becomes broader it's yeah more noisy let's say and then you combine these two sources of information right and then you get the posterior which is this right so x would be the location y would be your observation and what you see is that the mean of this posterior moves towards the prior as the visual information becomes less reliable that makes sense right so I don't know if you imagine so the example they had was imagine you play tennis with your friend and then it's a dusk or dawn or something like that right you have some kind of prior distribution of where your friend will play then the more the less you see the more you will rely on your prior knowledge right yeah sorry the green line you mean I'm sorry I'm red green color blind this one this would be an example of visual feedback or a likelihood that's even broader right so that means more visual noise so say for example this would be clear daylight and this would be a dusk or something like this right no no this is just an illustration so the third line in my experiment would have to be completely flat right I said sorry does it make sense or not no no no this is basically this picture comes from their paper but I think even in their paper yeah no they had two intermediate likelihood conditions so in their paper it made sense okay and now basically what you can plot is I can plot where do you point right and so if you have perfect vision right then if I plot the error on this axis and here the true position of the target right if you have perfect vision then you will have always zero error because you can just always go perfectly to the target right but if you have blurred vision right let's take the other extreme you don't see anything you play tennis at night right then the best you can do is to go to the mean of your prior right this is the highest probability of getting the ball and so that means that would be this location right and if it's really there you would have a zero error but if the real location was to the right or to the left you would basically have this diagonal line that gives you the error independence of the true location of the ball right and so you have basically these two extremes and then you have all the other cases with intermediate vision between right so you can basically express this the reliability of your stimulus as this slope right in your behavior if you would act optimally that was the idea of the original paper yeah okay so in our experiment we also did that so basically this would be the Bayesian posterior where do we believe the target is given our observation and we would have an quadratic cost to try and hit the target okay and basically it doesn't matter whether you just optimize expect utility or this kind of free energy cost function the solution would always be the same namely the what you take into account is the sigma prior right your prior uncertainty and sigma i would correspond to the feedback uncertainty it could be one of these three conditions right and so if sigma i was zero that means perfect vision right you would just go to the point that you observe why but if sigma i was infinite so that means you play at night right then this would go to zero and zero in our experiment was the mean of this prior okay so that's how to read this so what you see is that the predictions are the same okay now what we wanted is to get different predictions and in order to do that what you have to do is you have to add a linear cost term okay so here so that's what we did so in the experiment that means we introduced that's the reason now why we introduced this force area right so we had two scenarios one is where the force was lower on the left and more expensive on the right and on the and the other scenario was the other way around okay now if you have such a force then the prediction is simple if you care about the expected utility right you still do this trade-off as before but now here you basically bias towards the side that is has less force okay and how much depends on the strength of this force and the basically the parameter that tells you how much you want to go to the target how desperate are you to get to the target right these are traded off and and this importantly this is basically a constant bias right it doesn't matter what is your observation you would always bias it it just depends on the force and in our experiment this force was small enough that we can neglect this factor okay but whether you neglected or not actually for the argument is not important but it was a small effect so we couldn't see a bias here and then in the in the case of the free energy cost function we basically get this third term that doesn't exist here okay which is again like a term for risk sensitivity or model uncertainty and this term also exists consists of these sigmas the sigma of the prior and the sigma of the feedback but now it also consists of the risk sensitivity or model uncertainty parameter and the strength of the force field so you have an interaction between between three things okay so this is not a constant bias it depends on the condition of the feedback whether you see well or not so well okay that's that's the prediction and the intuition behind it is that if of course you have to be sensitive to this to the risk and then the intuition is that if you know exactly where you are then you basically go there if you don't know where you are then you might as well deviate a little bit to the side that is less costly okay that's the intuitive interpretation of this formula so these are basically the predictions for the three sigma conditions so this is when you see perfectly right then you have always zero error and these are when you have basically don't see anything we go with your prior and this is the intermediate condition with the blur and you see that basically the three force conditions are these three different colors okay so when there's no force you're in the middle and when the force is in the one side or the other you deviate okay it's like this this offset that's predicted okay and this is something that is not predicted if you would just care about the expectation value okay then you would basically always just have the constant bias or we said it's negligible so it would just be always this line in the middle so here you see the result for one subject right that shows exactly this kind of behavior and here you see the summary for all the subjects so the three columns or the three different force conditions so the first one is the force condition that's basically been measured before no force condition you just see that if you basically increase the sensory uncertainty the slope increases in this plot right that means you go more and more with your prior so that's what was previously recorded and we saw that also in the other force condition what's new here is to look at this intercept at this basically offset and the key prediction was that the offset should depend on the degree of the uncertainty right so we have a small offset here and a larger offset here and no offset there or even smaller offset and that's exactly what you see here that the offset increases with the with the model uncertainty right and the intuition behind it was that if you would put it into words that if you don't know where the target is anyway and you have model uncertainty that means you don't trust your model entirely then you may as well deviate a little bit to the less costly option okay um okay and now I would quickly it's till one o'clock right yeah so now I would talk quickly about the promised model uncertainty in a little bit more detail okay so in this risk sensitivity this has already come a little bit to the four mathematically this happens because when you use the free energy as the cost function you don't only care so you can do a Taylor series basically in beta and what you find is that you don't only care about the expectation value of the utility but you care about higher order moments like variability and so on okay this is basically the effect that we've seen here okay I don't care so these all these experiments that I've shown you now were designed such that if you care about expectation you don't see anything if you care about higher order moments of the utility then you will see a change in behavior predicted okay and that's what we found so in a way what I just demonstrated to you you could also phrase it like that people are sensitive to higher order moments of the utility okay and this has also been applied to machine learning right so you could say do you want to maximize expected reward or do you want to maximize also consider higher order moments right or in economics this is relevant for example when you go into portfolio theory then people say do you want to just maximize the return or do you also want to consider the variance of the return right and depending on your risk attitude you will choose a different portfolio okay okay so now more into direction of model uncertainty so I mean it's all the same formula but the viewpoint always changes slightly so maybe the last two studies can be considered under this aspect of higher order moments here I want to emphasize another aspect so this debate about different kinds of uncertainty right that started with knights that I was mentioning earlier is illustrated here in a simple example okay then on the right you have the standard decision making problem where all the probabilities and outcomes are known and now you have to decide between these two lotteries okay on the left we basically for one lottery we don't know the probabilities I don't give you any information and so we have a situation that involves ambiguity and now there's two ways you can argue about this and this is basically the kind of arguments that basically yeah I guess are still happening in the economic sciences as well it's been going on for a long time so one argument is to say okay if I believe in subjective probabilities right what I have to do is to say okay I don't know the probabilities let me make a big list of all the possibilities right so all the possible lotteries going from 100% 0 to 100% 1000 and then I have to put a probability distribution reflecting my subjective belief about these possibilities let's assume this distribution is uniform okay no preference what would happen then is that you would end up with a compound lottery that is equivalent to this one right because all the possibilities are equally probable you will end up with a 50-50 probability effectively yeah and so if that was the case then there would be no difference between the scenario on the right and on the left okay all the ambiguity would be translated into risk if you reason that way the other way to go is to say no I don't have a probability and therefore I cannot apply expected utility because to apply expectation I need the probability I don't have it so I won't do it okay it's a different kind of decision criterion that I need okay and the two scenarios would be fundamentally different okay and this is the kind of route that has been taken by people that have studied ambiguity and the point here is that the same formula that we've been using all along can also be used to deal with this kind of scenario okay because I guess you can think of the model uncertainty also as a resource constraint or information constraint and the way this works is the following okay so here's just I promised you the Ellsberg example so let's quickly do it so I have these two urns right the risky one and the ambiguous one and I tell you you get ten dollars which if you draw a red ball which one do you choose most people would choose the the left one I told you right so now I revealed to you why this creates a paradox so if you choose the left one then I presume you to be Bayesian that means you believe there's less red than blue balls in the other urn right otherwise it would be irrational to choose the other one okay so now I ask you with the same two urns I give you now ten dollars if I draw blue ball which urn do you want and again most people would choose the left one okay maybe you find yourself doing that as well in your mind okay so now the question is are you crazy or not okay if we put the two probabilities next to each other right that would mean that you believe at the same time that there's less red than blue and more red than blue in the right urn and then you find it quite difficult to justify why this would be a rational belief right okay so salvation is coming don't worry so that's why it's called a paradox okay so I said we need a decision criterion in the extreme case it's not probabilistic right so a simple one that has been used a lot is this is maximum criteria right this is worst case reasoning if you don't know assume the worst case and then if things happen it can only get better right it's an old wisdom I guess but this is also extreme right this this makes sense if you know absolutely nothing and maybe it's it's critical but maybe you have a model right so you have a model about the world but you're not sure about the model okay and that doesn't mean that you're not sure to a degree where you say okay I don't know anything but you say okay I know a model but maybe it's not quite right and maybe a distribution in the neighborhood would also be okay to describe would be possible right I would regard that possible and so what you can do is basically you can say okay I take my model and I look at models that lie in the neighborhood of this model right this is what people do that deal with robust decision making and I want this so obviously we recognize the KL divergence and we want this to be smaller than some number right and as a picture we could imagine that we have p0 in the space of all probability distributions and we basically withdraw a circle around p0 right and say all these distributions that lie in the circle I think are reasonable okay they could still be true I'm not sure and the ones that are outside the circle I sort of dismiss okay and now and now of course you can make the circle very small in fact you can make it tiny such that just p0 is in there and that means you know the model for sure right or you can make it infinitely large in which case you don't know anything okay so so these two kinds of perfect knowledge and knowing nothing are the extremes of this okay and then so basically what you then do is you take exactly this functional right and you say okay let me choose the probability that leads to a worst case so you take a minimization okay so that's when you would look for an action you would maximize right but I was saying before minimization is when you anticipate an opponent or when you do robust inference if you remember or robust decision-making so this would be the second case so so we're looking for the worst inside the circle of models that you regarded as reasonable right so you basically go to this circle and you look at each distribution say which is the worst and you find it and that would be your robust belief your robust model okay because if it's any other model in this circle it can only be better right yeah it's exactly the same way to look at it yeah exactly you think yes exactly the mathematics is exactly the same so I mean I was also making a similar story right in the previous example we're saying that you believe the world is like harming you fundamentally or not I mean it's the same kind of story you could make up here as well and it would be exactly the same mathematics and in fact people do that right there's like a lot of research on this adversarial decision-making okay and so you see here what happens if we now take what is the value of the lottery of course the free energy as before right now evaluate it as a minimization problem and then basically I have this graph I have the value I have the model uncertainty parameter here right and if I have no model uncertainty the value will just be the expected utility under my belief right if I have model uncertainty the value of the lottery is going to decrease right to the minimum in the end in the extreme case so that means if I now look at my two lot trees right 1000 and 0 and 500 if I would want to be robust right completely robust I would just say what is the worst that could happen I get nothing so the value of the lottery on the left is 0 in that case right so what's better nothing or 500 500 I take that one right but it doesn't have to be like that right you could also say okay I believe it's 50 50 but maybe I could be wrong to some extent right maybe 70 30 would still be okay but I don't believe it's totally rigged right and that would then determine the value of this lottery depending on how much you trust in your model and of course how much you trust in your model can depend on how often this model has been applied through learning for example okay and so that's an experiment that we did here so in this experiment people were basically so we try to translate this this aspect earned task that's what we did up here right so we have on the left the risky earn always shows red and blue balls and on the right you have the ambiguous earn now the payoff here was just again with a force right so I think whenever a blue ball was drawn with the probability of the earn you have to pay this forced move and subject want to avoid that but the more important thing is that the ambiguous earn was not necessarily fully ambiguous because we could show you samples from the ambiguous earn and so that's what I meant with learning right so if I show you no samples of the ambiguous earn it would be like Ellsberg if I show you like two balls from the earn then you already have an idea if I show you 10 you have more idea if you show 20 you have more if I show 50 at some point you know okay and the question is how the so we have now degrees of ambiguity right how would that affect your choice behavior and in the bottom we basically try to come up with an equivalent task like against a target hitting task where you had to decide do you want to hit a visible target that would be on the right here and it was trained for each subject such that the target has exactly the size you would hit the target 50 percent of the time okay so this was adapted or do you like the target on the left which basically is partially or fully occluded okay which could be smaller or bigger of course you don't know okay okay so now let me explain to you this in the case of the earn because it's easy to understand so here for the earn we were interested in particular trials so we created these particular trials namely trials in which I show you the same proportion of red and blue okay so if I show you for example one red one blue two red two blue five red five blue why are we interested in these trials because if you would do a Bayesian posterior about them right they would all have the same mean but the more balls you see the more you would concentrate around the idea that indeed the ratio of the earn is 50 50 okay so it makes sense so again this idea boils down to if you just care about expectation all these cases are the same to you if you care about higher order moments they're not the same to you okay that's essentially the mathematical difference that will kick in here and so when you just care about expected utility right then I always say 50 50 the known earn the risky earn was also 50 50 and that makes a simple prediction if you just care about the expectation value like proposed by expected utility theory you should be indifferent between the ambiguous earn and the risky earn you're independent of how many balls I show you if however you are sensitive for the amount of information that I'm giving you right then the value that you give to the ambiguous option depends on how much information you have been gathering right because that would change also the degree of model uncertainty so to say and now again it depends whether you are basically positive or negative with respect to this uncertainty right so you can think that if you don't know that's a great thing or if you don't know that's a bad thing now if we stick with this idea it's a bad thing then which was the Ellsberg finding right then you would devalue the ambiguous earn but the more you learn about it the less this devaluation is until in the end basically you have the same value than when you have the risky earn right these are just two curves for two different say attitudes okay okay so these are the single subject data maybe I'll show you the summary you can see it more easily so you see here for the earned task right the amount of information that you have and you see on the y-axis the probability of choosing the risky option okay and you see that if I reveal no information about the ambiguous earn there's like a I don't know say 80% tendency to choose the known earn just like in Ellsberg's case okay and then the more information I reveal to you the more this tendency goes down until you've seen so much that you're indifferent between I mean effectively the ambiguous earn becomes a risky earn right if I show you enough balls at some point and finally in the motor task we saw the opposite trend okay so there are people preferred the ambiguous option and we were thinking for a long time it's like I think one and a half years in the end why this isn't lots of tried lots of stuff because initially we thought maybe it has to do with that the uncertainty here is extrinsic and here this uncertainty comes from your own body and so on and you think maybe your body is friendly to you I don't know that's all not true the simple explanation is that it's the way you represent the uncertainty that's what makes the difference so if I would represent the uncertainty in the early example because that's what we did in the end also with these sort of bars right and I would explain to you that the the size of the bar relates to the probability of green or red or whatever and I include it and so on then people would show the opposite sort of behavior towards this and that's quite interesting so the conclusion from that was that it really depends how you represent uncertainty to people say I don't know say you climate science researcher and you show error bars to politician think twice how you will plot them that's the the side effect here but for us the important thing was that okay either way you cannot explain the choice with expected utility right there was always a bias that would take the amount of ambiguity into account and this you could basically describe with a free energy valuation that takes also into account how much information you've seen right and then you get basically these you can get these response curves okay so I think I'll stop here right and then we can continue tomorrow