 Now, let us I will talk about another way of a way of generalizing another way of another way of looking at mixed strategies and also goes into this question of of limited rationality or bounded rationality and so on okay. So, when we what we observe let us go back again to the case where there are finitely many pure strategies for each player. Now, what happens is in that in that sort of a game is that any player basically, generically player would become indifferent between playing multiple, multiple pure strategies you know that is the nature of the international equilibrium that is what would happen that player becomes indifferent between playing multiple pure strategies and then he randomizes over those. Now, all those pure strategies are the ones that give him the best payoff. So, any the one the best expected payoff, assuming the other guys are paying their mixed strategy, this is the structure, this is the way the national equilibrium behaves. Now, which what that effectively means is that if any strategy is even a little bit suboptimal, if any pure strategy is even slightly suboptimal, it gives even a slightly lower payoff than any of the others, the player would give it zero probability. He gives it, he gives you know, he gives probability, he gives no even even he cannot he it is irrational for him to give even a slight amount of slight weight to that to that to that strategy. Now, this people have wondered if this is this is really you know representative and you can this be generalized and so on you know in many different for many different angles they have thought about this. There is also an effort to try and reconcile game theory with observe data you know people when you observe that this is what has happened in the past this is the way the players have played in the past. Can you match that with with some kind of the the the outcomes that game theory is suggesting. So, one of the ways people have thought about this is that this is is that this particular property that you know you give zero probability to anything suboptimal, this kind of hyper optimization is a little too too strong. So, instead of players playing best responses they say that we should just expect that players play better responses. So, they play better than something that is that is obviously bad, but they they are very close to they are going to be close to the best, but not necessarily the best which means that you have to allow for players to play occasionally let us say or with some small probability a suboptimal strategy also. So, some suboptimal strategies also could get some positive way. If you think about it this way then it basically is enlarging the space of strategies in a kind of behavioral way where you are actually bringing in behavioral characteristics that players can make mistakes, players can get nervous, players can you know can develop cold feet at the time of play etc. All of these kind of you know human nature behavioral characteristics are being somehow being attempted to bring it. So, this is this leads to this concept of what is called a quantum response equilibrium. So, best response really means only the whatever is optimal given what others are playing, but in Nash equilibrium the best response is it would be that the player becomes indifferent between a certain set of pure strategies a subset of his pure strategies. What do you mean? He has to maximizes. So, he is just the utility maximizer. No, no. So, best response is assuming what are fixing what the others are playing what is what is the optimal thing for the player to play that is all that best response. So, this leads to as I said a concept of what is called quantum response. And now the interesting thing about quantum response is because it there are all these behavioral elements brought in right it also is very nice very sort of easy to merge with any kind of data sciences. So, if you have lots of past data if you want to in which say players have played in a certain way like for example you know there could be an evader and a and a pursuer and an evader and this is a pursuer is trying to or an intruder that is being that is trying to get into a network and and there are there are detectors trying to detect it and that this this cat and mouse game between between a intruder and detector has gone on for multiple rounds and you have had data from that from those rounds. But you do not know what you know there are some elements of the of the game that are not fully specified and all you know is this this past data like for example you do not know what exactly is the utility function of the intruder you do not know what exactly is the what are the what is exactly the utility function for the for for the for the for the detector etc. etc. You may not even know all the space the entire space of strategies that are available and all of these things some of these things are just you just have this data of what has happened in the past and from there you want to try and think about the game. So, one of the things that you can do in with using the quantum response is sort of is very easily blended in with you know any any other kind of data scientific tool. So, so I will just explain this explain this right now. So, so to motivate this let us look at a let us I will motivate this I we should look at this problem of what is called matching pennies. So, very simple simple game two players have two coins get a coins have two sides heads and tails. Now the I am I am writing this assuming players are maximizing. So, if the the the deal for player one is that so what they have to do is they they each have two coins they each have their own coin and they have to decide whether they put the coin down with heads facing up or tails facing up ok. If both have heads both put it up put heads up then player one gets one and player two gets zero. So, the row player gets one column player gets zero if both get if both put tails then again player one gets one and player two gets zero and on the other hand if they do not coordinate that means if player one plays puts heads and the other place player two puts tails then player one will get zero and player two will get one and vice versa as well. Player player one puts tails and player two put heads then player two gets one and player one gets zero. So, in other words player one wins when the when their outcomes are the same when the the face is the same for both players player two wins when the when the face is different ok. So, this is this is the setup. So, the game is you can see the the the payoffs and so on are extremely symmetric and it in fact you can also see that there is no there is no pure strategy Nash equilibrium in this game ok. The main thing is this is a non cooperative game. So, players have to decide what they want to put without seeing what the other has put obviously ok. So, otherwise the game becomes trivial you have to you cannot see what the other has put. So, you we send the weights usually played by kids and all that is that they hide the face and then they reveal it and then you decide who won ok. So, now, but there is no pure strategy Nash equilibrium but there is a mixed strategy Nash equilibrium and in fact there is a unique one ok. So, this has a unique there is only exactly one mixed strategy Nash equilibrium and what do you think that would be? Yeah, because of the symmetry of the problem it would be it is that you play each plays heads stroke tails with probability half this is the Nash equilibrium. Now, what we will do is we will change this a little bit ok. So, let us change this this problem. Now, what we will replace this one here that I have here this one this one is going to be replaced with with capital X ok and X here is some is something positive So, when X is 1 you get back the old matching pennies with X not equal to 1 it is what is called the asymmetric matching pennies ok. Now, can you tell me what is the in with X positive, but any general X there is again a unique mixed strategy Nash equilibrium, but what can you tell me what is that Nash equilibrium? Yeah, so who plays what? What is the row player strategy? So, remember the row player has to make the column player indifferent between his pure strategies right? No, the row player has to make the column player indifferent between his pure strategies the column yeah. So, the row player actually ends up uniformly randomizing ok. So, the row player plays half heads and half tails ok. So, if he plays tails with heads with probability half and tails with probability half the column player will becomes indifferent between playing heads or tails ok. And now what about the column player? So, the column player he selects heads with probability 1 by X plus 1 and tails with probability X by X plus 1. Now, what people have argued is that if you look at this you know in terms of experiments and so on in terms of any kind of empirical data this seems to be a little bit odd. You know one of the things that for example, one of the things that happens in this is that even if you let X say for example, go to infinity keep raising X to infinity player the row player continues to play half and half although X affects his payoff. If X is affecting the row player's payoff it does not affect the column player's payoff but his strategy remains independent of X right. So, you get this so the row player strategy half heads and half tails is independent of the value of X itself. So, you can make X extremely large you might think. So, if you look at this payoff matrix you might think that the row player would get more inclined towards being X as X gets larger and larger and behaviorally that is exactly what people have found you know when X is larger you there is a tendency to play heads more right for the row player. On the other hand what is happening is that the as X gets larger and larger it is the column player whose strategy that is changing and the column player starts drifting more towards playing tails. So, now one of the ways of sort of working around this is to say that well this is a this here the expectation is that the reason you are playing half and half here and the reason this is this is 1 by X plus 1 and X by X plus 1 is because there is a strict expectation of rationality. Suppose we allow players to make to play sub optimal strategies with some probability then your the outcome the nature of the outcome would change. And let us we will see that now. So, let us take for example. So, what people have done is they said let us look at how people respond to alternatives. So, again as I said as at the very outset I had told you that we are not looking for a behavioral theory to begin with but I am just giving you a perspective of you know another perspective on how things things have been developed. So, one of the observations that about responding to about responding to you know different alternatives is that you will say suppose you I want a player is shown two different sources of light and you are asked he is asked to pick which is the brighter source. So, if there is there are two sources of light say L1 with intensities suppose L1 and L2 then it turns out that he picks L1 with probability L1 by L1 plus L2 and L2 by with probability L2 by L1 plus L2. So, this is this is an observation that people have made that essentially if you give two players two different sources of light and ask them to quickly check which one is the brighter one they pick L1 with probability the one with intensity L1 is chosen with probability L1 by L1 plus L2 and the one with intensity L2 is chosen by the probability L2 by L1 plus L2. So, if in other words if suppose L2 is much much brighter than L1 then you start standing to one mostly people will pick obviously one L the light source L2 but otherwise they would pick a light source you know L1 with a slightly lower probability. Anyway the point is the point is this is just a motivation this was a motivation and they said can we use this to sort of think of a different way of modeling the response of a player. So, what you can think of is that player has these here a player has these choices right he has to put put down either heads or tails and you can think of these as sort of intensities that he is to choose from he looks at the conditional utility that he gets from each of these alternatives. So, he or he looks at the let us say the expected utility that he gets from each of these alternatives expectation taken over what the others are going to play. So, averaging over what the others would play each player looks at the pure strategy alternatives that he had and then instead of playing the one that is the largest only he will play them in this proportion in the way that he was picking the light source. So, imagine that H was one light source and tails was another light source giving you a certain utility and you pick them with a certain probability based on the value of the utility. So, the one with that gives you the higher utility you give it just higher probability the one that has lower utility you give it lower probability, but you do give them some probability. So, this is how a player is becoming a better responder as opposed to being a best responder a best responder would simply pick the best response and ignore everything that is suboptimal. But now even stuff that is suboptimal even alternatives that are suboptimal are being played with some probability. So, this is so this motivates a particular type of a particular type of response from the player. So, let us call that let us write that out here. So, erase this. So, suppose so I will just set up the problem first let us say let us suppose you have a you have n is your set of players you have si these are your pure strategies and you have ui. So, sigma i let this denote the a mixed strategy. So, capital sigma i is the set of mixed strategies. So, in other words is a probability which is the set of probability distributions on si. Now, vij this as a function I am going to write this as a function of sigma this is the expected utility that a player would get player i would get when he plays a pure strategy j. So, this here is the expectation summation over x minus i and s minus i of ui of. So, I will be quick carefully show you what this what I have written there. So, here vij when I write vij here i here the i denotes the player i is in n j is simply the j is just an index for the strategy of player i. So, I am looking at some particular strategy strategy j of player i. Now, here sigma minus i of x minus i is simply the product of the mixed strategies of all the other players. So, it is the you have some product of sigma k of x k k not equal to i. So, and this is with others playing x k. Now, if player i plays the his jth pure strategy this is then the expected payoff that he would get when others play when the other players play mixed strategies sigma minus i is this clear. So, this is what he is going to get from by paying his jth pure strategy. So, you can you can kind of see where I am headed here. So, now we have you get you have expected payoffs or light intensities from every pure strategy you can now ask with what probabilities should I be playing my various pure strategies in keeping this you know keeping this in mind. So, let us so, you what you say is you assume is that player i plays pure strategy j if player i plays pure strategy j in his set of pure strategies with this probability with this probability v ij of v ij that was his expected utility from playing pure strategy j divided by summation over v ij dash where j dash some is runs over s i. Now, there is obviously an assumption here v's have to be positive for this for this to work. So, you have to normalize you have to add constants and all that to make sure that this this works out to be a to be to be a an actual probability. But once the v ij's are all positive you can see that this now is a mixed strategy for the player because if I take the summation over summation over j in s i of sigma ij this sums to 1 and if the v's are positive this is also greater than equal to 0. But remember v ij the way we wrote it out this guy is a function of sigma v ij is a function of sigma it is a function of what the in particular it is a function of sigma sigma minus i I should write in more clearly here it is a function of sigma minus i. So, it is a function of what the others what the others are playing. So, you are evaluating it for a pure strategy a fixed pure strategy and your expectation is taken over the pure strategies of other players. So, your summation you are summing over x minus i here no with the sigma minus i probably sigma minus i is the is this is just simply the product of the mixed strategy is the probability with which the others are playing their their own pure strategies. So, player k plays pure strategy x k with probability sigma k of x k you take the product of that. So, that gives you the probability with which say x minus i is going to be played and you are play fixing your own pure strategies at x ij and you are asking what is the payoff I am going to get from that pure strategy from my jth pure strategy is this clear. So, this is this is the x ij here is the if you actually this is a little bit confusing I can even just put j here itself directly. So, if you so j itself is these is pure strategy. So, this here is his is the strategy in s i. So, this is what player i gets when he plays strategy j which is in s i and others play x minus i. So, this is so we just suppose that this is going to be the his model for response he has a he has his response as this model that he looks at the various utilities that he waits from his pure strategies and place them in the proportion of the utility. Remember this is different from the reasoning in Nash. Nash was asking that you play the best response and just that here we are saying we look at the utilities and in this in proportion you are going to play the play the play the various various probability. Now, this remember here the v ij are a function of sigma minus i these guys are all functions of sigma minus i that means these are a function of what the other players are playing. So, you now will and if so what is happening is now you you have this this equation here just defines for you sigma i that the probability is the mixed strategy for player i. You can write out a similar best similar response equation for all players. So, you will get therefore multiple coupled response equations for all the players. So, remember so in particular let us take the asymmetric matching Beniz for the asymmetric matching Beniz what happens is you get you can see v 1 1. So, v 1 1 is actually equal to x times sigma 2 1 v 1 2 so that means so 1 here is heads and so this is 1 this is 2 this is 1 this is 2 and v 1 2 would be is just sigma 2 2 v 2 1 is sigma 1 2 and v 2 2 is sigma 1 1. So, what have I written here this is the expected utility that player 1 gets when he plays heads assuming the others are playing assuming the assuming player 2 is playing sigma 2. So, why do you get x times sigma 2 1 because it is you see it is x times sigma 2 1 plus 0 times sigma 2 2. So, that that so sigma i ij this is player i jth pure strategy. So, we are I am just writing h as 1 and t as 2 heads is 1. So, sigma 2 1 is basically the probability if you want you can think of this as this is basically the probability that player 2 plays heads that is sigma 2 1. So, sigma 2 1 is the probability that player 2 plays heads and then similarly sigma 2 2 is the probability that player 2 player 2 plays plays tails. This is the probability that player 1 plays tails and this is the probability that player 1 plays heads. So, with this now we get we get these v's. So, these v's have to be remember a function of sigma minus i. So, you get equations like this with sigma on the left and sigma on the right one you get one equations like this for each player. So, how many equations would you have for player i here there will be as many as si as many as his pure strategies because that defines this the and that you will have one such set for every player. So, you have to solve these simultaneously for to solve for to solve for the sigma and it that then is your equilibrium that is what is called the quantal response. So, you solve these simultaneously simultaneous solution of these equations gives rise to what is called a quantal response equilibrium. Now, you can we can actually for the asymmetric matching bin is we can in fact calculate this in close form and see what it turns out to be turns out that sigma 1 1 star that means the probability that player 1 will play heads this turns out to be equal to square root of x divided by 1 plus square root of x and sigma 2 1 star becomes 1 by 1 plus square root of x. So, earlier remember the the row player was playing half-half regardless of what the value of x was that is not the case anymore. So, you get a you get a a a quantal response equilibrium in which in in which you you know which depends on the value of x. Now, the thing with the the the sort of this is all well and good trying to you know trying to explain behavior, but then the problem with once you go away from the kind of rigor and theory the it is very demanding rigor and theory that we had for when we were developing the rest of you know game theory otherwise. The problem that you have then is that there is not too many too many possible variations are possible. That is so you once you come up with some kind of a model that like this that a player should play in proportion with the utility then you can ask why not something else. So, for instance sigma you can ask for something like this sigma ij equal to v ij raised to lambda divided by summation v ij dash raised to lambda, where lambda here is greater than equal to 0 is a parameter. Now, you can see this is also a a model this is a model of responding you can solve for the solve for a quantal response equilibrium with this kind of quantal response model as well. It turns out in fact, I can tell you what it looks like in close form sigma 11 star now becomes x raised to lambda divided by lambda square plus 1 divided by 1 plus x raised to lambda divided by lambda square plus 1 and sigma 21 star 1 plus x is to lambda square by lambda square plus 1. Now, so this is obviously a generalization of what we wrote earlier if you take lambda equal to 1 we get back the the light intensity value model. But we can have actually any you can take any greater than any lambda that is non-negative and it gives you a valid quantal response quantal response model. Now, but here also there is something interesting we can actually interpret this lambda in a very precise way that we can interpret this lambda as the responsiveness of the player which is that which means for every small change in utility how responsive is the player in the sense that if you change the utility by a small amount how much will that how big a change is that bringing in his in the mixed strategy that he is doing. So, how sensitive is his mixed strategy to the changes in the utility that can be tuned by changing the lambda. So, you can so if the lambda is larger if the small changes in the utility will lead to bigger changes changes in his mixed strategy. So, lambda is then interpreted often as a so lambda is interpreted as a responsiveness but you can also interpret it in another way. So, can you tell me what happens when lambda is 0? Yeah, so when lambda is what happens when lambda is 0 in this model it becomes uniform right if so which means if lambda is 0 he is indifferent between everything he does not care basically whatever be the utility is just going to blindly play up your strategy uniformly at random. So, when lambda is 0 you have a player who basically is indifferent or does not care in short apathetic player essentially does not care what about his utility or what happens as lambda goes to infinity you get back the Nash Equilibrium. So, if as lambda goes to infinity so you these are ratios of you know things of geometric series you can quickly check the one eventually the ones if the only ones that will get positive probability. So, it is not Nash Equilibrium which actually gets it gives you best response. So, as lambda goes to infinity as lambda goes to infinity you get back best response. So, anything that is slightly suboptimal will get 0 probability in the in as lambda goes to infinity. So, and in fact it is possible to show this more generally also you take the quantum response equilibrium and let lambda go to infinity you get back you know in a very precise way you get back you get back the Nash Equilibrium.