 Let us now start discussing the equilibrium notions in imperfect information extensive form games. So, we have seen that in perfect information extensive form games the equilibrium notions are pretty straight forward we are looking at the sub game perfection that is at every sub game of the game of a PIEFG you can find the Nash equilibrium and that a Nash equilibrium which is guaranteed to exist will be the sub game perfect Nash equilibrium of the PIEFG. Now there are ways how you can extend the same idea for IIEFGs as well the only trouble being now the nodes or the histories of these games are uncertain because there are certain information sets which is not there in PIEFGs or they are and they are singletons so there is no uncertainty for any of the players which state of the game it is in. So, unlike the PIEFG and IIEFG will have to bring in a notion called belief that means that what is the probability distribution for a player who is playing at that information set about the different nodes or different states of that game in that information set. So, if we have let us say two different nodes in an information set how does that player who is playing at that information set believes that this is in this node versus that node and with what probability that actually determines what will be the final equilibrium notion and in the equilibrium notion we will also see that in addition to just the mixed strategy that we did for normal form games there is also another component which is known as the belief and this believes and that mixed strategy has to have some sort of a consistency. Okay, so belief as we just said is a conditional probability distribution over the histories of an information set. So if you have multiple nodes which are nothing but histories the probability distribution over those histories in the information set is essentially the belief. Now from now on we are just going to discuss only games with perfect recall and we know that in such kind of games behavioral strategies and mixed strategies are equivalent so we are going to use them interchangeably sometimes we will also use the same notation to mean that it's mixed or behavioral. So let's first start with this example this is an example from the Maschler's book. So what does this show? So there are two players in this example. So player one plays two different levels. So first it plays here. So this the player one is playing here as well as playing at this third round. So it has three information sets I1, I1-1, I1-2 and I1-3. Similarly player two has one information set which is here and it is playing in the second round. So let us look at some behavioral strategy profile and that behavioral strategy profile is actually shown using the bold lines here. So player one in this information set is playing a mixed strategy which is 5 by 12, 4 by 12 and 3 by 12 with that it is playing this capital L capital M and capital R respectively. So in the second round when it is player two start to play it is it is playing the action L even though it has this action M and R and in the third round player one comes again and it plays R1 here in this second information set and L2 in the third information set. So that is the behavioral strategy profile of these two players. Now we are going to ask whether this is an equilibrium and by equilibrium we are using the same Nash notion. So if the other player is playing the strategy that it is playing according to this strategy profile is it beneficial for this current agent to deviate from it or the strategy that it has picked in this strategy profile is the optimal one. So how should we talk about what is the rationality here? So rationality as we said depends on what is the belief. So let us focus on this information set let us say I1-3. So if the probability distribution or the belief of player one who is playing at this round is let us say P that with P it believes that it is in S6 and with 1-P it thinks that it is in X7 then based on that because then what is going to happen is that you can do this small calculation that if it plays L2 it has this expected utility which is 2 times P because that is the utility it gets if it is playing L2 plus it has the probability of 0 the utility of 0 at 1-P if it believes it is in here which happens with probability 1 by 1-P and if it plays L2 then it gets an utility of 0 and if it plays R2 the expected utility is going to be 0 times P that is this number here and it will be 5 times 1-P. Now if you compare these two things you can clearly see this observation that if if the belief was in favor of it was larger than 2 by 7 in favor of X7 so if this this player thought its belief was such that this 1-P is larger than 2 by 7 which means that this P is at most 5 by 7 in that case it would have chosen R2 so that is the so you just solve that inequality you will find it. So this is what we are going to call as sequential rationality so given the belief it is going to take the decision which is rational according to it and based on this kind of a rationality notion we are going to evaluate whether the the strategy profile that we have just given that behavioral strategy profile is indeed an equilibrium or not okay so let's let's go back to the figure it's easier to explain it that way so now we know that if the belief was less than 5 by 7 so in favor of this so what can player 1 think in this case so it knows that if if it is if it is L then its belief is going to be equal to 1 so for X6 belief of player 1 will be equal to 1 and X7 will be equal to 0 so in that case it knows for sure that it is it is going to be in this state and it can it can choose L2 because L2 is going to give them give give it a higher utility than choosing R2 so this L2 action at this point so which is a which is the behavioral strategy which happens to be a pure strategy in this case is definitely a better option now what about this information set by the choice of player 2 which is L the conditional probability here is undefined because the probability of reaching this information set is itself 0 so therefore we cannot really calculate any conditional probability on it so therefore R1 is sort of vacuously rational so there is no harm for this player to play R1 because this node will never be reached now let us go go above one stage so if player 1 so let's say player 1 was after player 1's first stage of play we are in this information set of player 2 and player 2's belief could be something like again you can put some numbers let's say q and 1 minus q that denotes what is the probability at these two different nodes different histories in the same information set now it knows that if it plays L so if it if it plays L it is going to get an utility of 1 here and in this case it is going to get a utility of 2 here now you can you can actually calculate what is the utility that it gets given that the player 1 is still holding on to the same strategy so it is playing L2 in this information in the third information set and is playing R1 in the second information set you calculate what is the what is the utility that you will get if you play M which turns out to be 0 in this case and if it plays M here then it will also get minus 1 so certainly this is better than that and similarly if you play R and this player plays R here then it gets 0 similarly if it gets 2 plays R then it is going to get an utility of 2 right now the the question now becomes based on what kind of so it will play L or R based on what is the the belief in this in this in this node so if it believes that it is the the probability of Q probability Q is much larger than 1 minus Q then possibly it will go with 1 because playing L at this node is a is a better strategy otherwise in this case it is like it is the it is getting the same utility here so it is sort of weakly dominating that that strategy so we can we can see we can do the the calculation so here the if the if player 1 is using this strategy this mixed strategy in the first round then what what is going to happen is you can find this Q that is the the belief of this player you can find that to be 5 by 9 and this belief here will be 4 by 9 so clearly you can you can see the expected utility of playing L under this situation is strictly better than playing R under this belief system so that is essentially the calculation that we have we have seen so so you can take a look at this so this essentially based on the same utility the same strategy profile we have actually derived the the probability of reaching every node and the the conditional probability of reaching a specific node when it is at a specific information set is nothing but the the the the total probability of reaching that information set as denominator and in the numerator you have the the probability of reaching that particular node so that is the the belief so essentially that is the numbers 4 4 by 9 and 5 by 9 that we have calculated so if it has to be sequentially rational then we will have to see that the the utility the expected utility when it is playing that strategy so by picking that strategy L that we have just discussed we will see that the expected utility is going to be 13 by 9 and this is this is going to be larger than any other choice of action if you pick M or pick R you can take a do the same calculation and try to find out what is the what is the expected utility and that is going to be smaller than this number so therefore under this belief system so where player one has already picked this behavioral strategy in the first round it is appropriate for player two to stick on to this strategy L because that is going to give it the highest expected utility. Now if you look at the the first stage so we are going in a in a backward manner so we are if you look at the strategy of player one whether it has done the right thing or not you can see that the expected utility so given that player two has chosen L and at the third round player one has chosen what it has chosen so it gives player one an utility of two in if it is in this state of the game and at x3 it is going to get an utility of two again and again at R it is also going to get this utility of two so all the utilities are actually equal so it makes sense to give positive probability masses in all of these actions and therefore these numbers are meaningful so just compare this with the the notion of mixed strategy Nash equilibrium we have said that if the utility if you if it has to happen if the strategy the equilibrium strategy has a support where it is putting positive probability masses then it better be the case that the expected utility when other players are putting their strategies according to that equilibrium it has to be equal and in this case the you can even further divide it at a specific stage that is in this stage of that behavioral strategy because behavioral strategies are defined at every information set you can also hold the the strategy of the same player in the other information sets to be fixed that is exactly what we have done so for that the same player at i12 and i13 we have held fixed and for player 2 if it plays this strategy of L one can see that under all these strategies of L M and R the expected utility is going to be the same and that is why you can put positive probability they all can live in the support of this behavioral strategy. Fair enough so that is that is the that is one way of showing that this particular strategy profile is a is an equilibrium point however there could be more equilibrium point and in particular you can actually take a take a look at the same graph and try to find out if there exists some other kind of equilibrium so let me give you one small exercise you can see whether M by the by player 1 so M comma M R2 so that is that is one strategy so this is a pure strategy so it is playing M in the first round R3 in the in the last round so in fact you can you can even write that R so instead of M R R2 so because there are three information sets we should write M R1 R2 comma the the second player is just choosing M because it is it has only one information set do you see if if this is an equilibrium or not this this one here M R1 R2 comma M so let let us just define for for the completeness we already have seen all of this notions in in practice in that example let us just formally define so that we can also define the equilibrium notion in the IIFG so the first definition is that of belief so belief is nothing but what we have defined it is defined for a specific information set of a player so let us say for player I in the information set J it is a probability distribution over all the histories in that information set so that mu ij is something that we are going to call the belief now this belief it becomes a patient belief and now we are bringing in another probability distribution which is the mixed strategy or the behavioral strategy in this case so we are going to call a belief which is the same as before a belief which puts some probability masses at each information set for a player this is going to be Bayesian with respect to the behavioral strategy sigma if it is derived from that sigma using Bayes rule what does that mean this is something that we have used to find out what is the belief of that player at a specific information set so we are just looking at this at that information set summing over the probabilities of all the nodes under that sigma that is what is the so what is p sigma y this is the probability that that the game reaches node y under this behavioral strategy profile sigma so player one is choosing some some strategy of picking some action at every information set player two is similarly choosing something else and based on that you reach a specific node which is y so you sum over all such nodes probabilities probability of reaching all the nodes in an information set therefore this denominator becomes the total probability of reaching a specific information set now that you are in that information set what is the probability you will be in one of those nodes which is x in that in the same information set of player i so which means that this is the the belief you are essentially already conditional on the fact that you are in this information set i i j what is the probability that you will be in that node x in that information set that is going to be given by this belief and because it is consistent with that sigma using this Bayes rule we are going to call this belief as a Bayesian belief so now the notion of sequential rationality is very similar to the to the notion of Nash equilibrium the only difference being that now we are also expecting over the the node being in that information set so you can see that this notation utility of agent i when it is playing sigma i prime and other players are playing sigma minus i condition on the fact that you are in that node x and what does this utility mean because you are in a in a in a tree here so in a tree like structure so let's say we have a we have one we have a bunch of nodes in a specific information set let's say this is the information set and there are corresponding sub trees below it now once you are in that node let's say x all that matters the utility that you are looking at is is conditioned on that x that is you are only looking at this particular subtree which is the subtree of x what is the utility that you are going to get when you are restricting this graph at this node x under this behavioral strategy profile mu i prime and mu minus i so that is going to be defined as ui and then you take the expected sum because you are going to look at you do not definitely know as agent i you do not really know exactly which node you are in so you you might be in x but you might be in some other nodes in the same information set so you are looking at the expected utility of expected utility at that information set and what it is what sequential rationality is saying is that this sigma i sigma minus i is going to be sequentially rational if everybody else is still committing to the sigma minus i and player i does not benefit by deviating to any other sigma i prime so this is very similar to the Nash equilibrium but now you are looking at the the information sets in an IIEFG so that is the only difference that we have so now we have actually a combination of two things one is the behavioral strategy which is given by sigma and the corresponding Bayesian belief which is mu now this tuple of sigma comma mu we are going to call that to be sequentially rational if if it is sequentially rational for every player at every information set so we will sometimes also refer to this to this tuple as an assessment I mean this is just a just a terminology so as you can see I mean it's a refinement this a notion of sequential rationality is a refinement of Nash equilibrium so the notion essentially coincides of with sub game perfect Nash equilibrium when you apply the ideas to PIEFG so the and that is not very surprising because in PIEFG your Bayesian belief is like a single term I mean it's a it's a degenerate distribution because all the information sets there are single terms so the theorem says that in a PIEFG a behavioral strategy profile sigma is an SPNE if and only if that tuple mu sigma hat is sequentially rational sigma mu hat so sigma mu hat is that Bayesian belief which has to be a degenerate distribution so with all these definitions now we are in a position to define the the equilibrium notion which is a perfect Bayesian equilibrium notion and that very naturally is dependent on this tuple itself the assessment sigma and mu this is going to be a perfect Bayesian equilibrium if for every player i in n two conditions hold the first thing is that this mu i is a Bayesian belief with respect to the sigma and then the second one is this sigma i is sequentially rational given sigma minus i and mu i so the sequential rationality we have just seen the definition it is very similar to the Nash equilibrium and the the mu y that you are deriving is essentially coming from the same behavioral strategy which is sigma so if these two conditions hold then we are going to call that a Bayesian perfect Bayesian equilibrium so this is often sometimes only represented with sigma but you can always remember that there exists a underlying mu which is Bayesian with respect to that sigma so it is something like it is very similar to SPNE self-enforcing but in a Bayesian way because now we do not know perfectly the players do not know exactly which state they are in therefore they will have to use this sort of a belief system and also there should be some sort of a consistency so therefore it has to be this mu y has to be Bayesian with respect to that sigma so once that happens we we have an equilibrium notion and that is the equilibrium notion in imperfect information extensive form game