 So, there is a version of another equilibrium concept, we have been talking about Nash equilibrium all along, but there is also another equilibrium concept which is commonly used and it is what is called the Stackelberg equilibrium. So, let me because this is a kind of a straight topic I thought I will just first tell you about this, ok. So, in the Stackelberg equilibrium here is what is happening, we you have let us take a case of just two players, first there is a player who commits first and that player is is what is called is what is called the leader, so leader commits first. So, the leader commits is the player who commits first. So, what have but what is the logic by which he commits and this is where it differs from the Nash equilibrium. So, when he commits he is take he not only commits he announces that strategy also to the other players. So, he commits a particular strategy and makes that known to all to the to the other player. And now the other player has to take into account the knowledge that this player has committed to such and such strategy, ok. So, it is not just an action that he is committing he is actually committing an entire strategy, ok. He is committing the entire he is not it is not just an action that he is taking he is committing an entire strategy and making that strategy known to the other player. And the other player then responds to that response to that. Now, yeah no no it is not a verbal thing it is a it is an enforced commitment he announced it is it is announced that way. So, I will give you an example a case of where the a player that commits first is the case of any law enforcement, right. When a law is written down or when you see you take a currency note for example, the RBI governor has signed there I promise to play pay the bearer or some of so and so rupees is basically committing that is that is the commitment, ok. So, it is a firm commitment that is what he is that he does that and makes it known to all, ok. So, now in a security situation for instance this is also again common in a security situation you your at least some part of your security apparatus is deployed and is no and is apparent that it has been deployed, right. You see police standing on the street you see you see that there are gun what a metal detector is this that and so all of these are visible security measures that you have put in and they are actually seen, ok. So, you are effectively announcing that this is this is my this is my strategy. So, then well if you are going to announce your strategy then how is and the other player is now going to respond to this, right. Question then is then how are you go how are you picking your strategy. So, the leader then as leader commits his strategy the follower responds to this, responds and responds with best response let us say, he responds with best response, ok. So, this is step one this is step two, follower responds with best response, if follower is going to respond with best response then how does the leader the question is how does the leader choose his strategy. What the leader does is he chooses his strategy anticipating the response of the follower, ok. So, in particular let us let us write this out. So, suppose there are two players J1 and J2 as player P1 and P2 and these are their payoffs J1. So, this is cost of this is cost of player 2 and this is cost of player 1 these are the strategies. So, player 1 suppose is the leader, ok say player 1 is the leader if he when he is the leader he is announces a strategy say J1 sorry gamma 1 he announces gamma 1 now player 2 will respond with his best response, ok. So, then this will lead to a best response gamma 2 of the player. Now, what is how is gamma 2 chosen gamma 2 is chosen from the best response set of player 2. So, let us call this R2 of gamma 1. So, what is R2 of gamma 1 R2 of gamma 1 is those let us say gamma 2 dash such that those strategies of player 2 which are which given the lowest the lowest cost as a function of for the fixing gamma 1, ok. So, with gamma 1 known player 2 is now going to respond with a best response, but then here is the thing. So, what happens is though there are multiple best responses usually. So, player 2 may be may have several best responses and he is quite ok with any of them, alright. Now, in this case how does how does player 1 choose what to do. So, let us take this step by step suppose there is just one best response, ok for player 2. Suppose there is just one best response for player 2 then how does player 1 choose his strategy? Yeah. So, he has announced gamma 1, gamma 1 is known to player 2, so player 2 he can play if this is a singleton if there is only one best response then player 1 can simply plug this thing in right. So, he can just anticipate that gamma 2 is going to be R2 of gamma 1 and then he says well let me minimize this overall gamma 1. So, you can see how this is different from Nash equilibrium. So, here the strategy is announced player 2 now picks a best response player 1 can anticipate the best response, plugs that in and says well ok now anticipating that this is how the best response is going to change as a function of my strategy put that functional dependence in and then compute what is going to be my best strategy, ok. So, this is a Stackelberg equilibrium and I will explain how this is related to dynamic games and so on, ok. Now, this same this exact thing could also be done in a static game as well, ok. So, there is nothing it is not necessary for this to be a dynamic game actually you could it is what is important is that you know you have a mechanism for communicating your commitment, ok. Once the commitment is communicated then the other person must take that into account as part of the part of his information. Now, the relation the slight difference from a dynamic game is in a dynamic game the what is committed and what is the part that is seen by the other player. So, if you remember we wrote out something similar in the case of these two producers producing quantities Q1 and Q2, right. What the other player was seeing the second acting player was seeing was the quantity which is the action, ok. Whereas, here it is the what the other player is seeing is the entire strategy. Now, the thing that you need to be clear is player leader committing is the player that commits first, leader need not be the player who acts first, leader is the player who just commits first, ok. During gameplay what actions are revealed and all that is a completely separate matter and that is the you know there could be imperfect information about actions and there there could be perfect information imperfect information and so on and that those actions could be available to be seen by the other player. But here the thing is that the the so leader could also be a player who acts second in the game, right. He could commit first and say well this is what I am going to know and just make that commitment known. This obviously is going beyond the assumptions of the Nash equilibrium because really the Nash equilibrium in the Nash equilibrium where players are computing strategies independently or without communication with each other. Here there is a commitment and the commitment is such that it is broadcast to the other player. The other player is actually sees and then therefore acts and in fact it is it is done with the explicit knowledge that such a thing is going to be that this broadcast is going to be happening, right. You are committing, you are making the your strategy known and then keep anticipating the response that will come out of it you are choosing your strategy, ok. So the mode is actually different, yeah. It is playing that strategy that the but the thing is the way it is chosen is that it is taking the anticipating the other player's response. So I will just write out the equation and it will be clear. So how is this different from the Nash equilibrium? Let us say suppose this was a static game how would it be different from the Nash equilibrium suppose this is gamma 1 star, right. So suppose gamma 1 star is this the one that minimizes this, right. So then gamma 1 star then is simply a best response to gamma in a Nash equilibrium is a best response to gamma 2 star and gamma 2 sorry R1 and gamma 2 star is a best response to gamma 1 star. Now the second part is true in a Stackelberg equilibrium as well, ok. But the first one is where the difference is because of in a Nash equilibrium what player 1 would be doing is this is a Nash equilibrium, ok. In a Nash equilibrium what a player 1 would be doing maybe I will just call these by double star. What a Nash in a Nash equilibrium player 1 is fixing this at gamma 2 star star. He is not anticipating how gamma 2 star star will change with gamma 1, ok that is the big difference. So because now the commitment is announced and then first on to the other player. So this, there is necessarily therefore a dependence here, ok. So the setting in which these you use a Stackelberg equilibrium is one where there is a huge separation in time scales. You know the first commitment happens and then it is irrevocable after that. You deploy your security apparatus, everyone gets to see it, your security apparatus there it is not like it is a you really cannot argue that to you know your security apparatus and the attacker strategies are being chosen simultaneously, ok. So that is why there is a this is where that is why this concept is used. Obviously there is because of this the announcement of the strategy and so on it is necessarily therefore that there has to be a sequentiality to the whole process. So it is the correct way of seeing this is it is almost like it is really the first player is parametrizing the second player's decision unlike both players it is not like the both players are playing simultaneously against each other but it is the first player who is parametrizing the other players decision problem and the other guys then just picking something as a function of whatever the first player has announced, ok. Now let us now generalize this now so this was if he could in fact plug this in right if this was in fact the for something that he could plug in and for that this had to be a singleton and this had to be a there had to be a single best response. Now if there are multiple best responses then what would the leader do yeah. So there are now this creates a problem for the Stackelberg equilibrium because you have to now bring in something more into the picture because if you are anticipating then you are anticipating what the follower is actually ok with playing anything in this set all of these are best responses so but you are not ok with all of them right you may prefer one or the other so you have to bring in some other means therefore of dealing with this problem so then this brings in what are called attitudes so you have a optimistic leader and you have a pessimistic leader so in an optimistic leader what is what is he doing he is looking for a gamma 1 star that minimizes the best case amongst all the best responses so this is the optimism here. So he looks at all the best responses that the follower could play and looks at the and things amongst those best responses see is the one which is most favorable to him ok so that is the optimum this is called the optimistic formulation essentially you are looking at the best response that is all of these are for the follower so you look for the one that is most in your favor ok. The pessimistic one is where you look at the worst case so you look at amongst all the best responses you look at the one which is which gives you the highest cost and then minimize that highest cost ok that is how you disambiguate this so I will tell you one interesting thing that happens because of this. Now let us suppose this is not one follower now but instead there were say n followers ok so you are let us say you impose say a new type of you introduce a new type of law or something like that or let us say you introduce you build a new road and then the traffic adjusts itself ok the followers are all the people in the cars they are trying to decide which route to take and all that you can model it as some game it leads to something like an ash equilibrium in that as well. Now what then the game is not really between you and the people in the car the game you are building the road anticipating how the people are going to now drive their cars and what is going to be the result of that road of building that road right. So the game is actually once the road has been built it is that is your irrevocable commitment the once the road has been built the game actually is at the level of the drivers right. So then it is not just one followers there are n followers now and they are all we can say in the Nash Equilibrium they are actually competing amongst themselves to get you know trying to get to their you know least travel time and what not this is clear. So that leads to a game amongst the drivers and you decide anticipating what about that game you have to decide which way to build the road but anticipating what then. So here if there was one follower you had you were looking at the best response. So you if you have if you have n followers what you need to look at is the Nash Equilibrium of the game at the level of the followers right. So if there are n follower if there is one leader and n followers then essentially every strategy of the leader gets announced seen is seen by everyone all the player all the followers the followers then play a game amongst themselves and you as a leader you can ask what is the outcome of this right well what is the outcome well the outcome is the Nash Equilibrium of that way alright. So then if there are let us go to a new page suppose you have multiple followers so if you have multiple followers so let us say so player 0 is this is your leader and then players 1 to n these are for followers ok. So then in that case what you have you have that there is a Nash Equilibrium amongst the followers for each strategy of the leader. So there is a J i of gamma i gamma i star gamma minus i star. So player 0 is now the leader players 1 to n are followers and they would then a Nash Equilibrium amongst the followers for each given strategy of the leader right. So you would look at Nash Equilibrium of this Nash Equilibrium Nash Equilibrium as a function of the strategy of the leader right. So that would be gamma 1 star to gamma n star such as that this source that would be the set of Nash Equilibrium and they would themselves be that set would depend on gamma 0 right. So then again if you want to now decide as a leader what where should you be building the road you have to again say well this is a game that would could in general have multiple Nash Equilibrium and you look at you then ask ok what based on some attitude either optimistic or pessimistic attitude you decide what should be the how what should be your strategy. So you look let us say for an as an optimistic thing optimistic leader or a gamma 0 star that minimizes the best case right and if you have a pessimistic the pessimistic formulation will have this this now leads to another one more level of complexity ok which is the following suppose you now have not just you have multiple followers here ok but one leader what if you have multiple leaders and multiple followers ok. So suppose there are two leaders let us say Airtel and Geo both are trying to decide where to you know where to deploy let us what kind of plans they should offer or where to deploy their towers or something like that ok and as a function of that players the common public will then say well I want to I want this or I want that and so I want to go for this plan or that plan and so on will distribute themselves. You can think of a sort of a game at the level of the of the level of at the level of the common public where common public do not want to be in a network which is overcrowded because they will get too much too many call drops so they would want to be in a network where where there is less crowding and so on the the two said telecom providers want to decide well now what should be our strategies for each strategy of each leader you take a profile of strategies at the level of the leaders you would have a Nash equilibrium amongst the followers ok. Now let us suppose we fix an attitude for both you fix optimistic for both suppose ok or pessimistic whichever does not matter they would there could be multiple Nash equilibria at the level of followers ok. So, now what each leader is then doing is playing a game what the leaders are doing is playing a game amongst themselves anticipating there is the result of of the strategies on the game between the followers what is going to be the Nash equilibrium that will result from their from the game that they play at their level ok. The Nash equilibrium amongst the followers that will result from the two pull-off strategies that they play at their level is this clear. Now the the trouble though that arises is that one player may anticipate one type of Nash equilibrium to result you know let us say out of this whole set of Nash equilibria that will result one leader may think one will occur the other leader may think another will occur ok. Now and anticipating that they are they are playing their they are now choosing their choosing their strategies but where but only one of them will actually occur right. So, this see this issue never actually arose in a dynamic game because in a dynamic game where we solve for the Nash equilibrium if every player who was every player who was playing later you fix something for that player right you fix the strategy of that player and for each fixed strategy you you moved forward. So, suppose if you had let us say you found a static subextensive form you form which had multiple equilibria you solve for each fixed equilibrium for that for that subextensive form we moved upward that was the backward induction argument. But here that is not what you are doing your for your each guy has some attitude and they need to find they need to fix on they need to disambigrate using something or the other ok. So, this actually leads to leads to some trouble in the with with in in applying this concept because essentially what is going to happen is both players are going to make plans but making plans on possibly divergent sort of imaginations about about the this thing. If and in that case then the the resulting solution that you get does not have any any let us say equilibrium property left because neither of whatever that they have both made their let us say plans but actually and both have been made taking into account a certain let us say taking into account possibly different projections alright. Because they are choosing out of multiple equilibria with the one that they most that that fits their attitude. But the equilibrium that will get realized is is probably one of the only one of them or none of them and then in then the the plans that have been made are not in equilibrium anymore given that given the equilibrium that has actually been realized because both would want have would have some regret because they would say ok if I knew that this is what was happening then I would have done something else right. So, this this actually leads to an issue and so and I think so there is a there is a this is in fact in so about so when I was doing my thesis I wrote a paper on this solving this particular issue saying that there is actually a problem with this with the multiple reader version of this formulation and the multiple reader version of the formulation is exact and the main problem is basically that with that you are allowing players to S to do this anticipate the the anticipate out of the set of Nashicular each player is anticipating but you are not enforcing anything to ensure that the anticipations are in fact consistent ok. So, what I argued was basically that the correct way of formulating this a problem with multiple leaders ok with if you want to write out a Stackelberg in Nash Stackelberg equilibrium what is called a Nash Stackelberg Nash equilibrium at the level of leaders who are each playing us in a Stackelberg fashion ok. If you want to find solve for a Nash Stackelberg equilibrium you have to include in it a another another constraint basically that that demands that players are consistent in their conjectures about about the equilibrium that they are anticipate and then everything falls in place otherwise there is there are there are you also get a whole lot of other issues which I cannot write right now talked about there are kind of a lot of other pathological and kind of paradoxes type of emerge once if you do not allow for this. So, and this is not just to do with Nash equilibrium see this is this issue that there could be multiple sort of outcomes and each guy is trying to anticipate one out of them right. This is an issue with any system in general I mean a system of interacting agents we are trying to analyze it with a Nash equilibrium there are multiple Nash equilibrium ok, but a dynamical system could have multiple you know equilibrium points a Markov chain could have multiple stationary distributions many this thing of where there are multiple settling points and you do not you are not clear about which one will actually occur and you are doing trying to put in some kind of an attitude on top of it right to disambiguate this creates this problem. It is fine as for one player there is basically no issue the issue the all these paradoxes will happen when there are multiple players and there are multiple players also I mean it is also it is a messy problem. See this is see also this is also another issue about how we structure you know this kind of multi stage this thing right you if you I do not know if you know enough about these things there are. So, if you look at the way we markets for renewable energy and so on are structured or markets for in fact electricity in general see what happens is there is a you have to commit to a certain level of generation beforehand because it you need if you are like say a nuclear power plant nuclear power plant takes about 6 or 8 hours to come online it is a you cannot just turn on a nuclear plant just like that it takes some amount of preparation to get it online. So, if that has to be if that has to be coming online then it needs to be told it needs to be given a certain amount of advance notice. Now if there are markets for deciding who gets to come online they happen but then in that market because the lead time for these kind of big power plants they need a big lead time. So, those markets are held well in advance then there are other markets separately which are which are which you know do is like which kind of where the generators come online in and like an instant instance like gas and all that. So, they need only a few minutes notice all right and they can all they are almost like real time thought of thing. So, now what happens is if you do not plan these markets if you do not design these markets properly right what you can do is you can actually sort of create these distortions where you perversely you know try to under commit in your what is called the day ahead market. So, that you can later make money using some other using your other resources in real time. The idea I mean what the idea behind all these markets is that well nuclear is cheaper or all these you know whatever are the bulk power plants are cheaper. So, they should be prioritized first bulk of the power should be made from them and then all the minor last minute adjustments should be done with the other real time market. But that is not how it actually can play out if because of if you do not design these things properly. So, again there is this issue issue here where there is a commitment there is a difference in time scales of the commitment and the players can actually exploit that to their advantage by you know by by you know appropriately playing.