 Welcome back to NPTEL course on game theory. In earlier sessions we have discussed 0-sum games. Now we will switch our focus to non-0-sum games. So in a 0-sum game what we have is that the sum of the 2 payoff the player 1 and player 2 payoff is always 0. In a non-0-sum games we do not take that and they can be different. In fact one example that we have seen earlier is coordination game which is a non-0-sum game. Let me recall the coordination game first. So what is this coordination game? In fact the version that we are going to see is known as a battle of success. Let us say there are 2 players let us assume wife and husband. So there are 2 choices one is going to a movie or going to a shopping. Let us say player 2 is husband player 1 is wife. If the wife prefers shopping compared to movie and husband prefers movie to shopping. So if they go to husband goes to movie he gets a higher benefit whereas wife gets more benefit if she goes to shopping. But the most important thing is that the benefit they will get only when they go to go together. So when both of them go to movie for example the husband is going to get 4 and let us say wife is going to get 2, 4. And both of them go to shopping 4 and 2 this is their benefit and then the other things can be any numbers appropriately taken. But right now let us say I am taking 0s. So we can make different numbers but let us say. So this is a game where the sum of the payoffs is not 0. This is an example of a non-zero sum game which we have seen earlier. So in fact both going to movie is an equilibrium both going to shopping is also an equilibrium. So as I have been pointing out since earlier in a game theory the most important thing is that the players are making their decisions simultaneously. They do not know what the other is going to do it. So under that is a very important thing if both of them know what they are doing it that becomes an optimization problem but the game flavor will be lost in that situation. So here the most important thing once again I elaborate is I stress upon is that they are choosing their decisions simultaneously independent of others that is a very important. So this is an example of a non-zero sum game. Now we will see another very, very important example which is known as a Prisoner's Dilemma. The Prisoner's Dilemma this example is framed by Merrill Flut and Melvin Drescher working at Rand Corporation in 50s. And it is Alan Tucker who formalize the version that we are going to see now. So in this game there are two individuals who have committed a serious crime both of them are apprehended but there is no criminal evidence. So the police has no evidence that they have done but they believe strongly that they have done it. So they can actually try persuading those guys asking them to confess against the other. If they can confess then that is going to help them. So that is basically the situation here. So what the interrogators do is the following thing. They are isolated in separate cells and they cannot communicate each other and then they gave the following choices. What the choices are? If you confess and your friend refuses to confess you will be released from custody and receive immunity as a states witness. If you confess and your friend refuses to confess then you will be released and your friend will be prosecuted using your evidence. If you refuse to confess and your friend confess a symmetric situation reversing the roles and both refuses to confess then the police has they do not have a sufficient evidence. So therefore they can only give a very little punishment. But if both of you confess then that means we have evidence against each and then they get a some reduced term of imprisonment. So that is basically the case here. And now this situation is a two-player strategic form game. So there are two choices D is for defection that means betraying your fellow criminal by confessing. That D means you are you have confessed that this crime is done and then in a sense this is also betraying the your fellow. And C is here means cooperation that means you are cooperating with the police here. C means cooperation with the fellow criminal and you are not confessing the crime. So this situation you can see it as a picture here. So there is a picture here in this you have a two players it is a 2 by 2 matrix game both are defecting both the players that means the player 1 is defecting against player 2 and player 2 is defecting against player 2 that means both have confessed their crime. Therefore the police has sufficient evidence against both. So in this case 6 and 6 are going to be the years of imprisonment. And if player 1 defects and player 2 confesses then player 1 is immediately released and player 2 gets 10 years punishment. And similarly if player 1 confesses or here in this case it is a cooperation he is cooperating with the player 2 and player 2 defects the player 1 then 10 and 0 that means player 1 is getting 10 years of imprisonment and player 2 is getting nothing. And both are cooperating each other both players are cooperating each other then there is not much evidence for the police therefore both of them get only one year of imprisonment. So this is going to be the situation here. So if you look at it so for what is going to be the equilibrium here in this setup. Now remember here people are minimizing because this is a imprisonment so this is a cost so therefore so far in the zero sum games we are assuming the players are maximizers but here we are we have a situation where both the players are minimizing their imprisonment. So no player would like to get 10 years of imprisonment. So therefore clearly we can see that 10 0 and 0 10 these are not going to be a equilibrium here we can easily verify it. What about DD? Is this equilibrium? So let us look at it. The way to see is that like in zero sum games we have some if once let us say player 1 has fixed to D. What is best for player C? The player 2. Once I know that the player 1 has used D for player 2 he has only 2 choices D and C for D he is getting 6 and for C he is getting 10 years of imprisonment. So therefore D is better. Similarly when player 2 decides to play D player 1 the best is to play D certainly therefore this is a Nash equilibrium. So this is all thought process the way it happens is that the player 1 will think. Suppose if I do not defect if I do not defect that means let me say I cooperate with my fellow criminal then if I cooperate what is going to happen to me that means 10 0 and 1 1 I will get either 10 years or 1 year. Can I assume that the other player cooperates? If other player why will he cooperate? If he cooperates he will get 1 year and if he does not cooperate he is going to get 0 therefore not cooperating is better for him therefore he will never cooperate. So therefore if I am cooperating I assume that my fellow criminal will also he will never cooperate he will only defect. So therefore for me cooperating is not a good choice therefore defecting is better. The same thought process will be with the other criminal and he will also think the same way and therefore defect happens. So therefore this is essentially what I was telling earlier that about the rationality okay people are rational they maximize their benefits the selfish behavior is very important here. So using that rationality behavior rationality we know that DD is going to be a Nash equilibrium. So this is essentially what happens here. So DD is a Nash equilibrium but here is an interesting situation the CC when both of them cooperate they are getting only one year imprisonment but that is not a Nash equilibrium they will never play CC. So this is exactly what I have been saying the thought process of the player he will always think that if I cooperate with my fellow criminal the other player is going to defect therefore defecting is better and a symmetric nature this happens and therefore the player of strategies DD is going to be a Nash equilibrium okay. So here are the few lessons that we need to see. So there is no dominate dominant strategy we have used the domination earlier in zero sum games if a specific column is dominated by some other column you do not want to use that column. So if a domination is possible you will play it but if there is no domination what we are going to look at is the thought process if I play this what my opponent will think about and how he will react to it and then based on that we do that is exactly the way we did it. For example if you look at a this penalty kick in a penalty kick there are two players now if the one player knows that I am very good on a left side then other guy will certainly going to wait for me at the left corner. So he will for me it is not good to play the left corner. So this is essentially the thought process that we have seen earlier. So this is a good example which illustrates that aspect okay. So now the most important question that comes here is that is there a way to make the prisoners cooperate. Now this situation arises at many many instances. So I would like you to think about various situations one example for example is when two neighboring houses when you look at it when they are cleaning it they try to throw the dirt on the other side thinking that their side is cleaner but other side also will do the same thing the end result is both are getting that outside their house. This is a common phenomena that we observe in many many situations. Now in such situations the most important question is how can we get the cooperation. So this is a non-trivial question which we may not discuss in this course much but this is a question that economics, biologists, behavioral economics and several people have been studying okay. We will now see another example which is known as a Kurnos duopoly. What is a duopoly? Duopoly is a situation where there are two forms who wants to control the market for a certain commodity. So we are considering a market and there is a commodity that two forms are selling and they want to control the situation. Now when there are more than two forms we call it as an oligopoly. So the duopoly is actually to decide how forms adjust their production to maximize their profits. The duopoly problem is studied by Kurno very very long back 1838. His work can be seen as a precursor to Nash equilibrium. So sometimes in economics the Nash equilibrium particularly in this oligopoly framework they call Kurno Nash equilibrium. So let us look at how this is. There are two forms one and two they are producing some product and they sell the product on a same market. Both of them are operating in the same market. The price of the product decreases proportionally to the supply. So let us assume Q i is the number of items produced by company i, Q 0 and P 0 are the highest reasonable production level and highest possible price. Price when the total quantity produced is Q that is Q 1 plus Q 2. So when form 1 is producing Q 1 and form 2 is producing Q 2 the total quantity available in the market is Q 1 plus Q 2 which let me say this is Q written by Q. Now the price when the total product available in the market is Q that is given by P of Q which is given by P 0 into 1 minus Q by Q 0 that is this. So if the quantity Q increases then this reduces therefore price reduces and if quantity available Q is smaller the price increases. Now if the product available the quantity available in the market is bigger than Q naught the price is going to be 0. This is one of the very simple example of a price quantity. We assume the cost of producing a product is C and we assume that in fact it is not there assume the price of the product can never be less than the marginal cost. So therefore P 0 less than or equals to C is a meaningless thing. So therefore we always make sure that P 0 is bigger than C. Now strategies of each forms now Q 1 and Q 2 both can be taken from the interval 0 and Q naught 0 is the least that they can produce Q 0 is the maximum they would like to produce. So therefore both the forms have same possibilities. So Q 1 and Q 2 they will choose simultaneously and what are the payoffs. If form 1 is choosing Q 1 form 2 chooses Q 2 the price of the product is P of Q and a form is producing Q i therefore his profit is going to be Q i into P of Q and he also incurs a cost that C into Q i so we reduce that. So therefore this is going to be the payoff function of the form i. Once we know this one now we are in a game setup. So therefore there are two forms they are making their decisions and they have a payoff functions. Now each forms objective is to maximize their profit. So what exactly will they do? So let us do this one. So as we have been doing it earlier we look at what is known as a best response. When player 2 here player 2 means form 2 form 2 let us say he decides a strategy Q 2 then what should form 1 do? So let us assume that when Q 2 is produced by form 2 that is a Q 1 hat is going to be the quantity that form 1 is going to produce and this Q 1 hat should maximize the profit. So what it means is that this Q 1 hat should be maximizing the form 1's profit. So when form 2 it producing Q 2 that is a Q 2 is fixed then form 1 should produce Q 1 which maximizes this pi 1 Q 1 Q 2. Now if I look back this the profit function is given by this one. So I will write it here again pi 1 Q 1 Q 2 is nothing but P 1 of Q into Q 1 minus C into Q 1. So P 1 Q is the price curve that is given in the previous slide and this is going to be the profit that player 1 is getting. So player we are looking at the maximizer. So we use the first order derivative with respect to Q 1 and equate it to 0 if you equate it to 0 so what you are going to get is P 1 Q plus Q 1 into the derivative of P 1 with respect to Q 1 minus C you calculate that you do and then equate it with 0 then you are going to get this much. So this is a simple calculus which I have not done and in a similar fashion if you look at the form 1's decision and let us say it is fixed at Q 1 if form 1 fixes that Q 1 form 2 will produce Q 2 hat which should maximize the his profit and then if you do the same analysis with pi 2 Q 2 hat is going to be this much. Now the another very important point we have done only the first order analysis that means the gradient is equals to 0 then how do we know that if the differential is 0 then it need not be maximizer the corresponding critical point need not be maximizer. But in this case what is going to happen is that if you look at the price curve price curve is a linear price curve is linear in Q and the profit if you look at it it has the product terms Q 1 into P 1 Q minus C Q 1 you can actually verify that this is a convex in the variable Q 1 similarly the profit of player 2 is also a concave in the play in the profit in the decision variable of player 2. Therefore the concavity together with the first order conditions will ensure that the Q 1 hat and Q 2 hat that we have got here is going to be the maximizers. Therefore if when player 2 fixes to Q 2 Q 1 hat is the maximizer of the form 1's profit and similarly when Q 1 is fixed by form 1 Q 2 hat is going to be the maximizer of the form 2. Now what is an equilibrium? Equilibrium is such that Q 1 star and Q 2 star is the equilibrium such that when player 2 fixes at Q 2 star Q 1 star should be the best response to Q 2 star. Similarly if player 1 fixes at Q 1 star Q 2 star should be the best response. So such a thing as Cournot equilibrium it is called and of course in the modern language this is a exactly a Nash equilibrium and that this is the reason why it is also called Cournot Nash equilibrium. So what we get here is that we need to consider the following thing. The equations are the following thing Q 1 hat of Q 2 star should be Q 1 star similarly Q 2 hat of Q 1 star this should be Q 2 star. These are the conditions that we have and if you rewrite those conditions if you put the Q 1 hat functions thing you are going to get these equations. These are two equations and Q 1 and Q 2 are unknowns here if you solve these things what we get is exactly this. The Q 1 star and Q 2 star are given by Q 0 by 3 into 1 minus C by P naught and under this the payoff are also going to be exactly same pi 1 Q 1 star Q 2 star is same as pi 2 Q 1 star Q 2 star which is given by this. So this is a Cournot duopoly of course the same problem one can actually do with instead of 2 forms some arbitrary number of forms multiple forms then it is an oligopoly we can start doing it. Now this is an example where the forms are deciding the quantity to produce but it is exactly similar model where the forms are going to determine the price rather than the quantity. So that is known as a Bertrand model. So in a Bertrand model the strategies are not the quantities but prices and the form with a lower price captures the market. This form sells the whole product and second one sells nothing in case of equal prices the forms share market equal. So the demand function now is if the price is fixed the demand function is given by Q of P which is nothing but Q naught into 1 minus P by P naught. If the price is higher the quantity that they are going to produce will be small if the price is small they will produce higher so that is exactly captured by this demand function and the payoff functions will be simply if P 1 and P 2 are the prices chosen by the forms it will be P 1 minus C into Q P 1 is going to be the profit when form 1 has a smaller price. If both the prices are same they share if the price is higher so there is a small mistake here this is not P plus 1 this is a P 1. If the price is going to be higher then of course the form 1 is not getting anything similarly pi 2 so the form 2's profit is also there and the similar exactly similar kind of conditions when the price has to be bigger than C that is a reasonable price C is the marginal cost so therefore P 0 has to be bigger than C. So therefore what is P 0 P 0 is the maximum price that they would like to consider if beyond P 0 they cannot share anything. Now therefore the prices will be between C and P 0 then if you go through it then we can actually solve the case and one by one we can verify the Nash equilibrium conditions then actually what happens that P 1 greater than P 2 cannot be best response to P 2. If a player is choosing P 2 the form 2 has decided a price P 2 then form 1 can never go beyond P 2. Similarly for P if form 1 decides P 1 form 2 will never go beyond P 2. So C is equals to P 2 less than P 1 less than or equals to P 0. In this case P 2 is not the best response to P 1 as choosing anything between C and P 1 yields a better payoff for form 2 so this case cannot give a Nash equilibrium. So likewise you can analyze all the situations and would like you to work out the details of this Nash equilibrium and you compute the Nash equilibrium in this setup you will get a different framework than the Cournot. Of course there are a lot of differences between this Cournot and Bertrand we will not go into those aspects we only give these as a an examples of non-zero sum games. So with this I will stop this session and in the next session we will formally define the Nash equilibrium and by non-zero sum games and we proceed to prove the existence of Nash equilibrium.