 today is a brief introduction to the basics of game theory. This is written by Matthew Jackson, who's sort of a famous economist. He's currently in Stanford and that's presented by me. That's my name. So let's open with the fun stuff, which are a bunch of definitions. So game theory, let's just define game. And the game is defined by players, actions, timing, and payoffs or utilities of players. So these are very generic. And then I've written down actual definitions for you beforehand. So here we go. There. Only look at the section that's been lit up by the projector. So players, you know, you have to define how many players are out there. So you can, in this case, it's like a set of people or whatever, entities. And so they're just like, from one to N, whatever. And then player I, one of these players has a set of actions, which is represented normally by AI. And it's also called pure strategies, which brings me to, oh, shit. Oh, no, no. Yes, here. Definitions. Yes, pure strategies. So because there's pure strategies, you ask, so what is a not pure strategy? And then the opposite of pure strategy is mixed strategy. And then we'll talk more about that later. For now, just bear with me with all these definitions. And then there's also this A. In the paper, the author uses small A. I find that super confusing. I think he meant the big A, which represents a set of all the possible profiles of pure strategies. And then this is like, you know, a set of actions for player one, set of actions for player two. And then you can just kind of like pick and choose one of them from each set. And then it's representing like the whole possible set. And then that is A, big A. And then there's a generic element is represented as A, which is just like, it's made of like, you know, one action from each player. And then the last definition here is what we just now talked about briefly, which is that payoffs and utilities are sometimes called utilities, which is a function of action to a set of actions to a value. And then normally a utility, it's a single player's utility defined by an action profile being a set of actions, including the player's own action and other player's action. So these are just dry definitions, just to just want to get that out of the way. Since I know you all read the paper, so you all know what I'm talking about. Cool. So let's move this away. So mentioning that game theory, I think everybody's like first reaction, whoever's heard of game theory, heard of prison's dilemma. And then let me just use this. I know you all understand this game in particular, who doesn't know about prisoner's dilemma. Okay, cool. So you all know this game. So I'm just going to use this game to explain this notation, table notation here. Yeah. So you have two prisoners, player one, player two, and then they have their action, their available actions are either C or D. C stands for some say cooperative or cooperate. Or I like to see it, see as cover each other up. So it's cover up. Or D stands for defect. So when they both choose to cover each other up, they get a much lesser less sentence. Therefore, their payoff is both negative one. In this case, they will still be punished. It's just the punishment is less severe. So therefore, both are negative one. However, if player one decided to cover up player two, well, player two decide to rat player one out. In this particular scenario, then player two is going to get rewarded with a neutral, like he or she get away with the crime while player one gets punished more severely. And then this action is mirrored here. So numbers are reflected here. So in this table, the first numbers in the pair represent player one's utility. Well, the second numbers represent their two's utility. And then, yeah, and then the cell corresponds to the action. So the last cell here, it corresponds to the action where both player one and player two defected. Is that clear? Questions? Cool. So this is like a famous example to show that individual incentives and overall welfare need not align. Right. So we covered that. And then for appeals, we, so for now, we're going to talk about pure strategies for the time being. Well, we'll get to mixed strategy if we have time. Something called a dominant strategy. I'm just going to read off the definitions here. A dominant strategy is a strategy that produces the highest payoff of any strategy available for every possible action by the other players. So dominant strategy is a very powerful concept. Why is that? Because if a player has a dominant strategy, then he or she doesn't care what other players do at all. They just always play this dominant strategy. So let's look at two examples. Let's just look at the example here first. So if you look at this example of firm one, so this is like the production choice game where firm one gets to choose to have a high or low production and then firm two is the same. Let's not care too much about the exact scenario. Just look at pay attention to the numbers, which are the utilities and payoffs for both, for a firm one, firm two. In this case, if you look at, if firm one chooses the high strategy, then the firm will get two and four as payoff respectively. But if firm one chooses the low strategy, the firm will get two and five respectively. So if you're a firm one, which one would you choose? Low. Exactly. So in this case, low is a dominant strategy for firm one. Does firm two have a dominant strategy in this case? No. Yeah, firm two. So if firm two chooses high, then the payoff will be two and seven. Low will be five and three. So there's no case where it's both higher than the other. So this is a dominant. So firm one has a dominant strategy, which is low. But if we come and look at this case, so this table and that table is very similar. The only difference is here where it's a one instead of two. So similarly, firm one has a dominant strategy in this case. And then that strategy is also a strictly dominant strategy. Why? Because in that case, high and low for firm one is both two. And in this case, choosing low is significantly better in both cases for firm one. So this concept is useful when you need to do more things. But it's just good to know for now. All right. Next, let's cover the very famous Nash equilibrium. So this is a definition where it describes a pure strategy Nash equilibrium is a profile of strategies such that each player's strategy is a best response against the equilibrium strategies of the other players. It's a mouthful and then but so I want to highlight a few words here. So this is where firstly we're looking at a pure strategy Nash equilibrium. So we're just saying in this in this context, every player choose one action and stick with that one action. So think about another example is like rock paper scissors. So you can only choose one. Which one would you pick? A mixed strategy would be okay, I'll play if you have like 10 games with like one third chance I'll choose paper, one third chance to rock, and one third chance to choose scissors. That's a mixed strategy. So pure strategy like rock all the way. That is a pure strategy. So a pure strategy Nash equilibrium is a profile strategy. So this is not just your choice of strategy. This has to be true for every player and then every player has to be doing has to be picking a best response strategy against whatever the rest of the player is playing. So great. So since everybody's read the paper, I'm sure you've you've seen this formula appeared many times. So this is like part of the formal definition. Oh, before that I want to highlight something. Just like notation might be confusing. So AI means ice action and then A minus I is other players action. So this is saying whatever this is saying is just saying let's say player I players utility. When player I is picking AI as a strategy, while the rest of the players paying a minus I as a strategy. It's this at this case, the utility is greater than any other action that this AI could have chosen. So all this is saying that AI is a best response for player I when a minus I is fixed. So that's the definition of best response. So they use this. Keep on hitting this fan. They use this definition. They use this formula for like three definitions. So we covered dominant strategy just now. So this is saying that for so when a minus I so this is for all the a minus I out there. You have like regardless of what their other players do. And then player I plays this action and they get utilities always greater than anything else. So this is for the dominant strategy. Well for best response as we discussed just now is for a given set of actions that's already picked by other players. And then the best like the best action that can be picked by player I which yields the most utility for that player I and Nash equilibrium is saying that that everybody in this society or in this game they are playing a best response against everybody else. It's very confusing. I know especially when you're going to like the formal definition because conceptually if you understand each concept like just by itself it's understandable. But when you use a formal definition suddenly some things just start kind of mess up. I just like to put all these together but they all make sense. So what are some of the properties of Nash equilibrium? So first of all it's stable. This is very very important where always saying that nobody when it's Nash equilibrium nobody has any incentive to move away from the situation. So recall I think I have more examples yes. And also the second property is that it's possible to have multiple Nash equilibria. So let's look at some examples because I think I rumbled about definitions too much. So this is a game called the battle of the sexes. So the setup is like this let's say Tim my husband and I we want to go see a movie and then I want to see train spotting and Tim wants to see 500 days of summer. And then we if we go to 500 days of summer Tim's going to get more utility out of it. He's going to get three I'm going to get one. But why do I still get a positive utility is because I get to be with Tim and then we get to go to the cinema together spend some time together that's a positive utility. So vice versa. So if we go see train spotting I'm going to get three Tim's going to get one. However if we kind of like I go and see train spotting Tim go and see 500 days of summer just like we each see our preferred film no experience because I'll be thinking oh now I don't get to spend time with my husband. So it kind of offsets the the value I get out of the movie. Okay so this is the setup. It's called the battle of the sexes game. So do we know what are the Nash equilibrium or Libra here? Yes. Well both we both take the same movie. We both take the same same movie. Why is that Nash? Because if you move away like nobody has any incentive to change to the other because you will just get worse unless both change at the same time. Correct. So yes and then an important thing here is that you can't really discuss and then both make the movie. You have to think about this as if only one of the players can change at a time. So in this case if Tim decides to move then both of our utilities decrease. If I decide to move like both of our utilities decrease. So there are two Nash equilibria here which is xx and yy. So there are also two Nash equilibria here. Can someone tell me why there's no one. Okay so I should probably explain this game. So this hawk dove game I can think of it as hawk is aggressive and doves are peaceful. So if you know if the two players are both aggressive they both get zero zero. If one is aggressive and the other one's peaceful the aggressive one gets more utility and the peaceful one gets less utility and if both are peaceful then they get equal amount which is less than the aggressor alone but it's more than the peaceful one when it's against an aggressive opponent. Isn't that essentially the same as the prisoner's dilemma not quite? Yeah similar. Yeah similar. No but isn't the difference that in the asymmetric case when one of them is a hawk and the other one is a dove then they're better off than when they're both in the prisoner's dilemma right? If you both de facto it's still better than if only one of you de facto. Okay let's go ahead and get this right. I mean yeah wait I'm sorry I mean for the guy who No but behind the author it's the same. Ideally you will be the guy confessing why the other guy doesn't confess it's the best outcome. Yes and then the second best outcome is you both don't confess and then the third best outcome is the other outcome best is when they don't. Say in the other one. Yeah so it makes no sense. I'm getting boogled it. Krillin Abram one is better one is worse the worst is there and then the other one this is the best outcome though oh it's the best outcome yeah it's very similar but they have wait but the equilibrium is different in these two cases so they're definitely different caves. Hang on are they different? No I'm confused. Wait so so okay so the equilibrium in this case would be hawk hawk. Oh no the equilibrium is different in this case because the dove hawk and hawk dove would be the equilibrium in this case in this game well it increases because here from here if if the if the if everybody moves up so if one of them moves up then they will get less utility and then if like the only person with the incentive to move that way would be the dove but dove's choice is already does so the other column when player two sorry when you are in the cell then the only person who gets better off moving that direction would be player one but player player one's choice is already dove okay right so the player is not in player two's interest to move that direction so this case this is definitely a different game because this game's equilibrium is different from the prisoner's dilemma makes sense I'll have I lost you all yeah I think it's still just because like I said when you when your opponent threats you out in the prisoner's dilemma that's the worst that can happen to you first here the worst that can happen is that you know you both fight to the death oh right yeah it's a different subject so the prisoner's dilemma here so when you are in the this this bottom bottom left cell when you're playing dc then um now player player one always loses going on player one would lose go up but player player two has the incentive to move to to to defect right so then yeah so then player two gets less um less penalty there that's the difference so that's why player so in this case the dd is the it's an equilibrium yeah because cc will cost you know either player to move to the the corner to this two direction right also the setup of this one specifically said you are not supposed to like you don't know the others or you said normally the case you don't know the others pick here you're not you're not supposed to change your choice after so that's isn't that fundamentally different as well yeah yeah but in the definition of the equilibrium that the whole point is that if you are in that if everyone chooses that then no one can on their own make a change and because it's only ai if you get the formula right the rest is just verbiage but if you look at the formula you can see that it's only ai that you change to ai prime but all the other coordinates stay the same so this corresponds to moving horizontally or vertically in the table for yourself okay yep yeah the setup is the same so for both games as in the like what a person can do or cannot do it can you cannot go from cc to dd in one step but you can do it in two steps here and then here so in the other game is that that you move once no this one you move once that you're stuck there you won't move anymore there's no more incentive for anybody to move there anymore so that's the difference yeah so yeah so essentially again Nash equilibrium is stable so there's no incentive for anyone to deviate from he saw her current action one player any player any one player so then the the next question is are there other cases where there is no nash equilibrium for a game now how about this case this is a penalty kick game where you have a goalie you have a kicker and then the goalie can go left or right the kicker can go left or right if they both go left then you know then the goalie gets then the kicker gets negative one and the goalie managed to save the the goal so then a goalie gets one but if if the goalie if the kicker goes left well the goalie goes right and then the kicker scores then kicker gets one goalie gets negative one so on so forth so this is also known as a matching pennies game so in this case anyone has a is there a equilibrium a nash equilibrium right so in this case there's no pure strategy nash equilibrium it's more like if you always go left or if you always go left left sorry if if if the kicker goes left and goalie goes left and then the kicker can be like all right so then i can be better off if i move to the right right but when the when the kicker moves to the right then the goalie is like okay i can be better off when i move to the right so the goalie would also move and then the kicker is like all right then i'm not gonna be better off moving to the left so then it goes on in a circle so this one never settles so it's not an equilibrium for a pure strategy game but there is a mixed strategy nash equilibrium um can someone tell me what it is because they both 50 50 chance and then what would be the expected utility for each player then in this case yes it'll be zero so this is like conceptually so if i go left 50 of a time and then i'll win 50 of a time i mean if i go right 50 time so this is a case where both the goalie and the kicker conceptually but um is this because it's a zero sum game and sort of random as well like the choices are random and it's zero sum um yeah so the the the question is how do you decide what's the percentage to random on which strategy so this is obvious in this case concept because it's symmetric so you know um you know you know to go 50 left 50 right and then the case is the same for both the kicker and the goalie um so the next what we're we're gonna look at how we're going to calculate us um so this is how you calculated no this is not not how you calculated but uh this is this is like the formal definition uh of a mixed strategy nash equilibrium um so let's see my mouse there come on all right so in this case s i a i or s j a j here that that represents the probability that an action is chosen so in our case this is like 50 right for uh left for a goalie um and then this as we know this is a utility that's yield from action a um so all this is saying is that this action that i'm picking here um with let's sum up like all the possible cases with the probability each case is probability times the utility that i get out of that case and i sum them together it's going to be greater than any other action that i could have chosen so all that we throw in here different from the pure strategy equilibrium is that we're throwing this probability so then yes that's the expectation correct yeah this is exactly so that's the expected return given this yeah but then so then the question yeah on the right hand side you you have so okay so on the left hand side you randomly choose even the i strategy right but on the right hand side you always choose a given other you know ai prime it's like it's not really comparing it's comparing the mixed strategy against a semi no so you're comparing a mixed strategy to a mixed strategy so this is just like for any of the probability any of the um ways that you can combine the distribute this probabilities and you have a profile a way of structuring your probabilities that you're going to get the maximum return you never get to choose si prime you never get to choose si prime what do you mean you never get to choose si prime on the right hand side right you take all the other components and they have some ways but but for i for the i component you just take ai prime always whereas on the left hand side you know i is one of the j's so you have some si which tells you the probability of picking uh no but the ai ai prime is like it's for all of the ai prime so it might be easier it might be clearer what i go into an example and then expand this formula but for any choice that i prime you still you know you commit to only being that ai prime right because on the right hand side you never you know you never introduce an si prime which tells you the probability of picking a ai prime you commit to a ai prime because it's not one of you know it's not in the product so probability this makes strategies already given that's not changing what's changing is your ui so agent i's action which is ai prime here and ai over there so it's saying compare ai to any other action i could have taken but the the ratio of these actions or the mixture actually that remains the same for this formula that's just one because you're evaluating one mixture i'm going to give you an example so if it's still not clear then let's argue afterwards because i have like a million more slides to go through after this because that's just a definition okay so go back to this game but with some modification so this is saying that the kicker is better at kicking to the right than to the left so when the kicker kicks to the right instead of getting negative one and one kicker is getting a zero and the goalie is getting a zero so this is just a slight modification which makes this game biased this is this assumes that the goalie knows that the kicker is better at the right yeah yes so this is saying that knowing this how should the both the kicker and the goalie allocate like or strategize how to whether they to kick right or to go to the right to save the ball um yeah yeah so this is where we're gonna we're gonna figure out like what is the mixed strategy for the kicker and the ballie and the goalie there so now we're looking at that section so um the kicker the kicker's goal or rather the condition that will satisfy uh which will help us find out the strategy is that the kicker's goal is for the goalies to be indifferent whether to go left or to go right so the i'll just write this out um so let's say the kicker has a probability of p of going right and then have a probability of one minus p of going left so how it goes is uh we have now we look at uh goalie's utility when it's going left which is one minus p times one times this um plus p times negative one this guy and then equals when goalie goes to the right which is one minus p times negative one and then so we expand this i should really not do that so let's just do one minus two p equals negative one plus p and now we move everything to one side p equal to two third so what this is saying is that the kicker is better off kicking more often to the right and if you go through the same process for the goalie and you will see that the goalie is actually going to uh favor the right going to the right hand side as well does that make sense so what's the odd about that is if you look at this um so the it makes sense for the kickers to go to the right because the kick the kicker is stronger on the right hand side um but if you look at this whole matrix it's like for the goalie to go right it's actually goalie's getting less utility out of it but in this case the goalie's going to the right as an adjustment to the kicker's um um sort of strategy so then the expected utilities you can just calculate so like by doing that the expected utility for the goalie is already there so it's like negative the total would be like negative one one-third i think and then the utility expected utility for the kicker is positive one-third so this is how you calculate it does that formula make sense now enough we can discuss that later uh do we have time to go through this how should we go do the next speaker okay awesome so then uh well i guess we already took enough questions uh so then the next step is the extensive form game just now we were talking about the normal form game um which is the simple form extensive form if you notice that we did not touch timing at all or order um of the player so just now it was just like if you know like pretty much we're assuming like everybody play at the same time and then people know every strategy and then you can change you can sort of formulate your strategy based on what other people's strategy is in this case um the extensive form game is more complicated you know we're just throwing in the uh the order of the move um and also we we keep we keep the property that they um they sort of they it's transparent like they know other smooths um so this is a formal definition just kidding you're not gonna we're not gonna go through that uh it like even the author in the paper he didn't go through the full definition just because his annotations crazy so instead he um he zoomed in on like a subset of an extensive form game which is a finite game of perfect information finite being that you know um player every player has a fixed number of moves uh in a fixed order um and then perfect information that uh the player every player knows what the previous players actions are um so in in the paper there was this uh game that was presented as a former form normal form game which is in this table where it's from one two firms choose not to advertise or to not advertise when they both choose not to advertise they both get a higher number and then we when one of them choose to advertise the one the advertise get a higher number the other one gets hurt um and then when they both advertise they both get a lower number so then this can be converted to a finite game of perfect information by simply allowing firm one to choose to move first so for uh so let's look at this this tree here uh normally a finite this kind of finite game of perfect information can be represented and is normally represented by a tree um so then the player one firm one uh can choose to advertise or not advertise player two then choose knowing player one strategy then they will choose to advertise or not advertise and then the numbers down here denotes firm one's utility and then firm two's utility at the end of the play ah straightforward enough so to the white board again oh it's a shame that i can't have that light shine here uh it's a bit distracting oh yes please can you bring on these lights and then so i like to focus your attention here in this bottom right corner so this represents just now that game oh here i even got the tree here and the utilities here um so what i did here is to uh annotate the each pass with different notations to make it less confusing so this is advertised this is not for firm one this is uh advertised not advertised advertised not advertised for firm two so i'm just annotating them with a one a two b one b two c one c two reason why is that so if you look at the action profile of firm one so available actions for firm firm one is to advertise or not advertise straightforward enough but for uh firm two um the action profile is actually uh just don't even don't let ignore firm one's action so think about it this as you are a person from firm one and you are planning your available actions so you have to prepare for the case where firm one advertises or not advertise right so then you can choose either b one c one b one c two or b two c one you can't choose b one b two at the same time that makes sense so that's why we have all this combination here so to fill up the number here so when firm one when firm one advertised when a one is picked and then firm two has b one b one and c one so in this case you can see actually it's it comes down to this path c one is not activated but as firm two you don't know that beforehand that's why it's part of your action profile so what's happening here is that this is going to result in six six anything that here is still c one the c one c two they're not going to be triggering this case and then when it's um when it's a two and then when it's a two c one is a one sorry a one and then b two this is 13 and 7 and then this case is a two no this is a two and a one and b two right so wait is still 13 and 7 yeah because b two is picked uh-huh and in this case is b one and c one so it's 7 and 13 and this is 7 and 13 b one and a two oh my god and then c two yeah it's 16 and 12 this case is 8 7 13 and then 16 and 12 okay so this is a bit all right so um so the interesting bit here is so let's look at the strategies when uh when firm one chooses a one what will firm two choose so firm two's utilities are these numbers so firm two would be better off picking this one on this one right um that is just saying that firm two would go to like this this pass is going to be picked and then c one or c two it doesn't make a difference great so in this case when when uh firm one is going with a two and then firm two is better off either this case or this case okay so then what's interesting is that just note a two b one c one a two b two c one then these both are um like equilibrium that's equilibrium according to this calculation can we have the projector on and have this off so keep that in mind don't forget just one minute we'll get there so the interesting part is so if you notice just now what we arrived at the the one that's highlighted that's one of the equilibrium right there what's the problem with that equilibrium so so this is saying that this is kind of like saying firm two is threatening for firm one firm two is saying you know if you come down this advertised route i'm just going to advertise and we're both going to end up with six and the firm one is like all right i will not advertise and then end up with this this score but this doesn't make any sense because if you look at this part firm two by choosing to advertise in this scenario is actually getting less utility so that's why this is a non-credible threat so how do we how do we how do we deal with such a case so that's where the backward induction comes in so how to do backward induction is if we look at this sub game what choice does it make sense for firm two in this case because firm two is the only one who's choosing this sub game and then obviously it will be not to advertise and then in this sub game which will make sense for firm two obviously is to advertise so then we reduce this game to that game and with this game then firm firm two is out of the picture then it's up to firm from one to choose with this given utility and obviously from one will choose to advertise so by doing this we arrived at the one and only equilibrium which is this one so this is a concept known as a sub game perfect equilibrium this is just saying you know given that a tree like that a game represented in your tree like that any of the any of the sub tree has to be in equilibrium so that's how we arrived at this solution that's why that solution doesn't make sense because because this is not sub tree sub game equilibrium but if your firm two and one two force firm one you have to go down that route okay like you basically you want to turn that credible uh-huh does that fit into the model like some kind of pre-commitment that firm two said whatever happens i'm gonna advertise well the thing is then it needs to be incorporated in the utility right in this case in this case that the firm two threat is not credible because it's if firm one actually goes down that rally the firm two will pick the the not advertised reality but i'm saying that that is not that is not then then you are letting firm two to pick first so in this case we're representing this game firm one always has the first move right so then then your then your firm two has to be up on the like closer to the root or be the root node right so this whole point is that it order matters so like you can see like just just adding one more condition your game suddenly became different and then you will have to deal with it differently so yeah so i think i'm done the only thing that's left is that when i posted this idea on facebook a certain professor protested he said all game papers we love is for cs people what are you doing economic paper for so just to link it back to cs we all know who this person is and then he's the one who came up with this mean max theorem theorem which which is fundamental to game theory and then just like what we did there it can also be derived a rather yeah like the how the kicker's goal is to make it you know the two strategy indifferent for the goalie it can be derived from the mean max theorem as well so this is like homework and then what's the point of all this you might ask so you know if you've heard of this paper click game then if you get far enough then you will see that there is actually a like some games there and you can play so you can write an ai and then use the strategies that we've talked about today to do proper you know strategy picking so that's the whole point of this talk thank y'all questions we have four minutes for questions no questions great