 This lecture is going to introduce the idea of mixed strategies and extend our previous concept of Nash equilibrium to this new definition. Let's begin by looking at the matching pennies game. Recall that it would be a pretty bad idea to play any deterministic strategy in this game. For example, if player two were to play heads, then player one would want to respond by playing heads to get a payoff of one, meaning that player two would prefer to change to tails so that he can get this payoff of one, meaning that player one would prefer to change to tails so he can get this payoff of one, meaning that player two would prefer to change back to heads to get this payoff of one, meaning that player one would prefer to change back to heads as well to get this payoff of one where we started. And so you can see there's kind of a cycle where we just bounced around between the different cells of this game matrix and essentially argued that no pair of deterministic strategies works for both players. So what does work for both players? Well, essentially it does make sense for players to confuse each other by choosing to play randomly. So intuitively, instead of saying I'm going to commit to playing heads or I'm going to commit to playing tails, I can say I'm going to commit to flipping this coin and playing whatever side comes up. So let's try to make this idea formal. Before we talked about the idea of pure strategies, which we just equated to playing actions. Now let's think of things in terms of probability distributions. So let's say that a strategy for an agent is any probability distribution over the actions that the player has available to him. And a pure strategy then is the special case where I play only one action with positive probability. A mixed strategy says I'm just going to play more than one action with positive probability. There might be a couple of different actions that I assign positive probability to, like in my example with matching pennies. I'm going to call the support of my mixed strategy the set of actions that get positive probability. So for example, when I flip a coin when I'm playing matching pennies, both heads and tails are in the support of my mixed strategy. My support is the set heads tails. I'm going to define the set of all strategies for an agent i to be capital S sub i. And I'm going to define the set of all strategy profiles capital S to be the Cartesian product of these strategy sets for the different agents. Now I have the problem that I've elaborated my definition of strategies in the game to include not just this finite set of things players can do, but this infinite set of all of the probability distributions over these finite sets. The reason this is a problem is I only have a utility definition for action profiles. And now I'm allowing things to happen that I don't have utilities for. That is to say, I can't just read a number out of the game matrix to figure out how happy the players are when something happens because under a mixed strategy with a support of size greater than one, I won't always end up in the same cell of the matrix. So I can extend my definition of utility here by leveraging the idea of expected utility from decision theory. So these equations explain what this means. It looks a lot more complicated than it is. So what I'm saying here is that I's utility under mixed strategy profile as where little less is some element of the set of all possible mixed strategy profiles big as is equal to the sum over all action profiles in the game. You can kind of think of this intuitively as the sum over all of the cells in the normal form of the game where I take the utility of each cell and I multiply it by the probability that that cell will be reached in the given mixed strategy. The probability of getting to sell a strategy profile a sorry action profile a given strategy profile as And then of course I need to define what this probability actually is and that's given here the probability that I'll get to a given action profile. Given a strategy profile as is just the product of the probability of each player playing his part of that action profile. So in, for example, if this player was playing with probability point five on each action and this player was playing with probability point five, then the probability that I would get to this cell is point two five. This action profile arises with probability point two five because this thing happens half the time and this thing happens half the time. And so we have to multiply those two probabilities together to get the joint probability of this action profile. So that's that's what this definition here is saying. So in total my utility for a strategy profile is my expected utility taking an expectation over all of the action profiles in the support of that strategy profile and waiting each of them by the probability that that action profile would actually arise. Well now that I've defined what strategies are, I can go back to my definitions of best response and Nash equilibrium. And basically they work the way they did before except I change all of the A's to S's. So, so that means I have to write these definitions again and I'll go through them again but conceptually if you understood what best response and Nash equilibrium meant in the case of actions then everything will work again. So I will say that a strategy S star I is an element of the set of best responses to strategy profile S minus I when the following condition is true. For all other strategies S I that player I could take for all of the strategies in the set of possible strategies for that player notice that these this is an infinite set but that's okay the definition still works. Then the utility that the player would get for playing S star I when everybody else plays the strategy profile S minus I is at least as big as when he plays this other strategy S I. So let me say that again S star I is a best response to strategy profile S minus I if it's at least as good as anything else given that everybody else is playing S minus I. Now we can say that a strategy profile S is a Nash equilibrium. If it's the case that for all agents everybody is playing a best response. Incidentally you might notice that I'm using a set membership operator here rather than an equal sign which is what you might have expected to see. Well the reason I don't use an equal sign is because the set of best responses might have more than one thing in it. So the best there might not be only one best response sometimes they'll be multiple best responses. And so here what I'm saying is a strategy profile is one of the best responses if this condition is true. And I'm saying a strategy profile is a Nash equilibrium if everybody is playing one of their best responses. Well this might seem like much to do about nothing I've introduced this idea of randomizing as a strategy I've redefined utility. Then I've leveraged this redefined definition of utility which is incidentally what I'm using here when I talk about the utility of a strategy profile to define best response. I've then leveraged that definition of best response here to talk about Nash equilibrium. And in total I've just ended up kind of saying everything that you've already heard us say. But what does matter is that now that we have a new definition of Nash equilibrium we're able to state a theorem that we didn't have before. And this is Nash's famous theorem. This is the reason why Nash gets the Nash equilibrium named after him and this is one of the main reasons why Nash got the Nobel Prize. This theorem actually didn't take very long to prove but it's a really important thing for game theory. And the theorem is that every finite game has a Nash equilibrium. First of all what is a finite game? This sounds like I'm hedging here but it's not much of a hedge. A finite game just means that the game takes a finite amount of space to write down. So it has a finite number of players. It has a finite number of actions for every player. And that means it has a finite number of utility values in the game because the number of utility values is determined by the number of players and the number of actions for each player. So as long as the game has a finite number of players, not just two players but any finite number of players, and a finite number of actions for each player, not just two actions but maybe a very big game, then no matter what the payoff values are, no matter what strategic situation we're talking about here, no matter what real world interaction this is modeling, there's going to be at least one Nash equilibrium in this game. That's a pretty deep thing. That's saying there will always be some stable thing that all of the players can do which has the property that if they knew what everyone was doing, none of them would want to change their strategy. That's basically one of the main reasons why we care about this idea of Nash equilibrium because we know that no matter what the game is, we can find such a Nash equilibrium and reason about it. That's why Nash equilibrium is such a powerful thing. And that's only true when we have this fuller definition of Nash equilibrium here that we've just defined in terms of strategies. We saw that when we talked about Nash equilibrium in terms of just actions before, what we'll now from this point on refer to as pure strategy Nash equilibrium. So pure strategy Nash is when we do all of this with A's instead of S's, that's a pure strategy Nash equilibrium. And the sad thing about that is that we don't get a theorem that says that every finite game has one of those. But this mixed strategy Nash equilibrium always exists. Let's do some examples. So remember matching pennies? Well, we just argued at the beginning of this video effectively that matching pennies doesn't have a pure strategy Nash equilibrium. But it does have a mixed strategy Nash equilibrium. It has one, and that is, as I suggested before, for both players to randomize 50-50. And that doesn't mean that it always has to be 50-50. That just happens to be what the Nash equilibrium is here. That comes from the symmetry of the payoffs. But that turns out to be the Nash equilibrium here. Let's come back to the coordination game. Well, we've previously seen that these two strategy profiles, I'm circling outcomes, but remember, an outcome isn't an equilibrium. You know, one-one isn't the equilibrium. That would be wrong to say. The right thing to say is that left, left is an equilibrium. Right, right is an equilibrium. But it turns out there's another equilibrium here. So it turns out, again, 0.5, 0.5 is a Nash equilibrium here as well. And that's kind of funny because it doesn't seem like 0.5, 0.5 is such a good thing to play in this game. But you can confirm to yourself that if player one is randomizing by playing 50-50, then player two can do no better than to randomize 50-50. Now, you'll notice that player two could do just as well by playing something else. If player one is playing 50-50, player one is just as happy to go left all the time. But in particular, if player one is going 50-50, player two can do no better than to go 50-50 himself. The reverse is also true, and that makes 50-50, 50-50 a Nash equilibrium of the coordination game as well. And let's look at prisoner's dilemma. In prisoner's dilemma, we've previously seen that this is an equilibrium, and it's an equilibrium in strictly dominant strategies. And we argued before that equilibria in strictly dominant strategies are unique. And so what that means is indeed there aren't any mixed Nash equilibria of the prisoner's dilemma. This is in fact the only Nash equilibrium.