 Hi folks, welcome back. This is Matt again, and we're going to talk a little bit now about repeated games when players discount the future payoffs. And let's talk a little bit about more about that and what that means. So when we're looking at discounted repeated games, the idea is that we're looking at games where players are playing the same game over and over and over again. But instead of looking at the limit of the means in terms of some limit of the average of what the payoffs are going to be in the distant future, instead people are looking at valuing today versus tomorrow differently. So the idea of discounted repeated games is the future is uncertain. You're often motivated somewhat by what happens today. And you trade off today versus the future. So it's not the infinite future that you care about, but you say, I really care about today. I care about it a little bit more than tomorrow. Tomorrow's value is, say, 80% or 90% of what today's value is. And that means that the next day is worth, if I say today is worth 1, tomorrow is worth 0.9, and next day is worth 0.81, 0.72, et cetera, et cetera. So things are decaying exponentially in terms of discounting. And so the idea here is if I misbehave today, now I have to think about how are people going to react to that? So if we're trying to support cooperative behavior in a prisoner's dilemma, I can behave today or I can cheat and deviate defect. And if I do that, I'm going to get a temporary gain and then I'm going to possibly be punished in the future. So the kinds of questions that are going to be important here is, will people want to punish me in the future? Is it going to be in their interest? How much do I care? Do I care about it? What's my discount? Do I care a lot about the future or just a little bit? So we're looking at a stage game. So again, a stage game, just take a normal form game. We're going to play that repeatedly over time. And now each player has a discount factor. So player one has a discount factor and so forth. Discount factors are going to be taken to be in 0, 1. Generally, we'll take beta i to be strictly less than 1 so that it's of more interest. If it's equal to 0, then it means you don't care about the future at all. It's basically just a one stage game. So generally the interesting case is going to be when players care somewhat about the future, but they care more about today than tomorrow and so forth. Often in these games, people look at situations with a common discount factor. So everybody has the same discount factor, which will make things fairly easy in some cases. And then the idea of discounting is then the payoff that you get from a whole sequence of actions. So profile of actions A1 played in the first period and AT in the teeth period and so forth. What you just do is you sum up these payoffs, but now you weight them by an exponentially decreasing function, which is the discount factor raised to the power of t. So if this payoff was 1 every day, I'd be getting 1 today plus 0.9 plus 0.81 plus 0.72 et cetera. So that's the idea. OK, so when we look at these games, again, players can condition their play on past history. So a history, a finite history of some length t is just going to be a list of everything that's happened at every date. So here, A1 is equal to a profile of what every player did in period 1. So in the first time we played this game, what did everyone do? And generally, AT is going to be what everybody did at time t. So we've got AT1 to ATN. So these things are vectors, and they tell us what everybody did in the first period, what everybody did in the second period, and so forth. And then we can talk about all finite histories, so all possible histories that I could be faced with when I'm playing this game, all the kinds of things I'm going to have to think about, what am I going to do if this happens? What am I going to do if that happens? So in an infinitely repeated game, I've got all these histories. What am I going to do with these circumstance? So a strategy is a map from every possible history into a possibly mixed strategy over what it can do in the given period faced with a given history. So if we're looking at a prisoner's dilemma, people can either cooperate or defect in a given period. So if we're thinking about a history of length 3, one possibility would be the following. We both cooperated in the first period, maybe player 2 defected in the second period, and then both of them defected in the third period. So that would be a possible history. And then they could say, OK, now what are we going to do in the fourth period? Maybe we'll let bygones be bygones and try and get back to cooperation. Maybe we'll just defect. We're angry at each other. Who knows? OK, so a strategy for a fourth period would be what you do after you've seen different histories of the first three periods. So subgame perfection, again, same as usual, profile of strategies that are Nash in every subgame. What is a subgame here? Subgames are just start in some period and talk about what remains. So it has to be a Nash equilibrium following every possible history. So if you take some history, start at that point, it has to be Nash forever on. So strategies now are going to be specifications of what we would do in every situation, and then we've got Nash in every history. One thing to check, and it's important here, is repeatedly playing a Nash equilibrium of the stage game. So just find a static Nash equilibrium of whatever game it is. For instance, defect in the prisoner's dilemma. Just play that forever. No matter what's happened in the past, it's always going to be subgame perfect. So for every possible history, everybody is going to say that they're going to play the Nash equilibrium forever on going forward. You can check that that's a subgame perfect equilibrium. That's going to be Nash in every possible subgame. So check. If everyone else is doing that, I wouldn't want to deviate. So just think a little bit about the logic of that, because there's a lot of possible subgames to think about. But you can convince yourself that that's true. OK, so solving the repeated prisoner's dilemma, let's think about it a little bit with the context of discounting now. So let's suppose that what we want to do is we want to sustain cooperation. So we've got our standard prisoner's dilemma, putting payoffs here of 3-3 for cooperating, 5-0 from you defecting and the other person cooperating, 1-1 if you both defect. So the only Nash equilibrium of the static game is defect-defect with payoff-1. We want to support 3-3 if we can. So cooperate as long as everyone has in the past and a defect forever in the future if anyone deviates. So when is this in equilibrium? So clearly, that's not in equilibrium. So if we set beta i equals 0 for both players, we can't make this work, right? Because I don't care about the future. Nobody cares about the future. Then we'd end up with only defect-defect in every period, being the only sub-game-perfect equilibrium. Players only care about the present. They're always just going to myopically defect. They don't care about the future. So nothing's going to work. So the question here is, for which betas can we sustain this kind of strategy, which is cooperate as long as everyone has? And if cooperation breaks down, then we just say, forget it. We're going to defect forever after. OK, let's have a peek. So if you cooperate and the other players cooperating, if no one's failed to cooperate in the past, what do we get? We get 3 in perpetuity, right? So we get 3 plus beta times 3. So take a common discount factor for now. Beta squared times 3, beta cubed in the third period and so forth. So in perpetuity, if you remember your sum of series, that's just the value of that is just 3 over 1 minus beta. OK, what happens if I defect and people are playing this Grim Trigger strategy? Well, everybody else is cooperating. The other person's cooperating in the first period. So I'm going to manage to change from cooperate to defect. I'm going to get a 5 in the first period. But then they're going to see that. And the next period they react to it, they defect. And they say everybody's going to defect forever after. So then in perpetuity, we get a bunch of 1s, right? So what do we get? We get 5. And then beta times 1, beta squared and so forth. And if you remember your sum of series here, this is just beta times a 1 plus a beta 1 and so forth. This is beta times 1 over 1 minus beta. So if I deviate, what happens is in the first period I get a gain, but then I lose in the subsequent periods. So there's a trade-off. And how big that trade-off is depends on the size of the discount factor. So we've got these two different payoffs. We can look at the difference between these. If I stay cooperating instead of defecting, I'm giving up 2 today. I could gain by defecting. But then I keep the benefits of cooperation in the future. So I don't ruin things and that means I'm getting a bunch of 2s extras in the future. And so when you look at this, the value of this is beta times 2 over 1 minus beta minus the 2 I'm foregoing today. And when do I want to keep cooperating? As long as this is not negative, right? If this becomes negative, then I'm worse off today by cooperating, I might as well just defect. So difference is not negative if this thing is such that beta is greater than 1 minus beta or basically beta has to be greater than or equal to 1 half. So if you just go through the algebra of solving this inequality, you'll get beta greater than or equal to 1 half. So as long as people care about tomorrow, at least half as much as today, they're gonna be willing to cooperate in this in the repeated prisoner's dilemma with these particular payoffs that we looked at before. So when we're looking at this payoff structure here, then we've got a situation where beta has to be, if each beta I is at least a half, then they can sustain cooperation in this infinitely repeated prisoner's dilemma. Okay, so let's change the numbers a little bit and see what happens. So now let's try and make defection a little bit more attractive, right? So instead of five, we'll make it worth 10 to defect. So now defection looks really attractive. What has to happen? Well, we can go through the same exact calculations we just did, but we're just gonna change the numbers, right? So we've got the same, if cooperating in perpetuity worth a three over one minus beta, the only difference is we're getting a higher number here and then we're still going back to the defect. So there's a little bit more temptation today. And when you do the differences here, you'll get the same kind of thing, except now instead of a minus two, we've got a minus seven difference. You're foregoing seven units from not defecting today. So when you go through and solve for that, now beta has to be at least seven nights before players are gonna be willing to cooperate. So you have to care about tomorrow, at least as seven nights as much as today, okay? And so you can see the basic logic here, right? So the trade-offs of punishments tomorrow versus a good payoff today. And whether or not something can hold together as an equilibrium, what's it gonna be determined by? We have to know how big is the future versus the present? How tempting is the defection versus the current? What we're doing in the current period? How big is the threat? So how bad is it if whatever the thing that we're resorting to in the future, how bad is that in terms of the trade-off? All these things are gonna matter in terms of holding together a cooperation in these kinds of settings. And that gets back to the discussion we had a little earlier about say OPEC, right? There's a temptation to pump more oil today. How much do you care about the future? What's your beta? What's the reaction gonna be? If I start pumping more oil, how are they gonna react to that? Are they gonna start pumping more oil and driving the price down? How much is that gonna hurt me? All those things matter and they determine whether an equilibrium can hang together or not. Okay, so basic logic, play something with relatively high payoffs even if it's not an equilibrium of static game, you can sustain it. And you sustain it by having punishments. If anyone deviates, you resort to something that has lower payoffs for at least that player. And the important thing is that it all has to be credible. It has to be an equilibrium in the sub game that goes forward in order to make that work. And it has to be that the lower payoffs in the future are enough to make sure that you deter people from deviating in the present.