How Game Theory can help in environmental treaties

September 7, 2021

Share this article:

[supsystic-social-sharing id='1']

The bank robbery that you planned with your best friend, Llyod Shapley, did not go as successfully as you both expected. Multiple things were working against you after you hid the stolen money and as a result you both ended up in the police station. Both taken to a separate interrogation room, the police officers are not sure who was the mastermind behind this robbery. As these officers were profound economics students in their younger years, they decided to give both of you two choices. Either you keep silent or you betray your best friend. What should you do?

To illustrate the problem better, let us put some numbers to the problem. If you betray your friend but he keeps silent, he will go to jail for 10 years and you will be free (and vice versa). If you both keep silent, you will both be in jail for 3 years. Finally, if you both betray each other you will both be in jail for 8 years. The payoff table below shows a summary of this dilemma. In each cell, the first number corresponds to the number of years in prison for you and the second number corresponds to the number of years for Llyod Shapley after a specific course of actions.

Llyod Shapley
BetrayKeep silent
Keep silent10,03,3

How do we find the optimal strategy? Consider your choices, given that Llyod plays ‘Betray’. it is then optimal to betray as well (it gives you 8 years of prison time instead of 10 if you had chosen ‘Keep Silent’). Given that Llyod plays ‘Keep Silent’, it is optimal to play ‘Betray’ (as it gives you 0 years of prison time instead of 3 if you would have chosen ‘Keep Silent’). As the game is symmetric (both players have the same payoffs), it is also optimal to always play ‘Betray’ for Llyod. So, the intersection of the dominant strategies yields a so-called Nash equilibrium where both players betray each other.

In the previous paragraph, we discussed the classic example of the prisoner’s dilemma. It arises in many forms in real life. Think for example of store owners both needing to decide whether to spend money on advertisement or not. Or countries having to sign environmental treaties. However, the outcome is slightly disappointing as both keeping silent would have yielded a better outcome. This is partly because it is a one-shot game (there is no next period in which the game is repeated). Diverging from the social optimum is allowed because there will be no repercussions in the future. But how does this extend to a multi-period game where players can choose to cooperate or maximize their own wellbeing and there will be future periods where deviation from the social optimum will be punished?

Cooperating in environmental treaties

We now discuss a multi-period model where two countries have to set pollution ceilings (for instance, think of the amount of CO2 emitted in the coming year). They can choose to stick to their usual policy and emit 20 units or choose to participate in reducing emissions and emit zero units. Let us also define a payoff table for each year, where the two numbers in a certain cell correspond to the welfare associated with a certain course of action of country 1 and country 2, respectively.

Country 2
Emit 20Emit 0
Country 1Emit 2015,1535,0
Emit 00,3525,25

Because emissions are a so-called public bad (country 1 is bothered by emissions from country 2 and vice versa), it is not optimal for country 1 to cut emissions given that country 2 does not cut emissions. In that case, country 2 then benefits from the cut in emissions from country 1 without having to partake in costly emission reducing activities (and vice versa). If the game is only played once, the dominant strategy is for both to play ‘Emit 20’. However, the social optimum is reached when both countries play ‘Emit 0’, as yields a total welfare of 50.

Now we look at the situation in which both countries have to make their choice every year. In the second period they make their choice, having observed the choices made by both countries in period 1. They could sign a treaty in which both agree to emit 0 units (yielding them a welfare of 25 units). However, if one country diverges from this strategy and emits 20 units, all the following periods the ‘bad’ equilibrium in which both countries get 15 units of welfare is realized. Whether this is a stable treaty depends, among other things, on the extent to which countries value welfare obtained from the next periods. So, we introduce a discount factor, p. Now we can derive the conditions for p of a stable treaty.

First, if a country never diverges from the treaty, it will get total welfare of: 

25 + 25p + 25p^2 + 25p^3 +… .

Realizing this is a geometric series, this sum is equal to 25/(1-p). Now, suppose a country diverges for a single period. It will get 35 units of welfare in that period (which is attractive in the short run as this is higher than the 25 units obtained when cooperating), and it will get 15 units of welfare in the rest of the periods as they will reach the ‘bad’ equilibrium, where both countries play ‘Emit 20’. Total welfare will then be equal to:

35 + 15p + 15p^2 + 15p^3 + … = 35 + 15p/(1-p)

Now we can set up the condition for a stable treaty. The treaty is stable if and only if diverging from the treaty doesn’t yield a higher total wealth for a country, i.e:

25/(1-p) >= 35 + 15p/(1-p)

Solving for p, we obtain a condition for the stability: p >= ½. To put this result into words, diverging from the treaty is only worth it if the discount factor is small (in this case smaller than 50%). For people interested in game theory and applications, the Folk Theorem is the formal name of the result derived above.

The Tragedy Of Commons

In 1968, Garrett Harding published a paper which dealt with another problem which is very famous in game theory. Imagine countries having to decide on how much CO2 they want to emit in the coming year. No country wants to emit too much, as a high concentration has large consequences for our planet. But when only taking into account their own emissions and the consequences for their country, emitting a bit more CO2 wouldn’t be so harmful. Yet, if each country thinks like this, in reality every country would emit too much. In this case, managing a public bad is difficult as each country only will consider their own marginal benefit instead of the public marginal benefit. 

As we have seen in the examples above, cooperation is key in achieving socially optimal outcomes. So, the next time you are captured by the police and under interrogation, do think twice and apply the lessons learned above!


Hardin, Garrett. “The Tragedy of the Commons. 162, no. 3859 (1968): 1243-248.

Mas-Colell, Andreu, Whinston, Michael and Green, Jerry, (1995), Microeconomic Theory, Oxford University Press.

For students interested in Environmental and Resource Economics, I highly recommend the bachelor course taught by Pim Heijnen.

This article is written by Simon Elgersma


Read more

Regression analysis: A beginner’s guide

Regression analysis: A beginner’s guide

Econome­­trics, the int­­ersection of economics and statistics, employs sophisticated methods to analyse and quantify relationships within economic systems. One of its fundamental tools is regression analysis, a statistical technique that allows economists tot model...

Are you tying your shoelaces wrong?

Are you tying your shoelaces wrong?

We tie our shoelaces to ensure that our shoes stay on tight, and we do these by tying a knot. There are different ways to tie your shoelaces, you may have learnt the “around the tree” technique, but somehow, they still always come undone, why? This all has to do with...