Let’s use our imagination to make a magic trip to Candyland. In Candyland, there are many tribes. Each day these tribes compete against one another for a supply of six pieces of candy, distributed by the almighty Candygod. They can either choose to cooperate and give the other tribe the opportunity to take some candy, or defect and take as much candy as possible for themselves. If both tribes cooperate, they both receive three pieces of candy. If one tribe cooperates while the other one defects, the defecting tribe receives five pieces while destroying one piece in its eagerness, leaving nothing for the cooperating tribe. If both tribes decide to defect, a horrific candy-battle emerges, leaving only one piece of candy for each tribe. This situation leads to a great divide within the tribes. Some people propose to cooperate with the other tribe, while others shout: ‘don’t be so naive, don’t you know that nice guys finish last!‘
Cooperate | Defect | |
Cooperate | 3,3 | 0,5 |
Defect | 5,0 | 1,1 |
You will probably immediately recognize this situation as a case of the classic Prisoner’s Dilemma. Double cooperation maximizes the total amount of candy distributed, but for a single game it is rational for every tribe to defect, as this always yields more candy than cooperating: 3>0 and 5>1. However, in Candyland we are not dealing with a single game: each day six new pieces of candy are distributed by the Candygod, creating a situation called the iterated Prisoner’s Dilemma. Could this change the optimal strategy for tribes in Candyland?
Tournament 1
In 1980, Robert Axelrod hosted an online tournament to find out. Anybody could submit a strategy for this iterated Prisoner’s Dilemma game. Each strategy was simulated 200 times against each other strategy. Fourteen strategies were sent in, together with the 50-50 Random strategy, where the options to defect or cooperate are randomly picked with equal odds. Note that when playing against 50-50 Random, always defecting is the best strategy, just as in the single game. This is because of the fact that defecting against 50-50 Random does not have any consequences, your opponent is not more likely to defect against you in the future. However, for strategies that do respond to the opponent, the most optimal reaction changes drastically.
Tit for Tat
The winner of this contest, achieving the highest average score against the other strategies, is a strategy named Tit for Tat. The strategy is surprisingly simple: it cooperates in the first round and afterwards copies what the opponent did in the last round. So how can this strategy be so effective? Well, first of all it instantly punishes defectors that try to take advantage of Tit for Tat. Secondly, it is forgiving, immediately switching to cooperating again if the opponent starts cooperating. Thereby, it creates an incentive for the opponent to stop defecting (D) and instead cooperate (C) the rest of the game.
Round 1 | 2 | 3 | 4 | 5 | Total Score | |
Tit for Tat | C | C | D | C | C | 11 |
Opponent | C | D | C | C | D | 16 |
As you might notice, Tit for Tat won’t ever win in a single game against his opponent. Depending on the last move of his opponent, it will tie or it will lose. This is the opposite to Always Defect, a strategy that either ties or wins. However, this doesn’t matter. The winner of the contest is the one with the most points in the end, not the one with the best relative score against other strategies.
Be nice!
A tactic much less forgiving than Tit for Tat is a strategy we will call Grudger. This strategy cooperates until the opponent defects and then becomes so pissed off that it defects for the rest of the game. Below is an example of a game between Grudger and the same opponent as the previous example.
Round 1 | 2 | 3 | 4 | 5 | Final Score | |
Grudger | C | C | D | D | D | 14 |
Opponent | C | D | C | C | D | 9 |
What both Tit for Tat and Grudger have in common is that they will never be the first to defect. In a way you could call them ‘nice’ strategies, as opposed to the ‘rude’ strategies that do defect first sometimes. As can be seen in the figure below, being nice seems to be a good trait to have in this tournament, with the complete top 8 being nice. Note Grudger is the worst scoring nice strategy, caused by its lack of forgiveness often leading to endless mutual defection against rude strategies.
So is Tit for Tat always the best thing to do? Not necessarily. A problem of Tit for Tat is exposed by one of the rude strategies which we will call Joss. This strategy works like Tit for Tat, but occasionally sneaks in an extra defect to try and exploit the opponent. Let’s look at an example of Tit For Tat against Joss, where Joss ‘tries out’ an unusual defect at moves 1 and 6. Games between the two strategies will eventually always end in mutual defecting, since the two parties replicate the movements of the other party, leading both to a low score.
Round 1 | 2 | 3 | 4 | 5 | 6 | 7 | Total Score | |
Tit for Tat | C | D | C | D | C | D | D | 12 |
Joss | D | C | D | C | D | D | D | 17 |
Be even nicer!
One strategy that could have prevented this mutual defecting doom scenario against Joss is called Forgiving Tit For Tat. This strategy is even nicer than the regular Tit for Tat. It requires two defections in a row by the opponent in order to respond with defect. Thereby it prevents the echo effects that result in endless mutual defecting. It turns out that this strategy could have even won the tournament if it would have been sent in by one of the contestants: the gains outweigh the losses that result from sometimes being taken advantage of. Below is an example of this tactic against the same Joss player of the previous example.
Round 1 | 2 | 3 | 4 | 5 | 6 | 7 | Total Score | |
Forgiving Tit for Tat | C | C | C | C | C | C | C | 15 |
Joss | D | C | C | C | C | D | C | 25 |
Tournament 2
From this first tournament it became clear that a successful strategy in an iterated Prisoner’s Dilemma game depends not only on your own characteristics, but also on the strategies your competitors use. For this reason, a second tournament was hosted. In this tournament, it wasn’t known beforehand which round would be the last, ruling out the advantage of strategies always defecting at the predetermined last round(s). Additionally, all contestants now knew the results of tournament 1. Therefore, you could state that the level of sophistication and rationality was much higher than in the first round. Out of 62 strategies sent in for the second round, the winner was again Tit for Tat! All contestants knew about its success in the first round, but nobody was able to design an entry that did any better.
What about Candyland?
So what do these results mean for the various tribes in Candyland? Let’s assume that we deal with a reproduction situation: if a tribe with a certain strategy earns a lot of candy, it will reproduce more tribe members using the same strategy. So which strategy will dominate here? Well, if all other tribes use an aggressive strategy like Always Defect, it is hard for a nice strategy like Tit for Tat to reproduce itself, as it will simply get exploited. However, if there are a couple of Tit for Tats, they could gain more from cooperating with each other than they lose from being exploited against rude strategies, similarly to how they won in the two tournaments. By reproducing themselves they would eventually take over all defectors, thereby showing (in contrast to the real world, corny sayings are still allowed in Candyland) how contagious kindness can be.
What about the real world?
Iterated Prisoner’s Dilemmas can be found in many human interactions as well as interactions in nature. Think, for example, about countries deciding on whether to cut their CO2 emissions, something from which everyone profits, or be selfish and hope others will cut more. Or take vampire bats after a successful night of hunting either eating all their food or sharing it with others, knowing that the next night they might be less lucky themselves. These problems are often a bit more complex than the one described in this article, as usually some uncertainty is involved. However, the point remains that deciding to cooperate does not necessarily require you to be extremely altruistic. For a self-interested human, vampire bat or Candyland tribe, it can often be the best thing to do, showing indeed that nice guys finish first.
Resources:
Robert Axelrod, The Evolution of Cooperation, 1984
Dit artikel is geschreven door Sjors Keet