The
prisoner's dilemma is a fundamental problem in
game theory that demonstrates why two
people might not cooperate even if it is in both their best
interests to do so. It was originally framed by
Merrill Flood and
Melvin Dresher working at
RAND in 1950.
Albert W.
Tucker formalized the game with
prison sentence payoffs and gave it the "prisoner's dilemma" name
(Poundstone, 1992).
In its classical form, the prisoner's dilemma ("PD") is presented
as follows:
 Two suspects are arrested by the police. The police have
insufficient evidence for a conviction, and, having separated both
prisoners, visit each of them to offer the same deal. If one
testifies (defects from the other) for the prosecution against the
other and the other remains silent (cooperates with the other), the
betrayer goes free and the silent accomplice receives the full
10year sentence. If both remain silent, both prisoners are
sentenced to only six months in jail for a minor charge. If each
betrays the other, each receives a fiveyear sentence. Each
prisoner must choose to betray the other or to remain silent. Each
one is assured that the other would not know about the betrayal
before the end of the investigation. How should the prisoners
act?
If we assume that each player cares only about minimizing his or
her own time in jail, then the prisoner's dilemma forms a
non
zerosum game in which two players may
each cooperate with or defect from (betray) the other player. In
this game, as in all game theory, the only concern of each
individual player (prisoner) is maximizing his or her own payoff,
without any concern for the other player's payoff. The unique
equilibrium for this game is a
Paretosuboptimal solution, that is,
rational choice leads the two players to both play defect, even
though each player's individual reward would be greater if they
both played cooperatively.
In the classic form of this game, cooperating is
strictly dominated by defecting, so
that the only possible
equilibrium
for the game is for all players to defect. No matter what the other
player does, one player will always gain a greater payoff by
playing defect. Since in any situation playing defect is more
beneficial than cooperating, all
rational players will play defect, all
things being equal.
In the
iterated prisoner's dilemma, the game is
played repeatedly. Thus each player
has an opportunity to punish the other player for previous
noncooperative play. If the number of steps is known by both
players in advance, economic theory says that the two players
should defect again and again, no matter how many times the game is
played. However, this analysis fails to predict the behavior of
human players in a real iterated prisoners dilemma situation, and
it also fails to predict the optimum algorithm when computer
programs play in a tournament. Only when the players play an
indefinite or random number of times can cooperation be an
equilibrium, technically a
subgame perfect equilibrium
meaning that both players defecting always remains an equilibrium
and there are many other equilibrium outcomes. In this case, the
incentive to defect can be overcome by the threat of
punishment.
In casual usage, the label "prisoner's dilemma" may be applied to
situations not strictly matching the formal criteria of the classic
or iterative games, for instance, those in which two entities could
gain important benefits from cooperating or suffer from the failure
to do so, but find it merely difficult or expensive, not
necessarily impossible, to coordinate their activities to achieve
cooperation.
Strategy for the classical prisoner's dilemma
The classical prisoner's dilemma can be summarized thus:

Prisoner B Stays Silent 
Prisoner B Betrays 
Prisoner A Stays Silent 
Each serves 6 months 
Prisoner A: 10 years
Prisoner B: goes free 
Prisoner A Betrays 
Prisoner A: goes free
Prisoner B: 10 years 
Each serves 5 years 
In this game, regardless of what the opponent chooses, each player
always receives a higher payoff (lesser sentence) by betraying;
that is to say that betraying is the strictly dominant strategy.
For instance, Prisoner A can accurately say, "No matter what
Prisoner B does, I personally am better off betraying than staying
silent. Therefore, for my own sake, I should betray." However, if
the other player acts similarly, then they both betray and both get
a lower payoff than they would get by staying silent. Rational
selfinterested decisions result in each prisoner being worse off
than if each chose to lessen the sentence of the accomplice at the
cost of staying a little longer in jail himself (hence the seeming
dilemma). In game theory, this demonstrates very elegantly that in
a nonzerosum game a
Nash
equilibrium need not be a Pareto optimum.
Generalized form
We can expose the skeleton of the game by stripping it of the
prisoner
framing device. The
generalized form of the game has been used frequently in
experimental economics. The following
rules give a typical realization of the game.
 There are two players and a banker. Each player holds a set of
two cards, one printed with the word "Cooperate", the other printed
with "Defect" (the standard terminology for the game). Each player
puts one card facedown in front of the banker. By laying them face
down, the possibility of a player knowing the other player's
selection in advance is eliminated (although revealing one's move
does not affect the dominance analysisA simple tell that partially or wholly reveals one
player's choice — such as the Red player playing her Cooperate
card faceup — does not change the fact that Defect is the
dominant strategy. When one is considering the game itself,
communication has no effect whatsoever. When the game is being
played in real life, communication may matter due to considerations
outside of the game itself; however, when external considerations
are not taken into account, communications do not affect the
singleinstance prisoner's dilemma. Even in the singleinstance
prisoner's dilemma, meaningful prior communication about issues
external to the game could alter the play environment by raising
the possibility of enforceable side contracts or credible threats.
For example, if the Red player plays their Cooperate card faceup
and simultaneously reveals a binding commitment to blow the jail up
if and only if Blue Defects (with additional payoff 11,10),
then Blue's Cooperation becomes dominant. As a result, players are
screened from each other and prevented from communicating outside
of the game.). At the end of the turn, the banker turns over both
cards and gives out the payments accordingly.
Given two players, "red" and "blue": if the red player defects and
the blue player cooperates, the red player gets the Temptation to
Defect payoff of 5 points while the blue player receives the
Sucker's payoff of 0 points. If both cooperate they get the Reward
for Mutual Cooperation payoff of 3 points each, while if they both
defect they get the Punishment for Mutual Defection payoff of 1
point. The checker board
payoff matrix
showing the payoffs is given below.
Example PD payoff matrix

Cooperate 
Defect 
Cooperate 
3, 3 
0, 5 
Defect 
5, 0 
1, 1 
In "winlose" terminology the table looks like this:

Cooperate 
Defect 
Cooperate 
winwin

lose muchwin much 
Defect 
win
muchlose much

loselose

These point assignments are given arbitrarily for illustration. It
is possible to generalize them, as follows:
Canonical PD payoff matrix

Cooperate 
Defect 
Cooperate 
R, R 
S, T 
Defect 
T, S 
P, P 
Where
T stands for
Temptation to defect,
R for
Reward for mutual cooperation,
P
for
Punishment for mutual defection and
S for
Sucker's payoff. To be defined as prisoner's dilemma, the
following inequalities must hold:
T >
R >
P >
S
This condition ensures that the equilibrium outcome is defection,
but that cooperation Pareto dominates equilibrium play. In addition
to the above condition, if the game is repeatedly played by two
players, the following condition should be added.
2
R >
T +
S
If that condition does not hold, then full cooperation is not
necessarily Pareto optimal, as the players are collectively better
off by having each player alternate between Cooperate and
Defect.
These rules were established by cognitive scientist
Douglas Hofstadter and form the formal
canonical description of a typical game of prisoner's
dilemma.
A simple special case occurs when the advantage of defection over
cooperation is independent of what the coplayer does and cost of
the coplayer's defection is independent of one's own action, i.e.
T+
S =
P+
R.
Human behavior in the prisoner's dilemma
One experiment based on the simple dilemma found that approximately
40% of participants played "cooperate" (i.e., stayed silent).
The iterated prisoner's dilemma
If two players play prisoner's dilemma more than once in succession
and they remember previous actions of their opponent and change
their strategy accordingly, the game is called iterated prisoner's
dilemma.
The iterated prisoner's dilemma game is fundamental to certain
theories of human cooperation and trust. On the assumption that the
game can model transactions between two people requiring trust,
cooperative behaviour in populations may be modelled by a
multiplayer, iterated, version of the game. It has, consequently,
fascinated many scholars over the years. In 1975, Grofman and Pool
estimated the count of scholarly articles devoted to it at over
2,000. The iterated prisoner's dilemma has also been referred to as
the "
PeaceWar game".
If the game is played exactly N times and both players know this,
then it is always game theoretically optimal to defect in all
rounds. The only possible
Nash
equilibrium is to always defect. The proof is
inductive: one might as well defect
on the last turn, since the opponent will not have a chance to
punish the player. Therefore, both will defect on the last turn.
Thus, the player might as well defect on the secondtolast turn,
since the opponent will defect on the last no matter what is done,
and so on. .
Unlike the standard prisoner's dilemma, in the iterated prisoner's
dilemma the defection strategy is counterintuitive and fails badly
to predict the behavior of human players. Within standard economic
theory, though, this is the only correct answer. The
superrational strategy in the iterated
prisoners dilemma with fixed N is to cooperate against a
superrational opponent, and in the limit of large N, experimental
results on strategies agree with the superrational version, not the
gametheoretic rational one.
For cooperation to emerge between game theoretic rational players,
the total number of rounds N must be random, or at least unknown to
the players. In this case always defect is no longer a strictly
dominant strategy, only a
Nash
equilibrium. Amongst results shown by Nobel Prize winner
Robert Aumann in his 1959 paper,
rational players repeatedly interacting for indefinitely long games
can sustain the cooperative outcome.
Iterated prisoners dilemma experiments
Interest in the iterated prisoners dilemma (IPD) was kindled by
Robert Axelrod in his book
The Evolution of
Cooperation (1984). In it he reports on a tournament he
organized of the N step prisoner dilemma (with N fixed) in which
participants have to choose their mutual strategy again and again,
and have memory of their previous encounters. Axelrod invited
academic colleagues all over the world to devise computer
strategies to compete in an
IPD
tournament. The programs that were entered varied widely in
algorithmic complexity, initial hostility, capacity for
forgiveness, and so forth.
Axelrod discovered that when these encounters were repeated over a
long period of time with many players, each with different
strategies, greedy strategies tended to do very poorly in the long
run while more
altruistic strategies did
better, as judged purely by selfinterest. He used this to show a
possible mechanism for the evolution of altruistic behaviour from
mechanisms that are initially purely selfish, by
natural selection.
The best
deterministic
strategy was found to be
tit for tat,
which
Anatol Rapoport developed and
entered into the tournament. It was the simplest of any program
entered, containing only four lines of BASIC, and won the contest.
The strategy is simply to cooperate on the first iteration of the
game; after that, the player does what his or her opponent did on
the previous move. Depending on the situation, a slightly better
strategy can be "tit for tat with forgiveness." When the opponent
defects, on the next move, the player sometimes cooperates anyway,
with a small probability (around 1%5%). This allows for occasional
recovery from getting trapped in a cycle of defections. The exact
probability depends on the lineup of opponents.
By analysing the topscoring strategies, Axelrod stated several
conditions necessary for a strategy to be successful.
 Nice: The most important condition is that the strategy must be
"nice", that is, it will not defect before its opponent does (this
is sometimes referred to as an "optimistic" algorithm). Almost all
of the topscoring strategies were nice; therefore a purely selfish
strategy will not "cheat" on its opponent, for purely utilitarian
reasons first.
 Retaliating: However, Axelrod contended, the successful
strategy must not be a blind optimist. It must sometimes retaliate.
An example of a nonretaliating strategy is Always Cooperate. This
is a very bad choice, as "nasty" strategies will ruthlessly exploit
such players.
 Forgiving: Successful strategies must also be forgiving. Though
players will retaliate, they will once again fall back to
cooperating if the opponent does not continue to defect. This stops
long runs of revenge and counterrevenge, maximizing points.
 Nonenvious: The last quality is being nonenvious, that is not
striving to score more than the opponent (impossible for a ‘nice’
strategy, i.e., a 'nice' strategy can never score more than the
opponent).
Therefore, Axelrod reached the
oxymoronicsounding conclusion that selfish
individuals for their own selfish good will tend to be nice and
forgiving and nonenvious.
The optimal (pointsmaximizing) strategy for the onetime PD game
is simply defection; as explained above, this is true whatever the
composition of opponents may be. However, in the iteratedPD game
the optimal strategy depends upon the strategies of likely
opponents, and how they will react to defections and cooperations.
For example, consider a population where everyone defects every
time, except for a single individual following the tit for tat
strategy. That individual is at a slight disadvantage because of
the loss on the first turn. In such a population, the optimal
strategy for that individual is to defect every time. In a
population with a certain percentage of alwaysdefectors and the
rest being tit for tat players, the optimal strategy for an
individual depends on the percentage, and on the length of the
game.
A strategy called Pavlov (an example of
WinStay, LoseSwitch) cooperates at
the first iteration and whenever the player and coplayer did the
same thing at the previous iteration; Pavlov defects when the
player and coplayer did different things at the previous
iteration. For a certain range of parameters, Pavlov beats all
other strategies by giving preferential treatment to coplayers
which resemble Pavlov.
Deriving the optimal strategy is generally done in two ways:
 Bayesian
Nash Equilibrium: If the statistical distribution of opposing
strategies can be determined (e.g. 50% tit for tat, 50% always
cooperate) an optimal counterstrategy can be derived
analytically.
 Monte Carlo simulations of
populations have been made, where individuals with low scores die
off, and those with high scores reproduce (a genetic algorithm for finding an optimal
strategy). The mix of algorithms in the final population generally
depends on the mix in the initial population. The introduction of
mutation (random variation during reproduction) lessens the
dependency on the initial population; empirical experiments with
such systems tend to produce tit for tat players (see for instance
Chess 1988), but there is no analytic proof that this will always
occur.
Although
tit for tat is considered to be the most robust basic strategy, a
team from Southampton
University in England (led by Professor Nicholas Jennings
[7235] and consisting of Rajdeep Dash, Sarvapali
Ramchurn, Alex Rogers, Perukrishnen Vytelingum) introduced a new
strategy at the 20thanniversary iterated prisoner's dilemma
competition, which proved to be more successful than tit for
tat. This strategy relied on cooperation between programs to
achieve the highest number of points for a single program. The
University submitted 60 programs to the competition, which were
designed to recognize each other through a series of five to ten
moves at the start. Once this recognition was made, one program
would always cooperate and the other would always defect, assuring
the maximum number of points for the defector. If the program
realized that it was playing a nonSouthampton player, it would
continuously defect in an attempt to minimize the score of the
competing program. As a result, this strategy ended up taking the
top three positions in the competition, as well as a number of
positions towards the bottom.
This strategy takes advantage of the fact that multiple entries
were allowed in this particular competition, and that the
performance of a team was measured by that of the highestscoring
player (meaning that the use of selfsacrificing players was a form
of
minmaxing). In a competition where one
has control of only a single player, tit for tat is certainly a
better strategy. Because of this new rule, this competition also
has little theoretical significance when analysing single agent
strategies as compared to Axelrod's seminal tournament. However, it
provided the framework for analysing how to achieve cooperative
strategies in multiagent frameworks, especially in the presence of
noise. In fact, long before this newrules tournament was played,
Richard Dawkins in his book
The Selfish Gene pointed
out the possibility of such strategies winning if multiple entries
were allowed, but remarked that most probably Axelrod would not
have allowed them if they had been submitted. It also relies on
circumventing rules about the prisoner's dilemma in that there is
no communication allowed between the two players. When the
Southampton programs engage in an opening "ten move dance" to
recognize one another, this only reinforces just how valuable
communication can be in shifting the balance of the game.
Another odd case is "play forever" prisoner's dilemma. The game is
repeated infinitely many times and the player's score is the
average (suitably computed).
Continuous iterated prisoner's dilemma
Most work on the iterated prisoner's dilemma has focused on the
discrete case, in which players either cooperate or defect, because
this model is relatively simple to analyze. However, some
researchers have looked at models of the continuous iterated
prisoner's dilemma, in which players are able to make a variable
contribution to the other player. Le and Boyd found that in such
situations, cooperation is much harder to evolve than in the
discrete iterated prisoner's dilemma. The basic intuition for this
result is straightforward: in a continuous prisoner's dilemma, if
a population starts off in a noncooperative equilibrium, players
who are only marginally more cooperative than noncooperators get
little benefit from assorting with one another. By contrast, in a
discrete prisoner's dilemma, tit for tat cooperators get a big
payoff boost from assorting with one another in a noncooperative
equilibrium, relative to noncooperators. Since Nature arguably
offers more opportunities for variable cooperation rather than a
strict dichotomy of cooperation or defection, the continuous
prisoner's dilemma may help explain why reallife examples of tit
for tatlike cooperation are extremely rare in nature (ex.
Hammerstein) even though tit for tat seems robust in theoretical
models.
Learning psychology and game theory
Where game players can learn to estimate the likelihood of other
players defecting, their own behaviour is influenced by their
experience of the others' behaviour. Simple statistics show that
inexperienced players are more likely to have had, overall,
atypically good or bad interactions with other players. If they act
on the basis of these experiences (by defecting or cooperating more
than they would otherwise) they are likely to suffer in future
transactions. As more experience is accrued a truer impression of
the likelihood of defection is gained and game playing becomes more
successful. The early transactions experienced by immature players
are likely to have a greater effect on their future playing than
would such transactions have upon mature players. This principle
goes part way towards explaining why the formative experiences of
young people are so influential and why, for example, those who are
particularly vulnerable to bullying sometimes become bullies
themselves.
The likelihood of defection in a population may be reduced by the
experience of cooperation in earlier games allowing
trust to build up. Hence selfsacrificing
behaviour may, in some instances, strengthen the moral fibre of a
group. If the group is small the positive behaviour is more likely
to feed back in a mutually affirming way, encouraging individuals
within that group to continue to cooperate. This is allied to the
twin dilemma of encouraging those people whom one would aid to
indulge in behaviour that might put them at risk. Such processes
are major concerns within the study of
reciprocal altruism,
group selection,
kin selection and
moral philosophy.
Douglas Hofstadter's Superrationality
Douglas Hofstadter in his
Metamagical Themas proposed that
the conception of rationality that led "rational" players to defect
is faulty. He proposed that there is another type of rational
behavior, which he called "
superrational", where players take into
account that the other person is presumably superrational, like
them. Superrational players behave identically and know that they
will behave identically. They take that into account
before they maximize their payoffs, and they therefore
cooperate with each other.
This view of the oneshot PD leads to cooperation as follows:
 Any superrational strategy will be the same for both
superrational players, since both players will think of it.
 therefore the superrational answer will lie on the diagonal of
the payoff matrix
 when you maximize return from solutions on the diagonal, you
cooperate
If a superrational player plays against a known rational opponent,
he or she will defect. A superrational player only cooperates with
other superrational players, whose thinking is correlated with his
or hers. If a superrational player plays against an opponent of
unknown superrationality in a symmetric situation, the result can
be either to cooperate or to defect depending on the odds that the
opponent is superrational (Pavlov strategy).
Superrationality is not studied by academic economists, because the
economic definition of rationality excludes any superrational
behavior by definition. Nevertheless, analogs of oneshot
cooperation are observed in human culture, wherever religious or
ethical codes exist. Hofstadter discusses the example of an
economic transaction between strangers passing through a town
where either party stands to gain by cheating the other, with
little hope of retaliation. Still, cheating is the exception rather
than the rule.
Morality
While it is sometimes thought that
morality
must involve the constraint of selfinterest,
David Gauthier famously argues that
cooperating in the prisoners dilemma on moral principles is
consistent with selfinterest and the axioms of game theory. In his
opinion, it is most prudent to give up straightforward maximizing
and instead adopt a disposition of constrained maximization,
according to which one resolves to cooperate in the belief that the
opponent will respond with the same choice, while in the classical
PD it is explicitly stipulated that the response of the opponent
does not depend on the player's choice. This form of
contractarianism claims that good moral
thinking is just an elevated and subtly strategic version of basic
meansend reasoning.
Douglas Hofstadter expresses a
strong personal belief that the mathematical symmetry is reinforced
by a moral symmetry, along the lines of the
Kantian categorical
imperative: defecting in the hope that the other player
cooperates is morally indefensible. If players treat each other as
they would treat themselves, then they will cooperate.
Reallife examples
These particular examples, involving prisoners and bag switching
and so forth, may seem contrived, but there are in fact many
examples in human interaction as well as interactions in nature
that have the same payoff matrix. The prisoner's dilemma is
therefore of interest to the
social
sciences such as
economics,
politics and
sociology, as
well as to the biological sciences such as
ethology and
evolutionary biology. Many natural
processes have been abstracted into models in which living beings
are engaged in endless games of prisoner's dilemma. This wide
applicability of the PD gives the game its substantial
importance.
In politics
In
political science, for
instance, the PD scenario is often used to illustrate the problem
of two states engaged in an
arms race.
Both will reason that they have two options, either to increase
military expenditure or to make
an agreement to reduce weapons. Either state will benefit from
military expansion regardless of what the other state does;
therefore, they both incline towards military expansion. The
paradox is that both states are acting
rational, but producing an apparently
irrational result. This could be considered a
corollary to
deterrence theory.
In science
In
environmental studies, the
PD is evident in crises such as global
climate change. All countries will benefit
from a stable climate, but any single country is often hesitant to
curb
emissions. The benefit to an
individual country to maintain current behavior is greater than the
benefit to all countries if behavior was changed, therefore
explaining the current impasse concerning climate change.
In program management and technology development, the PD applies to
the relationship between the customer and the developer. Capt Dan
Ward, an officer in the US Air Force, examined
The Program
Manager's Dilemma in an article published in Defense AT&L,
a defense technology journal.
In social science
In
sociology or
criminology, the PD may be applied to an actual
dilemma facing two inmates. The game theorist Marek Kaminski, a
former political prisoner, analysed the factors contributing to
payoffs in the game set up by a prosecutor for arrested defendants
(see
references below). He concluded
that while the PD is the ideal game of a prosecutor, numerous
factors may strongly affect the payoffs and potentially change the
properties of the game.
Steroid use
The prisoner's dilemma applies to the decision whether or not to
use performance enhancing drugs in athletics. Given that the drugs
have an approximately equal impact on each athlete, it is to all
athletes' advantage that no athlete take the drugs (because of the
side effects). However, if any one athlete takes the drugs, they
will gain an advantage unless all the other athletes do the same.
In that case, the advantage of taking the drugs is removed, but the
disadvantages (side effects) remain.
In economics
Advertising is sometimes cited as a real life example of the
prisoner’s dilemma. When
cigarette
advertising was legal in the United States, competing cigarette
manufacturers had to decide how much money to spend on advertising.
The effectiveness of Firm A’s advertising was partially determined
by the advertising conducted by Firm B. Likewise, the profit
derived from advertising for Firm B is affected by the advertising
conducted by Firm A. If both Firm A and Firm B chose to advertise
during a given period the advertising cancels out, receipts remain
constant, and expenses increase due to the cost of advertising.
Both firms would benefit from a reduction in advertising. However,
should Firm B choose not to advertise, Firm A could benefit greatly
by advertising. Nevertheless, the optimal amount of advertising by
one firm depends on how much advertising the other undertakes. As
the best strategy is dependent on what the other firm chooses there
is no dominant strategy and this is not a prisoner's dilemma but
rather is an example of a
stag hunt. The
outcome is similar, though, in that both firms would be better off
were they to advertise less than in the equilibrium. Sometimes
cooperative behaviors do emerge in business situations. For
instance, cigarette manufacturers endorsed the creation of laws
banning cigarette advertising, understanding that this would reduce
costs and increase profits across the industry. This analysis is
likely to be pertinent in many other business situations involving
advertising.
Without enforceable agreements, members of a
cartel are also involved in a (multiplayer)
prisoners' dilemma. 'Cooperating' typically means keeping prices at
a preagreed minimum level. 'Defecting' means selling under this
minimum level, instantly stealing business (and profits) from other
cartel members.
Antitrust authorities
want potential cartel members to mutually defect, ensuring the
lowest possible prices for
consumers.
In law
The theoretical conclusion of PD is one reason why, in many
countries,
plea bargaining is
forbidden. Often, precisely the PD scenario applies: it is in the
interest of both suspects to confess and testify against the other
prisoner/suspect, even if each is innocent of the alleged crime.
Arguably, the worst case is when only one party is guilty —
here, the innocent one is unlikely to confess, while the guilty one
is likely to confess and testify against the innocent.
Multiplayer dilemmas
Many reallife dilemmas involve multiple players. Although
metaphorical,
Hardin's tragedy of the commons may be viewed
as an example of a multiplayer generalization of the PD: Each
villager makes a choice for personal gain or restraint. The
collective reward for unanimous (or even frequent) defection is
very low payoffs (representing the destruction of the "commons").
Such multiplayer PDs are not formal as they can always be
decomposed into a set of classical twoplayer games. The commons
are not always exploited:
William
Poundstone, in a book about the prisoner's dilemma (see
References below), describes a situation in New Zealand where
newspaper boxes are left unlocked. It is possible for people to
take a paper without paying
(
defecting) but very few do, feeling that if they do not
pay then neither will others, destroying the system.
Quantum game theory
In
quantum game theory, a player
in the prisoner's dilemma can implement a quantum strategy. Unlike
a
mixed strategy, which can't improve
on the payoff to the dominant strategy, a quantum strategy can
increase the player's expected payoff. The results can be explained
in terms of efficient quantum algorithms.
Related games
Closedbag exchange
Hofstadter once suggested that
people often find problems such as the PD problem easier to
understand when it is illustrated in the form of a simple game, or
tradeoff. One of several examples he used was "closed bag
exchange":
 Two people meet and exchange closed bags, with the
understanding that one of them contains money, and the other
contains a purchase. Either player can choose to honor the deal by
putting into his or her bag what he or she agreed, or he or she can
defect by handing over an empty bag.
In this game, defection is always the best course, implying that
rational agents will never play. However, in this case both players
cooperating and both players defecting actually give the same
result, assuming there are no
gains
from trade, so chances of mutual cooperation, even in repeated
games, are few.
Friend or Foe?
Friend or
Foe? is a game show that aired from 2002 to 2005 on the
Game Show Network in the United States. It is an example of the prisoner's dilemma
game tested by real people, but in an artificial setting. On the
game show, three pairs of people compete. As each pair is
eliminated, it plays a game similar to the prisoner's dilemma to
determine how the winnings are split. If they both cooperate
(Friend), they share the winnings 5050. If one cooperates and the
other defects (Foe), the defector gets all the winnings and the
cooperator gets nothing. If both defect, both leave with nothing.
Notice that the payoff matrix is slightly different from the
standard one given above, as the payouts for the "both defect" and
the "cooperate while the opponent defects" cases are identical.
This makes the "both defect" case a weak equilibrium, compared with
being a strict equilibrium in the standard prisoner's dilemma. If
you know your opponent is going to vote Foe, then your choice does
not affect your winnings. In a certain sense,
Friend or
Foe has a payoff model between prisoner's dilemma and the
game of Chicken.
The payoff matrix is

Cooperate 
Defect 
Cooperate 
1, 1 
0, 2 
Defect 
2, 0 
0, 0 
This
payoff matrix was later used on the British television programmes Shafted and Golden
Balls.
It was also used earlier in the UK
Channel
4 gameshow
Trust Me, hosted by
Nick Bateman, in 2000.
See also
Notes
References
 Robert Aumann, “Acceptable points
in general cooperative nperson games”, in R. D. Luce and A. W.
Tucker (eds.), Contributions to the Theory 23 of Games IV, Annals
of Mathematics Study 40, 287–324, Princeton University Press,
Princeton NJ.
 Axelrod, R. (1984). The Evolution of
Cooperation. ISBN 0465021212
 Bicchieri, Cristina (1993).
Rationality and Coordination. Cambridge University Press
 Kenneth Binmore, Fun and
Games.
 David M. Chess (1988). Simulating the evolution of behavior:
the iterated prisoners' dilemma problem. Complex Systems,
2:663–670.
 Dresher, M. (1961). The
Mathematics of Games of Strategy: Theory and Applications
PrenticeHall, Englewood Cliffs, NJ.
 Flood, M.M. (1952). Some
experimental games. Research memorandum RM789. RAND Corporation, Santa Monica, CA.
 Kaminski, Marek M. (2004) Games Prisoners Play
Princeton University Press. ISBN 0691117217
http://webfiles.uci.edu/mkaminsk/www/book.html
 Poundstone, W. (1992) Prisoner's Dilemma Doubleday, NY
NY.
 Greif, A. (2006). Institutions and the Path to the Modern
Economy: Lessons from Medieval Trade. Cambridge University Press,
Cambridge, UK.
 Rapoport, Anatol and Albert M.
Chammah (1965). Prisoner's Dilemma. University of Michigan
Press.
 S. Le and R. Boyd (2007) "Evolutionary Dynamics of the
Continuous Iterated Prisoner's Dilemma" Journal of Theoretical
Biology, Volume 245, 258–267. Full text
 A. Rogers, R. K. Dash, S. D. Ramchurn, P. Vytelingum and N. R.
Jennings (2007) “Coordinating team players within a noisy iterated
Prisoner’s Dilemma tournament” Theoretical Computer Science 377
(13) 243259. [7236]
Further reading
 Bicchieri, Cristina and
Mitchell Green (1997) "Symmetry Arguments for Cooperation in the
Prisoner's Dilemma", in G. HolmstromHintikka and R. Tuomela
(eds.), Contemporary Action Theory: The Philosophy and Logic of
Social Action, Kluwer.
 Plous, S. (1993). Prisoner's Dilemma or Perceptual Dilemma?
Journal of Peace Research, Vol. 30, No. 2, 163179.
External links