Molly'sBlog: Prisonners' Dilemma

Showing posts with label Prisonners' Dilemma. Show all posts

Tuesday, January 15, 2008

ANARCHIST THEORY:

ANARCHISM AND GAME THEORY:

PART TWO:

NEWDICK'S OBJECTIONS:

In Part One of Doug Newdick's essay presented the day before yesterday here at Molly's Blog the author outlined the basic format of what he calls the "anti-anarchist argument" of the prisoners' dilemma, as set out by Michael Taylor. The next part of this essay is devoted to Newdick's objections to this argument. It should be noted in passing that a rather simplistic form of the "Prisoner's Dilemma" is used by authors Joseph Health and Andrew Potter in their book 'The Rebel Sell'. The book is somewhat reminiscent of older generations of political recanters as they went from being communists to being neo-conservatives. The main point of the book is a criticism of fashionable counter-culture leftism, a stance that the authors apparently held in their youth (or before their present academic and journalistic careers anyways) and a stance that the authors take to be some sort of "anarchism". To say that whatever their earlier views were that they hardly resembled historical anarchism would be understating the case. The main point of their book is that so-called "culture-jamming" is both fruitless and ultimately hypocritical. That is largely true. The authors, however, don't go as far to the opposite extreme, from barbarism to barbarism, as ex-communist neo-cons do. They park themselves as basically right wing social democrats, a political viewpoint whose goals are at least as self-serving to people like them as "trendy leftism" is to others. Better than signing up with the Ahatollah Bush for Jihad I guess.

Whatever passed for "anarchism" in the rarified social circles that they travelled in when younger was, however, simply a crude "feeling", as is made plain by their discussions of it where they expose what only be termed "cosmic ingnorance" of what the word actually means. So, rather than going from barbarism to barbarism they have gone from crude to crude. Their presentation of the Prisoner's Dilemma, often under the alias of 'The Tragedy of the Commons", is crude in the extreme. It is as if everything that their professors threw at them in the economics classes where they "grew out" of their youthful naivity was taken from the state of game theory in the 1950s, without any regard to all the research that has been done since. Regular readers of Molly's Blog will know that I have a low estimate of the qualifications of many leftist academics. Reading Heath and Potter I came away with the "comforting "(???) feeling that the other side of the coin in academia is often just as lazy, thick and time-serving.

Heath and Potter's book has been reviewed in Issue Number One of the new Ontario platformist publication Linchpin (see earlier here at Molly's Blog) and also has been the subject of discussion of one of the forums over at LibCom. As might be expected such reviewers made much of the authors' criticism of subcultural politics and its futility, but they devoted little space to how distorted both the views of anarchism and the presentation of game theory were in the book.

All this is well and good, and it shows the other side of the coin, how game theory has become very much an in-topic outside of the leftist ghetto, even if some of its uses have all the airworthiness of lead bricks. If the reader is interested in this subject here's a further reference, 'Can Cooperation Ever Occur Without the State ?'. Also, the long time zinester and sceptical anarchist John Johson has recently written on anarchism and game theory in his zine "Imagine:Anarchism for the Real World', issue # 7. Sorry guys, no internet reference here, but you can get a copy for (I presume a small donation) at Imagine, Box 8145, Reno, NV 89507, USA. Now what a place to write about game theory from ! But now, on to the Newdick article...
5. PROVISION OF PUBLIC GOODS ISN'T ALWAYS A PRISONERS' DILEMMA.
"For a game to be a prisoners' dilemma it must fulfill certain conditions: "each player must (a) prefer non-cooperation if the other player does not cooperate, (b) prefer non-cooperation if the other player does cooperate. in other words: (a') neither individual finds it profitable to provide any of the public good by himself; and (b') the value to the player of the amount of the public good provided by the other player alone (ie. the value of being afree rider) exceeeds the value to him of the total amount of the public good provided by joint cooperation less his costs of cooperation" Toylor, 1987: 35).

5.1 CHICKEN GAMES.
For many public good situations either (a'), (b') or both fail to obtain. If condition (a') fails we can get what Taylor calls a Chicken Game, ie. if we get a situation where it pays a player to provide the public good even if the other player defects. But both players would prefer to let the other player provide the good, and we get this payoff matrix:
................................C.......D
C............................3,3......2,4
D............................4,2......1,1

Taylor (1987: 36) gives an example of two neighbouring farms maintaining an irrigation system where the result of mutual defection is so disasterous mthat either indididual would prefer to maintain the system herself. thus this game will model certain kinds of reciprocal arrangements that are not appropriately modelled by a prisoners' dilemma.

5.2 ASSURANCE GAMES.
if condition (b') fails to obtain we get what Taylor calls (1987:38) an Assurance Game, that is a situation where neither player can provide a sufficient amount of the good if they contribute alone. Thus for each player if the other defects then she should also defect, but if the other cooperates then she should prefer to cooperate as well. The payoff matrix looks like this:
........................C.......D
C....................4,4.....1,2
D....................2,1.....3,3
5.3 COOPERATION IN A CHICKEN OR ASSURANCE GAME.
There should be no problem with mutual cooperation in an Assurance Game (Taylor 1987: 39) because the preferred outcome for both players is that of mutual cooperation. With the one-off Chicken Game mutual cooperation is not assured. Mutual cooperation, however, is more likely than in an one-off Prisoners' Dilemma (5).
6. COOPERATION IS RATIONAL IN AN ITERATED PRISONERS' DILEMMA.
6.1 WHY ITERATION ?
Unequivocally there is no chance for mutual cooperation in a one-off Prisoners' Dilemma, but as has been pointed out, the one-off game is not a very realistic model of social interactions, especially public goods interactions (Taylor 1987: 60). Most social interactions involve repeated interactions, sometimes as a group (an N-person game) or between specific individuals (which might be modelled as a game between two players). The question then becomes: Is mutual cooperation more likely with iterated games ? (Specifically the iterated Prisoners' Dilemma). As one would expect, the fact that the games are repeated (with the same players) opens up the possibility of conditional cooperation, ie cooperation dependent upon the past performance of the other player.
6.2 ITERATED PRISONERS' DILEMMA.
There are two important assumptions to be made about iterated games. firstly, it is assumed (very plausibly) that the value of future games to a player is less than the value of the current game. The amount by which the value of future games are discounted is called thye discount value, the higher the discount value the less future games are worth (Taylor 1987:61). Secondly, it is assumed that the number of games to be played i8s idefinite. If the number of games is known to the players then the rational strategy will be to defect on the last game beacuse they cannot be punished for this by the other. Once this is assumed by both players the second to last game becomes in effect the last game and so on (Taylor 1987: 62).

Axelrod (1984) used an ingenious method to test what would be the best strategy for an iterated Prisoners' Dilemma. He held two round-robin computer tournaments where each different strategy (computer program) competed against each of its rivals a number of times. Suprisingly the simplest program, one called TIT FOR TAT, won both tournaments as well as all but one of a number of hypothetical tournaments. Axelrod's results confirmed what Taylor had proven in 1976 (6), TIT FOR TAT is the strategy of choosing C for the first game and thereafter choosing whatever the other player chose the last game (hereafter TIT FOR TAT will be designated stategy B, following taylor (1987).

An equiilibrium in an iterated game is defined as "a strategy vector such that no player can obtain a larger payoff using a different strategy while other players' strategy remains the same. An equillibrium then is such that, if each player expects it to be the outcome, he has no incentative to use a different strategy" (Taylor 1987: 63). Put informally, an equilibrium is a pair of strategies such that any move by a player awayfrom that strategy will not improve the player's payoff. Then mutual cooperation will arise if B is an equilibrium because no strategy will do better than B when played against B (7).

The payoff for a strategy in an indefinite iterated Prisoners' Dilemma is equal to the sum of an infinite series:
X/(1-w)
X= payoff
w= discount parameter (1-discount value)
UD playing with UD gets a payoff of two per game for mutual defection.If we set w=0.9, then UD's payoff is:
2/(1-0.9) = 20
(MOLLY NOTE: I retain the terminology "UD" as it appeared in the original essay, but I alert the reader that this probably should have been "AD" for "always defect" as it is usually referred to in game theory.)
B Playing with B gets a payoff of three per game for mutual cooperation. Thus with w = 0.9 B gets:
3/(1-0.9) = 30
(B,B) is an equilibrium when the payoff for B from (B,B) is higher than the payoff for UD from (UD,B):
B's payoff against B is
3/(1-w)
UD's payoff against B is
4 + 2w/(1-w)
Therefore UD cannot do better than B when:
(3/(1-w)))> (4 + 2w/(1-w))
= w > (4 -3)/(4 -2)
=w > 0.5
(Axelrod 1984: 208) (8) (9)

Can any other strategy fare better against B than B itself ? Informally we can see that this is not possible (assuming future interactions are not too heavily discounted). For any strategy to do better than B it must as some point defect. But is the strategy defects then B will punish this defection with a defection of its own which must result in the new strategy doing worse than it would have had it cooperated. Thus no strategy can do better playing with B than B itself. Now, if B is an equilibrium then the payoff matrix for the iterated game is:
........................B...........UD
B.....................4,4.........1,3
UD..................3,1.........2,2
Which is an assurance game. Thus if B is an equilibrium then we should expect mutual cooperation (Taylor 1987: 67). If, however, b isn't an equilibrium (ie the discount value is too high) then the payoffs resemble a Prisoners' Dilemma and thus mutual defection will be the result (Taylor 1987, 67).
This is a good place to stop until another day. The actual situation in game theory- and real life- is much more complex than what has been described above. In particular the game described above tends to drive towards iterated defection under a simple TIT FOR TAT strategy. Other refinements such as "forgiving tit for tat" are optimum in some situations, and the role of "spite" has been much further investigated in recent years. In the next section Newdick will describe "N-Persons" games. In such situations not just "spite" but also what has been called "altruistic punishment" comes into play. All this is to alert the reader that, while what will come in the next section is valuable, the present state of the theory is far more advanced than what will be presented here.

Sunday, March 04, 2007

THE NEVER ENDING REVIEW: CHAPTER THREE OF 'A BEAUTIFUL MATH' : 'NASH'S EQUILIBRIUM':
Welcome back to this continuing review of Tom Siegried's 'A Beautiful Math', on the growth of game theory. I spend a goodly amount of time on this review because it is, in my opinion, important for a clear sighted view of social action- not because I agree with the author's often overinflated claims for the relevance of game theory. Even a sceptic of some of the grander claims can see just how important this matter is for a rational radicalism of the future. Anyways...
It's the third chapter of this book before the character of the title, John Nash, makes an appearance. Nash entered Princeton University as a graduate student in 1948. This was Von Neumann's stomping grounds. Morgenstein worked in the economics department and Von Neumann was at the Institute for Applied Studies a mile away.
To this point game theory had been restricted to the rather sparse world of "two player zero-sum" games. Nash rapidly broke into new fields of analysis, and his 1950 paper ('The Bargaining Problem, Econometrica 18(1950) pp 155-162) on which he was advised by both Morgenstein and Von Neumann, expanded the world of game theory into "cooperative games" in which the two sides work together to achieve a mutual benefit, and what he provided was a mathematical map for finding the optimal bargain that maximized the utilities of both players.
In the same year that he published the above paper he also presented his doctoral thesis- Non-Cooperative Games. This introduced the idea of an "equilibrium strategy" towards which a repeated round game will evolve. At this equilibrium Nash wrote in his thesis the situation is such that,
"...each player's mixed strategy maximizes his payoff if the strategies of the other players are held fixed."
What this idea did was to take game theory and make it possible to describe multi-player games, something that Von Neumann's ideas floundered on. Nash's proof depended upon something known as the fixed point theorem , an idea borrowed from topology. The ideas presented in this thesis were also published in the PNAC as 'Equilibrium Points in N-Person Games' in 1950 and in 1951 as 'Non-Cooperative Games' in the Annals of Mathematics.
As a side note "cooperative" and "non-cooperative" have a rather restricted meaning here. Cooperative refers to the coalition forming that Von Neumann and Morgenstein used to get around the fact that their theories couldn't deal with more than two players. Non-cooperative refers to Nash's expansion which can deal with any number of players who don't collaborate or communicate with each other. What Nash showed is that there is an "equilibrium strategy" that each player (at least one such but sometimes more than one) which maximizes their payoff no matter what the other players do assuming they also try to maximize their payoff.
This is, of course a simplified version of the real world where the "equilibrium states" of perfectly rational actors who try to maximize their self interest rarely exist. But armed with this general description the author goes on to describe specific "games" have analyzed, especially 'The Prisoner's Dilemma' (1), first described by Nash's Princeton professor Albert W. Tucker in 1950 (2) .
The game is set up as follows, two criminals, call them "Alice and Bob', are arrested. The police interrogate them separately. They have enough evidence to convict each of them on a minor charge, but they need confessions for convictions on more serious charges. If both refuse to confess they each get one year on the lesser charge. If one confesses and the other stays silent the squealer goes free and the other gets five years. If both confess they each get 3 years (two years off for "copping a plea". The payoff matrix is as below (once more excuse the limitations of blogger).

Alice

Keep Mum Rat

Bob: Keep Mum 1, 1 5,0

Rat 0,5 3,3

The above game is actually set up as something like a "routine procedure" by police interrogators, often with the predictable outcome (criminals are usually not heroes after all).

The 'Nash equilibrium' for the above is for both players to confess. from the point of view of either player the best choice is to rat no matter what the other player does. The outcome where both players squeal is "worse for the group" as 6 combined years is the maximum sentence, but is the best outcome for an individual acting in their own self interest. A real life example of this can be seen in continual news of "eco-terrorists" acting as squealers time after time in the USA. The ideology of those who promote such acts (while often remaining aloof from same) is insufficient to overcome the self interest of those who are caught in such acts (which they usually are), and their attempted "punishment" is as quite puny as compared to that of ordinary criminals. Hence the great incentive to confess on the part of people who are caught for such crimes. "Spite" may play a part in this as well, as those who have been caught may come to realize the self-interest of many who have "egged them on" while remaining out of danger themselves.

Another game mentioned, one closer to actual reality rather than anarchist cultism, is the 'Public Goods Game'. The question of this game revolves around the provision of public goods by voluntary donation- something closer to the heart of real anarchism rather than the posturing of certain American cults. In this case "defectors", otherwise known as "free riders" who don't voluntarily contribute can still reap the benefits of a "public good". It seems OK for the defectors, but if too many decide to "free ride" then the public good becomes unavailable and the defectors get no benefit.

One of the variants of the public goods game that Siegfried mentions is set up as follows. Four players are given monetary tokens and told that they could keep as many as they wanted or put them into a "public pot" where the amount would be doubled by the experimenter. There were a certain number of "rounds" in this game wherein each player would be told how much had been contributed to the pot, and they would be offered the chance to change their contribution; either decrease or increase it.

When the game was played repeatedly a stable pattern began to emerge. As Siegried says,

"Players fell into three identifiable groups:cooperation, defection (or free riders) and reciprocaters. Since all the players learned at some point how much had been contributed, they could adjust their behavior accordingly. Some players remained stingy (defectors), some continued to contribute generously (cooperators) and others contributed more if others in the group had donated significantly (reciprocaters).

Over time, the members of each group earned equal amounts of money, suggesting that something like a Nash Equilibrium had been achieved- they all won as much as they could, given the strategy of others. In other words,in this kind of game, the human race plays a mixed strategy- about 13% cooperators, 20% defectors (free riders) and 60 % reciprocaters in this particular experiment". (Molly Note: this emphasizes the importance of what is called "altruistic punishment" in evolutionary psychology. In a "game" where knowledge of an "opponent's" previous interactions with other players is given the percentage of "defectors" can be reduced by such punishment inflicted by players who were not part of the original rounds).

The author ends the chapter with an overview on 'Game Theory Today'. He notes that the field has been broadened considerably to cover "games where coalitions form, where information is incomplete, where players are less than perfectly rational". He also notes that there are arguments about whether game theory predicts behavior or "proscribes" what a rational person should do. He goes on to answer some of the criticisms of game theory's ability to "predict" in real world situations.

Siegried notes that game theory, like other scientific theories, is a model of reality, not reality itself. It makes reality comprehensible by simplifying it. As it is tested in experimental situations it is modified and grows just like any other scientific theory. The author quotes Colin Cameron in 'Behavioral Game Theory':

"The goal is not to disprove game theory...but it is to improve it"

Siegried goes on to describe the contributions of Thomas Schelling who won the 2005 Nobel Prize in economics. Schelling focused on games where there is more than one Nash equilibrium. he particularly analyzed conflict in international relations and the role of "bluff" in same. He also analyzed games where a "coordinated outcome" is better than any particular outcome. These are situations where, as Siegried says,

"...where it is better for everybody to be on the same page, regardless of what the page is."

The work of the other 2005 economics Nobel Prize winner, Robert Aumann, is also mentioned. Aumann analyzed the prisoner's dilemma game as a "repeated rounds" situation rather than a "one shot" affair and showed how cooperation could evolve in such situations (Much closer to everyday life:Molly Note). He identified situations where cooperation is less likely ie many players, limited communication or limited game time (fewer rounds). (Molly Note: The eventual "goal" of studying matters such as these is to identify what sort of conditions lead to "increased cooperation" in the presumed society that we want. Some things are obvious from the above. "Fewer players" means a decentralized society. "Full communication" means not just decentralization but also the elimination of "socialist managers" who mediate such communication and add "noise" to same).

Siegried finishes this chapter by naming multiple applications of game theory, not just in economics but also in medicine, politics,ecology and especially !!! evolutionary biology.

Molly'sBlog

Links

Tuesday, January 15, 2008

Sunday, March 04, 2007

About Me

Blog Archive