sábado, 23 de março de 2013

The Greater Good

Let me start this post with the famous Prisoner's Dilemma. It has many different formulations, so I'll stick with the first formal one I've seen, in a Game Theory course at the Technical University of Prague ( March 2012 ).

"It is 1930's. In the Soviet Union at that time a conductor travels by train to Moscow, to the symphony orchestra concert. He studies the score and concentrates on the demanding performance. Two KGB agents are watching him, who - in their ignorance - think that the score is a secret code. All conductor's efforts to explain that it is yet Tchajkovskij are absolutely hopeless. He is arrested and imprisoned. The second day our couple of agents visit him with the words "You have better speak. We have found your comrade Tchajkovskij and he is already speaking...
Two innocent people, one because he studied a score and the second because his name was coincidentally Tchajkovskij, find themselves in prison, faced the following problem: if both of them bravely keep denying, despite physical and psychical torture, they will be sent to Gulag for three years, then they will be released. If one of them confesses the fictive espionage crime of them both, and the second one keeps denying, then the first one will get only one year in Gulag, while the second one 25. If both of them confess, they will be sent to Gulag for 10 years."

Although at first glance the solution might seem simple ( both of them should deny ) that is in fact not true. Keep in mind that they cannot struck a deal, and thus must decide independently. Moreover, notice the fact, no matter what one decides, it is always more profitable for the other to confess. In order to fully realise this, consider you are the conductor. If Tchajkovskij stands strong and denies, you'll get 3 years if you deny as well. But, were you to confess in the same situation, you'd only get 1 year! Poor Tchajkovskij would spend 25 years at the Gulag, but that's hardly your problem. If, on the other hand, Tchajkovskij confesses, denying would mean 25 years for you, while confessing would mean you'd both get 10 years. Again, confessing is the smartest option for you! Of course the same reasoning applies to Tchajkovskij's situation, since they're exactly the same. Indeed, confess is the dominating strategy for both players and (confess , confess) is the only equilibrium point in the game. Non-technically it means that, if both Tchajkovskij and the conductor act rationally (eg: they act in order to get the best possible outcome for themselves and feel no pity or remorse for the other) they will both confess and thus be sent to the Gulag for 10 years, when they could get a much better outcome if they both denied.

The prisoner's dilemma can be generalised to any  2-players game where they both can choose to cooperate and defect and
  • if they both cooperate (deny) they'll both get a reward (only 3 years in Gulag, as opposed to the 10 given by the rational solution);
  • if they both defect (confess) they'll both be punished (10 years in Gulag);
  • If only one of them defects and the other cooperates, that one will get a bonus even better than the one he'd get if both cooperated (only 1 year in Gulag), while the other gets a really bad outcome (25 years in Gulag). In this situation the first is said to have given in to tempation while the second is called the sucker.

Interesting as it is as a theoretical conception, one must wonder if the prisoner's dilemma actually ever occurs in practice. And it does. A lot. Consider for instance the issue of tax paying (in Sweden) and imagine there's some kind of magical way to evade taxes without getting caught. If you pay taxes, you get lots of benefits (education, health, public transports, etc... ) but you lose the money. If you don't pay the taxes, you get the same benefits without the nuisance of losing any money. This seems like the right thing to do then. Thing is, if no one pays paxes, then everyone will lose the benefits. This simple and somewhat ridiculous example intends only to show how an instance of the dilemma might look in practice. Other examples could be a typical duopolist situation, payment of fares on public transports, soldier on a battle's first line and the storage/abdication of nuclear armament.

The thing with this kind of problems is that they require communication and co-operation in order to reach the overall best solution. If each individual acts so as to maximise its own proffit, everyone will end up in a nasty situation. Still, it seems like the rational thing to do. And most of Economics is based on the assumption that people will act rationally, as the so called Econs. Even worst, we all know that people really hate to end up being the sucker, so in fact their irrationality will bias them even more towards the defecting solution. So, are we doomed to this worst possible outcome?

Maybe not. Most pratical insstances of the Prisoner's Dilemma occur repeatedly; that is, you are called to make a decision only once but over and over again. While this may seem a minor detail, it makes all the diference in practice. Going back to rational agents and Game Theory, the solution to the repeat game is not to defect all the time. Without further ado, let's speak about Axelrod's Tournament. In 1981 Robert Axelrod invited prominent game theorists to a computer tournment. The idea was that each participant would develop its own computer agent (with a given strategy) to play the Prisoner's Dilemma game. They would then play with each other for 200 rounds, and in the end they would count the points (each year in prison is worth one negative point). Just for the sake clarity I'll introduce three of the strategies used
  • Tit for Tat: cooperates on the first round and then always mimics the other's last move;
  • Random: cooperates with probability 0.5;
  • Grudger: cooperates until the other has defected, then defects forever (it doesn't forgive the betrayal).
Twelve other strategies were employed, most of them more complex than these.
There are many ways to categorise these strategies; an interesting one if by dividing them into two groups: nice strategies and nasty strategies. The first are characterised by never defecting first (only in retaliation), while the latter defect first in at least some situations. You can therefore picture an agent following the first type of strategy as someone who is trustful, in opposition to someone with guile. Quite surprisingly the simple Tit for Tat strategy won the tournment*, reminding us yet again that complexity does not necessarily make something better or smarter (and believe me, people do tend to forget this a lot!) But what I really find worth mentioning is that the first eight places in the tournment were taken by the eight nice strategies present, which means that all nasty strategies scored below any nice strategy. This seems to indicate that, even in a rational sense, it could be proffitable to play nice in life. Maybe there is hope for Humanity after all...


Filipe Baptista de Morais

* The tournment was repeated with somewhat different rules some time later, yet this simple strategy won again.

Sem comentários:

Enviar um comentário