THE BEST OF RON MAMMON
From: rmaimon@husc9.Harvard.EDU (Ron Maimon)Newsgroups: sci.physicsSubject: THE BEST OF RON MAMMON (superrationality)Date: 8 Feb 1995 03:38:10 GMTOrganization: Harvard University, Cambridge, MALines: 180Sender: rmaimon@husc9 (Ron Maimon)Distribution: worldMessage-ID: <3h9eb2$rpp@decaxp.harvard.edu>NNTP-Posting-Host: scws3.harvard.edu
Superrationality----------------
Not really intuition. I guess I have to go into detail after all.
I'll discuss a "toy model" of rational choice, it is instructive, but clearly not realistic.
The model will be as follows: I imagine two rational beings in two rooms, and there is a button on each wall. If one of the two presses the button, the other is killed. If they don't, they survive. If both of them press the button, one of them is killed at random.
What course of action maximizes each individuals probability of survival?
This seems like a question that has a definite answer, and it is surprising that when you do the analysis, you run into a problem. Well, I won't talk aboutit, I'll do it.
there are four possibilities to consider: I press and he presses, I press and he doesn't, I don't and he does, I don't and he doesn't. My probabilty of survival is 50%,100%,0,100% respectively.
A naive argument that someone could make now is as follows. Clearly, his choice doesn't depend on mine, and clearly no matter what he does, I'm at least as well off pressing as not pressing. Therefore, if I want to maximize my chances of survival, I should press the button.
Well, this argument is very convincing- so much so that it convinced game theorists for years. This led to a big puzzle, why is it that two rational beings who both want to live end up trying to kill each other with this button, when if they just didn't press it, they would both be just fine?
Well, the argument is wrong, and the reason it's wrong is in the first "obvious" statement. It is _not_at_all_ obvious that his choice does not depend on mine.
What is obvious is that my choice doesn't physically influence his choice. But, since we are both rational, and we are working out the same problem, we should get the same answer- there should be a correlation between us, which has nothing to do with physical causation. If I decide that I should press the button, I know he will decide to do that too, not because I influence him, but because he is rational just like me. So that there is a correlation between my decision and his.
What is the degree of this correlation?
my argument above seems to suggest that the correlation between two rational beings is perfect. This is _almost_ true. It is true if you assume there is a unique answer to the problem posed above. If this is true, then two rational beings will hit upon it, and there will be a perfect corellation between their decisions. If there is no unique answer, there is no way of saying what the corellation is.
assume that there is a unique answer, what is it?
well, since both rational beings will hit upon it, and do it, this means that the only options are both pressing or both not pressing the button. A quick inspection says that the best option for me is for both of us not to press the button, and so that is the unique way of maximizing my probability of survival.
If I don't assume there is a unique answer, I can't find it. This is the troubling point. There is no unique solution to this problem, unless you assume there is, in which case you can find it. This is tied in to the problem of what it means to be "rational", which is vague and undefined.
Well, now I will define it.
I will follow Douglass Hofstadter in his incredibly good articles in Scientific American, and define "superrational" as follows.
A strategy is superrational when it maximizes my probability of survival (my utility in a more general case) in this game, when all the superrational players follow it. For instance, in order to analyze the game above in a superrational way, I want to find the superrational strategy which, when all the superrational players follow it, gives me an optimal result. The answer is either both players press the button or both don't press the button, a quick glance at the probability of survival shows me that the strategy of a superrational player playing against another superrational player will be not to press the button. Against any other player, a button pushing machine, a nonsuperrational person, it will be to press the button. This is sort of like rule utilitarianism, except it is much more precise, since it doesn't have any ambiguity in figuring out things like "maximumgood for maximum number of people".
This Hofstadter also calls "renormalized rationality", which is a nice analogy between this situation and perturbation theory in quantum field theory. In order to show the analogy, I will define "zeroeth order rationality"
this is rationality done in the following way- you make a table of options and assume you’re actions are wholly independent of anyone else, and then make the choice that maximizes you’re probability of survival. The zeroeth order rationality is the only type of rationality game theorists ever talk about, and it is an inadequate description of human rationality. A zeroeth order rational person playing against anything will press the button and try to kill the other thing, even when playing against another zeroeth order rational person, who is perfectly correlated with him.
There is also "first order rationality", which attempts to correct for the correlations. You first find the zeroeth order strategy, and then you look at the other person's situation, and then choose the option that is best for you assuming he will behave according to zeroeth order rationality. Then you can define second order rationality, etc. In the case of this game, rationality to nth order doesn't converge to superrationality- this is just a stupid analogy, don’t pay too much attention to it- and therefore there is a need for some sort of “renormalization" before rationality makes sense. This "renormalization" takes into account the fact that a decision isn't just a "bare" isolated decision, but it's "dressed" with all the other decisions you assume it induces through the correlations of rational thought. This analogy suggests calling the mode of analysis which assumes no correlation, and then attempts to correct for other people’s behaviors, "perturbative rationality".
Now, I can show that superrationality is, in fact, the "best" sort of rationality. In the sense that it is a rationality that does a weird sort of utilitarian analysis in deciding what options to choose. In that sense, it looks a lot like human rationality, even if very few people ever come to this realization of the concept of superrationality on their own. They try to do "perturbative rationality" usually only to zeroeth order. Then when this doesn't agree with superrationality, they get a funny feeling that something is wrong, and that they need a new "moral principle" or something. Some people have enough conviction in perturbative rationality to ignore this feeling. They just do whatever they want to do, ignoring any possible correlations between their behavior and other people's. We call them "amoral" sometimes. They only behave nicely when they have a fear of getting caught being bad, and punished.
But this is just interpretation. It doesn't change the fact that super-rationality is the best way to make decisions. In that sense, utilitarianism isn’t based on instinct- it's based on rational behavior with the rather wrong assumption that people are often superrational. Even without this assumption, you still don't get problems in the real world, because even perturbatively rational people will make jails for criminals and put cops on the streets, and basically behave well towards each other most of the time, so that they won’t get a retaliation against them. This is ordinary utilitarianism, and it works well enough, except in a few cases like the example of the game I gave. In an ideal world, no one would press the button. In the real world it depends who you play against, and what confidence you have in their superrationality.
Another analysis and then I'll be done. I will now assume that there is some probability p that the person in the room is not superrational, and show that so long as p is small, I shouldn't press the button, so that the whole thing is stable to small changes.
I will do this as follows. I will assume the same game as before, except that now there is a probability p that the person in the other room is, in fact, a robot, who presses the button automatically. Just to be fair, I will give the same probability for me being a robot, so that the situation is symmetrical. Now I ask, what is the strategy that maximizes my probability of survival?
Well, assume it's to press the button. Then I know that I will press the button, and therefore the other guy will press the button, and whether he’s in the room or the robot is, the button will get pressed, so my probability of survival, if I am in the room, is 50%.
Assume it's not to press the button. Then I have a probability p of dying, since if the robot is in the room I will certainly be killed, and if the superrational player is in the room, he won't press the button, since I don’t, and he is superrational too, and by assumption the superrational strategy is not to press the button.
So as long as p<50% it is not in my best interest to press the button.
A utilitarian might say, well what's the utilitarian way of doing it?
He would say, there are four possibilities- two robots, two people, I'm a robot he’s a person, I'm a person he's a robot. In each case he would compute the expectation value of the number of people that survive given either strategy, and then he would compare the two strategies (pressing or not pressing the button).
Strangely enough, this utilitarian gets that so long as p<50%, you shouldn’t press the button.
Hmm, they're the same...
I'll leave that for you to ponder.
-- Ron Maimon
