Until last week, I had never heard of the paradox of the absent-minded driver, but I was recently told that it has some relevance to my encyclopedia article on quantum game theory. That plus the fact that I am a notoriously absent-minded driver myself made me think I should check out the original source. Here’s what I extracted:
Each day, Albert leaves his office (at the bottom of the map), gets on the Main Highway and attempts to drive home to his house on Second Street. If he turns too soon (onto First Street) or if he overshoots (going all the way to the north end of the Main Highway), he is mauled by dinosaurs.
Obviously, Albert’s best strategy is to go straight at the first intersection and turn right at the second. Unfortunately, both intersections look identical. Doubly unfortunately, Albert can never remember whether he’s already passed the first intersection.
Since Albert can’t tell the intersections apart, he needs a single strategy for both of them. Strategy A is to turn at every intersection. This delivers him directly to the First Street dinosaur mob. Strategy B is to go straight at every intersection, putting him on a direct route to the North Side crew. Neither of these strategies has any chance of getting him home.
Therefore, Albert adopts Strategy C, which is to flip a fair coin at every intersection. This gives him a 50% chance of going straight at First Street and a 50% chance of turning right at Second, for an overall 25% chance of arriving home safely. It’s easy to compute that Albert can do no better.
Fortunately, Albert is smart enough to figure this out. Perhaps unfortunately, he’s also smart enough to reason a bit further.
You see, with Strategy C in place, Albert makes it as far as First Street every day and continues on to Second Street only half the time. Which means that every time Albert reaches an intersection, the odds that he’s at First Street are 2 to 1. So going straight is always the better bet. Of course if he takes this computation to heart, Albert is sure to make the North Side dinosaurs very happy.
Alternatively, Albert might reason that he’s probably, but not certainly at First Street, so he should flip a weighted coin that increases his chance of going straight. Strategy D is to flip a coin that comes up “straight” 1/4 of the time. Why 1/4? Because Albert knows enough calculus to recognize that this is the best he can do on the assumption that he gets to revise his plan only once. With some other assumption, he might have come up with some other probability—but what matters is that it’s surely not 1/2.
With Strategy D, Albert goes straight at First Street 1/4 of the time, and turns at Second Street 3/4 of the time, which reduces his probability of a safe homecoming to 3/16. Albert would have been better off if he’d been dumb enough to stick with Strategy C.
There are multiple paradoxes here. First, the very fact that Albert is committed to Strategy C is what leads him to compute that he should switch to Strategy D. Second, even though his computation seems correct, he’s better off ignoring it.
Albert can avoid the first paradox if he starts off with Strategy E, turning right at each intersection with probability 2/3. A little calculus reveals that in that case, the recalculation leads him to right back to Strategy E. But Strategy E is clearly the wrong strategy, since it gets him home only 2/9 of the time.
So the question is: How can a sensible calculation lead to a clearly wrong answer? Given that you’re absent-minded does it pay to also be dumb?
I think I have a useful framework for thinking about this problem, but it’s a bit technical and I think I should save it for another post. Meanwhile, what do you think?
I think he should get satnav.
Is this similar to the Sleeping Beauty problem?
“Which means that every time Albert reaches an intersection, the odds that he’s at First Street are 2 to 1. So going straight is always the better bet.”
It is indeed 2:1 that Albert is at First Street. However, the payoffs for going straight and turning are not identical.
Turning at the first intersection guarantees that you’ll be mauled by dinosaurs, while turning at the second guarantees you’ll get home safely. However, going straight at the second intersection guarantees you’ll get mauled by dinosaurs, while going straight at the first only gets you home with the probability that you’ll turn at the next intersection.
Since we’ve already calculated this to be 0.5, we can trivially see that turning and going straight have the same expected payoffs. You’re at the second intersection a third of the time: if you go straight you’ll be mauled for a payoff of 0, if you turn you’ll get home for a payoff of 1. You’re at the first intersection two thirds of the time: if you turn, you’re mauled for a payoff of 0. If you go straight, you know that 50% of the time, you’re going to turn at the next intersection for a payoff of 1. Given your knowledge, turning and going straight will both get you home with probability 1/3.
Probability of survival p = (1-x)*x, where x is the probability of turning. Assuming that is what Albert wants to maximise, differentiate this and you get 1-2x=0 or x=0.5. So Strategy C is not the dumb one :)
As with most probability stuff, there’s no paradox. It’s just a question of expressing the odds of success correctly.
Since Albert can apply only a single ‘turn or no-turn’ strategy with no other information, his odds of getting home are:
(1 – p) * p
Maximising the result yields p = 0.5.
“How can a sensible calculation lead to a clearly wrong answer?”
By being the wrong calculation.
“Given that you’re absent-minded does it pay to also be dumb?”
Nope!
Aumann, Hart, and Perry have a nice (and short) paper on this topic, which I think provides a useful way of thinking about the problem.
I particularly like the observation that you can do better (than strategy C) with the aid of a very simple trick. Place a coin in the seat next to you (heads or tails up at random, you’ll forget anyway). Now, whenever you enter an intersection, if the coin is tails, exit. If the coin is heads, continue, but flip the coin over.
Half the time the coin will be heads when you arrive at the first intersection, so you will stay on the road, flip the coin, and exit at the second intersection, reaching home!
Half the time the coin will be tails up when you arrive at the first intersection, and sadly, the dinosaurs will have their way with you.
But this is still twice the chance of getting home (compared with strategy C). The reason is of course that the coin allows you to correlate your choices, so you have some “probabilistic memory”.
link to paper:
http://delivery.acm.org/10.1145/1030000/1029702/p97-aumann.pdf?key1=1029702&key2=5126434721&coll=GUIDE&dl=GUIDE&CFID=89003080&CFTOKEN=20720761
Solved in the comments at Less Wrong, wasn’t it? (http://lesswrong.com/lw/182/the_absentminded_driver/13z1)
The Less Wrong post is slightly different: going straight at the second intersection produces a non-zero payoff (but less than ‘getting home’). That leads to a different payoff formula, which yields a different value for p.
However, the conclusion that attempting to account for ‘the probability of being at the first intersection’ introduces error (and is therefor a Bad Plan) is correct.
mcp: Introducing a memory device into a problem based on lacking memory should give a 100% chance of getting home, not an improved one! The driver is capable of making decisions, just not remembering them – introducing a memory device removes the problem (he can decide to always place the coin face up when he begins his journey, etc). Turns out a link in a comment on that Less Wrong post that Bo linked to has a nice explanation of that sort of thing:
http://lesswrong.com/lw/2k/the_least_convenient_possible_world/
Of course, the charity example in that post is wrong, but that’s another topic…
My feeling is that if you get mauled by dinosaurs often enough, you learn to pay attention when you drive… or you die. :)
MWC puts is very well; read his first answer.
As with many of these things, sometimes a hidden assumption is slipped in that makes things seem odd. The hidden assumption that causes the problem is that the decisions at corner 1 and 2 are somehow separate. As MWC notes there is only one decision to make. The temporal sequencing of events just misleads here. You make one choice, and each choice has a payoff.
Seems to me this “solution” that provides a 25% chance of getting
home is bizzare. Why would we apply a probabilistic solution to
a system with a fixed and known state? In this case, there is no
statistical solution that is a sensible calculation.
Web pages didn’t used to preserve state. They solved that
problem, not by guessing, but by using cookies or other
equivalent mechanisms. This is what Albert needs.
Once the problem is stated in correct terms, there are any number
of reliable solutions. There are too mamy perfect solutions to
even list. You can adjust the external world (e.g.: street
signs or a painted curb), the car (e.g.: GPS), or the driver
(e.g.: take a cab).
In fact, no solution at all may be necessary. Unless Albert is
blind (in which case, he has more problems that dinosaurs), he
can do a U-turn when he sees a dinosaur ahead. Dinosaurs just
don’t hide very well in cities. After all, which is worse: the
slight chance of a ticket for an illegal U-turn or the 100%
chance of being a dinosaur meal?
Albert needs to rent a condo on Main.
Why not just carry a bazooka to kill the dinosaur? If you end up firing the bazooka, you know you made the wrong choice and go back. Eventually, the dinosaurs go extinct, or, if you’re very unlucky, evolve to resist bazooka shells.
“That’s absurd,”…”That may be true, but it’s completely accurate, and as long as the answer is right, who cares if the question is wrong? If you want sense, you’ll have to make it yourself.”
-The Phantom Tollbooth
Albert’s calculation of 2 to 1 odds that he is at first street is based on the assumption he turns half the time. But if he changes his strategy by using a weighted coin, he no longer can assume 2 to 1 odds. For example, if he turns 1/4 of the time the odds that he is at first street are only 4/3, not 2, so he shouldn’t use a weighted coin so heavily weighted for going straight. The only consistent strategy is 1/2.
Neil: No, by your reasoning the only consistent strategy is in fact 2/3.
By increasing the probability of going straight at the first turn, you are simultaneously reducing the probability of turning at the second. When you take both into account, the optimal strategy is 1/2.
Where’s the paradox?
Nitin: Yes, it’s clear that the optimal strategy is 1/2. The paradox is that there is an argument that seems to conclude otherwise. A resolution would consist of finding a flaw in that argument. Repeating the (correct) argument in favor of 1/2 adds nothing to that endeavor.
As I see it the question is why doesn’t this iterative procedure help when our intuition tells us it should? i.e. what is wrong with the intuition that Albert is more likely to be at the 1st street than the 2nd (he’s at the 2nd only if he hasbeen at the 1st) so he should be more willing to go straight than turn right. I think I agree with Henry that it’s because of the payoff structure. going straight at 1st intersection isn’t as good as going right at the 2nd.
If we change this to (countably) infintely many blocks with dinos to the right of each street except for the 2nd, it’s clearly the identical problem but it makes my intuition a bit clearer about the fact that we’re more likely to be at the 2nd street than (say )the 47th
“I think I have a useful framework for thinking about this problem, but it’s a bit technical and I think I should save it for another post.”
At this, I pictured Professor Landsburg playing roughly with plastic dinosaurs and a matchbox car in his office, and a colleague knocking.
“Landsburg, are you playing with your dinosaur toys again?”
“No! Don’t open the door!”
Sorry, couldn’t help myself.
“Given Strategy X, Strategy Y is optimal.” Is that the confusion?
I think Nitin has pointed out the flaw. When Albert arrives at an intersection, lacking memory he correctly assumes he is more likely to be at the first intersection than the second. But it does not follow that it is advantagous to him to increase the probability of going straight because when (if) he reaches the next intersection he will have lost his memory and make the same decision. If he decides to increase the probability of going straight, then going straight has less value to him because he is less likely to turn when he reaches the next intersection, if there is a next intersection.
“Yes, it’s clear that the optimal strategy is 1/2. The paradox is that there is an argument that seems to conclude otherwise. A resolution would consist of finding a flaw in that argument.”
The flaw is in the statement, “Which means that every time Albert reaches an intersection, the odds that he’s at First Street are 2 to 1. So going straight is always the better bet.” It’s true that the odds are 2 to 1, but it doesn’t follow that going straight is the better bet.
Others may have said this verbally, but I prefer a mathematical proof:
Let D = the utility of meeting the dinosaur.
Let H = the utility of getting home.
If you reach an unknown intersection and turn, you get expected utility of:
EU(turn) = (2/3)D + (1/3)H
If you reach an unknown intersection and go straight, you get expected utility of:
EU(straight) = (2/3)[(1/2)D + (1/2)H] + (1/3)D = (2/3)D + (1/3)H
So the expected utilities of turning and going straight are equal, and thus there is no reason to adjust your probability of going straight.
This can be done without the utility values. The key point is realizing that if you go straight, there is a 2/3 chance that you will have to flip another coin (because you were only at 1st street). As a result, while there’s a 2/3 chance that going straight will get you closer to home, there’s only a 1/3 chance you will actually get there.
“The paradox is that there is an argument that seems to conclude otherwise. A resolution would consist of finding a flaw in that argument.”
As I said, this is purely a question of stating the odds correctly, and the very notion of ‘action-optimality’ introduces error.
Piccione and Rubinstein’s initial hypothesis is simply wrong. From their paper:
“Planning his trip at the bar, the decision maker must conclude that it is impossible for him to get home and that he should not exit when reaching an intersection. Thus, his optimal plan will lead him to spend the night at the motel and yield a payoff of 1.”
This is the initial mistake, since this is not in fact the optimal strategy at planning time. It is followed with:
“Having chosen the strategy to continue, he concludes that he is at the first intersection with probability 1/2. Then, reviewing his plan, he finds that it is optimal for him to leave the highway since it yields an expected payoff of 2. Despite no new information and no change in his preferences, the decision maker would like to change his initial plan once he reaches an intersection!”
This is where the initial mistake is compounded by introducing unnecessary error. The rest of the paper is taken up making assertions such as:
“Therefore, no mixed strategy can be strictly preferred to all the pure strategies.”
Which is apparently also derived from their initial misconception about what constitutes the initial optimal plan (and, in fact, the optimal plan regardless).
The only way to justify the existence of the paper is to focus on this conclusion:
“In all its forms the absent-minded driver example exhibits a conflict between two types of reasoning.”
If the paper is treated as a study of bad human reasoning (the same sort of bad human reasoning that causes a baseball manager to call for a sacrifice bunt to move a runner over late in the game but not early in the game) then it has value. But statements such as this make it seem unlikely that was the intent:
“We have investigated one resolution which requires dividing a decision maker into multiple independent selves.”
Glen: Your calculations depend on a very specific model of how you’re going to behave at the *next* intersection. Next week, I will post a general model of which yours is a special case, but different special cases yield different conclusions.
Steve, isn’t the point of the problem that how we behave at the ‘next’ intersection is independent of (and hence identical to) how we behave at this one? So Glen’s ‘special case’ is in fact what is specified in the problem statement?
If there’s any asymmetry whatever between the two intersections, then obviously the driver can exploit this to do better than p = 1/4.
I will admit here to never having found any of the arguments for anything other than 1/2 even vaguely plausible, which either means I saw through them immediately, or don’t entirely understand them.
Optimal stategy: Bring many boxes of raw meat drenched in dinosaur pheremone. Turn right at every street. You people and your fractions.
GregS, I seem to recall Professor Calvin engaging in research on the subject.
I never really saw any paradox in this problem at all, merely
faulty reasoning on the part of Albert.
It’s true that if you want to optimize for the first
intersection, the algorithm should be “never turn”. He saw that
this will be a bad idea, for obvious reasons. So, instead, he
opted for a weighting that would handle the first intersection
better without obviously dooming him. However, he didn’t bother
optimizing for the second intersection, and he’s neglecting the
effects of applying the same algorithm to the second
intersection. Since the goal is to correctly get past BOTH, this
is a clear mistake.
As in the case of “always go straight”, any strategy that
(correctly) takes into account that he’s likely at the first
intersection, but seriously lowers his chances when he reaches
the second is likely to be a bad one.
Ahh, so you have access to coins of various weights. Every day, you should bring one coin with you. At each intersection, reach for the coin. If you find it, throw it out the window and drive straight. If you don’t find it, turn right. 100% success.
A possible framework for thinking about this problem is to suppose that there are really two different Alberts: Albert at the office (AAO); and Albert behind the wheel (ABW). ABW behaves in a manner that while probabilistic, is predictable to AAO. AAO therefore, must design a mechanism that, given ABW’s behavior, will get AAO home with the highest probability.
The paradox stems from considering two distinct types of behavior exhibited by ABW. In the first, ABW is told to turn right with some predetermined probability, call it X, set by AAO, and ABW DOES EXACTLY AS HE IS TOLD. If ABW does exactly as he is told, AAO should set X equal to 1/2, which gets AAO home with probability 1/4. This is equivalent to Strategy C as described by Professor Landsburg.
In the second type of behavior, ABW has his instructions from AAO, but he also has the ability to DECIDE FOR HIMSELF the best strategy for getting himself home. Given that the instructions from AAO require him to turn right with probability X, ABW knows that he will be at intersection at First Street with probability 1/(2-X) and the intersection at Second Street with probability (1-X)/(2-X). Since ABW can decide for himself the best strategy for getting himself home, he can choose a strategy that has him turn right with probability Z. After some math, we can show that ABW will choose Z equal to (2-X)/2. If X is equal to 1/2, ABW will choose Z equal to 3/4 as was the case in Strategy D described by Professor Landsburg. Further, as described by Professor Landsburg, if AAO chooses a strategy of X equal to 1/2 when ABW chooses a strategy of Z equal to 3/4, ABW gets home with probability 3/16, which is lower than his probability of getting home if he were he to do exactly as he is told.
But the story isn’t over yet, because if AAO knows that ABW will choose Z in response to his choice of X, then it benefits AAO to take this into account when choosing X. For instance, if AAO chooses X equal to 1, then substituting 1 into ABW’s solution for Z above gives us Z equal to (2-1)/2, which is equal to 1/2. This choice of X gets ABW home with probability 1/4. QED
I agree with those who argue that this is not a paradox but an error on Albert’s part. His error consists of failing to distinguish between two different probabilities, the probability of turning right at 1st Street and the probability of turning right at 2nd Street. Obviously the optimal choices are p1 = 0 and p2 = 1. Since Albert can’t know whether he is choosing p1 or p2, it is a mistake to act as if he can. It is suboptimal to choose a joint value for p1 and p2 as if they can be chosen distinctly when they cannot. Albert should recognize this and, quite rationally, not try to do something he is incapable of doing. It seems to me, therefore, that this isn’t really an example of time inconsistency, because t.i. arises precisely because the chooser DOES know which decision node he’s at at all times.
BTW, I may be committing an elementary mistake, but it seems to me that the probability of being at 1st Street is (1-p). My thinking is this: Albert is either at 1st or 2nd, so the probability of being at 1st is 1 minus the probability of being at 2nd. Since the probability of being at 2nd is p (or, strictly speaking, p1), it seems that the probability of being at 1st is therefore (1-p). When I set up the problem in this way, I get a value of p* equal to about .243 when Albert tries to outsmart himself, so he still does worse but not by so much as you’ve said.
Sandy,
No, it is 1/(2-p). If he has N trials, he will show up at the first intersection N times and the second intersection N(1-p) times for a total of N(2-p) intersections. Thus, he can deduce there is a likelihood of N/N(2-p)=1/(2-p) that he is at the first intersection and (1-p)/(2-p) that he is at the second.
Neil,
Thanks for your clear presentation. My initial puzzlement arose from the fact that it seemed that the state of being alive had to convey some information to Albert, but I did not work out the answer carefully. I’ve tried to do that in the next paragraph. If this approach to the problem doesn’t seem useful or sensible to you, then this is the place to stop reading this comment.
The probability that Albert is at 2nd Street is the probability that he made a decision at 1st Street, conditional on the fact that he is alive. Call this posterior probability Pr(D|A), where D refers to the state of having made a decision at 1st Street and A refers to the state of being alive. Bayes’s theorem tells us that this posterior probability is equal to Pr(A|D)Pr(D)/Pr(A). We know that Pr(A|D) = (1-p) [which I mistook for Pr(D|A) in my previous comment]. What we need are prior probabilities for A and D. The possible states in which Albert is alive are (i) he hasn’t yet made a decision [with probability 1 – Pr(D)] or (ii) he has made a lucky decision [with probability (1-p)Pr(D)]. Therefore, it seems reasonable to propose that the prior probability assigned to the state of being alive is Pr(A) = [1 – Pr(D)] + (1-p)Pr(D), so that Pr(D)/Pr(A) = (1-p)Pr(D)/[1 – pPr(D)]. IF we consider it equally likely ex ante that Albert has or has not yet chosen, then the prior probability on that event is Pr(D) = 1/2, in which case Pr(D|A) = (1-p)/(2-p) = the posterior probability of being at 2nd Street.
Hmm. I’ll have to think about this. I’ve always, like Descartes, taken the posterior probability that I am alive as equal to one.