Monty Hall Problem

I would imagine most people who would bother reading a blog like this are very familiar with the Monty Hall problem. Of course, in the interest of completeness, I'm going to state the problem here:
You are on a game show where you must choose between three doors. Behind one door is a prize, and the other two doors have nothing behind them. After you select a door the show host, Monty, will open one of the doors that has nothing behind it. Monty, of course, knows exactly where the prize is and is always able to select such a door. You are then given the option to select a new door. What is the optimal strategy to select the door with the prize?

Given a bit of analysis, its not too hard to determine the optimal solution, though some people have a hard time wrapping their minds around it. However, opponents of the standard solution often point out an unstated assumption, and that is that Monty will always offer you the choice to switch. If Monty gets paid based on contestants not finding the prize, then perhaps Monty would rather only offer the switch to contestants who correctly guessed the prize door. In this case, it is clear to see that you should never switch. Of course, perhaps Monty will occasionally offer the switch to a contestant who is wrong, just to mess things up a bit.

After thinking about this, I came up with a extension problem to work out:
Suppose Monty will offer the switch with probability x to a contestant who picked the right door, and will offer the switch with probability y to a contestant who picked the wrong door. Assuming the contestant watches the show frequently enough to determine the values of x and y, what is the optimal solution for the contestant? What are the optimal values of x and y for Monty?

If you feel like working this out yourself, I encourage it. Its not a particularly hard problem, and the answer is very intuitive.

Anyway, I'm going to post the answer here now.

Suppose the contestant decides he will accept the switch with probability c. c cannot depend on the contestant getting the right door, as the contestant doesn't know, however, c can depend on x and y.

The probability the contestant wins is given by the sum of two terms. First, the probability they got the wrong door (2/3), times the probability of being offered the switch (y), times the probability they accepted the switch (c). The second term is in two pieces, the probability they got the right door (1/3) times the probability they were not offered the switch (1-x), and the probability they got the right door (1/3), were offered the switch (x), and refused it (1-c).

Adding it up, the probability the contestant wins is given by:
P=1/3(2yc+(1-x)+x(1-c))

P=c/3(2y-x)+1/3

Since this is just a linear function of c, it will be maximized at the endpoints, either c=0 or c=1, depending on if 2y-x is positive. If Monty offers "good switches" (y) more than half as often as he offers "bad switches" (x), then you should switch, otherwise don't. If 2y-x=0, then either switch or do not, it doesn't matter.

The optimal solution for Monty is to make sure 2y-x is not positive, because he has no way to reduce P less than 1/3 (the contestant can always choose c=0), but if 2y-x is positive then P will be greater than 1/3 with c>0.

Finally, note that in the original case, if Monty always offers the switch then x=y=1, and the optimal solution is to switch. c=1 gives that P=2/3, whereas never switching gives P=1/3. So this method also solves the original Monty Hall problem.

No comments: