Balls In A Bag

I have been pretty consistent in posting just once a week, and thats mostly just been so I don't run out of blogging material, but Matt recently solved a problem that we have been working on for nearly 3 years so I really wanted to post the solution as soon as I was able to work out the wrinkles. This is the problem as I first learned it on the forums at xkcd:
Consider you have a bag filled with N balls, each with a different colour. At each step you will randomly select a ball from the bag, look at its colour, and select another ball from the bag, and repaint the second ball to the colour of the first one. After doing that, place both balls back into the bag and begin a new step.
After doing this enough times, it will eventually happen that all the balls in the bag will have the same colour. On average, how many steps does it take for this to happen?

Doing the problem "properly" is very difficult, and I would only suggest it if you are crazy. However, solving it in the cases of N=2,3,4 is not particularly hard, and from that you will be able to guess what the answer should be.

6 comments:

Lionel Brits said...

N! good night :)

Kory Stevens said...

Probably a reasonable first guess, however, I can see that you haven't even solved the N=2 case if that is actually your guess.

Surprisingly, the actual answer is smaller than exponential in N.

McAnerbot said...

I think I did the math right and ended up at N^2 (that is, a guess after looking at N=2,3,4 and THINKING that I did the math right). Which, you know, highly suspicious.

Kory Stevens said...

You probably got it right, but you either typoed to brain-oed, because N^2 isn't quite correct. Anyway, I'll put up the answer in the next day or so.

McAnerbot said...

O(N^2) I meant.

Since you know, I'm totally not REVEALING the answer. Also I hope you have a convincing proof because I really don't want to invert a 6x6 matrix to do N=5 (although i did already generate the matrix...)

Yay for stochastic systems.

It is interesting to see if you consider the arbitrary N case, how the function that is 'how many steps on average it will take to remove the NEXT color' looks.

Since clearly going from 2 colors to 1 is probably the vast share of total steps. Also for large N the first few colors to 'die out' will all take on average 1 step until the number of colors is significantly less than N.

I didn't invest much thought into it but it looks like inductioning it won't be pretty to prove the behavior.

Kory Stevens said...

Yeah, I think you almost hit on something when you talk about how long it takes for the "next colour to be removed", but I'm not sure.

Getting the general formula and trying to prove it by induction on N is basically impossible, there is no good way to relate the case K to the case K+1.