Forward Is Backward

Some time ago, I had presented Jon with the balls in a bag puzzle, and he had suggested the following reverse puzzle:
Consider a bag with N red balls. At each step, you select two balls from the bag and if they are the same colour, you repaint one of them a new unique colour. If they are different, then do nothing. Either way, place the two balls back in the bag and begin a new step. Given long enough, eventually all the balls in the bag will be of different colours, on average how many steps does it take?

Its clear that this is some sort of reverse of the original problem, however, it is not clear that this is actually the "best" reverse. One could also give the newly coloured balls "random colours from the list of N colours" instead of new unique ones, but it turns out this problem is much more awful.

Anyway, I'm going to just solve this reverse problem on the spot, it seems like the thing to do. Let F(k) be the average number of steps it takes to get from a state with k red balls to a state with one red ball. Note that knowing the number of red balls totally specifies a state, since all non-red balls are of unique colours. F(1)=0 trivially, and we seek F(N), the state with N red balls. The probability that we move from a state with k red balls to a state with k-1 red balls (I will call this P(k)) is the chance that we draw exactly two red balls from the bag, that is:
P(k)=k/N*(k-1)/(N-1)

Of course, if something happens with probability p, then on average we need 1/p tries to have it happen once (I proved this back in the post where I gave the solution to the prisoner problem). Naturally then, we have:
F(k)-F(k-1)=(N-1)N/(k-1)k

and since F(1)=0, we can just express F(m) as F(m)-F(m-1)+F(m-1)-F(m-2)+...+F(2)-F(1), to give
F(m)=&Sigma N(N-1)/k(k-1)

summing k from 2 to m.
Therefore, for F(N) we have:
N(N-1)&Sigma 1/k(k-1)
=N(N-1)&Sigma [1/(k-1)-1/k]

summed k from 2 to N. This series telescopes, the terms look like 1/1-1/2+1/2-1/3+1/3-1/4+...-1/(N-1)+1/(N-1)-1/N. The final result is then that:
F(N)=N(N-1)[1-1/N]
=N(N-1)[(N-1)/N]
=(N-1)^2

Weird.

Alright, so this (much easier) problem has the same answer as the "forward" ball problem. I guess in some sense it is the correct reversal of the initial problem, since it has the same answer. I would love to find a way to show that these two problems have the same answer (that does not rely on using the fact that they both give (N-1)^2), so that way we can solve the harder one just by doing the easier one. Anyway, thats all I got here, if you can find a way to show that these problems are equivalent, feel free to give it in the comments.

Its All Red

Time to post the solution to the balls in a bag problem from last time. The solution actually gets pretty crazy, and I actually had to make up a tiny amount of new math (new to me, anyway) in order to justify the solution that Matt came to me with, but all in all it was a good adventure.

First of all, we are going to select a single colour of ball from the bag (let it be called "red") and only watch the progress of that colour. If red is ever eliminated we will declare the experiment a failure and try again, if red is ever the only colour left we ask how many steps it took. Naturally this means that only 1/N of our experiments even end in success, so I will have to account for that at the end, but we will cross that bridge when we come to it. Next, let us define a function F(k,m) as the probability that we are in a state with k red balls after m steps. Naturally F(1,0)=1, F(k,0)=0 for k &ne 1, and the asymptotic behaviour of F in m is
limm->&infinF(N,m)=1/N
limm->&infinF(0,m)=(N-1)/N
limm->&infinF(k,m)=0 for all other k

In words, this is that our initial condition has exactly 1 red ball, and in the long run we will either have all red balls (probability 1/N) or no red balls (otherwise). It is worth noting that we expect that there are no experiments that run forever (that would be limm->&infinF(k,m)=nonzero for k neither N nor 0), that have a nonzero probability. This is to be contrasted with the chance that the expectation value of the path length being infinity, which I have not ruled out yet. Of course, summing F(k,m) over k always gives 1.

Alright, now that we have that, we need to find the recursion relation on F. If we are entering the (m+1)th step, the chance that we get to having k red balls is given by:
F(k,m+1)=[(k-1)/N]*[(N-(k-1))/(N-1)]F(k-1,m)
+[(N-(k+1))/N]*[(k+1)/(N-1)]F(k+1,m)
+[k/N*(k-1)/(N-1)+(N-k)/N*(N-k-1)/(N-1)]F(k,m)

that it, the chance we drew a red then non-red from k-1, plus the chance we drew a non-red then red from k+1 and the chance we drew two red or two non-red from k. This can be written as:
F(k,m+1)=1/(N(N-1))*
[(k-1)(N-k+1)F(k-1,m)+
(N-k-1)(k+1)F(k+1,m)-
2k(N-k)F(k,m)]
+F(k,m)

Why I separated out that F(k,m) term will be apparent later.

Let us now make a series of vectors, Vm with N-1 elements so that the kth element of the vector Vm is given by F(k,m). Then the recursion relation gives us
Vm+1=MVm

and the matrix M is specified by the recursion relation. Note that V only keeps track of having 1 through N-1 red balls left, so information "leaks" out the ends of it and the sum of the elements of V is not conserved. It is also worth noting that Vm tends to the zero vector as m tends to infinity, no matter how the initial data for V looked (that point will turn out to be extremely important later, make sure you convince yourself of it now).

Ok, let us try to get a feel for the matrix M that iterates on V. The diagonal of M is the term that keeps F(k,m) at F(k,m+1), that is to say that the kth element on the diagonal is 1-2k(N-k)/(N(N-1)). There are also off-diagonal terms adjacent to the diagonal, but everything else is zero. For notational convenience, I will denote N(N-1)=Z. For some large N, the first part of the matrix looks like:
[1-2(N-1)/Z, 2(N-2)/Z, 0, 0, 0, ....]
[(N-1)/Z, 1-2*(2(N-2))/Z, 3(N-3)/Z, 0, 0, ....]
[0, 2(N-2)/Z, 1-2*(3(N-3))/Z, 4(N-4)/Z, 0, ....]
....

The kth diagonal element is 1-2*k(N-k)/Z, above and below the kth diagonal element is k(N-k)/Z. Every column adds to 1, except for the first and last columns, which represent probability leaking out the ends.

Now that we have this, what the heck do we do with it. Well, we want to find the probability that we end on the mth step, times m, summed over all m. That gives us the expectation value of what step we end on. The probability that we end on the mth step is the last element of Vm-1 (the penultimate state) times (N-1)/N (the chance we draw a red ball) times 1/(N-1) (the chance we draw the final non-red ball). So we want
&Sigma m/N [0,0,0,....0,1]Vm-1

summed over m. Actually this is not quite right, since only 1/N of the experiments even end the way we want, we must also multiply by N. Anyway, we know that Vm-1=Mm-1V0, and V0 is given as the column vector [1,0,0,0,...] (which I will write as [1,0,0,0,...]T, for lack of ability to write column vectors).

To put it together, we seek to find
&Sigma m [0,0,0,....0,1]Mm-1[1,0,0,0,...]T

summed over m=0 to &infin of course.

Much of that is just constants relative to the sum, and we really just need to evaluate
&Sigma m Mm-1

Now, if instead of a matrix M, we had a real number, x, we could easily use
d &Sigma xm/dx = &Sigma m xm-1

so we would like to say:
&Sigma m Mm-1=d &Sigma Mm/dM

Ok, so can we take the derivative with respect to a matrix? Well, its not at all uncommon to do that in physics, we often consider scalar function S(M) as define the derivative (dS/dM)ij as the derivative of S with respect to the ijth component of M. This means dS/dM is a matrix. Following this up, we would expect that if we have a matrix function T(M) with two component indices (such as M2 or M-1), then dT/dM would have four indices. If we define the derivative dT/dM to have two indices by using the definition
dT/dMij=&Sigma (d/dMik)Tkj

summing over k (as we do in matrix multiplication), then using
(d/dMij)Mkl=&delta ik&delta jl

it is easy to show that
dMn/dM=nMn-1

for all integers n (to get the inverses, you must use that d/dM (M-1M)=0).

Alright, se we can now say with confidence that
&Sigma m Mm-1=d &Sigma Mm/dM

So now we just need a way to evaluate &Sigma Mm. Again, if this were a real number x, we would simply say that
&Sigma xm=1/(1-x)

(as long as |x|<1) so how can we do something similar for a matrix?

First of all, let us consider the pth partial sum, &Sigma Mm summed m from 0 to p. It is easy to see that
(1-M)&Sigma Mm=1-Mp+1

so as long as we have (1-M) being invertible and limp->&infinMp+1=0, then we can say that &Sigma Mm=(1-M)-1

Remember when I said that Vm tends to zero no matter what the initial data is? this fact can prove both of the statements that we need. A matrix is really just defined in terms of how it acts on vectors (this is an important point, understand it), thus if a matrix obeys the relationship limp->&infinMpV=0 for all vectors V, then it means that limp->&infinMp=0, since it really apparently is the zero matrix. We knew from before that this is true for our matrix M. I'll admit that I have not really proven it, but it is apparent from the physics of the problem (people who know me will understand that when I say something is "apparent from the physics of the problem", I really mean "I don't know how to prove this, but it is so obvious from the way things are set up, I feel no need to").

So we can now say with confidence that
(1-M)&Sigma Mm=1

when we take the sum to infinity. Now, can we invert the matrix (1-M) (which I will call W)? Well, first let us assume that W has no inverse. This means that W must have a null space, that is to say there is a nonzero vector U such that WU=0. This naturally implies that (1-M)U=0 so MU=U, U is an eigenvector with eigenvalue 1. But this means that MpU=U for all p, and so that is even true in the limit of large p, but for large p we have Mp must vanish. This is a contradiction proving that W (=(1-M)) is invertible.

So, earlier, we had seen that the answer to the question we seek was given by
&Sigma m [0,0,0,....0,1]Mm-1[1,0,0,0,...]T

which we can now rewrite as
[0,0,0,....0,1]W-2[1,0,0,0,...]T

That is, we seek the inverse of the matrix (1-M) and we want the element in the last row and first column in the square of said matrix.

W looks like
[2*(N-1)/Z, 2(2-N)/Z, 0, 0, 0, 0,.....]
[(1-N)/Z, 2*2(N-2)/Z, 3(N-3)/Z, 0, 0, 0,.....]
[0, 2(N-2)/Z, 2*3(N-3)/Z, 4(N-4)/Z, 0, 0,.....]
.....

Z being N(N-1) as before. We can actually just factor out 1/Z now, and get
[2*(N-1), 2(2-N), 0, 0, 0, 0,.....]
[(1-N), 2*2(N-2), 3(3-N), 0, 0, 0,.....]
[0, 2(2-N), 2*3(N-3), 4(4-N), 0, 0,.....]
[0, 0, 3(3-N), 2*4(N-4), 5(5-N), 0,.....]
.....

as the expression for WZ.

Now, if we want the last element in W-2, we want the last row and the first column of W-1 and we need no other information about W-1 except that it exists (and it does). Find a row vector that gives 1 when it acts on the last column of W and gives 0 for all other columns (you might want to write W on paper now, to get a feel for it). We find that the last row in the matrix W-1 is [1,2,3,4,5...,N-1], you might want to try out W in a few simple cases of N=3,4,5 to convince yourself of this. We also need the first column in the matrix W-1, that is, we need a column vector that gives 1 when it acts on the first row of W, and 0 when it acts on any other row. The vector is [Z/N, Z/2N, Z/3N, Z/4N, ....., Z/(N(N-1))]T. Naturally, the last element of the first column and the first element of the last row agree.

Finally, we multiply this last row and first column to get the element of the last row and first column of W-2 Z/N+Z/N+Z/N+..., and we have a total of N-1 terms, but Z is N(N-1).

So we have (N-1)^2, our final answer. I have a bit more to write about this puzzle, but this seems like quite enough as it is.

Balls In A Bag

I have been pretty consistent in posting just once a week, and thats mostly just been so I don't run out of blogging material, but Matt recently solved a problem that we have been working on for nearly 3 years so I really wanted to post the solution as soon as I was able to work out the wrinkles. This is the problem as I first learned it on the forums at xkcd:
Consider you have a bag filled with N balls, each with a different colour. At each step you will randomly select a ball from the bag, look at its colour, and select another ball from the bag, and repaint the second ball to the colour of the first one. After doing that, place both balls back into the bag and begin a new step.
After doing this enough times, it will eventually happen that all the balls in the bag will have the same colour. On average, how many steps does it take for this to happen?

Doing the problem "properly" is very difficult, and I would only suggest it if you are crazy. However, solving it in the cases of N=2,3,4 is not particularly hard, and from that you will be able to guess what the answer should be.

Equilateral Roads

Time for the solution to the Houses on a Square problem, in spite of the fact that all (both, it seems) of my readers have already solved it.

Not much is gained by building up to the solution, so I'll just sort of blurt it out. I should use pictures here, but rather than figure out how pictures work on blogger, I'd prefer to just use words:
Place a node at the center top and center bottom of the square, connect the top node to each of the top two houses with roads, and connect the bottom node to the bottom two houses with roads and connect the top node and the bottom node to eachother. I assume that you have a picture in your head that looks like a capital i. Now we are going to move the top and bottom nodes, and the edges that terminate on them will move with them. Move the top node down a distance x, and the bottom node up a distance x. Now it is simply a matter of finding the x which optimizes the solution.

The total length as a function of x is
L=1-2x+4 &radic (1/4+x2)

1-2x is the length of the center piece and &radic (1/4+x2) is the length of one of the diagonals. Maximizing L(x), we find the derivative is
L'(x)= -2+4*x/&radic (1/4+x2)=0
x=1/(2 &radic 3)

Giving a value of L=1+&radic 3. At this point we haven't exactly shown that this is the optimal solution, however, we have shown that it is the optimal solution of this form. Before I go into more about that, first let us consider the rectangular case.

Consider a rectangle, with side lengths A and B. We will assume that A>B for defeniteness, and we will visualize that the longer side is the vertical one (I do this so that when I refer to "the top side" or something, we all have the same picture in mind). Trying the same form of solution as last time, we can see that we want our middle line to be vertical rather than horizontal, it will save more space that way. Letting x once represent the amount that the top node is down from the top, the total road length is given by
L=A-2x+4 &radic (B2/4+x2)

Extremizing this gives that x=B/2 &radic 3, and L=A+B &radic 3 (we can see that the length is minimized when A>B rather than B>A, as anticipated). Note that the ratio of x to B is the same in the rectangle case as it is in the square one. This means that the angle that the diagonal line hits the vertical line at is independent of the measurements of the rectangle. In fact, the vertex where the lines meet has angles entirely of 120 degrees. This is not surprising, because if we just had 3 houses we were trying to connect, we would place a point somewhere in the middle of them and have the lines connecting them to the point meet at 120 degrees. There is an exception if the triangle has an internal angle of greater than 120, then it is optimal to just draw straight lines (as the relevant point is outside of the triangle).

Actually, this is a feature of any solution, all angles involved in connections are at least 120 degrees. If you draw a solution with an angle of less than that, you are better to add a point and draw 3 lines connecting to it. I found a paper that talks about this problem at http://www.denison.edu/academics/departments/mathcs/herring.pdf. The full proof is nontrivial, but one conclusion is that to connect N nodes together, you will never need more than N-2 extra nodes, so for the square there are no solutions that involve needing 3 extra nodes, 2 is optimal.

You can also see that for any regular polygon with 6 or more sides, the internal angles are 120 degrees or larger, so it is optimal to just connect along the edges. A pentagon is a bit funny though, since the internal angles are 108 degrees, you do need to find a solution with extra verticies on the inside. The actual solution looks sort of funny, and isn't hard to find after some thinking, anyway, thats all I got on this problem.

Houses On A Square

Guess we once again find ourselves in the "new puzzle" part of the cycle. This particular one I heard a long time ago, can't recall when exactly, but I was recently reminded of it by James:
Consider 4 houses located at the corners of a square with side length 1. We must build roads so that one can drive from any house to any other house along the roads. What is the minimum total length of road that needs to be built?

Just to get you started through the first few layers of solution, you can easily solve the problem with road length 3, connect 3 of the houses along sides of the square, and you have connected them all. You can do better with a 2 &radic 2 solution, by drawing in the diagonals of the square, this will connect all the houses to eachother by roads. As amazing as it might seem, you can do even better than that.

As a generalization of this problem, try solving it on a rectangle. You can further generalize it to other shapes, and even arbitrary distributions of points, but I wouldn't suggest it.

Triangular Hypermarbles

Due to some discussion in the solution to the breaking marbles puzzle, I have decided to make a post of the extension problem. Now let us suppose we have K marbles and N floors, what is the minimum number of moves it takes to find the critical floor, assuming you are minimizing the worst case scenario.

For some definiteness, we will assume that if you eliminate all the floors but the end one, you must still check the end one. This is so that, say, in the 2 marble 3 floor case, you can do it in 2 steps optimally by checking floor 2, then checking floor 3 if it does not break, and checking floor 1 if it does (specifically you cannot just assume that if it didn't break on floor 2 it will then break on 3, and that if it broke on floor 2 it can't break on floor 1). The reason for this "handling of the endpoints" is so that in the two marble case, you cannot do better than Triangle(M) floors with M steps. Triangle(M), the Mth triangle number, will be denoted TM, and is given by M(M+1)/2.

Now, lets try the 3 marble case. Let us suppose we have N floors, and we have a solution that works in never more than 5 steps (I'll generalize 5 to M later). Then with our first move, we might as well go to floor 11. If we go any higher, then if it breaks we cannot solve the remaining case with over 10 floors with only 2 marbles and 4 steps (2 marbles, 4 steps, 10 floors is optimal), and nothing is to be gained by going any lower. Next, we move 7 more floors, since if it breaks we can solve through the 6 we skipped with 2 marbles and 3 steps. Next, we will move up another 4 floors, so that if it breaks, there are 3 floors unchecked which we do can with the remaining 2 steps and 2 marbles. Next up 2 more floors, because we will only have 1 step left (now steps is more limiting than marbles, one gets the impression that when you have at least as many steps as you have marbles, binary search will be best). And finally we get 1 last step.

The result is 11+7+4+2+1=25 (looks like powers of 2 at the end, neat), this is better written as 10+1+6+1+3+1+1+1+0+1, that is T4+1+T3+1+T2+1+T1+1+T0+1. Which is the same as P4+4+1 (PK is the Kth pyramidal number, the sum of the first K triangle numbers). The general statement, which is easy enough to verify by induction, is that with 3 marbles and M steps, you can do at most PM-1+M floors. We will denote this as F(3,M), the maximum you can do with 3 marbles in M steps.

We already know that
F(1,M)=M
F(2,M)=TM=M(M+1)/2

And now we know
F(3,M)=PM-1+M
=(M-1)M(M+1)/6+M
=(M2+5)*M/6

The formula PM=M*(M+1)*(M+2)/6 can be found either by induction or by realizing the connection between Pascal's triangle and the generalized triangle numbers (more on that in a moment).

Next we want to try 4 marbles. With M steps, we cannot go higher than F(3,M-1)+1 for our first step, and there is no sense in going lower. We then move up another F(3,M-2)+1 steps and so on until we are out of moves. The result is that
F(4,M)=&Sigma F(3,M-i)+M

The sum being i from 1 to M-1. This argument will hold quite generally, to get the result of
F(N,M)=&Sigma F(N-1,i)+M

i now being summed from 1 to M-1, or if we define F(N,0)=0 (the most number of floors you can do with N marbles and zero steps) we can start i from 0.

Before going further into F, I would like to consider another function. Let us define the "hypertriangular numbers" H(N,M) so that each one is the sum of the first M previous ones. Specifically,
H(0,M)=1
H(N,M)=&Sigma H(N-1,i)

i being summed from 0 to M-1. These generalize the triangle numbers cleanly, H(1,M)=M+1, H(2,M)=TM+1, H(3,M)=PM+1. One can easily confirm that
H(N,M)=H(N-1,M)+H(N,M-1)
H(N,0)=1
H(0,M)=1

So these numbers basically form up Pascal's triangle, so that
H(N,M)=(N+M)!/(N!M!)

Note that in the definition, N and M were not symmetric, but in the end they are.

I'll come back to that function later, but the connection between hypertriangle numbers and Pascal's triangle will be important in a bit. Recall F's recursion relation is:
F(0,M)=0
F(N,0)=0
F(N,M)=&Sigma F(N-1,i)+M

(i summed from 0 to M-1). One can start filling this in, its slightly illustrative to do this. Labeling x-axis as M value and y-axis as N value, we get (apologies for formatting issues):
0 0 0 0 0 0 0 0 0 0
0 1 2 3 4 5 6 7 8 9
0 1 3 6 10 15
0 1 3 7 14 25
0 1 3 7 15 30
0 1 3 7 15
0 1 3 7
0 1 3 7

and so on. Each number is the sum of the numbers in the row above it from 0 to M-1 and then adds the number M. This results in the first non-zero row coming up just as M, the second row comes up as triangle numbers, but then the third row comes up funny. Note that when N gets bigger than M, the numbers just turn into 2M-1, somewhat demonstrating the fact that having more marbles than steps means the marbles breaking isn't an issue, and then you get the result of a binary search.

One can construct Pascal's triangle in a similar way, starting with 1's and letting each element be the sum of the row above it, from that one can prove the relation H(N,M)=H(N-1,M)+H(N,M-1) (I'm going to start calling this the Pascal relation). In this case, one can find by inspection of the graph that F will obey a similar linear relation:
F(N,M)=F(N-1,M)+F(N,M-1)-F(N-2,M-1)

It isn't exactly clear to me how to prove this formally, but I am confident in it. You should confirm for yourself that it is true, it is somewhat pivotal. Anyway, we now have a semi-Pascal relation that F obeys, and now our relation is linear in F. At this point, I asked myself, "what would Matt do?" and got the only obvious answer, define the derivative function:
D(N,M)=F(N,M)-F(N-1,M)

I suppose its the derivative with respect to N, but its one nonetheless.

Using our semi-Pascal relation on F, we can find a similar one on D, rewrite the F one as:
F(N,M)-F(N-1,M)=F(N,M-1)-F(N-2,M-1)
=F(N,M-1)-F(N-1,M-1)+F(N-1,M-1)-F(N-2,M-1)
=> D(N,M)=D(N,M-1)+D(N-1,M-1)

This is almost a Pascal relation on D, but things are shifted over slightly, to fix this we will define a new function K according to
K(N,L)=D(N,N+L)
K(N,M-N)=D(N,M)

I'm really just defining L=M-N and considering a function of N,L rather than N,M (this is logical, since D sort of "breaks down" if N>M, thats were F stopped varying). Anyway, from D's relation, we can find that K obeys the relation:
K(N,M-N)=K(N,M-N-1)+K(N-1,M-N)
K(N,L)=K(N,L-1)+K(N-1,L)

Hurrah, K obeys the Pascal relation, now we need to find its boundary conditions. Trying to check K(0,L) immediately causes problems involving D(-1,L), so lets try K(1,L)
K(1,L)=D(1,L+1)=F(1,L+1)-F(0,L+1)
=L+1

Good, next lets check K(N,0)
K(N,0)=D(N,N)=F(N,N)-F(N-1,N)

Uh oh, this could put a stop to all of our fun right here. Fortunately, we can handle this, it turns out (one can inspect the chart) that F(N,N)-F(N-1,N)=1. We can prove this using the earlier recursion relation for F to show that
F(N,N)=F(N-1,N)+F(N,N-1)+F(N-2,N-1)
F(N,N)-F(N-1,N)=F(N,N-1)+F(N-2,N-1)

Next, we must handle F(N,N-1) by considering the "physics" of the problem. Consider F(N,M) when N>M, so you have more marbles than you have steps remaining. In this case, the extra marbles are irrelevant, so you can just ignore their existence, F(N,M)=F(M,M) when N>M. This is also present in the table of F. Thus we have F(N,N-1)=F(N-1,N-1), therefore
F(N,N)-F(N-1,N)=F(N-1,N-1)+F(N-2,N-1)

So apparently F(N,N)-F(N-1,N) is constant in N (shifting N by 1 has no effect, so shifting it by 100 won't either). Evaluating when N=1 gives us

F(N,N)-F(N-1,N)=F(1,1)-F(0,1)=1-0=1

Proving that K(N,0)=1.

Ok, current data on K:
K(N,L)=K(N,L-1)+K(N-1,L)
K(1,L)=L+1
K(N,0)=1

We can extend K to K(0,L) using its Pascal relation on N=1:
K(1,L)=K(1,L-1)+K(0,L)
L+1=L+K(0,L)

Demonstrating that it is consistent to take K(0,L)=1. so, K has the same initial conditions as H (the hypertriangle function) and the same Pascal relation, so K=H must be the case.
K(N,M)=(N+M)!/(N!M!)

Now we can find D as D(N,M)=K(N,M-N)
D(N,M)=M!(N!(M-N)!)

So it seems D is just M choose N, naturally if N>M then D(N,M)=0.

Finally we are ready to construct F. D was the "derivative" of F, so F is the "integral" of D. Specifically,
F(N,M)-F(0,M)=&Sigma D(i,M)

The sum running i from 1 to N. We know F(0,M)=0, and D(N,M) is zero if N>M, and so we have
F(N,M)=&Sigma M!/((M-i)! i!)

Summing i from 1 to min(N,M). Note that the summation starts at 1, where a typical summation over choose functions would start at 0.

So when N is less than M, F(N,M) is the sum of the first N elements in the Mth row of Pascal's triangle, minus 1. In words, one can consider a set of M elements and ask how many nonempty subsets with at most N elements are there. I'm not sure if there is a clean form for that.

Finally, if N>M, F(N,M)=F(M,M), the number of nonempty subsets of size at most M in the set of M elements. A set with M elements has 2M subsets, and one of those is the empty set. So, F(M,M)=2M-1, as one would expect from a binary search solution.

This completes the solution for how many floors you can cover with N marbles in M steps. If somebody asked your the reverse (and more standard) problem of solving K floors with N marbles, you simply look for the minimal value of M such that F(N,M)>K, and that gives you the worst case M.

Invisible Ants

Time for the solution to the ants on a line problem. The solution is actually stupidly simple, it really just is about one realization: Two pointlike ants bouncing off eachother is identical to two pointlike ants passing through eachother. At this point, you can easily see that there theoretical maximum is that you have an ant right on the left end initially walking to the right. No matter how the other ants are distributed, it will take exactly 100s for the last ant to fall off.

Pretty dumb puzzle, to be honest, but I thought it was a bit of a cute trick.