Standard Hat Rules Apply: 2011

Weighing Medals

Almost a new year already, apparently I didn't do quite as much blogging this year, as I usually get about 40 posts a year and this year I only have about 30. I choose to blame it on my thesis rather than on my general laziness.

Anyway, I found a new puzzle over at Tanya Khovanovas Math Blog that I thought was sort of neat:

There is a collection of medals and one of them is known to be fake. There is 1 gold medal, 3 silver medals, and 5 bronze medals. A fake medal weighs slightly less than the corresponding real medal. All real medals of the same type weigh the same amount, but medals of differing types might not weigh the same amount. Using a balance scale and two weighings, find a strategy that is guaranteed to identify the fake medal.

Standard balance scale rules, of course. The balance scale can be envisioned as having two places to put stuff, and then you push a button and it will tell you either "left side heavy", "right side heavy", or "balanced". The button will only work two times and then the scale will break.

Its actually a pretty simple puzzle if you have done the other balance scale problems, it only took me a few minutes to get it, but I somehow was very satisfied with the answer.

Finding Coins

Alright, time to finish off the envelope puzzle from last time. This post is going to wind up getting into Bayes' theorem, which has never come up before in my blogging, to my surprise. I suppose I might as well also give a little proof and statment of Bayes' theorem while I'm at it, since its sort of cool.

First though, I want to explicitly demonstrate the solution to the follow-up problem of last time, then I will give a more general derivation using Bayes' theorem. The first thing to remember here, and the general theme of this post actually, is that Bob gained information when he selected a random envelope and found a coin. The explicit two envelope solution of last time served as a good example of this, but it is important enough that I want it stated explicitly. When Bob found a coin in a random envelope, he has to reevaluate his stance on how Alice put coins in envelopes. At first he thought it was 50-50 of zero or two coins, but after finding one, he knew there were two and not zero.

So, I'm going to just give the answer to the follow-up puzzle and then show that it is, in fact, an answer (I will later show that it is 'the' answer). It is for N=4 the scenario is consistent. Before opening an envelope, the cases k=0,1,2,3,4 are all equally likely, and so the expectation of a random envelope is 1/5(0/4)+1/5(1/4)+1/5(2/4)+1/5(3/4)+1/5(4/4)=1/2. A random envelope is worth $1/2, not very surprising. Next, when Bob finds a coin, he knows that k=0 is impossible, and if k had been 1 there would have been only a 1/4 chance of finding a coin, while if k was 4 he was certian, thus it is 4 times more likely that k=4 than k=1. Thus, Bob thinks the values k=0,1,2,3,4 have the relative likleyhoods of 0:1:2:3:4, meaning k=0 has probablity 0, k=1 probablity 1/10, k=2 probability 2/10, k=3 probability 3/10 and k=4 probability 4/10. for a given k, now that a coin has been removed, the expectation value of an envelope is (k-1)/4, so a random envelope now has expectation value 0/4(1/10)+1/4(2/10)+2/4(3/10)+3/4(4/10)=1/2. So we can see N=4 is a solution.

To prove N=4 is the only solution, and to sort of derive it in the first place, we need Bayes' theorem. Suppose we have some an experiment that we are doing, and there are events that can occur, call the events A,B,C... and so on. Now, we can talk about the probabilty of an event A, we call that P(A), and we can talk about the probabilty of an event A given that some other event B has occured, we call that P(A|B). We can of course talk about the probablity that both A and B occur, P(A and B) and such things. It can be seen that the probabilty that both A and B occur is equal to the probabilty that A occurs times the probabilty that B occurs given that A already occured, so we can say:

P(A and B)=P(A)P(B|A)

Of course I could switch A and B in that statement to get:

P(A and B)=P(B)P(A|B)

Giving us that

P(A|B)P(B)=P(B|A)P(A)
P(A|B)=P(B|A)P(A)/P(B)

This last line is the statement of Bayes' Theorem.

So, this theorem can be used to tell us how likely is was that 'Alice put k coins in the envelopes' given that 'Bob found one at random' as long as we can calculate the chance that 'Bob found one at random' given that 'Alice put k coins in the envelopes'. That second one sounds pretty easy to calculate. This is the use of Bayes' theorem, it allows us to switch the order of conditionals which can often make it easier to calculate.

In most practical cases, we have a list of mutually exclusive events A_i and another list of mutually exclusive events B_i such that exactly one A_i and one B_i must occur. Of course, for particular i and j, we can read off Bayes' theorem as:

P(A_i|B_j)=P(B_j|A_i)P(A_i)/P(B_j)

and to find P(B_j) we can find P(B_j|A_k) for each k and then multiply by P(A_k) and sum over k. Essentially we can see that in this case the denominator is serving as a normalizing constant (constant for each choice of j, that is) and that summing both sides of that last equation with resepct to i will give 1 (since exactly one of the A_i must happen).

Now lets try to use Bayes' theorem to solve the follow-up problem. Alice has N envelopes, placed coins in k of them, Bob found a coin. So for our purposes let the event A_i be 'Alice placed coins in exactly i envelopes', and the events B₀ and B₁ are 'Bob didn't find a coin' and 'Bob found a coin', respectively. So we have

P(A_i|B₁)=P(B₁|A_i)P(A_i)/P(B₁)

P(B₁|A_i) is easy, its i/N, and P(A_i) is just 1/(N+1). P(B₁), the chance Bob found a coin, is the numerator summed over i, it is Σ i/(N(N+1)), which is just 1/2. So we have

P(A_i|B₁)=2i/(N(N+1))

Now, what is the expectation value of an envelope given that Bob found and then removed a coin? It is

Σ (i-1)/N 2i/(N(N+1))

Which is the chance of finding one of the remaining i-1 coins time the chance there were i coins (given that Bob found one) summed over i. We can rewrite this as

2/(N²(N+1))Σ (i²-i)

That sum evaluates to N³/3-N/3, so we now have

2(N-1)/3N

as the expectation value of a random envelope given that Bob has found a coin. If we set this to 1/2 we get N=4. Somewhat interesting is that in the limit of large N this tends to 2/3, so which is the statement that Bob pushes his expectation of a random envelope upward quite a bit if he found a coin, when N is large the act of removing one is pretty harmless, he probably won't pick that envelope again anyway.

Alright, that was fun, and in my mind was the end of the puzzle, however, in the comments of the initial post somebody showed me that you can classify all the solutions to the puzzle, so lets do that now (because its actually really cool).

First, note that the list of probabilites P(A_i) completely specifies Alices strategy. Once she has placed coins in the envelopes and shuffles them, the only information that matters is how many envelopes have coins in them. She can use whatever method she wants to place coins, but she might as well just whisper to Bob the list P(A_i), that is the only thing that matters. Alright, so we have already seen that

P(A_i|B₁)=P(B₁|A_i)P(A_i)/P(B₁)

P(B₁|A_i) is still i/N, but now P(A_i) is encapsulating Alices strategy. P(B₁) is still the sum of the numerator, that is

P(B₁)=Σ i/N P(A_i)

Actually, thats the average value of the number of envelopes that Alice put coins in divided by N (sort of obviously), we will call this K/N.

So we have

P(A_i|B₁)=i P(A_i)/K

Alright, after having found a coin, the expectation value of a random envelope is now (i-1)/N times the chance there are i coins, summed over i, so that is

Σ (i-1) i P(A_i)/(K N)

If we want this to be equal to the expectation of an envelope before Bob selected one the first time, we must set this equal to K/N, meaning we now have

K²=Σ (i-1) i P(A_i)
=Σ i² P(A_i) - Σ i P(A_i)

That second term on the right is just K and the first term on the right is the expectation of the number of coins in envelopes, squared. Some rearranging gives us

K=Σ i² P(A_i)-K²

The right side is the expectation of coins squared minus the expectation of coins, squared. This is the standard deviation of Alices strategy squared. So a strategy for Alice is a solution to this puzzle if and only if: the expectation value of the number of coins Alice places in envelopes is equal to the standard deviation squared of the number of coins Alice places in envelopes. Neat.

You can confirm explicitly that the 50-50, 0,2 strategy and the 0,1,2,3,4 strategy both have this feature. Anyway, thats basically it for this.

Half And Half

So, I wasn't sure if I should put up a complete solution to the envelope puzzle from last time or just post a partial solution and have follow-up questions, but I have since decided on the latter.

Anyway, a solution to the puzzle (which was also given in the comments, but was my "intended easy solution") is: Alice has two envelopes and flips a coin, filling both envelopes with coins on heads and neither of them on tails. With this, the expected value of a random envelope is $1/2. After Bob finds a removes a coin he knows for a fact the other envelope has a coin, so the expected value of an envelope is again $1/2.

Next a follow-up problem:

Suppose instead of Alice whispering her strategy, you overheard her say "I selected an integer k from 0 to N uniformly at random and placed coins in k of the envelopes." What is the value of N so that the scenario is possible?

Its funny how our brains can work, I was again convinced that this was just impossible, even after knowing the earlier answer, but there is a way to solve this. It even has a unique answer.

Random Envelopes

So, apparently I misread that snake puzzle from last time, the intention was to get to a position where there is only one type of snake, rather than one where there are equal numbers of all snakes. The solution is basically the same, but the "brute force" method of applying AⁿB^mC^k is slightly less effective, since you must prove that you cannot reach any of 45-0-0, 0-45-0, 0-0-45, so hitting on the mod 3 solution is more needed.

Anyway, I found a new puzzle on the xkcd forums, I thought it was sort of cool:

Alice and Bob are having a conversation that you are overhearing. Alice has a collection of N envelopes and says to Bob, "for each of these envelopes, it contains either a dollar coin or nothing, I chose to fill the envelopes at random, what is the expected value of the contents of a random envelope?"

Bob responds, "It depends on the method you used to decide to fill the envelopes. You could have flipped a coin for each one, filling it if you got heads and not if you got tails, in that case the expected value of an envelope is $1/2."

Alice decides to tell Bob the method she used to fill the envelopes, she whispers it to him. "I see," Bob replies, "I will purchase a random envelope from you for the expected monetary value, then." Alice agrees, Bob hands her some money and selects an envelope. Bob finds that the envelope does contain a coin and pockets it.

Alice takes the now-empty envelope from Bob and shuffles it back in with the other envelopes. Alice says to Bob, "what is the expected value of a random envelope now?" To which Bob replies, "it is the same as before my purchase."

What is a possible value for N and random method that Alice could have used to make this scenario possible?

When I first read this puzzle, I was certain it was impossible. There is no way that Bob could have taken a coin away and not changed the expected value of a random envelope. However I was missing something. Obviously not something stupid like "Bob can see which envelope he opened earlier because it is now torn" or something like that, that would hardly make an interesting puzzle. There is a well-defined method that Alice can use to make it so that after Bob finds and removes a coin the expected value of a random envelope has not changed.

There are actually many possible solutions, so when I do post the solution next time it will by no means be a complete listing of all possibilities (there are far too many), try to see if you can come up with at least one possible solution though.

Invariant Snakes

Solution time. Puzzle last time was that weird snake puzzle.

First thing, if you played around with it for a bit, you almost certainly noticed that there was no way to get to 15-15-15. This is probably a good thing, given that the puzzle would have been stupid if you could, but sort of awesome if you couldn't. Playing around with it, you probably noticed that there was some sort of parity thing going on, some hidden quantity is conserved, stopping us from getting to 15-15-15.

For a first solution, I'm going to work through what most people probably did (and was posted in the comments). Consider three operators, A, B, and C, which act on three-vectors (x,y,z) such that

A(x,y,z)=(x+2,y-1,z-1)
B(x,y,z)=(x-1,y+2,z-1)
C(x,y,z)=(x-1,y-1,z+2)

These A,B,C aren't quite matrices, because they aren't linear, but whatever, they are operators. They almost form the generators of an abelian group with the relation ABC=I the identity operator. I say almost because there is a bit of an issue, specifically, note that AB(1,0,2)=A(0,2,1)=(2,1,0), but BA(1,0,2)=B(3,-1,1)=(2,1,0). In the first order we applied B then A and it works fine, but in the second order with A the B the number of snakes ran negative. If we allow the numbers of snakes to run negative, then A,B,C do form the generators of an abelian group. If we manage to prove the statement "you cannot reach 15-15-15 even allowing snakes to run negative", then this proves the statement "you cannot reach 15-15-15 being restricted to nonnegative numbers", so lets just allow the snakes to run negative.

Now, consider a general operator AⁿB^mC^k acting on (13,15,17), you get

AⁿB^mC^k(13,15,17)=(13+2n-m-k, 15+2m-n-k, 17+2k-n-m)

setting that equal to (15,15,15) you get

2n-m-k=2
2m-n-k=0
2k-m-n=-2

which can be arranged to get n-k=4/3 which cannot happen.

This is enough to give us the solution to the original problem, but I wasn't happy with this, I wanted to find the invariant hiding in the bushes. Clearly we have one invariant x+y+z (the total number of snakes) and explicitly checking the operators A,B,C from before we see this is conserved. Lets reconsider what happens when we apply AⁿB^mC^k on just (0,0,0) and see what (x,y,z) we can access. We get

x=2n-m-k
y=2m-n-k
z=2k-n-m

from which we can see x-y=3(n-m). Which tells us that x-y is a multiple of three. So if y=2, then x must be -1,2,5,8 or something, it must be 2 mod 3. This suggest that in general (x-y) mod 3 is the invariant (and by symmetry (y-z) mod 3 and (z-x) mod 3). Checking the operators A,B,C explicitly, we can see that in fact all of those operators do preserve (x-y) mod 3.

Actually, since x+y+z is conserved, it is also the case that (x+y+z) mod 3 is conserved, and so if (x-y) mod 3 is conserved then so is (x-y+x+y+z) mod 3, which is (z+2x) mod 3, but 2=-1 mod 3 so this is (z-x) mod 3. Also (x-y-x-y-z) mod 3 must be conserved, and this is (-2y-z) mod 3 and -2=1 mod 3 so this is (y-z) mod 3. We can see there are really only 2 real conserved quantities here, x+y+z and (x-y) mod 3, the others (y-z) mod 3 and (z-x) mod 3 will be guaranteed conserved if those other two are.

Anyway, if we start at 13-15-17, we have x+y+z=45 and (x-y) mod 3 = 1, so we will be able to reach 15-15-15 about as easily as we could reach 11-13-12, we simply cannot do it because it doesn't conserve the two invariants it needs to.

Snakes On An Island

I sort of wanted to work through the solution to the tic-tac-toe variant from last time, but in truth its probably not very interesting to actually do it. Anyway, when I get desperate for new blogging material I might go back to that. For now, I learned a new puzzle from the people in the math room on KGS:

There is an island with snakes on it. The snakes can be either red, green, or blue. If two snakes of different colours meet, they each become the third colour. The snakes never reproduce or die.

Starting with 13 red snakes, 15 green snakes, and 17 blue snakes, is it possible to get to a configuration with equal numbers of snakes of each colour? Find the steps that would need to happen, or prove that it is impossible.

So, you can reformulate this as balls in three jars, and the legal move is to select a jar and move one ball from each of the other two jars into the selected one. Starting from 13-15-17, try to get to 15-15-15.

Cats And Dogs

Hmm... been some time since I blogged, something of a commentary on the complete lack of interesting puzzles I've come across in the past while. A few months ago I was lying in bed thinking about tic-tac-toe and my mind wandered into a bit of a neat game.

When I was young, it was common for a game of tic-tac-toe (which I will start calling TTT, to avoid typing that out everytime) that ended in a tie to be referred to as "cats", and I remember one friend of mine who would occasionally call a tie game "cats" only sometimes and "dogs" at other times. It took me some weeks of playing games with her before I managed to decipher the rule that she used in calling some games "cats" and some "dogs". Basically it came down to if there was a letter "C" in the final position. Consider a game of TTT with X going first (when I was young, O always went first, but I have discovered in my adult life that I am apparently the only person on the planet who thinks that O goes first, so I will submit and use the majority convention here), if the game ends in a tie there will be 5 X's on the board and 4 O's, and those 5 X's can form one of only three possible shapes (they aren't hard to find, and I don't feel like figuring out a good way to draw them). One of those shapes looks like a "C" when oriented correctly, and the draw would be called "cats" if that shape appeared, and the draw was "dogs" otherwise.

Lying in bed, I considered using this "cats" vs "dogs" distinction as a tiebreaker on the game of TTT. Since it is trivial for X to achieve "dogs" by making their first move in the center, it is natural to break ties as "cats" is a win for X and "dogs" is a win for O.

Looking at the possible tie configurations, we can see that "cats" occurs if and only if X controls three of the four side squares, which basically tells us that the game is about controling the side squares. So we can reword the game as "play TTT as normal, X going first, if the game ends in a tie, then whoever controls more of the side squares wins, if that is a tie then O wins." Actually this game is somewhat interesting, sides are the worst places in normal TTT, but this puts an extra emphasis on controlling the sides, so there is some sort of balance. You want to take the sides as soon as possible, but make sure not to let the opponent get an old-fashioned three in a row while you are doing that.

Anyway, I'm going to leave you with that. If you are like me, you will take the time to solve out this game, its only slightly harder than TTT, and its sort of fun. As adults we never get to solve out TTT because we solved it so young, its nice to be able to start over again.

James suggested calling this game "Cats and Dogs," which seems like a smart enough thing to do, so I guess that will be that.

Onto The Nimbers

Alright, in principle there is where I should post the solution to the puzzle from last time, but first I want to talk about some game theory first, just to give a reason why I found this puzzle sort of neat in spite of the fact that it is quite simple.

Let us consider a two player game, played with piles of coins. On your turn you choose a pile of coins and remove some nonzero number of coins from that pile. The players take turns doing this, and eventually piles run out of coins. If it is your turn and there are no coins left, you lose.

If you know some game theory, you will recognize this game as Nim and you might as well stop reading this post right now, since I'm not going to say anything you don't already know. Anyway, if you haven't seen this game, I encourage you to try it out, its a fun and simple game until you have seen the complete solution, you can probably find some programs online that will play with you (and beat you, because the solution is simple to program).

Before I go through the solution, I want to set up some notation. *x will denote a pile of x coins, and *x+*y will denote a game with two piles of coins with sizes x and y. {a,b,c} will denote a game where the legal move is for a player to move to any of positions a, b, or c, so we can say that *4={*0, *1, *2, *3}, and *2={*0, *1}, of course *0={ }. Lets begin by trying to solve a few simple cases of the game.

The one heap game is trivial, you simply remove all the coins in the heap with your first move and the game is over. The two heap game is fairly simple, suppose that we have a position like *7+*4, then we simply reduce the *7 by three coins to *4. With two heaps of the same number of coins, we simply have to mirror everything our opponent does to win, if he reduces one of the 4 heaps to x, we reduce the other 4 heap to x, eventually our opponent will remove a piles final coin, and we sweep the other pile away to win.

The larger games are harder, but we can solve a few cases right away. For example, if it is our turn from the position *8+*6+*6, we should simply remove the heap of size 8, and mirror our opponent in the two heap game. Or is we are in a position *9+*8+*6+*6, we can simply reduce 9 to 8 and play two mirror games. Actually, we can see that if ever we are in some position with *a+*b+*c+*x+*x, we can ignore the last two heaps, our opponent can find no winning move in those two heaps (we just respond in the other one) and we can find no winning move in those two heaps (our opponent will do the same to us). So, from an optimal strategy point of view, adding *x+*x to any game does not change anything. We can also see that adding *0 to any game does not change anything, so we can see that there is some sense in which

*x+*x=*0

Naturally, in the game with only *0, the active player loses, and in the game with only *x+*x, the active player loses.

Now we are in a position to analyze a slightly more complicated position, *2+*1. What are the legal moves from this position? We can reach any of *2+*0, *1+*1, or *0+*1, which are *2, *0, and *1, respectively, so we can see that *2+*1={*2, *0, *1}, which is *3

*2+*1=*3

Which means that in any game with *a+*b+*c+*2+*1, you can treat this as the same as *a+*b+*c+*3. In particular *3+*2+*1 is a losing position for the first player, as it is the same as *3+*3. It also means that a game like *a+*b+*c+*3+*2+*1 is just the same as the game *a+*b+*c+*3+*3, which we have seen is the same as the game *a+*b+*c.

*3+*2+*1=*0

Actually, one might try to derive this last equation from *2+*1=*3 by adding *3 to both sides and using *3+*3=*0. Sounds reasonable enough, and if it weren't valid then our choosing to use "+" to denote these games would be questionable indeed. Actually, this means that we should also be able to add *1 to *2+*1=*3 to arrive at *2=*3+*1, is this equation valid? Well, from the position *3+*1 we can reach any of *3+*0, *2+*1, *1+*1, *0+*1, so we can see that *3+*1={*3, *3, *0, *1}={*3, *0, *1}. But we know that *2={*0, *1}, so are these things really the same?

Suppose we have a winning strategy in the game G+*2 (where G is any collection of heaps, so G represents something like *a+*b+*c), I claim we can use that as a winning strategy in G+*3+*1. If your winning strategy ever tells you to move in G, then do so. If your winning strategy is to move in *2, it must be to *1 or *0, both of which are accessible in *3+*1, so you can follow your strategy. Our only concern is that our opponent makes use of the move from *3+*1 to *3, but we can immediately revert that to *2 (since *2 is a legal move from *3), and then we can actually continue to follow our strategy.

The conclusion of this is that if you have analyzed a game and found is looks like {*0, *1, *2, *5, *9} or so, you can treat this game as *3. It can access any game that *3 can, and if you opponent moves to one of the larger numbers you can immediately revert it to *3. 3 is said to be the Minimal EXcluded number from the set {0,1,2,5,9}, denoted 3=mex{0,1,2,5,9}.

{*a, *b, *c, ...} = *n, where
n=mex{a,b,c,...}

This gives us {*0, *1, *3}=*2 which validates the equation *3+*1=*2. We can confirm that *3+*2=*1 through a similar method.

This actually gives us a complete solution for the N-pile game, as long as the piles are not larger than size 3, any two piles *x and *y can be replaced by one pile, using one of *x+*x=*0, *1+*2=*3, *1+*3=*2, *2+*3=*1. In the end, there is one pile, and you simply must reduce it to zero, if the one pile is already *0, you are in a losing position and must hope for a mistake.

Next, what do we do if we are in a position with a pile of size 4? Well, lets see what *1+*4 gives us. We can move to any of *0+*4, *1+*3, *1+*2, *1+*1, *1+*0, which means *4+*1={*4, *2, *3, *0, *1}, which is clearly *5. We can find *1+*5 as well, we can move to any of*0+*5, *1+*4, *1+*3, *1+*2, *1+*1, *1+*0, which means *1+*5={*5, *5, *2, *3, *0, *1} which by the mex rule is the same as *4. The mex rule gives us our general answer, actually:

*a+*b=*c, where
*c is the smallest number not of the form *a'+*b or *a+*b'
for any a' less than a and b' less than b

This is enough for us to fill in an entire addition table for nim if we want (these numbers are often called nimbers, just to be whimsical). A partial nimber addition table is

0 1 2 3 4 5 6 7
1 0 3 2 5 4 7 6
2 3 0 1 6 7 4 5
3 2 1 0 7 6 5 4
4 5 6 7 0 1 2 3
5 4 7 6 1 0 3 2
6 7 4 5 2 3 0 1
7 6 5 4 3 2 1 0

If you extend this, you'll see it has some nice structure in terms of blocks of powers of two. Actually, you can find a general rule, which I will not prove here:

*2^a+*2^b=*(2^a+2^b)
unless a=b

So, nimber addition for different powers of two is the same as number addition. Of course if a=b then the nim sum is zero. One can also see from the nim addition table that nim addition is associative and commutative, meaning that the nimbers form an abelian group under nim addition with identity *0 and every element is its own inverse. We can use this to do some more complicated nim addition, for example *22+*7+*25 = (*16+*8)+(*4+*2+*1)+(*16+*8+*1) = *16+*16+*8+*8+*4+*2+*1+*1 = *0+*0+*6+*0 = *6. One can extend the nimbers to allow for a multiplication and make it a full blown field (sometimes called On₂), but lets not bother with that.

One might find this addition easier to do in binary, in fact, some people describe nim addition as "convert to binary and add without carrying". This means that Patrick's addition rule from last time is actually just nim addition.

At this point is it fairly natural to solve the puzzle from last time. We can see that *a+*b+*c+...=*x+*y+*z+... if and only if *a+*b+*c+...+*x+*y+*z+...=*0, so we have an immediate check to see if Sean has any split that does not make his brother cry. If he has such a split, any split will work, so he simply gives his brother the smallest candy to keep the most for himself.

Alright, so any time you are in a situation in nim, simply check the nim sum, if it is not zero then there is some move that will make it zero. If it is already zero then you have lost and must either resign or hope for a mistake.

Nim is actually very important in game theory because any two player, perfect information, finite, impartial game with the normal ending condition can be reduced to nim. To clarify those technical terms, finite means that there is no infinite sequence of legal moves anywhere in the game, the normal ending condition is that if you have no legal moves you lose. Finally impartial means that in any position of the game, both players have the same set of legal moves available to them (contrast to something like chess where one player plays white and the other black, chess is not impartial).

As an example of another impartial game, consider the game of Cram. This game is played on a chessboard and on a players turn they place a 2x1 domino on the board to cover two squares. If it is your turn and you have no room to place a domino, you lose. One can reduce this game to nim by considering what moves are available from any given position. For example, if there is only a 1x1 square, this is the same as *0 (no legal moves). A 2x1 or 3x1 box has exactly one move, to *0, so those positions are *1. A 4x1 box or an L-shape can move either to *0 or to *1, so it is *2. You can just sort of explore the entire game like this if you feel like it.

Anyway, I think thats all I got for this. I really find the entire theory fascinating for some bizzare reason, and that puzzle reminded me that I wanted to make a post working out the solution. If you find this sort of game theory interesting, I recommend you look up the series of books by Gardener, Winning Ways For Your Mathematical Plays.

Splitting Candy

Alright, new puzzle time. This is another one that Bart sent me from CodeJam:

Two brothers, Sean and Patrick, are splitting a bag of candy. The candies come in different sizes, each represented by a positive integer value. Sean, as the older brother, will split the candy into two nonempty piles, and give one to Patrick, keeping the other for himself. Patrick will then examine the two piles and decide if he thinks the split is fair. If he decides the split isn't fair, he will start crying.

Unfortunately, Patrick is very young and bad at adding. He can almost add numbers in binary, but he always forgets to carry the 1. So if he wanted to add 13 (=1101) and 6 (=0110), he will get 1011, which is 11. Some other examples are 3+5=6, 4+10=14, 7+4=3.

Sean is good at adding, and wants to take as much candy as possible while not causing his brother to cry. Given a list of numbers the represent the candy values, is it possible to easily determine if Sean can split the candies in such a way that will not make his brother cry? What is the maximum value of candy that Sean's pile can be?

As an example, if the candies are value 3,4,7, we know that Sean can split them (7, 4) vs (3) and keep 11 total for himself, Patrick will perceive this as fair because he thinks 7+4=3. As another example, if the candies are valued 2,3,4, then there is no splitting that Sean can do that will keep Patrick happy.
The puzzle is sort of neat, and gets into some math that I really like that has a lot of importance in combinatorial game theory. Basically I'm just posting the puzzle so that I can make a game theory post while making it look like a puzzle post.

Flipping Vases

Alright, back to blogging before I stop blogging for another month or something. Time for the solution to the vases in a line puzzle from last time.

First of all, suppose we have a solution for N vases, then by removing vases M+1 through N, we will have a solution for M vases for any M less than N. Thus, we only need to find solutions in a few particular cases of N, as long as those cases get arbitrarily large it will be good enough (for example, we are going to wind up having a solution that works for N being a power of 2, since any number can be expressed a less than some power of two, you can simply grab that larger solution and remove the extra vases).

For future considerations, it will be slightly easier to subtract 1 from the heights of all the vases, so they are numbered 0 through N-1, rather than 1 through N.

We must arrange things so that no number is between two numbers it is the average of, so x cannot appear between x+a and x-a for any a. Note that the average of two even numbers is even and the average of two odd numbers is odd, and the average of an even number and an odd number is not an integer. This suggests that we place all the even numbers on one side, and all the odd numbers on the other. That way there is no concern of the "left side" and the "right side" having anything dangerous in between them, we can simply treat them independently.

Within the even ones, we can simply half them and solve the N/2 case. With the odd ones, we can subtract 1 from them and then half them and use the N/2 solution. Thus we can see that if we can solve N/2, we can solve N. This is why we will get solutions for powers of two, we simply must start by solving the case N=2 (this is trivial, the solution is (0 1)).

So, if we wanted to solve N=4, we have 0 1 2 3 to place in order, begin with the evens on the left and odds on the right and we get (0 2 1 3) as our solution. For N=8 we being by sorting them to (0 2 4 6 1 3 5 7) next treat 0 2 4 6 as 0 1 2 3 to get (0 4 2 6) and treat 1 3 5 7 as 0 1 2 3 to get (1 5 3 7), so the solution is (0 4 2 6 1 5 3 7).

There is another way to generate this solution, begin by writing out the numbers 0 through N-1 in binary

000
001
010
011
100
101
110
111

Next, reverse the digits

000
100
010
110
001
101
011
111

Next, read them off as they are to get (0 4 2 6 1 5 3 7), which is the same solution.

That first algorithm to get the solution is clean enough, and I was pretty happy to consider it the solution, but the second solution involving binary numbers in reverse is really cool (and I did not come up with it, I saw it on xkcd). Its a bit harder to prove that it always works though, but its not so counterintuitive that its impossible to believe. Naturally, if you want the solution for N that is not a power of two, just solve a larger N that is a power of two and ignore the extra numbers, so for N=5 we have (0 4 2 1 3), which after adding 1 becomes (1 5 3 2 4) which works as a solution.

Vases In A Line

Blogging is for chumps, I'm on extended vacation (also known as semi-unemployed). Anyway, time for a puzzle that I found on the forums at xkcd:

There is a museum with N vases they wish to arrange on a display. The vases are such that the first one has a height of 1cm, the second one has height of 2cm, and so on such that the N^th one has a height of N centimeters. The people at the museum wish to arrange the vases in a linear display in a special way, such that if we have three vases with heights a,b,c, with b being the average of a and c, b does not appear between a and c.

The puzzle is to come up with an algorithm that will generate an arrangement of numbers 1 through N such that no number appears between two numbers it is the average of.

So, for example if N=5, we could not have 3 appear anywhere between 2 and 4, or anywhere between 1 and 5, 2 could not appear between 1 and 3, and 4 could not appear between 3 and 5. An acceptable answer there would be (1 5 3 2 4).

Arrangements and Derangements

So, it seems that finishing my thesis has slowed down my blogging. Blogging was done as a way to slack from work, and now I don't really do much work anymore, leaving me with nothing to slack from. Basically, I need to start looking for a job so I can slack off with maximum efficiency.

Alright, time to look at the solution to the sorting puzzle from last time. I guess the solution was given in the comments, which means that I actually have something of a readership I guess (which is a good thing?), anyway, I have no problems with people posting solutions in the comments, but I'm going to do my own solution as usual.

First, lets try to solve it in some simple cases. We already know that the sequence (2 1) takes two moves, how about the sequence (3 1 2). There seems to be no reason to hold any of them in place, and a randomization has an equal chance of going to any of (1 2 3) (1 3 2) (2 1 3) (2 3 1) (3 1 2) (3 2 1). The number of steps those are away from the sorted form is 0, 2, 2, k, k, 2, respectively, (knowing that, say, (1 3 2) is two steps away on average) where k is the number of steps (3 1 2) is away. This allows us to say that

k=1/6(0+1)+3/6(2+1)+2/6(k+1)

which can be solved to get k=3. So this solves the N=3 case completely.

For N=4 we have a bit more of a complicated situation, it is clear that anything such as (_ _ _ 4) reduces to the N=3 case, but we have two separate things to solve, one is things of the form (2 3 4 1) where we have everything out of place, and the other is of the form (2 1 4 3) where we have two separate groups. We already know that it is possible to solve (2 1 4 3) in two steps by doing the first two separate from the second two, but it might be better somehow to just ignore that they are two separate cycles and randomize the whole thing. If it is better, then (2 3 4 1) will be less than 4 steps away on average. If it is more than 4 on average, then we should treat cycles separately, and if it is exactly 4 then it does not matter if we isolate cycles.

First, we will treat cycles separately, and let g(n) be the number of moves it takes on average to sort an n-cycle with each element out of place to the right by one (or left if you like). We have seen g(2)=2, g(3)=3 and trivially g(0)=0, g(1) isn't really a thing since you can't have a cycle of 1 with each one being out of place. Anyway, from a 4-cycle we have 24 places we can go, 1 of them is (1 2 3 4), the solution, ₄C₂ of them are 2-cycles (like (2 1 3 4)), ₄C₂/2 of then are two 2-cycles (like (2 1 4 3)), 8 of them are 3-cycles (you can see this by locking one of them down and there are two ways the remaining three are out of place), and the remaining 6 of them are 4-cycles. We can then see that

g(4)=1/24(0+1)+6/24(g(2)+1)+3/24(g(2)+g(2)+1)+8/24(g(3)+1)+6/24(g(4)+1)

which we can solve to get g(4)=4. If we chose the other strategy of just treating two 2-cycles as a 4-cycle, we would replace the term g(2)+g(2) with g(4) instead, and still get the solution g(4)=4, showing that the strategies are the same.

Given that, we might as well only lock down ones that are in the correct place, and not worry about the smaller cycles. Let f(n) be the number of steps it takes to get from a completely unsorted list of n elements to a sorted one ("completely unsorted" means none are in the correct place). Mathematicans have a name for a completely unsorted list, they call it a derangement, and the standard notation for the number of derangements of n elements is !n. A few early ones are !0=1, !1=0, !2=1, !3=2, !4=9. Our earlier strategy for finding f(4) can be expressed as

f(4)=!0/4!(f(0)+1)+!1/4!(f(1)+1)+!2/4!(f(2)+1)+!3/4!(f(3)+1)+!4/4!(f(4)+1)

that is, f(4) is the chance you get k elements in the wrong place, times f(k)+1, summed over k. The chance that exactly k elements are in the wrong place is !k/4!. f(1) isn't really defined, but !1 is zero anyway, I just included that term for completeness.

From this we can deduce the general form

f(N)=Σ !k/N!(f(k)+1)

summed k from 0 to N. If we hypothesize that f(k)=k for all k less than N, we can then proceed to prove f(N)=N, though it is rather tedious to do it this way.

The more intuitive way to get f(N)=N was given in the comments, throwing N elements into the air, each one has a 1/N chance of landing in the right spot, and you threw N of them, so on average one of them will land in the right spot. Thus it takes N steps.

Sorting Numbers

Alright, new puzzle time, Bart showed me this one from the Code Jam this year:

You have a list of the integers 1 through N in some random order, and you are going to attempt to sort them. The only operation you have is to randomize any subset of these integers. Specifically, you may select any subset of the integers and lock them in place, and then the other ones get rearranged randomly. You may perform this operation as many times as you like, but want to do it as few times as possible.

Given the initial list of integers, what is the optimal average number of steps it will take to get the list in order?

The original puzzle had a bit more coding bubble-wrap around it, but not too much. It also had a nicer explanation of why the algorithm is so weird, but whatever.

As a few specific examples, if the initial list is (2 1 3), you can hold the 3 in place and randomize the other two until they fall in place, this will take on average 2 steps. If the initial list is (2 1 4 3) you can hold the first two in place until the second two land in order and then hold the second two in place until the first two land in order. This will take 4 steps on average. Given the initial list, is there an easy way to determine how many steps it will take on average? (Hint: there is, or I wouldn't consider this puzzle interesting).

Cards and Coins

Alright, time for the solution to the score cards puzzle I put up last time. The comments contained the essentials of the solution, but I still want to type anyway.

First, suppose we had a target number N and a solution to use bases a,b,c,d... That is to say, the product abcd... is at least N with the sum a+b+c+d+... minimized. Suppose the proper solution had a 7 as one of the bases, it is just as good to replace the 7 with a 5 and 2, it makes the sum no larger and in fact makes the product bigger, we can similarly replace 6 with 4 and 2 and even 4 with 2 and 2. In all cases where a is larger than 4 it is better to replace a with a-2 and 2 (this is sort of obvious, but if you wanted to prove it note that 2(a-2) > a can be rearranged into a > 4). Anyway, so a proper solution has no need to contain numbers greater than 3, it should only have 2's and 3's.

Next, note that if a proper solution ever had three 2's, we might as well replace them with two 3's, since the product is larger and the sum the same. So any proper solution need not contain more than two 2's. Thus, we find the algorithm to find the optimal base for a target N. Select k such that 3^k is at least as large as N and 3^k-1 is smaller than N. Compare 3^k-12 and 3^k-22² to N. Whichever of those things is smallest while being larger than N is the expression of the solution.

It is somewhat surprising that base 3 is the optimal solution, most people (myself included) would intuitively think that base 2 is better, but that is just not the case. In some cases base 2 does work fine, our solution does give an optimal solution, but it is not the only one, for example if the target number is 60 then this solution gives us to use 3⁴, however using 2⁶ works just as well. As the numbers get sufficiently large however, base 3 will always win out.

One can "generalize" this problem somewhat by assuming the numbers do not have to be integers. So, we have to select numbers a,b,c,... to minimize their sum and have their product be N. There is nothing to be gained by overshooting N if we are using reals rather than integers, so we might as well hit it exact. Whats more, if two numbers a and b appear in the solution we would do just as well to replace them with a/x and bx with a smaller sum. How do we minimize the sum of a and b while keeping the product equal? This is just minimizing the perimeter of a rectangle, make it a square, set a=b. So we actually should only use a single number a. That is, we want p copies of a such that the sum a+a+a+... is minimized while the product equals N. That is, minimize pa with N=a^p. Minimizing a(Ln(N)/Ln(a)) over a gives us a=e. Giving us that the optimal base is base e. This is sort of why 3 was best, it is the closest integer to e, with 2 also sort of being part of it (I'm not sure how to interpret that, but it feels right).

This also can be taken as a commentary on currency, the optimal base for a currency is base 3 (base e would be better, but then your coins are really weird and making change is hard). Base 3 currency is optimal in the sense of needing the fewest total number of coins to represent a variety of different values. Of course, I doubt we will see a switch to base 3 coinage anytime soon, a single coin of value 3 could probably happen (and probably has, but I know of no real world examples), but coins of value 27 and 81 would be pretty hard to get used to for people who are used to thinking in base 10.

Score Cards

Alright, time for a new puzzle. Though I meant to post last time that that lemma I used didn't actually need the numbers a_i to be positive reals, it is sufficient to be real. The proof is the exact same even, the only thing that needed to be positive in the proof were the z_i, and positivity is guaranteed by L being the minimum L_i. Anyway, whatever. New puzzle, I first learned from Bart:

There is a school that is using flip cards to keep track of scores in their sporting events. They use a series of cards to represent the ones place of the score, and another series for the tens place and so on for as many places as they feel they need. They want to minimize the number of cards used however, and (being a very mathematically inclined school) they are willing to change from base 10 to another base to accomplish this. For example, if you wanted to represent scores 0-99 in base 10 you would need 20 cards (10 for the ones place and 10 for the hundreds place) however you could do even better by using base 5 and having 3 sets of 5 cards (15 cards total) and be able to represent all the way from 0-124, naturally having the ability to represent extra numbers is not harmful at all, if it uses fewer cards it is better.

Actually, this school is even more mathematically inclined than that, they are even willing to use different bases for each place, for example the first digit could be base 5, the second digit base 5 and the third digit base 4, this uses only 14 cards and can represent 100 different possible numbers.

The puzzle is this: given a target number N, find the minimum number of cards it would take to represent at least the numbers 0 through N-1 on this style of scoreboard.

The solution to this puzzle is somewhat surprising, if you do think you have found a solution, make sure you have checked it out in some medium-sized cases (like numbers 10 through 25) to make sure it works, or better yet have a full proof done that it is actually optimal.

Nonlinear Prisoners

OK, I just finished my PhD departmental defense, so now its time to look over the solution to the prisoners and light switches puzzle. I wanted to have a post calculating out how long the actual strategy would take for the prisoners to complete, so here we go.

First, we need to figure out how long each procedure will take to perform, starting with the upper bound procedure. The upper bound procedure has k nights of activating people, followed by k nights of deactivating people. If the number of prisoners is N, they will have to run k from 1 to N before the warden is forced to reveal there is an upper bound, this will take N(N+1) nights, and they will have an upper bound B=2^N. The cannot say the upper bound is lower than that, as they do not know what the warden chose to do.

A ping procedure takes B nights to perform, obviously, and the bing procedure takes 1 night. The upper bound procedure is the awful one, for it takes 2^z nights if it is called for on night z.

When the how many times can the warden make them call the upper bound procedure? Well, that depends on how big he can group them up. If he groups the prisoners into very large groups and then splits them off one at a time until they are all in their own groups of size one, the warden will maximize the number of times they perform the group division procedure.

Suppose we have N prisoners. As they run the upper bound procedure, the warden cannot do better than to split them into lg(N) groups. Actually, odds are that if he does this maximally, they will probably find their value of B faster and it will be smaller, but I don't want to do a more complicated calculation than this. So, they have taken on the order of N² nights to find B, the group division procedure than takes like 2^N² nights. Next they do lg(N) pings, costing B lg(N) nights, but that is nothing compared to 2^N². The adversary makes one more group, so they call the group division procedure again and it takes 2^(2^(N²)) more nights.

We can see where this is going, the warden can make them call the group division procedure N-lg(N) times, resulting in the prisoners needing x^N² nights, where x is 2^2^2^2^..., N-lg(N) times, also known as 2 tetrad (N-lg(N)). I tried calculating out that number for something like N=20, but it was huge beyond my comprehension. So then I tried N=5, and again, it was huge beyond my comprehension. Basically, this strategy is stupid.

Linear Prisoners

Alright, time to do the second half of the solution to the prisoners and light switches problem. In order to explain the solution in full, we first must make use of a lemma (thats right, this solution is so epic that it actually needs a lemma).

First, the statement of the lemma:

Suppose we have a system of unknowns a₁, a₂, ....a_k that obey a series of equations. Let S={a₁, a₂, .... a_k} and suppose the equations have the following properties:
1) a₁=1 is one of the equations
2) For every nonempty, proper subset P of S, there exists a subset Q of S such that there exists an x in Q that is not in P and one of the equations reads "the sum of P" = "the sum of Q"
3) There exists a positive solution to the equations

Then, the solution is unique over the positive numbers.

Alright, lets try to make sense of this lemma before I go proving it. First, we have k unknowns that obey some equations, how many equations exactly? Well, for every nonempty proper subset P of S, there is an equation, and there are 2^k-2 such subsets, also there is a₁=1. In total we have 2^k-1 equations for this system, so it is probably extremely overspecified. The lemma says that if we are so lucky that a solution actually exists, there is not going to be more than one.

As an explicit example, lets suppose we have three unknowns, a, b, c, that obey these equations:

a=1
a=b
b=a
c=a+b
a+b=c
a+c=b+c
b+c=a+c

We can see that it satisfies the conditions of the lemma, for every nonempty proper subset of {a,b,c}, that subset appears summed on the left side of some equation and there is a subset that is summed on that left side such that the right side has at least one element that is not on the left. Clearly one can see that a=1, b=1, c=2 is a solution, and it is fairly obvious that it is unique. In fact, we don't even need to restrict a, b, and c to be positive to get uniqueness, and I think the lemma is true without it, but the proof is very simple with that assumption. Whats more, for our purposes of solving the puzzle, we only need the lemma to be true over the positive integers.

Note that instead of b+c=a+c, we could have had b+c=a instead. Then we would still have satisfied condition 2), but not condition 3), making it so that the lemma gives us nothing useful.

Alright, lets go with the proof now. Suppose we have our system of equations and we have two sets of positive solutions {x₁...x_k} and {y₁...y_k}, we will now prove that x_i and y_i represent the same solutions. Define L_i=x_i/y_i. The set of the L_i is finite so it has a smallest element, call that L. Define z_i=x_i-Ly_i. Since L is smaller than x_i/y_i, it must be the case that z_i is nonnegative, but since L=x_i/y_i for some i, it must be the case that at least one z_i is zero. We also know that the z_i solve the same system of linear equations, being a linear combination of x and y.

Next, I claim that all the z_i are zero. Suppose some of them are not. Let P be the set of z_i=0 and P is not all of {z₁....z_k}, then by 2) there exists a Q such that the sum of P is the sum of Q and Q has an element that is not in P. But the sum of P is zero, and the sum of Q cannot be. This contradiction shows that P is either empty or all of the z_i. Since P is nonempty, we have that z_i=0 and so x_i=Ly_i. Since a₁=1 is one of the equations, it must be the case that all the x's and the y's are equal, completing the proof of the lemma.

Good times.

Now, lets construct the solution to the puzzle.

Recall we had three procedures previously, the "upper bound procedure", to find an upper bound B on the number of prisoners, the "ping", a B round procedure to send out a common knowledge signal to everybody, and the "bing", a one round procedure to send out a signal with the knowledge that the number of signals out equals the number of signals received.

We will need to define one more procedure for this solution, the "group division procedure". If this procedure is called for on night W, each prisoner looks at their history for the past W-1 nights and treats it as a binary number between 1 and 2^W (treat a history of all zeros as 2^W I guess). We have 2^W nights of pinging, where if a prisoners history was i, they ping on the i^th night. This divides people into groups. First, "you" are in group 1, and then the first group of people to ping during the group division procedure are the people in group 2, the next group of people to ping are group 3, the next group is group 4 and so on.

At the end of the group division procedure we have groups G₁ through G_k with G₁ having only "you" in it. The number of groups k is common knowledge and every person knows what group they belong to.

Note that if we call the group division procedure and get k groups, and then later we call it again and get h different groups, it must be the case that h is no smaller than k, people who had different histories at one point cannot later "merge". Whats more, if h=k, each person must be in the same group as they were last time.

The main goal of the strategy is to have every nonempty, proper subset of {G₁....G_k} send out a bing, which will be received by other people in other groups. we then check to see if everybody in the same group received the same signals, if they did we will have a series of equations that we can solve, and the lemma will guarantee the uniqueness of the solution. If it was not the case that everybody in a group received the same bings, then we can use that to make a finer group division. The adversary will have to choose between making finer and finer groups or giving us our series of equations. Eventually the groups will all be size 1, then the adversary will have no choice.

Alright, explicitly the strategy is:

First, perform the upper bound procedure, obtain a common knowledge upper bound B on the number of prisoners.

Next, perform the group division procedure, there are now groups G₁ through G_k and each prisoner belongs to some group and knows what group they belong to. k is also common knowledge.

Next, for each nonempty proper subset P of {G₁....G_k}, send out a bing if you are in a group which belongs to P. There are 2^k-2 such subsets, so there will be that many rounds of bings. Make a note of any bings you receive and the particular set P that it came from.

Next call the group division procedure again, if the number of groups is larger than it used to be, go back to the previous paragraph using the new groups, if the number of groups is the same as before, it must be the case that everybody in a given group has had the exact same history the whole time, even throughout the bing rounds.

Next, for each nonempty proper subset P of {G₁....G_k}, we will have k rounds of pings. Send out a ping on the i^th round if you belong to group G_i and you received a signal from P during the most recent bing round.

Now, for each nonempty proper subset P we know what groups received a signal from that set, meaning that we have an equation that reads "the sum of P" = "the sum of Q" for some known Q. Whats more there must be something in Q not in P, as there cannot be a closed loop that is less than all the prisoners. We know the size of group G₁ is exactly 1, as that group is "you". Finally, a solution to these equations must exist if the prisoners actually performed it. Our lemma guarantees that there is not more than one solution to this set of equations, so we can uniquely solve them to get the size of every single group. Since every prisoner was in some group, we can just add the size of the groups to get N.

This completes the deterministic solution to the puzzle. It is a bit funny that the final value of N is common knowledge to the prisoners, none of the solutions presented made use of the fact that any individual prisoner can declare knowledge of N, it didn't have to be common knowledge. Anyway, I doubt that there is much improving on this solution, because you have such limited communication available and the adversary can really restrict what sort of things you can attempt. Anyway, next time I want to do a post about how long this solution might take to implement. My guess is something like 2 tetrad N, but I haven't calculated it out yet.

Random Prisoners

Alright, time to do the solution to the latest prisoners and light switches puzzle. I'm not going to post everything about the full solution here, because that would take too long, but I do want to fully examine all the probabilistic solutions.

First of all, it is quite easy to establish an upper bound on the number of prisoners that there are. We will have two types of prisoners, called "active" and "inactive". If you are active, turn your switch on every night, if you are inactive you leave it off. Suppose we selected a number k that all the prisoners knew, and we begin with everybody inactive, but "you" begin active. If the light in your room comes on on a given night, you change your state to active and leave it that way. If we keep doing this for k nights, then at the end of the process there will be somewhere between k+1 and 2^k activated prisoners. After k nights have passed, we can then check if there are any inactive prisoners still by reversing the strategy, make it so that if you see your light off, you turn inactive and stay that way. If there were still any inactive prisoners, then it will take no more than 2^k nights to deactivate everybody.

The procedure is as follows: for each k starting at 1, spend k rounds activating people, followed by 2^k rounds deactivating people. If at the end anybody is inactive, then it must be the case that everybody is inactive and the number of prisoners must be greater than k+1 (else everybody would have been active for sure), and we move on to the next value of k. If at the end of the deactivating process there are no inactive people then the number of people can be no greater than 2^k.

We will call this entire process the "upper bound procedure", it will be used to establish an upper bound B on the number of prisoners. B will always be a power of 2, but that doesn't really matter for what we will do, it is sufficient that B is determined in a well prescribed way and at the end of the process we have determined an upper bound that every prisoner is aware of. Whats more they are all aware that they are all aware of it, and so on, thus making the upper bound "common knowledge" among the prisoners.

Once we have an upper bound on the number of prisoners, we can construct another procedure which I will call a "ping". A ping is a method for prisoners to send out a signal to everybody, such that a signal being sent becomes common knowledge. If the strategy calls for a ping on day M, the procedure lasts for B (=upper bound) days, from M to M+B-1. First, each prisoner decides if they would like to send out a signal or not, if a person would like to send out a signal they are "active", otherwise they are "inactive". As always, turn on your switch if you are active, and off if you are not. If your light comes on, change your state to active. If anybody was active, then by the end of the procedure everybody will be active, if nobody was active at the start then nobody will be active at the end. Thus, if our strategy call for a ping procedure on some specified day M, the prisoners will be able to distinguish "somebody wanted to send out a signal" from "nobody wanted to send out a signal". Note that they cannot distinguish "one person wanted to send a signal" from "two people wanted to send a signal", also note that the existence of a signal being sent becomes common knowledge.

We now have enough structure to set up our first solution. Begin by performing the upper bound procedure to find an upper bound B for use in the ping procedure. Next, have an agreed on number F which depends on B, such that if B independent integers are chosen between 1 and F, it is unlikely that any two of them are the same (I will make the concept of "unlikely" more precise later). Next, each prisoner selects a random integer between 1 and F. Now we call for F rounds of pings. On the i^th round of pinging, send out a signal if your random number is i. After that, count the number of rounds that signals were sent out, guess that that number is the number of prisoners N.

As an example, suppose we determined that the number of prisoners was no greater than 8 (B=8), and we have agreed that in the case B=8, we will use F=100 000 000 (perhaps F=10^B was agreed). You perform the 100 000 000 rounds of pinging, and signals are sent on the following rounds:

18398520
19216152
41001462
53182675
71507596
90195088

Then odds are that there are 6 prisoners. Its possible that there are 7 and two of them generated the same random numbers, but unlikely, it is even possible that there are 8 and there were two duplicates, but very very unlikely.

Given B random integers from 1 to F, what is the chance that two of them are the same? I did that calculation back when I was working out stuff about the birthday paradox, the answer comes out to be approximately B²/F, provided that B² is much smaller than F. So, if you choose F=B²/e, then you fail with probability no more than e (this is approximately true as long as e is small, you might want to choose F=B³/e to be sure).

This completes the strategy that will work with probability 1-e for any e>0. This is also as far as I was able to get in figuring out this puzzle, everything after this was me cheating and reading other peoples answers.

For the next strategy, we will attempt to number the prisoners, and then each round we will check to see if anybody is still unnumbered, and then try to number them. If nobody is unnumbered we will declare success. The prescription will be such that everybody knows their own number and if m people are numbered the number m is common knowledge. It will begin with "you" being number 1 and everybody else being unnumbered. The strategy will gradually assign new numbers to people and will keep the total m common knowledge as we do it.

We will need a new, fairly simple, procedure for this solution (we will also need it for the final solution, so I might as well set it up now), a "bing" is a one night procedure, a person who wants to send out a signal turns on their switch that night, and people who do not want to send out a signal do not. Some other prisoners receive the signals, and some do not. The only thing that is common knowledge here is that the number of signals sent must equal the number of signals received (look, I said it was a simple procedure, but I need a name for it because I will use it alot).

Suppose that so far we have numbered people 1 through m, with m being common knowledge, first we have a ping round, send out a ping if you are unnumbered. If nobody sends out a ping we are done, N=m. If somebody out there is unnumbered we will next do something that will select a group of people S such that every person in S is an unnumbered person, and we will then assign numbers to the people in S.

We perform a bing where if you have a number you send out a signal, and if you are unnumbered you do not. If you received a signal that night and you are unnumbered then you are in S, if you received a signal that night and are numbered just remember it for now, we will need it later. S is necessarily nonempty, because there must have been an unnumbered person next to a numbered person. We must find the size of S and make its size common knowledge. We do this by having m rounds of pings, where on the i^th round, you send out a signal if your number is i and you received a signal during the earlier bing night. This will determine how many "wasted" signals there were during the bing night. If there were no wasted signals then the size of S would equal m, and for each wasted signal then S will be 1 fewer than that. So at this point we have a set S of unnumbered people, each person who is in S knows that they are, and the size of S is common knowledge. We will denote the size of S with s.

Each person in S chooses a random integer between 1 and s, and then we have s rounds of pings. On the i^th round, send out a ping if your random number was i. If there is a round where nobody sends a signal, then it must be the case that two people in S chose the same random number, go back to the start of this paragraph with the same set S. If a ping was sent every round then we are good, it means that each person in S has selected a different number between 1 and s. Each person in S can now claim the number m+i and each person who is numbered has a unique number, so we have now numbered people m+1 through m+s. Go back to the part of the strategy where we asked if anybody was still left unnumbered.

This strategy will always work eventually, a nonempty set S will get chosen and after some number of rounds they will eventually pick unique random numbers that they can use to number themselves with (it will on average take s! rounds to do that). Gradually people will all get numbered, notice this works even if the adversary is aware of what random numbers the random number generators are going to generate, though he might be able to delay it for quite some time in that case.

That is all I am going to post about this for now. Next time I will go over the full deterministic solution.

Crazy Warden

Alright, so I found a new puzzle on the xkcd forums, and it is just far too epic to not post here:

There are N prisoners and N cells in a prison (though the prisoners do not know the value of N). The cells are all identical from the inside, you cannot tell them apart. The cells are arranged in a circle, and in each cell there is a switch. The switch controls a light in the next cell clockwise around the circle. However, the power supply for these lights is usually off, and it only comes on at midnight each night for a tenth of a second (a tenth of a second not being long enough for a prisoner to see the state of their light and make a decision about their switch).

Each day, the warden will take the prisoners out of their cells and clean out the cells, setting the light switches back to "off". The warden will then place the prisoners back into the cells as he sees fit (the prisoners do not see eachother during this process). At any time, any prisoner may announce "I know how many prisoners there are." If they are then able to state the value of N, the prisoners win and are released, if they are incorrect they lose and all the prisoners are killed.

You are one of the N prisoners. Before the game, you may send out a single email outlining the strategy the prisoners must follow. Your email cannot refer to any prisoner in particular, as you know nothing about their identities. Find a strategy that guarantees the prisoners escape against the warden acting as an adversary.

Man, those wardens are nuts. Anyway, the light switch thing can be equivalently expressed as "every night each prisoner must simultaneously transmit a '0' or a '1' to the next prisoner clockwise from them". Your strategy may contain two types of prisoners, "you" and "everybody else".

The puzzle is pretty hard, I was only able to make a small amount of progress before giving up and just looking at the solution. There are a variety of degrees of solutions one can arrive at, there is a fairly easy strategy that works with probability 1-e for any e>0, and there is a better strategy that works with probability 1. In both of these cases, it is necessary to assume the prisoners have a private random number generator at their disposal to actually generate said probabilities. The full solution, however, is a guaranteed escape, and in fact is guaranteed that for each N there is a K such that "if the number of prisoners is no greater than N, we will escape before K days have passed". Though, K is a really really huge number compared to N.

The Lowest One

Solution time! Last puzzle was about Infinite Hats. Actually, its funny, I found a solution and then I told it to Mark and he found a slightly different solution that had the exact same probability of working, but then those clever people over at Tanya Khovanova’s Math Blog found a better solution than mine. Anyway, lets go over my solution first, because I'm egotistical like that.

First lets consider the two person case, as soon a one person guesses we are at probability 1/2 of being correct, and we simply need to make sure that both of them are right together or that they are wrong together. Specifically, I want to make sure that if Alice guesses wrong, Bob will also guess wrong (say). In this case, suppose Alice looks at Bobs hats and writes down the smallest number that is a black hat and similarly Bob writes down the lowest number that is a black hat on Alices head. So if Bobs first hat is black, Alice will write down 1 and then if she is wrong Bob will also be wrong.

What is the actual chance of this working? We will calculate this by looking up the hats on their heads slowly, starting at the bottom. If both are black then we lose, if one is white while the other is black, then the person with a white hat is considered correct, and if both are white we move up to the next pair of hats. Let P be the chance that one person is correct, in this case we have

P=0*1/4+1*1/4+1*1/4+P*1/4

that is, 1/4 of the time we get BB and we are wrong, in the cases where we get BW or WB then we are get one person correct, and in the case of WW we haven't really changed anything. One can solve this to get P=2/3.

Now that one person is correct, what is the chance that the other person is correct? Suppose we have already seen WB go by, so person 1 is already correct, now WB and WW both have no effect, and BB results in a loss while BW results in a win. This the odds are 1/2 of the second person being correct. All in all, the probability is 1/3 of winning the game, better than 1/4.

Actually, there is a faster way to arrive at that answer of 1/3. Consider looking at the hats that occur, from the bottom up, you will see a series of things like WW, WB, BW, BB, and then look for the lowest occurrence of each of BW, WB, and BB. If BW and WB each occur before BB, then we win. That is, of those three if the highest is BB then we win, which is clearly 1/3.

Naturally this strategy generalizes to 100 quite easily. Simply look for the lowest occurrence of all black hats by your teammates and write down that number. The relevant sequences are BBB...BBW, BBB...BWB, BBB...BWBB, ...., BWBB...BB, WBB...BBB, and BBB...BBB, of which there are 101. There is a 1/101 chance that the last one is that all black one, and then we win. This generalizes to a strategy that works with probability 1/(N+1).

Marks strategy was slightly different, instead writing down the lowest occurrence of all white hats rather than all black hats (attempting the align when the players are right rather than when they are wrong). The relevant sequences are the all white and all white with one black, again N+1 of them in total. Naturally the game is won if all white occurs before any of the other ones, and that is again 1/(N+1).

The better strategy can be arrived at in a similar fashion to some other hat puzzles, using modular arithmetic. First, let K_i be the location of the lowest white hat on person i's head. Let S be the sum of all the K_i. If each person knew S, they could find the value of K_i. Fix a constant C_N (the constant can depend on N, the number of players). Each person is to make their guess assuming the following two things:

1) S mod C_N = 0
2) each K_i is no larger than C_N

If these statements are actually true, then this is enough for the players to guess correctly. What is the chance that those two statements were actually correct? Well, the largest K_i is going to be about as large as lg(N) (lg being log base 2), so if C_N - lg(N) gets large as N goes to infinity, we should be safe as far as the second point is concerned. As for the first one, it will be on average probability 1/C_N, since S is just a random number, and the two points are approximately independent. For example you could use C_N=2lg(N) and then you have a chance of winning of about 1/2lg(N), maybe a little worse, but this is asymptotically much better than 1/(N+1). One of the posters at Tanya Khovanova's Blog showed that it is basically optimal to use C_N=lg(N)+lg(lg(N))-lg(lg(lg(N))), but thats neither here nor there.

It is interesting that this better solution always attempts to find the location of the lowest white hat on your head, rather than any hat at all, so it seems clear one should be able to do better than this, but it is not clear how much better.

Infinite Hats

Time for a new puzzle. I found this one on Tanya Khovanova’s Math Blog, as so many of my puzzles are:

There is a room with 100 logicians, and each of them have an infinite number of hats in a tower upon their head. Each hat can be either white or black, and the sequence of hats upon a given logicians head is random. Standard hat rules apply of course.

At a specified time, each logician must simultaneously write down a number. For each logician, the hat of the number they wrote down is checked (so if Bob wrote down 13, we look at the 13^th hat on his head), and if that hat is white, that logician is "correct". If all the logicians are correct, they win, if any of them are incorrect they will play the game again tomorrow with new random hats.

Before the game, they may strategize. The naive strategy gives you a 1/2¹⁰⁰ chance to win, so it will take about 2¹⁰⁰ days to be released, but they do not want to wait that long. Find a strategy that will probably get them out within the year.

Because the hats are random you may use a strategy that only will specify a number with probability one. For example, Bob could look at Alice's hats and look for "the first occurrence of 578 consecutive black hats", which will appear somewhere on her head with probability one.

Unfair Auction

Time for the solution to that stupid auction game puzzle from last time.

As with so many puzzles like this, it basically comes down to solving the simpler cases first and moving up. Actually, we will generalize the puzzle a bit, and assume Alice is starting with $N.

Alright, in the "one coin means winning the game" case, it is obvious Bob simply must open the bidding with $N. A bid of any less hands Alice the win, and a bid of any more is unnecessary, so we can see that Bob must start with $N also. We can actually say something a bit stronger, if we are in a position where Alice needs one more coin to win and Bob needs one more coin to win, then Bob wins iff he has at least as much money as Alice has left.

Next the two coin game. There are four positions to consider here, we will first introduce some variables to help the notation. Let A be the number of coins Alice needs in order to win, and let B be the number of coins Bob needs in order to win. Then the positions can be represented as (A,B) being any of (1,1), (1,2), (2,1), (2,2). Obviously if we are in (1,1) then Bob needs the same amount of money as Alice, and if we are in the position (1,2) then Bob needs twice as much money as Alice, as he must bid enough so she can't outbid him, twice. In the position (2,1), Alice will lose if she lets Bob get a coin, so when Bob bids x, Alice must pay x+1, and then Bob must have as much money as Alice left over so he can win. Thus Bobs money-x=Alices money-(x+1), so Bob can have 1 less dollar than Alice in this position, and then he can bid anything he wants and win. Finally, the position (2,2), Bob will have F dollars and Alice will have N dollars. Bob will bid z dollars on the first coin. If he wins, he will have F-z vs N, and we know that he can then win the game if F-z=N-1. If Alice outbids him for z+1, Bob will have F-z dollars and Alice will have N-(z+1), and we know that Bob can win the game if F-z=2(N-(z+1)).

Combining those equations we get F=3/2(N-1), which tells you the amount of money Bob needs to start with.

Alright, we can just keep that pattern going. Define F(A,B,N) as the amount of money Bob needs to win from the position where Alice needs A more coins, Bob needs B more coins, and Alice has N dollars left. Clearly F(1,B,N)=B*N, as Bob must bid N every round, and F(A,1,N)=N-(A-1) as Alice can outbid Bob each of A-1 times losing one dollar on him each time, and then be unable to outbid him the last time. Finally, from some position (A,B,N), Bob has F(A,B,N) dollars, Bob will bid x, and Alice can either let him have it, moving to (A,B-1,N) with Bob having F(A,B,N)-x, or Alice can outbid him, moving to (A-1,B,N-(x+1)), with Bob having F(A,B,N)-x. This means that F(A,B-1,N)=F(A-1,B,N-(x+1))=F(A,B,N)-x.

Looking back at the strategy in the two coin game, Bob needed $3z while Alice had $2z+1 when Bob bid z on the first round. Actually he can bid z on all three rounds and Alice does not have enough money to outbid him twice, so he trivially wins (we also showed it was a minimal win). If one expects this strategy to continue upwards, Bob simply identifies a z such that Alice cannot pay Az+A, so she cannot outbid him A times (Az=N-A+1 will work) and then Bob will need (A+B-1)z dollars to win (so he can pay $z each of the (A+B-1) times a coin is won). Thus, we expect that F(A,B,N)=(N-A+1)(A+B-1)/A is the solution. One can confirm is satisfies our earlier relations, with x=z that we have found.

Actually this has alot of discretization error in it, since Bob cannot bid non-integer values. By our solution we expect that in the case of A=B, the amount of money Bob needs is z(2A-1), with z=(N-A+1)/A, so if N=100 and A=3, say, then z=32+1/3 and Bob needs 161+2/3. If Bob is given $162, it is not enough, he can make bids of $33, $33, $32, $32, $32, and Alice can outbid the three times he bids $32. She could even outbid two $32's and one $33, she cannot outbid two $33's and one $32 though, so Bob needs 32+33+33+33+33 dollars, so Bob needs $164 to win.

Its a bit sad the solution is so boring, Bob has some freedom in how he bids, but the strategy of just "pick a number large enough that Alice cannot afford to outbid it enough times" is basically optimal. Asymptotically, in the case A=B, with N>>A>>1, the solution is that Bob needs 2N dollars and bids N/A each round, Alice being unable to outbid him A times.

Auctioning Coins

Time for a new puzzle. I thought I learned this one on the forums at xkcd, but I was incorrect, what actually happened was that I misread a puzzle that somebody had posted and created my own inadvertent variant. Anyway, here it is:

Bob and Alice are playing an auction game. In a round of the auction a single coin is going to be put up for auction and Bob makes a bid on that coin. Alice can then either outbid him or pass. If Alice outbids Bob, she wins the bid, Bob does not get to bid again (the auction is once-around). Additionally, it is an all-pay auction, any bid you make you must pay, even if you lose the bid. The first person to acquire three coins wins the game.

It is clear that Alice has a large advantage in this game, so Bob gets to start with extra money. Given that Alice starts with $100, what is the minimum amount of money that Bob needs to start with so that he has a winning strategy?

To be clear, only nonnegative integer bids are allowed, no fractions. In a given round, Bob bids $k (and immediately pays it) and Alice gets the choice of either paying $(k+1) for the coin or letting Bob have it. Bob is allowed to bid zero (and will be forced to if he runs out of money), and Alice may then take the coin for $1 (which she will probably do unless she is out of money).

On the forums, the actual puzzle had an infinite number of rounds of bidding, and a player won as soon as they had a three coin lead. That puzzle had a much more complicated answer, but if I am honest the answer to this one is actually sort of boring once you get it, I just found it fun to solve out.

Pigeonhole'd

Solution o'clock! Hopefully you have all solved out the balls in boxes problem from last time. Given that you probably have, writing a solution is a totally pointless act, done simply to amuse myself. Here we go.

First of all, lets find some N for which we cannot always win. Specifically, consider that N=1 and there are 1000 balls. Box 999 might have 500 of them and box 1000 has the other 500 of them. In this case we have no legal moves and cannot win. What is the largest N can be so that we still might have no legal moves? To put it another way, when is N big enough to guarantee that we always have a legal move?

The most balls we can have with no legal move is 1 ball in box 2, 2 balls in box 3, 3 balls in box 4,..... k-1 balls in box k. If we have any more balls than that it will be guaranteed that we always have at least one legal move. That number of balls is 1+2+3+4+..+999 = 999*1000/2. So if N is larger than 999/2, so 500 or more, then we always have a legal move. If N is less than 500 we might have no legal moves, thus we may have started in a position with no legal moves.

Now, if N is 500 or larger, we know we always have a legal move, is this enough to guarantee that we can eventually wander our way to a solution? First note that if all the balls were in box 1, we would basically be done, we can move balls from box 1 to any target box simply by moving them 1 at a time. In fact, if we want to move balls from box i to box j, it costs us nothing to move from box i to box 1 then from box 1 to box j, so we might as well only consider moves to and from box 1.

So, assume N is 500 or larger, we have some boxes that have enough balls in them such that we can perform a move. Make all the moves you can from those boxes to box 1, such that when we are done the only legal moves we have left are from box 1. When we are done this box 1 must be nonempty, for if box 1 was empty we would have no moves, and that is impossible for N at least 500. Now, either all the balls are in box 1 or they are not, if all the balls are in box 1 we are done, we can solve the problem as mentioned earlier. If not all the balls are in box 1, there exists some k and x where box k has x balls in it, with k>1, k>x, and x>0. One can see that box 1 has at least k-x balls in it, this is apparent because if it did not I could move all the balls from box 1 into box k and leave myself with no legal moves (which is impossible). So, we can move k-x balls from box 1 into box k, bringing box k up to k balls, and we may then move all the balls from box k to box 1. We repeat this process for each nonempty box k with k>1. In the end all the balls are in box 1 and the game is trivially won.

I'm not sure why I liked this problem so much, I guess I was just pleasantly surprised to see something so simple on a Putnam. I cannot say that their questions lack creativity, but this particular problem is just the kind of creativity I like.

Balls In Boxes

New puzzle o'clock! I should probably be working instead of blogging, but blogging is so much more fun. Anyway, I found something of a neat puzzle on the Putnam 2010 exam. Usually those exams are just filled with semi-impossible math problems involving stupid functions, but there was one question on that exam I thought was pretty cool as a puzzle:

There are 1000 boxes, labelled 1 through 1000. There are 1000*N balls in the boxes, distributed randomly between the boxes. You will be playing a 1 player game, trying to average out the balls in the boxes, so that each box has exactly N balls in it.

The only legal move in the game is to select a number K and move exactly K balls from box K into any other box. So you could go to box 247 and move exactly 247 balls from that box into any other box. If box 247 has fewer than 247 balls in it, you cannot do this, so you don't get to move any of them. You are allowed to keep making legal moves as long as you like, and you win if every box has exactly N balls in it.

Given that in total there are 1000*N balls in the 1000 boxes, for what values of N is it always possible to win, regardless of the initial configuration of the balls?

Naturally, the fundamental nature of the solution is not particularly sensitive to the number 1000, in the actual Putnam they used 2010, cause thats how they roll.

Correlated Hats

Well, my thesis is essentially done, now its just time for the eternal editing of the damn thing (also, its only like 65 pages, somehow I had hoped for more). Anyway, time to post the solution the the hat problem from last time.

For the first part of the puzzle, consider that the logicians have some sort of strategy, which must be based on the hats of their fellow logicians and possibly some constant which varies for each individual logician and possibly some random variable. You get into the room ("you" being one of the logicians) and you check your strategy, and it says to say that your hat was white. OK, great, but since your strategy is independent of your own hat colour, your hat colour could randomize as we do this and your answer won't change. Thus you are 50% to be correct, no matter what. This is true for each and every logician. This means that no strategy can save more than 50% on average. In fact, no strategy can save less than 50% either. All strategies have an expectation value of saving 50 of the logicians.

You might think that is weird, since in so many other hat puzzles it has been that case that the logicians pull off something amazing with no information about their own hat colours. If you check carefully though, in all of those, when they were going to guess they were 50% chance to be correct, the most they ever pulled off was correlation with the other logicians being right.

Alright, now, is there a strategy that guarantees saving 50? Note that if the logicians knew if the total number of white hats was even, they could always guess their own hat, and similarly if they knew the total number of white hats was odd. So, the strategy is to have half of the logicians assume the total number of white hats is even, the other half assume that it is odd. Exactly one of those two groups will be right, and you will save 50 of them.

Finally, a strategy that saves all or none of them. This is simple given the last paragraph, simply have everybody assume the total number of white hats is even (or odd, if you prefer). They will either all be correct or all be wrong, with 50-50 chance.

Friendly Hats

Been some time since I have done an actual hat puzzle. There was one recently on Tanya Khovanova’s Math Blog that I had assumed I had done before, since it was so basic. Interestingly enough, I have never actually done this one, so I might as well put it up now:

There is a room of 100 logicians wearing hats. Each hat can be either black or white, and standard hat rules apply. The logicians must simultaneously guess what colour hat they have. Every logician who guesses right is released and every one that guesses wrong is killed.

Before the hats are assigned, they logicians may get together and plan. What strategy maximizes the number of logicians you can save on average?

Thats simple enough, there are a few follow up questions that Tanya also added though:

Suppose the maximum you can save on average is N, is there a strategy that guarantees that you will save N of them?

and,

Suppose the logicians are close friends, and they do not want to see their friends die, they would rather die with them. Is there a strategy that guarantees that they will all live together or die together?

Fairly simple if you have been following all the other hat posts, most of the basic results you need to solve this out have been proven somewhere or another on this blog.

Going In Circles

Alright, some time ago I promised to do more on the dots on a line puzzle, so I guess I'll do that now.

On the puzzle page over at cut-the-knot, they had a few writeups of possible ways to arrive at the solution. They varied in their degree of rigour, but they certainly were all more elegant than my method. One in particular led me to examine a few other things about the puzzle (that I honestly do not understand the implications of), and I will go over them here.

This solution was put forward by Stuart Anderson, and it beings by assuming that the dots do not necessarily follow a uniform distribution. Suppose x is the length of an interval between adjacent points, and let f(x)dx be the probability that a particular interval has length between x and x+dx. F(x) will be the cumulative probability distribution, so F'(x)=f(x) and F(x) ranges from 0 to 1, as x ranges from 0 to 1.

We will call an interval "small" if it is smaller than both its neighbours, "large" if it is larger than both its neighbours, and "medium" otherwise. Intervals that are small get painted twice, and medium ones get painted once, while large ones are unpainted. Lets try to find out how many of each region we have and how large each type is. Suppose we have an interval with size x, that occurs with chance f(x)dx, and its left hand neighbour will be smaller than it with chance F(x) (F(x) being the cumulative distribution function, it is the integral of f(y) from 0 to x). Similarly, the right hand neighbour will be smaller with chance F(x), so the total chance that we have a large region is

∫F(x) F(x) f(x) dx
=∫F(x)² dF

Where we have used f(x)=F'(x). The limits of integration are 0 to 1, since that is what F ranges over. Thus we get 1/3 for the chance an interval is large. A similar calculation shows that you also get 1/3 for small and medium intervals. This means that if an interval is selected at random, it is equally likely to be small, medium, or large, and this is independent of the probability distribution that the points are selected with.

The fact that P(small)=P(large) is somewhat obvious if you consider doing a plot of interval lengths. Small intervals will be a local minimum of the plot, while large intervals will be local maxima. For any two local minima, there must be a local maxima in between and vice-versa, so the number of small intervals must be the same as the number of large intervals (well, within 1 anyway), so in the infinite limit they will have the same probability. It is not clear why medium intervals would be equally likely, but there is probably some underlying reason.

Moving on, the expected length of a large interval is given by

∫x F(x) F(x) f(x) dx

which we can integrate by parts. Let u=x and v=F(x)³-1, so then we have

x(F(x)³-1)+∫(1-F(x)³) dx

v was chosen so that the first term would vanish at the endpoints, so that last integral gives us the average size of a large interval.

One can do a similar calculation for small and medium intervals, and the expressions are only slightly uglier, but this is as far as we can go without giving a specific form to the function F(x).

To proceed, assume that we have distributed the N dots on the line, and we scale things up by a factor of N, so there is on average 1 dot per unit length, then one can show that the function F(x) will tend to 1-e^-x for large N (I cannot really prove this, it makes sense, but people who wrote the solutions quoted at cut-the-knot took this result as obvious, so I suspect it is some elementary thing that I simply am unfamiliar with (to be honest, its why I found their solutions unconvincing and did my own awful one, I couldn't get past this point)).

Anyway, using that F for the integral, one gets that the average size of a large interval is 11/18. This must be multiplied by N to account for the fact that there are N intervals, but then divided by N because we scaled everything up by a factor of N.

A similar calculation gives that the average size of a medium interval is 5/18 and small intervals are 2/18.

So the ratio of small:medium:large intervals is 2:5:11. the small+medium intervals account for (2+5)/18 of the total length, giving the answer to the original problem. Note that if the small intervals are counted twice, then the total amount painted is (2+2+5)/18, which is exactly 1/2. I have not yet found a way to explain why this answer would be so clean.

When I first found out about the fact that small, medium, and large segments all appear with equal probability, I was inspired to try a slightly different problem. Suppose we take the unit circle and place 3 points on it, that will divide the circle into 3 parts, one small, one large, and one medium. On average, what is the relative sizes of these three parts?

This is simple enough to calculate, let the points be located at 0, x and y where x and y vary from 0 to 1 (1 being the full circle). By the symmetry of the problem, lets assume that x is smaller than y, and lets assume that the interval between 0 and x is the smallest. So y varies from 2x to 1-x and x varies from 0 to 1/3. The average length can be found from the integral

∫∫x dy dx

over the regions specified. We also need to divide by the integral of 1 over that same region, to divide by the probability that the region is actually a small one, essentially just normalizing properly.

Naturally, you can do a similar calculation for the largest interval and for the other one. In the end, you get the answers 2/18, 5/18, 11/18, unsurprising given that I chose to bring this whole thing up. So one does not need to work in the infinite case, somehow only using 3 points and 3 intervals gives the correct answer, but I have no idea why.

Standard Hat Rules Apply