Showing posts with label random walk. Show all posts
Showing posts with label random walk. Show all posts

The Value Of One Bit

I was looking over my "solution" threats, and it seems that about 90% of them begin with "time to post the solution to", so I figure I'm about sick of that. Sick of it this week anyway, next time I'll be back to it. Anyway, now that I have actually began the post with something different, time to post the solution to the bit in a bottle problem.

First I will give my formulation of the solution, and then I will show that it is optimal in a sense (a pretty convincing sense as well, not just some esoteric one). First, consider what happens if you just bet heads all the time. Your total money will simply preform a random walk on the integers starting with N, and will terminate if it ever hits zero. With probability 1, it will hit zero in finite time.

Now, I will construct two strategies, first define the following strategy
Strategy H: Every game, bet heads until you have reached $0 or $2N. If you ever reach $0 or $2N, quit playing.

Similarly, define a strategy for tails
Strategy T: Every game, bet tails until you have reached $0 or $2N. If you ever reach $0 or $2N, quit playing.

Now, we can show that one of these strategies will hit $0 and the other will hit $2N (this should be pretty intuitive, but lets prove it anyway). For a particular sequence of flips, let M(S,k) be the amount of money that strategy S will have after k flips. Then I claim that
M(H,k)+M(T,k)=$2N

This is easy to prove by induction, as it is true when k=0 and every time strategy H gains $1, strategy T loses $1 and vice-versa. The strategies also terminate at the same time, as failing to do so would violate the equality. It is theoretically possible that the strategies do not ever terminate, but that has zero probability of happening.

Now, the question we ask the genie is simple. First define strategy H for the genie, then ask:
If I follow strategy H, will I end with $0?

If the answer is yes follow strategy T, if the answer is no follow strategy H. This guarantees that you will end with $2N (where "guarantees" means "with probability one"). You can also change strategy H to any specific betting sequence rather than all heads if you like, as long as strategy T is the compliment of that sequence, this will still work.

Next I want to prove that this solution is optimal. It might seem surprising that we can optimize over the space of all possible questions, since math usually is not very good at handling human language. To deal with this, first consider C to be the set of all the possible coin flip results, so C contains things like HHTHH..., THTTH.., TTTHT..., and so on. Next consider a possible yes or no question, denoted Q, that for every element of C the question must answer exactly "yes" or "no". Thus Q partitions C into two disjoint subsets, called Y and N. Really, any Q is equivalent to selecting the sets Y and N and asking the genie "is the coming sequence of coin flips in Y?" So, to optimize over all questions Q, we really just need to optimize over choice of sets Y and N, and it is less surprising that math can handle that.

Before moving on, I must state an obvious but important fact of probability theory. Consider we have two disjoint events, X and Y (that is to say, exactly one of them must happen (thats not actually what disjoint means, but I don't know the correct term)), and a third event called A. Let P(x) is the probability that event x happens, and P(x|y) is the probability that event x happens given that event y happened. Then, the probability that event A happens can be written as:
P(A)=P(A|X)P(X)+P(A|Y)P(Y)

That is, A must happen with X or Y, so we calculate the chance it happens with X plus the chance it happens with Y. We need not subtract the chance it happens with both, as X and Y are disjoint. It is easy to generalize this to more disjoint events X,Y,Z,... We can also make a similar statement about expectation values, let E(A) be the expectation value of some variable A and E(A|x) is the expectation value of A given that event x is true. Then for disjoint events X and Y we have:
E(A)=E(A|X)P(X)+E(A|Y)P(Y)

An intuitive result, to be sure, but an important one anyway.

OK, now that that is out of the way, the next important step in the proof is to consider what will happen with no genie. Suppose you have some strategy S, and after k flips the strategy has an expected value E(S,k). The expected value will be determined by the chance that you went up a dollar on that flip (call that P(+)) and that chance that you went down a dollar on that flip (call that P(-)), and the chance that your strategy had told you to stop (call that P(O)). Specifically:
E(S,k+1)=(E(S,k)+1)P(+)+(E(S,k)-1)P(-)+E(S,k)P(O)

It is clear that P(+)+P(-)+P(O)=1 (since every round you either gain a dollar or lose a dollar or stop), and also P(+)=P(-) since without access to a genie we have no knowledge about the coin flips. This means that:
E(S,k+1)=E(S,k)

So, E(S,k) is a constant, and that value must equal $N, since E(S,0) is $N. This proves that with no genie, all strategies have the same expectation value after a fixed number of flips. Note that something funny happens if you just let k=∞ since strategies such as "play until I run out of money" are guaranteed to have zero money at the end, even though after a finite number of flips there is a chance of the player having a large amount of money for the time being (this is known as the "gamblers ruin").

Alright, now let us find what is the best we can do given our genie scenario. First let x represent the coming sequence of coin flips, the genie knows the value of x. Consider that we have a strategy specified by a question Q (which simply selects the sets Y and N), a strategy S(Y) (which is what we will do if the genie tells us x is in Y), and a strategy S(N) (which is what we will do if the genie tells us x is in N). We want to find E(S,k), the expectation value of our strategy S (=(Q,S(Y),S(N))) after k flips. Letting Y represent the event "x is in Y" and N represent the event "x is in N", we see that Y and N are disjoint events, so that:
E(S)=E(S|Y)P(Y)+E(S|N)P(N)

I really should have some k's in there also, but they are going to get cumbersome so I am just going to drop them for now. Now, E(S|Y) is just E(S(Y)|Y), since S says to play S(Y) when Y is true, also E(S|N)=E(S(N)|N) for the same reason. That is to say
E(S)=E(S(Y)|Y)P(Y)+E(S(N)|N)P(N)

To get any more information out of this, consider again what would happen to us with no genie. We could just play S(Y) anyway, but we already know that E(S(Y))=$N (one would expect that E(S(Y)|Y) would be higher, since then we also know that x is in Y, but if we just play S(Y) with no other information then its expectation value is $N) (I must say this is only true after k flips, if the number of flips is infinite weird things might happen). How else could we calculate E(S(Y))? we could just consider if x is in Y or N, specficially
E(S(Y))=E(S(Y)|Y)P(Y)+E(S(Y)|N)P(N)

That is the value of strategy S(Y) when Y is true plus the value of strategy S(Y) when N is true (remember, we have no genie right now). Now since E(S(Y)|N) cannot be negative, and P(N) cannot be negative, that second term cannot be negative. Thus we have a bound on E(S(Y)|Y)P(Y), given by
E(S(Y)|Y)P(Y) ≤ $N

Similarly, switching Y for N we have
E(S(N))=E(S(N)|Y)P(Y)+E(S(N)|N)P(N)

and therefore
E(S(N)|N)P(N) ≤ $N

We can put this together in our last equation for E(S) to get
E(S) ≤ $2N

Proving the optimality of the solution presented earlier. It appears that the value of one bit is to exactly double your cash. If you can instead ask the genie a question with possible answers 1,2,3,... up to some fixed K then you can extend this proof to show it is optimal to multiply your money by K.

Naturally, it is possible to get other strategies to end with more than $2N, they just cannot guarantee ending with that much. Even with no genie one can simply play the game and only quit if you hit $0 or $3N, you might end with $3N in this case, but it cannot be guaranteed. I will also point out that the proof relied on the game not taking an arbitrarily long amount of time, my intuition is that the result is still valid even without that assumption, but I do not see a good proof.

I do like this solution in that it is practical, one would reasonably use it if actually presented with this situation, and the genie does not even need to do an infinite amount of work (probably not anyway, its possible that the random walk never terminates, but thats probability zero). The genie can even be replaced by a mathematically competent human who simply has access to the coming flips.

There is a harder problem posed of the xkcd forums where losing a flip costs you a $1 and winning a flip gains you $(1-q) where q is a small positive number. I have not made any real progress to finding a solution to this problem, but anything involving random walks seems risky, so another approach might be needed.

Ray's Random Walk

So, I was thinking about just how good Sam and Ray can do by simply taking the random strategy. Sam always guesses the right answer and Ray simply guesses randomly. Its easy to see that this strategy has at least 50% chance to work, since you might just get the first one right and quit (of course the argument about just repeating the strategy n times to make $n fails, but whatever). So I started to wonder, the probability of at some point getting to $1 within the first k steps is lower than 1 and can not decrease as k goes upward. Thus in the limit as k goes to infinity it must converge to some real number, what real number does it converge to?

I suppose I could post it as a logic problem, but I'd rather just work out the solution right now, as its actually sort of tricky. Also I'm not 100% about my proof anyway, so maybe somebody can point out to me a flaw or something.

First, lets reformulate the problem:
Consider a random walk on the integers, starting at 0. At each step, we have a 1/2 chance to move up one step from n to n+1 and a 1/2 chance to move down 2 steps from n to n-2. What is the probability that we ever reach 1?

If you are concerned about the ambiguity of the term "ever reach", then define f(k) to be the probability of reaching 1 within the first k steps (well defined, nondecreasing, and bounded above by 1). Now ask what is the limit of f(k) as k tends to infinity.

Suppose we let P(n) represent the probability of ever reaching point n, we are interested in P(1) given that P(0)=1. Notice that P(n) will obey the relation:
P(n)=1/2P(n-1)+1/2P(n+2)

Its not so easy to prove that statement rigorously (I'm not even confident I know how to), but it is intuitive to say that to be at n, you came in from n-1 or n+2.

Next we must define the 'derivative' function:
d(n)=P(n+1)-P(n)

Then, we can rearrange our earlier equation to
P(n)-P(n-1)=P(n+2)-P(n)
d(n-1)=d(n)+d(n+1)

d(n) seems to obey some sort of "reverse Fibonacci" relationship, d(n) is determined as the sum of d(n+1) and d(n+2). Equivalently, and in a way that makes more sense:
d(n+1)=d(n-1)-d(n)

So, it is clear d(n) would be uniquely determined by any initial two values. There is a clean relationship you can use to find d(n) given d(0) and d(1) (similar to the one for the Fibonacci Squence in terms of the golden ratio), I will give an outline of the derivation.

Define a vector D(n) by [d(n+1),d(n)]. Then D(n+1)=A D(n), where A is the matrix [[-1,1],[1,0]]. You can diagonalize A into UTU-1, so that D(n)=AnD(0)=UTnU-1D(0). Tn is easy to calculate, as T is diagonal.

This will give you a relationship of d(n) in terms of d(1) and d(0). The golden ratio φ will show up everywhere (φ=(sqrt(5)+1)/2 and obeys φ=1+1/φ).

We can lock down d(0) and d(1) by using the boundary conditions. It is clear that P(n) must be a non-increasing function for n>0, because in order to reach 5, you must pass through 4 first, so P(5) cannot be greater than P(4). This means that P(n) must converge to some number as n tends to infinity. This guarantees that d(n), as the difference of successive P(n) terms, must converge to zero.

If one has been following along with the math and actually found the exact expression for d(n) (its not very pretty, so I didn't want to illustrate it), you find that it has terms of φn and φ-n. The condition d(n) converges to zero as n tends to infinity guarantees that the coefficient of φn must vanish, giving us d(1)=d(0)/φ, and cleans up the relationship to
d(n)=d(0)/φn

Now we must only lock down d(0). We know that P(n) is the sum of the first n d(i) values plus P(0) (which is 1). But this sum is an easy geometric sum:
P(n)=1+d(0)(1-φ-n)/(φ -1)
=1+d(0)φ(1-φ-n)

So, as n tends to infinity, P(n) tends to 1+d(0)φ. It seemed pretty self-evident to me that P(n) either tends to zero or one, but I have found a pretty convincing argument for zero.

After the first N steps, you will have moved right a spaces and left 2*(N-a) spaces (a is a number between 0 and N). You have ended on space 3a-2N, and this is clearly positive if a>2N/3. The probability of any particular value of a is (N choose a)/2N. The probability of ending to the right of zero is the sum of that as a goes from 2N/3 to N. This is the sum of the right third of the Nth row of Pascals triangle, divided be the sum of the whole Nth row. This will tend to zero as N tends to infinity (as the right third and the left third are equal in size, but the middle third is always bigger, eventually much bigger).

From this, I would conclude that P(n) tends to zero as n tends to infinity (not rigorous mind you, but I'm convinced). This guarantees that d(0) is -φ-1 and so:
P(n)=φ-n for n>0

Note that we needed the argument that P(n) is nondecreasing as we go to infinity, which fails for n<0, so I don't know what the behavior is there. In conclusion, if you use the random strategy in the Sam and Ray problem to try to get $n, it will work with probability φ-n.