Salmon Ray

Alright, time for further musings on the Sam and Ray problem. Actually there are two separate realizations here, but both of them seem to go nowhere, and I will post them anyway because this blog is to allow me to get this stuff out of my brain so I can one day actually finish my thesis.

So, the first thing was that I thought about the possibility of constructing the set S referred to in the Sam and Ray solution, but this time using a Hamming distance of two instead of one. Due to the fact that the best set was constructed out of efficiency in the initial solution, lets try that again.

We wish to construct a set S of binary strings of length N. S must have the feature that every possible binary string of length N is within Hamming distance 2 of exactly one element of S. The demand of exactly one element is an efficiency demand, if there were two elements, then we have "wasted elements" in some sense. Also making that demand will at least make this problem tractable.

What do we know about S? There are 2N possible binary strings that must be covered by S, so how many strings are covered by a single element of S? Each element covers itself (1) plus neighbors that are one bitflip away (N) plus neighbors that are two biflips away (N(N-1)/2), thus N2/2+N/2+1. So the number of elements that S has is at least 2N/(N2/2+N/2+1), which may or may not be an integer. However, if that number is not an integer, we will not meet our efficiency demand, so we will insist that 2N/(N2/2+N/2+1) is an integer. 2N/k will only be an integer if k is also a power of two, so we know that there exists an M such that:
N2+N+2=2M+1

then |S|=2N-M. Also, keep in mind, I have not proved that this S actually exists when N satisfies the above, but I have proven it cannot exist (with the efficiency demand) if N does not. Of course, ignoring the efficiency demand S always exists, but its probably ugly.

Solving our polynomial for N, we get
N=1/2(√2M+3-7-1)

or, in a format which is less visually offensive and more illustrative:
N=1/2(J-1)
J2=2M+3-7

Since N is an integer, J will have to be an odd integer, further in order for M to exist as we wanted, J2+7 will have to be a power of two (which also demands J be an odd integer, good).

Alright, lets try out some numbers to find candidates for J.
J=1 ⇒ J2+7=8, good. Then M=0 and N=0. Well, OK, this seems to be the trivial solution.

J=3 ⇒ J2+7=16, good. Then M=1 and N=1. Another trivial solution.

J=5 ⇒ J2+7=32, wow, this works great. Now M=2 and N=2. Still a trivial solution, as S={00} is good enough in the N=2 case.

J=7 ⇒ J2+7=56, finally not a power of two, I was worried.

J=9 ⇒ J2+7=89, again nothing.

J=11 ⇒ J2+7=128, cool. So M=4 and N=5. S must have cardinality 2N-M=2 and the set is simply S={00000,11111}.

This is good, we already knew about the cases N=0,1,2, and 5. In fact the N=5 case is what our information sending strategy is based on. So the next one will give us what we really want. You can try the next few values of J yourself, but they won't give you any powers of two for some time, instead I resorted to Maple. Actually, it is more practical to ask Maple for numbers such that 2N-7 is a perfect square rather than J2+7 being a power of two, because then you go through the numbers exponentially rather than quadratically. You will find that J=181 is the next solution.
J=181 ⇒ J2+7=32768=215, so M=12 and N=90. S must have cardinality 2N-M, so |S|=278.

OK, now I'm interested. We have a sequence of numbers that has N going 0,1,2,5,90, M goes 0,1,2,4,12, J goes 1,3,5,11,181, and |S| goes 1,1,1,2,278, anybody else out there see the pattern?

I'm mostly interested in the next J value, and Maple was able to tell me there aren't any before 220000. This was a bit worrisome, as the numbers 1,3,5,11,181 sort of make me think of things like the busy beaver function or the Ackermann function, so I expect the next number to be Graham's number or something silly (probably not that big, but you get the idea). Anyway, I decided to take a stroll over to the On-Line Encyclopedia of Integer Sequences and type in 1,3,5,11,181. Turns out its in there as A038198, and that there is no next number in the sequence, its well known as the solutions to Ramanujan's Square Equation.

OK, so my guess that the next number would be huge wasn't far off, at least I was correct in guessing that Maple would never find it for me. Anyway, this means that there is no more than one set S satisfying our efficiency demand that we don't know about already (for example {00000,11111}) (there might not even be any of course, I still haven't proven existence). I guess this means that there won't be worries about the general construction like there was in the Hamming distance one case, one just needs to explicitly construct the set in the case N=90 and you have done all the possible ones. Now, I have no idea how to construct the set, if anybody out there knows I would love to hear it, until then I'll give it a shot in my spare time and if I get anywhere I'll post it.

So, let us assume we actually had this set S, we can now give it to Sam and Ray. Sam now has the ability to specify any 90 digit string two within two digits by only using 78 bits of data. 78 bits of data costs $79, and getting 90 bits with only two wrong earns $88-$4=$84. So they earn $5, and have two wrong to send out new bits. Assuming they just convert those bits directly into cash, they earn $7 (they can do better than just convert to cash, but who's counting anyway). How long does it take this strategy to earn said $7? This algorithm takes 77x5 steps to send out the data and then a string of 90, finally the first bit and the last two bits gives us 478, and average of 68 steps/$, compared to my earlier strategies of 46 steps/$ and 72 steps/$. Though I will concede that I am being a bit wasteful about my endpoints, so my analysis isn't great. Anyway, I will conclude that using a Hamming distance of 2 does not improve your efficiency by much, and in fact costs you quite a bit if you only intend to earn $1 before quitting.

I still have more musings on this problem, but this post seems long enough for now.

Last thing to say before I close this one up mostly a reminder to myself that I would still like to construct S with N=90. It seems very interesting that there would be exactly one such set, it probably has neat properties. Or does not exist, which is also a neat property.

No comments: