Let the numbers to be sorted be in an array called *A*.

We can freely assume that all values in *A* are between 0 and 1. (If you
cannot assume this, first calculate, in O(*n*) time, the maximum and
minimum values in the array; if they are equal, all elements of the array
are the same and the sorting result is trivial; if they are not equal, one can
use them to compute an order-preserving linear transformation that maps all
elements of *A* into [0,1).)

Now, for any such *A* there exists an integer *s*, so that for any
*i* and *j*, if *A*[*i*]≠*A*[*j*] then the
two numbers also differ in their first *s* bits. By taking
"int((1<<*s*)**A*[*i*])", we can transform our problem of
sorting reals to a problem of sorting integers, which we know to be doable in
O(*n*). (See, e.g., the last solution given to the
December 2008
riddle.)

The question therefore boils down to whether we can find such an *s* in
O(*n*) time. Zhang's solution is the following.

We will work with a candidate *s*, beginning with *s*=0, and will
read the values in *A* sequentially. For each value read, we will check
whether it collides, in its first *s* digits, with a previously-read value
in *A*, and if so, we will increase *s* accordingly, until the two
no longer have an overlap. Because the previously read values in *A* are
known to differ in their first *s* digits, the collision can only ever
be with a single element of *A*.

The question now becomes how to find the element of *A* with which there
is a collision. This must be done in O(1). In principle, this should be easy to
do: we can maintain an array of size 2^{s}, where for each
possible sequence of *s* bits we store which element of *A* the
sequence collides with (if at all), but how do we update this array when
*s*
needs to be increased? We do not have time to re-read all elements of the array
in order to find out what their values are in the previously discarded bits.

The ingenious part of Zhang's solution is in realising that we don't have to
re-read *A* and don't need to know what the previously discarded bits are.

We can, instead, when increasing *s* to *s*', simply place any value
in the 2^{s}-length array in 2^{(s'-s)}
distinct positions in the new array, each corresponding to one of the
2^{(s'-s)} *possible* completions of the
*s*-length bit prefix into an *s*'-length prefix.

The critical observation is that because no two elements of *A* coincided
in their first *s* bits, no two can overlap in *any* of their
extensions to *s*' bits. While the look-up in the table will no longer
ensure that one will find an *A* value beginning in the desired sequence
of *s*' bits, one *is* guaranteed that one will find the only value
in *A* that *potentially* has the desired prefix. All other elements
in *A* are known not to share this prefix.

Techniques regarding how to adjust the look-up table in O(1) for a new value of
*s* have been discussed in the solutions for the
January 2009
riddle, in the context of width changing, but those techniques require the
use of Boolean operations, which are here not permitted: we showed how to
copy all values to their desired positions in O(1), but then needed a Boolean
AND operation to zero out other values that migrated in the process to
undesired positions.

Zhang's approach to this problem, circumventing the need for Boolean operations,
is the following: the undesired values are simply ignored. Effectively, the
new array created is not 2^{s'} values long, but rather
2^{s'}*x* values long, for some *x*, and when wanting
to read value *i* of the 2^{s'} possible ones, one
reads address *ix* of the array. All values that are in position numbers
that do not divide by *x* are simply ignored.

Zhang's code, annotated, can be found here.

Once again, thanks Zhang for this wonderful riddle!