## February 2017 solution

Spoiler alert: This month's riddle is a harder version of the January 2017 riddle. As such, its solution is also a solution to that previous riddle. If you still mean to tackle the January riddle, you may want to hold on before you read the below.

Yes, the harder version is also possible.

The basic method is to never construct the full sieve, but only the portions of it that cannot be easily filtered out in advance.

Consider the state of the sieve after filtering out the first few primes. For example, suppose that we have already filtered out 2, 3 and 5. The sieve now contains all numbers whose smallest prime factor is greater than 5. Consider the state of the sieve as a function, S5, where S5(x) is 0 if x is not in the sieve (it has a divisor that is at most 5) and 1 otherwise. This function is periodic, with a cycle of length 2×3×5=30.

Instead of creating the entire sieve up to n, we can therefore compute only the first 30 elements in the sieve. All other elements can be viewed as repetitions of the same.

Looking at the sieve in this state, we see that the next prime is 7. In order to update the sieve, we merely need to extend the 30-length sieve 7 times until it reaches length 210 (or until it reaches length n, if n<210), and then employ the fast method of Euler's sieve in order to filter out all elements divisible by 7.

How much is saved by pruning the sieve as it is built up, rather than as a last step?

Let us name the primes p1, p2,... At the k'th step, the sieve expands from a length of p1×...×pk-1 to a length of p1×...×pk. As it starts, it has (p1-1)×...×(pk-1-1) elements, as this is the number of elements in the range that do not divide any of the first k-1 primes. The algorithm then creates a sieve with pk times as many elements, and removes from these as many as there were in the sieve to begin with, for a total multiplicative expansion by a factor of pk-1 in the number of remaining elements.

Let r=ceil(n/(p1×...×pk-1)).

If k is the last step to be performed, then rpk. Let us switch to a slightly less efficient algorithm, which builds the sieve not just up to n, but up to p1×...×pk-1×r. We will show that even this less efficient algorithm is still an o(n)-time algorithm.

Because we know that each sieve element is generated exactly once and filtered out at most once, the work done for the k'th step in the algorithm is

O((p1-1)×...×(pk-1-1)×r).

Because the work done in all previous steps put together is negligible compared to this, the expression above is a complexity bound for the entire run-time of the algorithm.

This, in turn, is the same as

O(n×(1-1/p1)×...×(1-1/pk)).

To prove that this is o(n) we merely need to show that

limk→∞ (1-1/p1)×...×(1-1/pk) = 0.

To demonstrate this, consider the logarithm of this value. The logarithm of 1-1/pi is asymptotically -1/pi for large i, so up to a constant multiple, the log of the entire product behaves as the negative of the sum of the prime harmonic series.

This, in turn, is known to behave, up to an additive constant, as log(log k), which, in turn, diverges, proving our claim.

To calculate the complexity more specifically, we see that the product behaves as 1/log k, where k is such that

p1×...×pkn.

In complexity terms, this indicates that log n is approximately

Σi=1k log(i log i) = Σi=1k (log(i)+log(log i)) ≈ k log(k),

indicating that k≈log n.

Hence, the total complexity of the algorithm is on the order of Θ(n/log(log n)).

This is, as desired, better than O(n), but it is still worse than the number of primes that are output, which is Θ(n/log n).

Once again, I thank Jim Boyce for sending in this riddle.

For completion, a Python program implementing the solution can be found here. (The implementation is not perfect, in the sense that it uses 'a*b' and 'a+b' without verifying ahead of time that these values do not exceed n, causing a calculation overflow, but this was done for clarity of presentation only. Readers should find it not difficult to augment the program so as to avoid these pitfalls.)