Talk:Sieve of Eratosthenes

Optimizations

My Forth is very rusty, but it looks like your Forth version of the Sieve_of_Eratosthenes lacks both important features of the sieve: The outer loop should stop at the sqrt of the total limit, and the inner loop should start with the square of the just detected prime (any other multiples must already have been crossed out). Without those, the sieve won't be faster than to trial division for each number. Some other implementations seem to have the same problem. (And some only sieve odd numbers, while others don't, making overall comparison between languages more difficult). Maybe all this should also be clarified in the task description? Dirkt 02:48, 2 December 2007 (MST)

I"m ambivalent about these optimizations. The half dozen or so sieve implementations I based this on did not contain these optimizations (including those on similar comparison sites, such as the Computer Language Shootout). Also when I measured this a while ago, the speedup of these optimizations was minimal. For small ranges, the other optimization to assume 2 is prime (i.e. only examine odd numbers) had a greater effect, both in time and space. (Out of curiosity, did Eratosthenes himself document these optimizations, or were they found later?)

As to trial division, you are assuming division and square root are cheap. The simple implementation only uses addition (multiplication may not be implemented on a given processor).

I think keeping the tasks simple should be a goal of Rosetta Code. Keeping things simple encourages folks to donate their time in writing task solutions.

I agree that the examples should all use the same algorithm. The algorithm we decide upon should be described in the task description. --IanOsgood 07:53, 2 December 2007 (MST)

The difference only starts to kick in for large numbers. It's a similar story with the purely functional Haskell version I mentioned - nobody really tried this for large numbers, so nobody discovered it was flawed for a long time. And yes, only examining odd numbers (or using wheels other than this simple one that is only based on the prime 2) do indeed speed up things quite a lot, even for large limits.

By "trial division", I didn't mean actual cost, but asymptotic cost. The 'trick' behind the Sieve of Eratosthenes is to avoid checking whether a prime p divides some remaining candidate k if it is already known that k is composite. Trial division, in contrast, performs this check for every p and every k. And because the sieve only does a fraction of those checks, it's much faster.

The actual cost of sum versus multiplication is recovered quickly by the better asymptotic behavior (I can do the math for you, if you want me to).

I agree that things should be kept simple (and therefore I would prefer to simply test for all numbers, not only odd, or not those with remainder 1 or 5 (mod 6), etc.), but these two things (which are really two sides of the same idea - if you start crossing at p*p for each prime p, than of course you can stop crossing once p*p is over the limit) are really easy to implement and do affect asymptotic cost, so I'd like to see them included. Dirkt 09:24, 2 December 2007 (MST)

FYI, if you apply the outer loop optimization (loop from 2 to sqrt n), then you must also print the primes in a second loop from 2 to n. Many of the examples are now incorrect as they stand. --IanOsgood 10:15, 2 December 2007 (MST)

Yes, if one insists that the primes must actually be printed. But the task only says "find all the primes", and keeping the sieve around is actually the less space intensive way to store them (as I found out the hard way some time ago).

Anyway, I think that's not the main point :-) And the examples are first of all *different*. If you think that from the point of view of comparing languages, an inefficient Sieve of Eratosthenes is better, then the task should clearly say so, and all examples should be adapted. However, as I said, I think that those two optimasations are simple enough, and worth including. Dirkt 01:38, 3 December 2007 (MST)

If I might chime in...

I'm not really familiar with the Sieve of Eratosthenes, but perhaps a page could be taken from FizzBuzz and 100 Doors? Language sections in those pages are subsected to reflect optimized vs unoptimized approaches. For the purposes of easy comparison, perhaps such a subsection would be appropriate here?--Short Circuit 12:23, 3 December 2007 (MST)