Legendre prime counting function: Difference between revisions

Content added Content deleted
m (→‎Non-memoized: Oops, corrected memory usage.)
(→‎Non-Memoized Version: Nim, adjustments to comments and clarifications...)
Line 754: Line 754:
π(10^9) = 50847534</pre>
π(10^9) = 50847534</pre>


===Non-Memoized Version===
===Non-Memoized Versions===


The above code takes a huge amount of memory to run, especially if using the full size "Naturals" which not using them will severely limit the usable counting range (which of course is limited anyway by the huge memory use by the memoization). As per the comments on the task, memoization is not strictly needed, as the exponential growth in execution time can be limited by a very simple optimization that stops splitting of the recursive "tree" when there are no more contributions to be had by further "leaves" to the right of the "tree", as per the following code:
The above code takes a huge amount of memory to run, especially if using the full size "Naturals" which not using them will severely limit the usable counting range (which of course is limited anyway by the huge memory use by the memoization). As per the comments on the task, memoization is not strictly needed, as the exponential growth in execution time can be limited by a very simple optimization that stops splitting of the recursive "tree" when there are no more contributions to be had by further "leaves" to the right of the "tree", as per the following code:
Line 1,059: Line 1,059:
let elpsd = (getMonoTime() - strt).inMilliseconds
let elpsd = (getMonoTime() - strt).inMilliseconds
echo "This took ", elpsd, " milliseconds."</syntaxhighlight>
echo "This took ", elpsd, " milliseconds."</syntaxhighlight>
The output of the above code is the same as the previous version except that it is about ten times faster still and the execution time is almost too small to be measured for this trivial range (for prime counting functions) of only a billion; it takes a little over a second to calculate the count of primes up to 1e13 and just over 30 seconds to calculate the number of primes up to 1e16.
The output of the above code is the same as the previous version except that it is almost ten times faster still and the execution time is almost too small to be measured for this trivial range (for prime counting functions) of only a billion; it takes a little over a second to calculate the count of primes up to 1e13 and just over 30 seconds to calculate the number of primes up to 1e16 on a modern CPU.


The above program uses about 250,000 long integer divisions to calculate the count of primes to a billion as here, which is small enough to be able to calculate it by hand in several man-years of work. The Meissel-LMO algorithm with usual basic optimizations would take even less at perhaps a third to a half as many integer long divisions so as to be even more possible to hand calculate, although still subject to math errors as Meissel made in this computation to a billion in the late 1800's.
The above program uses about 250,000 long integer divisions to calculate the count of primes to a billion as here, which is small enough to be able to calculate it by hand in several man-years of work. The Meissel-LMO algorithm with usual basic optimizations greatly reduces the number of integer long divisions (to only about 17 thousand with a "TinyPhi" LUT of degree eight) so as to be even more possible to hand calculate in only a few man-months, although still subject to math errors as Meissel made in this computation to a billion in the about 1870's. Meissel did not have our advantage in electronic computational power, but just using partial sieving and even a small degree of "TinyPhi" table would have made it easily possible to hand calculate in a few man-years.


The advantage of this algorithm is its relative simplicity, but that comes at the cost of fairly high memory use such that it is only really usable to about 1e16 even for a modern computer.
The advantage of this "k-roughs" algorithm is its relative simplicity, but that comes at the cost of fairly high memory use due to using Legendre such that it is only really usable to about 1e16 even for a modern computer. Another disadvantage is that each partial sieving pass must be completed before being able to proceed with the next, which precludes using page segmentation and the ease of introducing multi-threading that provides. One could use page segmented sieving as for the the LMO algorithm but if one were to add that much added code complexity, one may as well implement full LMO and also enjoy the reduction in memory use. Once one has page-segmented sieving, one can then easily convert the code to multi-threading for gains in speed proportional to the number of effective threads for a given computer.


For larger ranges, a better algorithm such as that of LMO will perform much better than this and use much less memory (to proportional to the cube root of the range), although at the cost of greater code complexity and more lines of code (about 500 LOC). The above code can be considered to be a variation of the Legendre algorithm which is to the basic Legendre algorithm as the LMO algorithm is to the Meissel algorithm (which directly followed and improved the Legendre work), and which is much faster due to reduced computational complexity although it still uses a lot of memory (proportional to the square root of the range).
As counting range is increased above these trivial ranges, a better algorithm such as that of LMO will likely perform much better than this and use much less memory (to proportional to the cube root of the range), although at the cost of greater code complexity and more lines of code (about 500 LOC). The trade-off between algorithms of the Legendre type and those of the Meissel type is that the latter decreases the time complexity but at the cost of more time spent sieving; modern follow-on work to LMO produces algorithms which are able to tune the balance between Legendre-type and Meissel-type algorithm in trading off the execution time costs between the different parts of the given algorithm for the best in performance, with the latest in Kim Walisch's tuning of Xavier Gourdon's work being about five to ten times faster than LMO (plus includes multi-threading abilities).


=={{header|Perl}}==
=={{header|Perl}}==