Legendre prime counting function: Difference between revisions
Content added Content deleted
m (Minor edit to C++ and Java code) |
GordonBGood (talk | contribs) (→{{header|Erlang}}: add Crystal non-memoized versions as per Nim...) |
||
Line 154: | Line 154: | ||
10^9 50847534 |
10^9 50847534 |
||
</pre> |
</pre> |
||
=={{header|Crystal}}== |
|||
There doesn't seem to be much point to contributing memoized (Hash table) versions as they seem to be the result of the task author not realizing that there is a simple way of keeping the computation complexity from having exponential growth with counting range by limiting the "splitting" of the "phi" calculation when the result must be just one. Accordingly, the following Crystal versions are [translated from the non-memoizing Nim versions](https://rosettacode.org/wiki/Legendre_prime_counting_function#Non-Memoized_Versions), which explanations can be referenced for an understanding of how they work: |
|||
'''The Basic Recursive Legendre Algorithm''' |
|||
<syntaxhighlight lang="crystal">require "bit_array" |
|||
def count_primes(n : Int64) |
|||
if n < 3_i64 |
|||
return 0_i64 if n < 2_i64 |
|||
return 1_i64 |
|||
end |
|||
rtlmt = Math.sqrt(n.to_f64).to_i32 |
|||
mxndx = (rtlmt - 3) // 2 |
|||
cmpsts = BitArray.new(mxndx + 1) |
|||
i = 0 |
|||
while true |
|||
c = (i + i) * (i + 3) + 3 |
|||
break if c > mxndx |
|||
unless cmpsts[i] |
|||
bp = i + i + 3 |
|||
until c > mxndx |
|||
cmpsts[c] = true |
|||
c += bp |
|||
end |
|||
end |
|||
i += 1 |
|||
end |
|||
oprms = Array(Int32).new(cmpsts.count { |e| !e }, 0) |
|||
pi = 0 |
|||
cmpsts.each_with_index do |e, i| |
|||
unless e |
|||
oprms[pi] = (i + i + 3).to_i32; pi += 1 |
|||
end |
|||
end |
|||
phi = uninitialized Proc(Int64, Int32, Int64) # recursion target! |
|||
phi = ->(x : Int64, a : Int32) { |
|||
return x - (x >> 1) if a < 1 |
|||
p = oprms.unsafe_fetch(a - 1) |
|||
return 1_i64 if x <= p |
|||
phi.call(x, a - 1) - phi.call((x.to_f64 / p.to_f64).to_i64, a - 1) |
|||
} |
|||
phi.call(n, oprms.size) + oprms.size |
|||
end |
|||
start_time = Time.monotonic |
|||
(0 .. 9).each { |i| puts "π(10**#{i}) = #{count_primes(10_i64**i)}" } |
|||
elpsd = (Time.monotonic - start_time).total_milliseconds |
|||
puts "This took #{elpsd} milliseconds."</syntaxhighlight> |
|||
{{out}} |
|||
<pre>π(10**0) = 0 |
|||
π(10**1) = 4 |
|||
π(10**2) = 25 |
|||
π(10**3) = 168 |
|||
π(10**4) = 1229 |
|||
π(10**5) = 9592 |
|||
π(10**6) = 78498 |
|||
π(10**7) = 664579 |
|||
π(10**8) = 5761455 |
|||
π(10**9) = 50847534 |
|||
This took 272.428954 milliseconds.</pre> |
|||
The time is as run on an Intel i5-6500 (3.6 GHz single-threaded boost). |
|||
Although this is about ten times faster than if it were using memoization (and uses much less memory), it is only reasonably usable for trivial ranges such as this (trivial in terms of prime counting functions). |
|||
'''The Less Recursive "Bottom-Up" with a TinyPhi LUT Algorithm''' |
|||
Just substitute the following Crystal code for the `count_primes` function above to enjoy the gain in speed: |
|||
<syntaxhighlight lang="crystal">Tiny_Phi_Primes = [ 2, 3, 5, 7, 11, 13 ] |
|||
Tiny_Phi_Odd_Circ = Tiny_Phi_Primes.product // 2 |
|||
Tiny_Phi_Tot = Tiny_Phi_Primes.reduce(1) { |acc, p| acc * (p - 1) } |
|||
CC = Tiny_Phi_Primes.size - 1 |
|||
def make_Tiny_Phi_LUT() |
|||
rslt = Array(UInt16).new(Tiny_Phi_Odd_Circ, 1_u16) |
|||
Tiny_Phi_Primes.skip(1).each { |bp| |
|||
i = (bp - 1) >> 1; rslt[i] = 0; c = (i + i) * (i + 1) |
|||
while c < Tiny_Phi_Odd_Circ |
|||
rslt[c] = 0; c += bp |
|||
end } |
|||
acc = 0_u16; i = 0 |
|||
while i < Tiny_Phi_Odd_Circ |
|||
acc += rslt[i]; rslt[i] = acc; i += 1 |
|||
end |
|||
rslt |
|||
end |
|||
Tiny_Phi_LUT = make_Tiny_Phi_LUT() |
|||
@[AlwaysInline] |
|||
def tiny_Phi(x : Int64) : Int64 |
|||
ndx = (x - 1) >> 1; numtot = ndx // Tiny_Phi_Odd_Circ.to_i64 |
|||
tpli = ndx - numtot * Tiny_Phi_Odd_Circ.to_i64 |
|||
numtot * Tiny_Phi_Tot.to_i64 + |
|||
Tiny_Phi_LUT.unsafe_fetch(tpli).to_i64 |
|||
end |
|||
def count_primes(n : Int64) |
|||
if n < 169_i64 # below 169 whose sqrt is 13 is where TinyPhi doesn't work... |
|||
return 0_i64 if n < 2_i64 |
|||
return 1_i64 if n < 3_i64 |
|||
# adjust for the missing "degree" base primes |
|||
return 1 + (n - 1) // 2 if n < 9_i64 |
|||
return (n - 1) // 2 if n <= 13_i64 |
|||
return 5 + Tiny_Phi_LUT[(n - 1).to_i32 // 2].to_i64 |
|||
end |
|||
rtlmt = Math.sqrt(n.to_f64).to_i32 |
|||
mxndx = (rtlmt - 3) // 2 |
|||
cmpsts = BitArray.new(mxndx + 1) |
|||
i = 0 |
|||
while true |
|||
c = (i + i) * (i + 3) + 3 |
|||
break if c > mxndx |
|||
unless cmpsts[i] |
|||
bp = i + i + 3 |
|||
until c > mxndx |
|||
cmpsts[c] = true |
|||
c += bp |
|||
end |
|||
end |
|||
i += 1 |
|||
end |
|||
oprms = Array(Int32).new(cmpsts.count { |e| !e }, 0) |
|||
opi = 0 |
|||
cmpsts.each_with_index do |e, i| |
|||
unless e |
|||
oprms[opi] = (i + i + 3).to_i32; opi += 1 |
|||
end |
|||
end |
|||
lvl = uninitialized Proc(Int32, Int32, Int64, Int64) # recursion target! |
|||
lvl = ->(pilo : Int32, pilmt : Int32, m : Int64) : Int64 { |
|||
pi = pilo; answr = 0_i64 |
|||
while pi < pilmt |
|||
p = oprms.unsafe_fetch(pi).to_i64; nm = p * m |
|||
return answr + (pilmt - pi) if n <= nm * p |
|||
q = (n.to_f64 / nm.to_f64).to_i64; answr += tiny_Phi(q) |
|||
answr -= lvl.call(CC, pi, nm) if pi > CC |
|||
pi += 1 |
|||
end |
|||
answr |
|||
} |
|||
tiny_Phi(n) - lvl.call(CC, oprms.size, 1_i64) + oprms.size |
|||
end</syntaxhighlight> |
|||
Use of the above function gets a gain in speed of about a further ten times over the above version due to less recursion, the use of the "Tiny_Phi" Look Up Table (LUT) for smaller "degrees" of primes which greatly reduces the number of integer divisions, and by using floating point division for integer division because it is faster, although it is limited in precision to 53-bits, one wouldn't want to use these algorithms over a range where that would cause a problem. This version of prime counting function might be considered to be reasonably useful for counting primes up to a trillion (1e12) in a few tens of seconds. |
|||
'''The Non-Recursive Partial Sieving Algorithm''' |
|||
Just substitute the following Crystal code for the `count_primes` function above to enjoy the further gain in speed: |
|||
<syntaxhighlight lang="crystal">def count_primes(n : Int64) : Int64 |
|||
if n < 3 |
|||
if n < 2 |
|||
return 0_i64 |
|||
else |
|||
return 1_i64 |
|||
end |
|||
end |
|||
half = ->(n : Int64) : Int64 { (n - 1) >> 1 } |
|||
divide = ->(n : Int64, d : Int64) : Int64 { (n.to_f64 / d.to_f64).to_i64 } |
|||
rtlmt = Math.sqrt(n.to_f64).to_i32; mxndx = (rtlmt - 1) // 2 |
|||
cmpsts = BitArray.new(mxndx + 1) |
|||
smalls = Array(Int32).new(mxndx + 1) { |i| i } |
|||
roughs = Array(Int32).new(mxndx + 1) { |i| i + i + 1 } |
|||
larges = |
|||
Array(Int64).new(mxndx + 1) { |i| ((n // (i + i + 1)).to_i64 - 1) >> 1 } |
|||
i = 1; nbps = 0; mxri = mxndx |
|||
while true |
|||
c = (i + i) * (i + 1); break if c > mxndx |
|||
if !cmpsts.unsafe_fetch(i) |
|||
bp = i + i + 1; cmpsts.unsafe_put(i, true) |
|||
until c > mxndx |
|||
cmpsts.unsafe_put(c, true); c += bp |
|||
end # partial sieving for bp completed here! |
|||
j = 0; ri = 0 # adjust `larges` according to partial sieve... |
|||
while j <= mxri |
|||
q = roughs.unsafe_fetch(j); qi = q >> 1 |
|||
if !cmpsts.unsafe_fetch(qi) |
|||
d = bp.to_i64 * q.to_i64 |
|||
larges.unsafe_put(ri, larges.unsafe_fetch(j) - |
|||
if d <= rtlmt.to_i64 |
|||
ndx = smalls.unsafe_fetch(d >> 1) - nbps |
|||
larges.unsafe_fetch(ndx) |
|||
else |
|||
ndx = half.call(divide.call(n, d)) |
|||
smalls.unsafe_fetch(ndx) |
|||
end + nbps) |
|||
roughs.unsafe_put(ri, q); ri += 1 |
|||
end; j += 1 |
|||
end |
|||
si = mxndx; bpm = (rtlmt // bp - 1) | 1 |
|||
while bpm >= bp # adjust smalls according to partial sieve... |
|||
c = smalls.unsafe_fetch(bpm >> 1) - nbps; e = (bpm * bp) >> 1 |
|||
while si >= e |
|||
smalls.unsafe_put(si, smalls.unsafe_fetch(si) - c); si -= 1 |
|||
end |
|||
bpm -= 2 |
|||
end |
|||
mxri = ri - 1; nbps += 1 |
|||
end; i += 1 |
|||
end |
|||
ans = larges.unsafe_fetch(0); i = 1 |
|||
while i <= mxri # combine results; adjust for over subtraction base primes... |
|||
ans -= larges.unsafe_fetch(i); i += 1 |
|||
end |
|||
ans += (mxri.to_i64 + 1 + 2 * (nbps.to_i64 - 1)) * mxri.to_i64 // 2 # adjust! |
|||
ri = 1 # do final phi calculation for pairs of larger primes... |
|||
while true # break on condition when up to cube root of range! |
|||
p = roughs.unsafe_fetch(ri).to_i64; q = n // p |
|||
e = smalls.unsafe_fetch(half.call(divide.call(q, p))) - nbps |
|||
break if e <= ri; ori = ri + 1 |
|||
while ori <= e |
|||
ndx = half.call(divide.call(q, roughs.unsafe_fetch(ori).to_i64)) |
|||
ans += smalls.unsafe_fetch(ndx).to_i64; ori += 1 |
|||
end |
|||
ans -= (e - ri).to_i64 * (nbps.to_i64 + ri.to_i64 - 1); ri += 1 |
|||
end |
|||
ans + 1 # for only even prime of two! |
|||
end</syntaxhighlight> |
|||
The above function enjoys quite a large gain in speed of about a further ten times over the immediately preceding version due to not using recursion at all and the greatly reduced computational complexity of O(n**(3/4)/((log n)**2)) instead of the almost-linear-for-large-counting-ranges O(n/((log n)**2)) for the previous two versions, although the previous two versions differ in performance by a constant factor due to the overheads of recursion and division. This version of prime counting function might be considered to be reasonably useful for counting primes up to a 1e16 in almost two minutes as long as the computer used has the required free memory of about 800 Megabytes. This Crystal version is about twice as slow as the Nim version from which it was translated because Nim's default GCC back-end is better at optimization (at least for this algorithm) than the LLVM back-end used by Crystal and also there is more numeric calculation checking such as numeric overflow used in Crystal that can't be turned off. At that, this version is about the same speed as when translated to Rust, which also uses a LLVM back-end. |
|||
=={{header|Erlang}}== |
=={{header|Erlang}}== |
||
<syntaxhighlight lang="erlang"> |
<syntaxhighlight lang="erlang"> |
||
Line 222: | Line 450: | ||
10^9 50847534 |
10^9 50847534 |
||
</pre> |
</pre> |
||
=={{header|Go}}== |
=={{header|Go}}== |