User talk:Wherrera: Difference between revisions

m
no edit summary
m (Intentional task demotion?)
mNo edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 47:
== Intentional task demotion? ==
Did you mean to remove [[Bioinformatics/Global alignment]] and [[Doomsday rule]] as tasks? I see you removed the task markup, so they are no longer listed as tasks, and that is throwing off statistical calculations in the reports I generate. (Phix and Wren have over 100% completion.). I rather suspect that you meant to remove the '''draft''' part of the markup rather than the '''task''' part. I am provisionally going to re-add that. --[[User:Thundergnat|Thundergnat]] ([[User talk:Thundergnat|talk]]) 11:06, 31 October 2022 (UTC)
 
:: :: No this was supposed to change from draft status to regular task -- i did not know that had happened! So, the "task" label is always needed! I will try to not just remove the draft without adding the 'task' label then. --[[User:Wherrera|Wherrera]] ([[User talk:Wherrera|talk]]) 18:19, 31 October 2022 (UTC)
 
== Speed of your non-recursive Legendre prime counting function in Julia... ==
 
I ran your contribution on my machine to see how it compared to other language versions I had contributed, to find your translation of Nim is quite slow for the following reasons:
# The type of the argument counting function is the generic "::Integer" abstract type, which means that whatever type comes in will be the type of all the integers within the function (default ::Int64 on a 64-bit machine).
# You didn't keep the optimization of doing the division using floating point, which while not as fast in Julia as in Nim due to the speed of the conversion, does make a difference. While it is true that the precision of using floating point divisions means that counts to ranges over about 0.9e16 won't be accurate, this is a very high range for this algorithm due to high memory use and that alternate algorithms use much less memory and and a little faster.
# you are using even more memory because all of the arrays are this default size, meaning that "smalls" and "roughs" which only need to be 32-bits for a 64-bit argument (this algorithm is too slow and takes too much memory, even with these suggested improvement, to use anywhere near the 64-bit limit let alone for higher ranges), so the algorithm as you wrote it will be using about twice as much memory as the Nim version: 1.6 Gigabytes counting to 1e16 rather than 0.9 Gigabytes.
 
I fixed the above problems in the following Julia code, but didn't want to submit it myself as Julia isn't one of my commonly used languages and I might have missed something:
<syntaxhighlight lang="julia">function countprimes(n::UInt64)
n < 3 && return typeof(n)(n > 1)
@inline half(i::Integer) = (i - 1) >> 1
@inline divide(nm::Integer, d::Integer) = Integer(trunc(Float64(nm) / Float64(d)))
rtlmt = isqrt(n)
mxndx = (rtlmt - 1) ÷ 2
smalls::Array{UInt32} = [i for i in 0:mxndx+1]
roughs::Array{UInt32} = [i + i + 1 for i in 0:mxndx+1]
larges::Array{Int64} = [(n ÷ (i + i + 1) - 1) ÷ 2 for i in 0:mxndx+1]
cmpsts = falses(mxndx + 1)
bp, npc, mxri = 3, 0, mxndx
while true
i = bp >> 1
sqri = (i + i) * (i + 1)
sqri > mxndx && break
if !cmpsts[i + 1]
cmpsts[i + 1] = true
for c in sqri:bp:mxndx
cmpsts[c + 1] = true
end
ri = 0
for k in 0:mxri
q = roughs[k + 1]
qi = q >> 1
cmpsts[qi + 1] && continue
d = UInt64(bp) * UInt64(q)
larges[ri + 1] = larges[k + 1] + npc -
(d <= rtlmt ? larges[smalls[d >> 1 + 1] - npc + 1]
: smalls[half(divide(n, d)) + 1])
roughs[ri + 1] = q
ri += 1
end
m = mxndx
for k in (rtlmt ÷ bp - 1) | 1 : -2 : bp
c = smalls[k >> 1 + 1] - npc
ee = (k * bp) >> 1
while m >= ee
smalls[m + 1] -= c
m -= 1
end
end
mxri = ri - 1
npc += 1
end
bp += 2
end
 
result = larges[1]
for i in 2:mxri+1
result -= larges[i]
end
result += (mxri + 1 + 2 * (npc - 1)) * mxri ÷ 2
 
for j in 1:mxri
p = UInt64(roughs[j + 1])
m = n ÷ p
ee = smalls[half(m ÷ p) + 1] - npc
ee <= j && break
for k in j+1:ee
result += smalls[half(divide(m, UInt64(roughs[k + 1]))) + 1]
end
result -= (ee - j) * (npc + j - 1)
end
return result + 1
end
 
@time(println("π(10^9) = ", countprimes(UInt64(10)^9)))
@time(for i in 0:14
println("π(10^$i) = ", countprimes(UInt64(10)^i))
end)</syntaxhighlight>
Unless I am missing something, the above code takes about ten seconds on my machine and about three minutes to count to 1e16 (incorrectly as noted due to precision problems), which makes it about the speed of my JavaScript contribution of the same algorithm and slower than my Kotlin and F# contributions as well as most other native code compiling languages, which you might want to look into to see if something more can be done. --[[User:GordonBGood|GordonBGood]] ([[User talk:GordonBGood|talk]]) 03:33, 12 November 2022 (UTC)
 
That is right. I see no speedup with your corrections, so I think that the problem may lie elsewhere. --[[User:Wherrera|Wherrera]] ([[User talk:Wherrera|talk]]) 05:58, 12 November 2022 (UTC)
 
It looks as if using the precalculated values from Nim example speeds it by about 20%. I suspect more can be wrung out. --[[User:Wherrera|Wherrera]] ([[User talk:Wherrera|talk]]) 21:22, 12 November 2022 (UTC)
4,102

edits