Jump to content

User talk:WillNess: Difference between revisions

(reply to your comment in my talk page...)
(→‎Space complexity improvement -- Hamming numbers: added more of what I know now...)
Line 40:
Hi, glad you enjoyed my tweak.
 
There is an edge case I don't think your argument checking covers: what if n is 0 or negative?
I believe the way you apply it works out the same at the end as the way I do. I add the empirical error correction factor (which quickly becomes about a constant factor of the log of output number for even reasonable ranges) times the original estimate of the log value directly to the desired 'n' number as in the nth Hamming number to calculate a new estimated value as the upper bound of the band. The lower bound is then just the same difference lower (or twice the difference lower than the upper bound). Using my method, I don't have to produce fractions for both the upper bound and the difference.
 
I believe the way you apply itthe tweak works out the same at the end as the way I do. I add the empirical error correction factor (which quickly becomes about a constant factor of the log of output number for even reasonable ranges) times the original estimate of the log value directly to the desired 'n' number as in the nth Hamming number to calculate a new estimated value as the upper bound of the band. The lower bound is then just the same difference lower (or twice the difference lower than the upper bound). Using my method, I don't have to produce fractions for both the upper bound and the difference.
 
As to the space complexity, since the log estimate is basically n^(1/3), and the error band is proportional to the log, then the space complexity is as calculated and quite quickly approaches this limit.
Line 46 ⟶ 48:
Of course the space complexity makes very little difference in speed just as the sorting algorithm chosen makes very little difference as operations are limited to the size of the error band which is negligible as compared to the amount of time spent in the inner loops; for very large ranges there is a slight slowdown from ideal time complexity as the size of the error band gets larger than the various CPU caches.
 
Yes, you are right that 32-bit Haskell is quite slow at handling Word64's, which in my version is why I limit the inner loop i, j, and k values to Word32 (which under the covers Integers are calculated as when the ranges are small enough). The GHC Native Code Generator (NCG) has a known [https://ghc.haskell.org/trac/ghc/ticket/8971 problem, still there in version 8.0.1], especially for 32-bit code, in efficiently allocating registers that is known to be at least twice as slow as 64-bit code for many algorithms, so it is actually amazing that you managed to make this run as fast as you did on IdeOne under 32-bit. It looks like the extra slow-down for 320bit code is computing the count for each value, which for Integers reverts to something close to 32 bit values when the ranges are small enough instead of always having to compute by two chunks.
 
I've always avoided using Int's where I can because they used to have a few tag bits that limited their range for 32 bit use even more, but it seems were new versions of GHC these tag bits are no longer used.
 
The main time expended is due to the inner loops, and even in 64-bit code GHC is still slower by a constant factor than many other languages, even lowly C#/F#, likely due to the time spent building the (lazy) error list instead of using arrays. I think that one could write code that would be as fast as the others, but one would have to define an instance of an STUArray to hold the unboxed tuples used (perhaps as their own packed type class instead of tuples) and the code would get messy for submission to RosettaCode. For [http://rosettacode.org/wiki/Hamming_numbers#Extremely_fast_non-enumerating_version_sorting_values_in_error_band my F# version] I use non-lazy lists (at a constant factor cost in memory use for the tail links) very effectively, running almost as fast as a version using mutable arrays such as C#. The fastest versions in various languages fun about twice as fast as the Haskell version (64-bit code).
Line 52 ⟶ 56:
Yes, we have had the discussion about type signatures before re: Sieve of Eratosthenes; there is a reason they are recommended in the Haskell style guides for outer definitions.
 
You know me from previous discussions on this and other forums: I like maximum efficiency from any given algorithm including this one, and I think that with this tweak we are pretty well there as far as the algorithm goes. However, there are still some problems with your Haskell implementation as posted on the page when I run it on my actual Windows machine. First, it currently doesn't compile as it is missing some changes you intended to make in moving it from the IdeOne listing: missing 'ww' and changes between estval2 back to estval and so on. There is still a time slow down due to garbage collection: with both 64 and 32 bit code, your non-foldl' version spends over half the time garbage collecting on my machine due to the list comprehensions used and even without the garbage collection is still slow - my version can calculate the trillionth number in just over five seconds for 64/32 using Integer instead of Word64 whereas yours takes about three times as long plus the garbage collection. It looks like you should correct the RosettaCode version to the same foldl' version you use on IdeOne. Even with foldl' to prevent the effects of being non-strict, the overhead of using the list comprehensions still makes your version in 64-bit code about twice as slow as my version. My test program on IdeOne is at http://ideone.com/1Np5Ik and runs just a little bit faster than yours in 32-bit code with Integer but quite a bit faster under 64-bit code with Word64. --[[User:GordonBGood|GordonBGood]] ([[User talk:GordonBGood|talk]]) 08:10, 19 August 2016 (UTC)
You know me from previous discussions on this and other forums: I like maximum efficiency from any given algorithm including this one, and I think that with this tweak we are pretty well there. --[[User:GordonBGood|GordonBGood]] ([[User talk:GordonBGood|talk]]) 02:13, 19 August 2016 (UTC)
474

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.