Humble numbers: Difference between revisions

Line 2,816:

(24,29458)

(25,33188)</pre>

===Much Faster Version with Linear Response for Range===

The above version is quite "brute force" in that it collects all possible humble numbers in a Set persistent data structure, extracting them from smallest first while adding all of the new humble numbers based on each minimum. The Set takes care of the problem of eliminating repeat values, but at quite a large constant overhead as well as an extra `log n` execution performance cost where `n` is the size of the Set due to all persistent data structures. As well, there is a huge processing time cost due to grouping the values by their length after conversion to a string at the cost of a divide by ten for every digit and multiple levels of list processing.

From [[Hamming_numbers#Avoiding_generation_of_duplicatessk page|this Haskell implementation on the Hamming numbers task page]], the method of finding the sequence by non-duplicates as well as using logarithmic approximations for size comparisons costs linear execution time with the length of the sequence, and can be easily adapted for use here just by adding the stage of using the additional prime of seven. In addition, the counting of the digit groups can be greatly simplified and thus faster when accomplished with a single loop based on the logarithmic approximations to reduce the garbage collection overheads of multiple steps of list processing, as per the following code:

<syntaxhighlight lang="haskell">{-# OPTIONS_GHC -O2 #-}

import Data.Word (Word16)

import Data.Bits (shiftR, (.&.))

import Data.Function (fix)

import Data.List (group, intercalate)

import Data.Time.Clock.POSIX (getPOSIXTime)

--------------------- HUMBLE NUMBERS ----------------------

data LogRep = LogRep {-# UNPACK #-} !Double

{-# UNPACK #-} !Word16

{-# UNPACK #-} !Word16 deriving Show

instance Eq LogRep where

(==) (LogRep la _ _ _ _) (LogRep lb _ _ _ _) = la == lb

instance Ord LogRep where

(<=) (LogRep la _ _ _ _) (LogRep lb _ _ _ _) = la <= lb

logrep2Integer :: LogRep -> Integer

logrep2Integer (LogRep _ w x y z) = xpnd 2 w $ xpnd 3 x $ xpnd 5 y $ xpnd 7 z 1 where

xpnd m v = go v m where

go i mlt acc =

if i <= 0 then acc else

go (i `shiftR` 1) (mlt * mlt) (if i .&. 1 == 0 then acc else acc * mlt)

cOneLR :: LogRep

cOneLR = LogRep 0.0 0 0 0 0

cLgOf2 :: Double

cLgOf2 = logBase 10 2

cLgOf3 :: Double

cLgOf3 = logBase 10 3

cLgOf5 :: Double

cLgOf5 = logBase 10 5

cLgOf7 :: Double

cLgOf7 = logBase 10 7

cLgOf10 :: Double

cLgOf10 = cLgOf2 + cLgOf5

mulLR2 :: LogRep -> LogRep

mulLR2 (LogRep lg w x y z) = LogRep (lg + cLgOf2) (w + 1) x y z

mulLR3 :: LogRep -> LogRep

mulLR3 (LogRep lg w x y z) = LogRep (lg + cLgOf3) w (x + 1) y z

mulLR5 :: LogRep -> LogRep

mulLR5 (LogRep lg w x y z) = LogRep (lg + cLgOf5) w x (y + 1) z

mulLR7 :: LogRep -> LogRep

mulLR7 (LogRep lg w x y z) = LogRep (lg + cLgOf7) w x y (z + 1)

humbleLRs :: () -> [LogRep]

humbleLRs() = cOneLR : foldr u [] [mulLR2, mulLR3, mulLR5, mulLR7] where

u nmf s = fix (merge s . map nmf . (cOneLR:)) where

merge a [] = a

merge [] b = b

merge a@(x:xs) b@(y:ys) | x < y = x : merge xs b

| otherwise = y : merge a ys

-------------------------- TEST ---------------------------

main :: IO ()

main = do

putStrLn "First 50 Humble numbers:"

mapM_ (putStrLn . concat) $

chunksOf 10 $ justifyRight 4 ' ' . show <$> take 50 (map logrep2Integer $ humbleLRs())

strt <- getPOSIXTime

putStrLn "\nCount of humble numbers for each digit length 1-255:"

putStrLn "Digits Count Accum"

mapM_ putStrLn $ take 255 $ groupFormat $ humbleLRs()

stop <- getPOSIXTime

putStrLn $ "Counting took " ++ show (1.0 * (stop - strt)) ++ " seconds."

------------------------- DISPLAY -------------------------

chunksOf :: Int -> [a] -> [[a]]

chunksOf n = go where

go [] = []

go ilst = take n ilst : go (drop n ilst)

justifyRight :: Int -> a -> [a] -> [a]

justifyRight n c = (drop . length) <*> (replicate n c ++)

commaString :: String -> String

commaString =

let grpsOf3 [] = []

grpsOf3 is = let (frst, rest) = splitAt 3 is in frst : grpsOf3 rest

in reverse . intercalate "," . grpsOf3 . reverse

groupFormat :: [LogRep] -> [String]

groupFormat = go (0 :: Int) (0 :: Int) 0 where

go _ _ _ [] = []

go i cnt cacc ((LogRep lg _ _ _ _) : lrtl) =

let nxt = truncate (lg / cLgOf10) :: Int in

if nxt == i then go i (cnt + 1) cacc lrtl else

let ni = i + 1

ncacc = cacc + cnt

str = justifyRight 4 ' ' (show ni) ++

justifyRight 14 ' ' (commaString $ show cnt) ++

justifyRight 19 ' ' (commaString $ show ncacc)

in str : go ni 1 ncacc lrtl</syntaxhighlight>

<pre>First 50 Humble numbers:

1 2 3 4 5 6 7 8 9 10

12 14 15 16 18 20 21 24 25 27

28 30 32 35 36 40 42 45 48 49

50 54 56 60 63 64 70 72 75 80

81 84 90 96 98 100 105 108 112 120

Count of humble numbers for each digit length 1-255:

Digits Count Accum

1 9 9

2 36 45

3 95 140

4 197 337

5 356 693

6 579 1,272

7 882 2,154

8 1,272 3,426

9 1,767 5,193

10 2,381 7,574

11 3,113 10,687

12 3,984 14,671

13 5,002 19,673

14 6,187 25,860

15 7,545 33,405

16 9,081 42,486

17 10,815 53,301

18 12,759 66,060

19 14,927 80,987

20 17,323 98,310

21 19,960 118,270

22 22,853 141,123

23 26,015 167,138

24 29,458 196,596

25 33,188 229,784

.

. results as for the C++ or Pascal versions...

.

250 30,938,881 1,954,289,627

251 31,310,645 1,985,600,272

252 31,685,379 2,017,285,651

253 32,063,093 2,049,348,744

254 32,443,792 2,081,792,536

255 32,827,496 2,114,620,032

Counting took 169.193563058s seconds.</pre>

The above abbreviated results match the results from [[Humble_numbers#Direct_Generation_-_Variant|the C++ Direct-Generation contribution]] as well [[Humble_numbers#Pascal|as the Pascal contribution]]. The above code, as run on an Intel i5-6500 at 3.6 GHz with single-thread boost, still spends almost two thirds of the execution time doing garbage collection related to lazy list processing, which could almost be eliminated by replacing the inner lists with "deque" first in/first out mutable array-based data structures, which would also reduce memory use by some constant factor, but that hardly seems worth the effort when there are even faster techniques available for this specific set of tasks that almost eliminate the huge memory use and most of the processing time overheads.

===Direct Generation of Digit Counts from Logarithms===

From [[Humble_numbers#Direct_Generation_-_Variant|the C++ Direct Generation contribution]], there is no need to generate the ordered sequence of humble numbers to be able to bin them according to digits length; that technique, which reduces all of the execution time expended in sorting the humble numbers stream in increasing order, works fine in Haskell, as well reducing the huge memory requirements as it only requires space for the binning array. The use of the extended precision logarithmic approximations as in 64-bit rather than the 53-bit precision of 64-bit floats may be faster to compare 64-bit integers rather than `Double`'s. For the trivial exercise of generating the first 50 humble numbers in order, any algorithm including a brute force one may be used, but here a more direct version of the non-duplicates version is used, as per the following code:

<syntaxhighlight lang="haskell">{-# OPTIONS_GHC -O2 #-}

{-# LANGUAGE FlexibleContexts #-}

import Data.Word (Word64)

import Data.Bits (shiftR)

import Data.Function (fix)

import Data.Array.Unboxed (UArray, elems)

import Data.Array.Base (MArray(newArray, unsafeRead, unsafeWrite))

import Data.Array.IO (IOUArray)

import Data.List (intercalate)

import Data.Time.Clock.POSIX (getPOSIXTime)

import Data.Array.Unsafe (unsafeFreeze)

cNumDigits :: Int

cNumDigits = 877

cShift :: Int

cShift = 50

cFactor :: Word64

cFactor = 2^cShift

cLogOf10 :: Word64

cLogOf10 = cFactor

cLogOf7 :: Word64

cLogOf7 = round $ (logBase 10 7 :: Double) * fromIntegral cFactor

cLogOf5 :: Word64

cLogOf5 = round $ (logBase 10 5 :: Double) * fromIntegral cFactor

cLogOf3 :: Word64

cLogOf3 = round $ (logBase 10 3 :: Double) * fromIntegral cFactor

cLogOf2 :: Word64

cLogOf2 = cLogOf10 - cLogOf5

cLogLmt :: Word64

cLogLmt = fromIntegral cNumDigits * cLogOf10

humbles :: () -> [Integer]

humbles() = 1 : foldr u [] [2,3,5,7] where

u n s = fix (merge s . map (n*) . (1:)) where

merge a [] = a

merge [] b = b

merge a@(x:xs) b@(y:ys) | x < y = x : merge xs b

| otherwise = y : merge a ys

-------------------------- TEST ---------------------------

main :: IO ()

main = do

putStrLn "First 50 humble numbers:"

mapM_ (putStrLn . concat) $

chunksOf 10 $ justifyRight 4 ' ' . show <$> take 50 (humbles())

putStrLn $ "\nCount of humble numbers for each digit length 1-"

++ show cNumDigits ++ ":"

putStrLn "Digits Count Accum"

strt <- getPOSIXTime

mbins <- newArray (0, cNumDigits - 1) 0 :: IO (IOUArray Int Int)

let loopw w =

if w >= cLogLmt then return () else

let loopx x =

if x >= cLogLmt then loopw (w + cLogOf2) else

let loopy y =

if y >= cLogLmt then loopx (x + cLogOf3) else

let loopz z =

if z >= cLogLmt then loopy (y + cLogOf5) else do

let ndx = fromIntegral (z `shiftR` cShift)

v <- unsafeRead mbins ndx

unsafeWrite mbins ndx (v + 1)

loopz (z + cLogOf7) in loopz y in loopy x in loopx w

loopw 0

stop <- getPOSIXTime

bins <- unsafeFreeze mbins :: IO (UArray Int Int)

mapM_ putStrLn $ format $ elems bins

putStrLn $ "Counting took " ++ show (realToFrac (stop - strt)) ++ " seconds."

------------------------- DISPLAY -------------------------

chunksOf :: Int -> [a] -> [[a]]

chunksOf n = go where

go [] = []

go ilst = take n ilst : go (drop n ilst)

justifyRight :: Int -> a -> [a] -> [a]

justifyRight n c = (drop . length) <*> (replicate n c ++)

commaString :: String -> String

commaString =

let grpsOf3 [] = []

grpsOf3 s = let (frst, rest) = splitAt 3 s in frst : grpsOf3 rest

in reverse . intercalate "," . grpsOf3 . reverse

format :: [Int] -> [String]

format = go (0 :: Int) 0 where

go _ _ [] = []

go i cacc (hd : tl) =

let ni = i + 1

ncacc = cacc + hd

str = justifyRight 4 ' ' (show ni) ++

justifyRight 14 ' ' (commaString $ show hd) ++

justifyRight 19 ' ' (commaString $ show ncacc)

in str : go ni ncacc tl</syntaxhighlight>

<pre>First 50 humble numbers:

1 2 3 4 5 6 7 8 9 10

12 14 15 16 18 20 21 24 25 27

28 30 32 35 36 40 42 45 48 49

50 54 56 60 63 64 70 72 75 80

81 84 90 96 98 100 105 108 112 120

Count of humble numbers for each digit length 1-877:

Digits Count Accum

1 9 9

2 36 45

3 95 140

4 197 337

5 356 693

6 579 1,272

7 882 2,154

8 1,272 3,426

9 1,767 5,193

10 2,381 7,574

11 3,113 10,687

12 3,984 14,671

13 5,002 19,673

14 6,187 25,860

15 7,545 33,405

16 9,081 42,486

17 10,815 53,301

18 12,759 66,060

19 14,927 80,987

20 17,323 98,310

21 19,960 118,270

22 22,853 141,123

23 26,015 167,138

24 29,458 196,596

25 33,188 229,784

.

. results as for the C++ or Pascal versions...

.

860 1,252,394,180 270,098,254,942

861 1,256,764,708 271,355,019,650

862 1,261,145,413 272,616,165,063

863 1,265,536,277 273,881,701,340

864 1,269,937,307 275,151,638,647

865 1,274,348,541 276,425,987,188

866 1,278,769,968 277,704,757,156

867 1,283,201,615 278,987,958,771

868 1,287,643,503 280,275,602,274

869 1,292,095,618 281,567,697,892

870 1,296,557,975 282,864,255,867

871 1,301,030,613 284,165,286,480

872 1,305,513,506 285,470,799,986

873 1,310,006,698 286,780,806,684

874 1,314,510,190 288,095,316,874

875 1,319,023,979 289,414,340,853

876 1,323,548,095 290,737,888,948

877 1,328,082,553 292,065,971,501

Counting took 177.070331827 seconds.</pre>

This result is as run on an Intel i5-6500 at 3.6 GHz for single-threaded boost and the results are the same as for [[Humble_numbers#Direct_Generation_-_Variant|the C++ Direct-Generation contribution]] as well [[Humble_numbers#Pascal|as the Pascal contribution]]. The results are perhaps a little faster than the C++ code from which this was translated, likely due to scaling so that the bin index is calculated by a simple bit shift rather than a division.

This code can likely be used to count the number of humble numbers for all digits up to 4000 digits in under a day (untested).

=={{header|J}}==