Hamming numbers
You are encouraged to solve this task according to the task description, using any language you may know.
Hamming numbers are numbers of the form
H = 2i × 3j × 5k where i, j, k ≥ 0
Hamming numbers are also known as ugly numbers and also 5-smooth numbers (numbers whose prime divisors are less or equal to 5).
- Task
Generate the sequence of Hamming numbers, in increasing order. In particular:
- Show the first twenty Hamming numbers.
- Show the 1691st Hamming number (the last one below 231).
- Show the one millionth Hamming number (if the language – or a convenient library – supports arbitrary-precision integers).
- Related tasks
- References
- Wikipedia entry: Hamming numbers (this link is re-directed to Regular number).
- Wikipedia entry: Smooth number
- OEIS entry: A051037 5-smooth or Hamming numbers
- Hamming problem from Dr. Dobb's CodeTalk (dead link as of Sep 2011; parts of the thread here and here).
11l
F hamming(limit)
V h = [1] * limit
V (x2, x3, x5) = (2, 3, 5)
V i = 0
V j = 0
V k = 0
L(n) 1 .< limit
h[n] = min(x2, x3, x5)
I x2 == h[n]
i++
x2 = 2 * h[i]
I x3 == h[n]
j++
x3 = 3 * h[j]
I x5 == h[n]
k++
x5 = 5 * h[k]
R h.last
print((1..20).map(i -> hamming(i)))
print(hamming(1691))
- Output:
[1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36] 2125764000
360 Assembly
* Hamming numbers 12/03/2017
HAM CSECT
USING HAM,R13 base register
B 72(R15) skip savearea
DC 17F'0' savearea
STM R14,R12,12(R13) save previous context
ST R13,4(R15) link backward
ST R15,8(R13) link forward
LR R13,R15 set addressability
LA R6,1 ii=1
DO WHILE=(C,R6,LE,=F'20') do ii=1 to 20
BAL R14,PRTHAM call prtham
LA R6,1(R6) ii++
ENDDO , enddo ii
LA R6,1691 ii=1691
BAL R14,PRTHAM call prtham
L R13,4(0,R13) restore previous savearea pointer
LM R14,R12,12(R13) restore previous context
XR R15,R15 rc=0
BR R14 exit
PRTHAM EQU * ---- prtham
ST R14,R14PRT save return addr
LR R1,R6 ii
XDECO R1,XDEC edit
MVC PG+2(4),XDEC+8 output ii
LR R1,R6 ii
BAL R14,HAMMING call hamming(ii)
XDECO R0,XDEC edit
MVC PG+8(10),XDEC+2 output hamming(ii)
XPRNT PG,L'PG print buffer
L R14,R14PRT restore return addr
BR R14 ---- return
HAMMING EQU * ---- hamming(ll)
ST R14,R14HAM save return addr
ST R1,LL ll
MVC HH,=F'1' h(1)=1
SR R0,R0 0
ST R0,I i=0
ST R0,J j=0
ST R0,K k=0
MVC X2,=F'2' x2=2
MVC X3,=F'3' x3=3
MVC X5,=F'5' x5=5
LA R7,1 n=1
L R2,LL ll
BCTR R2,0 -1
ST R2,LLM1 ll-1
DO WHILE=(C,R7,LE,LLM1) do n=1 to ll-1
L R4,X2 m=x2
IF C,R4,GT,X3 THEN if m>x3 then
L R4,X3 m=x3
ENDIF , endif
IF C,R4,GT,X5 THEN if m>x5 then
L R4,X5 m=x5
ENDIF , endif
LR R1,R7 n
SLA R1,2 *4
ST R4,HH(R1) h(n+1)=m
IF C,R4,EQ,X2 THEN if m=x2 then
L R1,I i
LA R1,1(R1) i+1
ST R1,I i=i+1
SLA R1,2 *4
L R2,HH(R1) h(i+1)
MH R2,=H'2' *2
ST R2,X2 x2=2*h(i+1)
ENDIF , endif
IF C,R4,EQ,X3 THEN if m=x3 then
L R1,J j
LA R1,1(R1) j+1
ST R1,J j=j+1
SLA R1,2 *4
L R2,HH(R1) h(j+1)
MH R2,=H'3' *3
ST R2,X3 x3=3*h(j+1)
ENDIF , endif
IF C,R4,EQ,X5 THEN if m=x5 then
L R1,K k
LA R1,1(R1) k+1
ST R1,K k=k+1
SLA R1,2 *4
L R2,HH(R1) h(k+1)
MH R2,=H'5' *5
ST R2,X5 x5=5*h(k+1)
ENDIF , endif
LA R7,1(R7) n++
ENDDO , enddo n
L R1,LL ll
SLA R1,2 *4
L R0,HH-4(R1) return h(ll)
L R14,R14HAM restore return addr
BR R14 ---- return
R14HAM DS A return addr of hamming
R14PRT DS A return addr of print
LL DS F ll
LLM1 DS F ll-1
I DS F i
J DS F j
K DS F k
X2 DS F x2
X3 DS F x3
X5 DS F x5
PG DC CL80'H(xxxx)=xxxxxxxxxx'
XDEC DS CL12 temp
LTORG positioning literal pool
HH DS 1691F array h(1691)
YREGS
END HAM
- Output:
H( 1)= 1 H( 2)= 2 H( 3)= 3 H( 4)= 4 H( 5)= 5 H( 6)= 6 H( 7)= 8 H( 8)= 9 H( 9)= 10 H( 10)= 12 H( 11)= 15 H( 12)= 16 H( 13)= 18 H( 14)= 20 H( 15)= 24 H( 16)= 25 H( 17)= 27 H( 18)= 30 H( 19)= 32 H( 20)= 36 H(1691)=2125764000
Ada
GNAT provides the datatypes Integer, Long_Integer and Long_Long_Integer, which are not large enough to store hamming numbers. In this program, we represent them as the factors for each of the prime numbers 2, 3 and 5, and only convert them to a base-10 numbers for display. We use the gmp library binding part of GNATCOLL, though a simple 'pragma import' would be enough.
This version is very fast (20ms for the million-th hamming number), thanks to a good algorithm. We also do not manipulate large numbers directly (gmp lib), but only the factors of the prime.
It will fail to compute the billion'th number because we use an array of the stack to store all numbers. It is possible to get rid of this array though it will make the code slightly less readable.
with Ada.Numerics.Generic_Elementary_Functions;
with Ada.Text_IO; use Ada.Text_IO;
with GNATCOLL.GMP.Integers;
with GNATCOLL.GMP.Lib;
procedure Hamming is
type Log_Type is new Long_Long_Float;
package Funcs is new Ada.Numerics.Generic_Elementary_Functions (Log_Type);
type Factors_Array is array (Positive range <>) of Positive;
generic
Factors : Factors_Array := (2, 3, 5);
-- The factors for smooth numbers. Hamming numbers are 5-smooth.
package Smooth_Numbers is
type Number is private;
function Compute (Nth : Positive) return Number;
function Image (N : Number) return String;
private
type Exponent_Type is new Natural;
type Exponents_Array is array (Factors'Range) of Exponent_Type;
-- Numbers are stored as the exponents of the prime factors.
type Number is record
Exponents : Exponents_Array;
Log : Log_Type;
-- The log of the value, used to ease sorting.
end record;
function "=" (N1, N2 : Number) return Boolean
is (for all F in Factors'Range => N1.Exponents (F) = N2.Exponents (F));
end Smooth_Numbers;
package body Smooth_Numbers is
One : constant Number := (Exponents => (others => 0), Log => 0.0);
Factors_Log : array (Factors'Range) of Log_Type;
function Image (N : Number) return String is
use GNATCOLL.GMP.Integers, GNATCOLL.GMP.Lib;
R, Tmp : Big_Integer;
begin
Set (R, "1");
for F in Factors'Range loop
Set (Tmp, Factors (F)'Image);
Raise_To_N (Tmp, GNATCOLL.GMP.Unsigned_Long (N.Exponents (F)));
Multiply (R, Tmp);
end loop;
return Image (R);
end Image;
function Compute (Nth : Positive) return Number is
Candidates : array (Factors'Range) of Number;
Values : array (1 .. Nth) of Number;
-- Will result in Storage_Error for very large values of Nth
Indices : array (Factors'Range) of Natural :=
(others => Values'First);
Current : Number;
Tmp : Number;
begin
for F in Factors'Range loop
Factors_Log (F) := Funcs.Log (Log_Type (Factors (F)));
Candidates (F) := One;
Candidates (F).Exponents (F) := 1;
Candidates (F).Log := Factors_Log (F);
end loop;
Values (1) := One;
for Count in 2 .. Nth loop
-- Find next value (the lowest of the candidates)
Current := Candidates (Factors'First);
for F in Factors'First + 1 .. Factors'Last loop
if Candidates (F).Log < Current.Log then
Current := Candidates (F);
end if;
end loop;
Values (Count) := Current;
-- Update the candidates. There might be several candidates with
-- the same value
for F in Factors'Range loop
if Candidates (F) = Current then
Indices (F) := Indices (F) + 1;
Tmp := Values (Indices (F));
Tmp.Exponents (F) := Tmp.Exponents (F) + 1;
Tmp.Log := Tmp.Log + Factors_Log (F);
Candidates (F) := Tmp;
end if;
end loop;
end loop;
return Values (Nth);
end Compute;
end Smooth_Numbers;
package Hamming is new Smooth_Numbers ((2, 3, 5));
begin
for N in 1 .. 20 loop
Put (" " & Hamming.Image (Hamming.Compute (N)));
end loop;
New_Line;
Put_Line (Hamming.Image (Hamming.Compute (1691)));
Put_Line (Hamming.Image (Hamming.Compute (1_000_000)));
end Hamming;
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
ALGOL 68
Hamming numbers are generated in a trivial iterative way as in the Python version below. This program keeps the series needed to generate the numbers as short as possible using flexible rows; on the downside, it spends considerable time on garbage collection.
PR precision=100 PR
MODE SERIES = FLEX [1 : 0] UNT, # Initially, no elements #
UNT = LONG LONG INT; # A 100-digit unsigned integer #
PROC hamming number = (INT n) UNT: # The n-th Hamming number #
CASE n
IN 1, 2, 3, 4, 5, 6, 8, 9, 10, 12 # First 10 in a table #
OUT # Additional operators #
OP MIN = (INT i, j) INT: (i < j | i | j), MIN = (UNT i, j) UNT: (i < j | i | j);
PRIO MIN = 9;
OP LAST = (SERIES h) UNT: h[UPB h]; # Last element of a series #
OP +:= = (REF SERIES s, UNT elem) VOID:
# Extend a series by one element, only keep the elements you need #
(INT lwb = (i MIN j) MIN k, upb = UPB s;
REF SERIES new s = HEAP FLEX [lwb : upb + 1] UNT;
(new s[lwb : upb] := s[lwb : upb], new s[upb + 1] := elem);
s := new s
);
# Determine the n-th hamming number iteratively #
SERIES h := 1, # Series, initially one element #
UNT m2 := 2, m3 := 3, m5 := 5, # Multipliers #
INT i := 1, j := 1, k := 1; # Counters #
TO n - 1
DO h +:= (m2 MIN m3) MIN m5;
(LAST h = m2 | m2 := 2 * h[i +:= 1]);
(LAST h = m3 | m3 := 3 * h[j +:= 1]);
(LAST h = m5 | m5 := 5 * h[k +:= 1])
OD;
LAST h
ESAC;
FOR k TO 20
DO print ((whole (hamming number (k), 0), blank))
OD;
print ((newline, whole (hamming number (1 691), 0)));
print ((newline, whole (hamming number (1 000 000), 0)))
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
ALGOL W
Algol W only has 32 bit integers, so we just show the first 20 Hamming Numbers and Hamming number 1691.
This uses the algorithm from Dr Dobbs (as in the Python version). The Coffee Script solution has some notes on how it works.
begin
% returns the minimum of a and b %
integer procedure min ( integer value a, b ) ; if a < b then a else b;
% find and print Hamming Numbers %
% Algol W only supports 32-bit integers so we just find %
% the 1691 32-bit Hamming Numbers %
integer MAX_HAMMING;
MAX_HAMMING := 1691;
begin
integer array H( 1 :: MAX_HAMMING );
integer p2, p3, p5, last2, last3, last5;
H( 1 ) := 1;
last2 := last3 := last5 := 1;
p2 := 2;
p3 := 3;
p5 := 5;
for hPos := 2 until MAX_HAMMING do begin
integer m;
% the next Hamming number is the lowest of the next multiple of 2, 3, and 5 %
m := min( min( p2, p3 ), p5 );
H( hPos ) := m;
if m = p2 then begin
last2 := last2 + 1;
p2 := 2 * H( last2 )
end if_used_power_of_2 ;
if m = p3 then begin
last3 := last3 + 1;
p3 := 3 * H( last3 )
end if_used_power_of_3 ;
if m = p5 then begin
last5 := last5 + 1;
p5 := 5 * H( last5 )
end if_used_power_of_5 ;
end for_hPos ;
i_w := 1;
s_w := 1;
write( H( 1 ) );
for i := 2 until 20 do writeon( H( i ) );
write( H( MAX_HAMMING ) )
end
end.
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000
Arturo
hamming: function [limit][
if limit=1 -> return 1
h: map 0..limit-1 'z -> 1
x2: 2, x3: 3, x5: 5
i: 0, j: 0, k: 0
loop 1..limit-1 'n [
set h n min @[x2 x3 x5]
if x2 = h\[n] [
i: i + 1
x2: 2 * h\[i]
]
if x3 = h\[n] [
j: j + 1
x3: 3 * h\[j]
]
if x5 = h\[n] [
k: k + 1
x5: 5 * h\[k]
]
]
last h
]
print map 1..20 => hamming
print hamming 1691
print hamming 1000000
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
ATS
//
// How to compile:
// patscc -DATS_MEMALLOC_LIBC -o hamming hamming.dats
//
#include
"share/atspre_staload.hats"
fun
min3
(
A: arrayref(int, 3)
) : natLt(3) = i where
{
var x: int = A[0]
var i: natLt(3) = 0
val () = if A[1] < x then (x := A[1]; i := 1)
val () = if A[2] < x then (x := A[2]; i := 2)
} (* end of [min3] *)
fun
hamming
{n:pos}
(
n: int(n)
) : int = let
//
var A = @[int](2, 3, 5)
val A = $UNSAFE.cast{arrayref(int, 3)}(addr@A)
var I = @[int](1, 1, 1)
val I = $UNSAFE.cast{arrayref(int, 3)}(addr@I)
val H = arrayref_make_elt<int> (i2sz(succ(n)), 0)
val () = H[0] := 1
//
fun
loop{k:pos}
(k: int(k)) : void =
(
//
if
k < n
then let
val i = min3(A)
val k =
(
if A[i] > H[k-1] then (H[k] := A[i]; k+1) else k
) : intBtwe(k, k+1)
val ii = I[i]
val () = I[i] := ii+1
val ii = $UNSAFE.cast{natLte(n)}(ii)
val () = if i = 0 then A[i] := 2*H[ii]
val () = if i = 1 then A[i] := 3*H[ii]
val () = if i = 2 then A[i] := 5*H[ii]
in
loop(k)
end // end of [then]
else () // end of [else]
//
) (* end of [loop] *)
//
in
loop (1); H[n-1]
end (* end of [hamming] *)
implement
main0 () =
{
val () =
loop(1) where
{
fun
loop
{n:pos}
(
n: int(n)
) : void =
if
n <= 20
then let
val () =
println! ("hamming(",n,") = ", hamming(n))
in
loop(n+1)
end // end of [then]
// end of [if]
} (* end of [val] *)
val n = 1691
val () = println! ("hamming(",n,") = ", hamming(n))
//
} (* end of [main0] *)
- Output:
hamming(1) = 1 hamming(2) = 2 hamming(3) = 3 hamming(4) = 4 hamming(5) = 5 hamming(6) = 6 hamming(7) = 8 hamming(8) = 9 hamming(9) = 10 hamming(10) = 12 hamming(11) = 15 hamming(12) = 16 hamming(13) = 18 hamming(14) = 20 hamming(15) = 24 hamming(16) = 25 hamming(17) = 27 hamming(18) = 30 hamming(19) = 32 hamming(20) = 36 hamming(1691) = 2125764000
AutoHotkey
SetBatchLines, -1
Msgbox % hamming(1,20)
Msgbox % hamming(1690)
return
hamming(first,last=0)
{
if (first < 1)
ans=ERROR
if (last = 0)
last := first
i:=0, j:=0, k:=0
num1 := ceil((last * 20)**(1/3))
num2 := ceil(num1 * ln(2)/ln(3))
num3 := ceil(num1 * ln(2)/ln(5))
loop
{
H := (2**i) * (3**j) * (5**k)
if (H > 0)
ans = %H%`n%ans%
i++
if (i > num1)
{
i=0
j++
if (j > num2)
{
j=0
k++
}
}
if (k > num3)
break
}
Sort ans, N
Loop, parse, ans, `n, `r
{
if (A_index > last)
break
if (A_index < first)
continue
Output = %Output%`n%A_LoopField%
}
return Output
}
AWK
# syntax: gawk -M -f hamming_numbers.awk
BEGIN {
for (i=1; i<=20; i++) {
printf("%d ",hamming(i))
}
printf("\n1691: %d\n",hamming(1691))
printf("\n1000000: %d\n",hamming(1000000))
exit(0)
}
function hamming(limit, h,i,j,k,n,x2,x3,x5) {
h[0] = 1
x2 = 2
x3 = 3
x5 = 5
for (n=1; n<=limit; n++) {
h[n] = min(x2,min(x3,x5))
if (h[n] == x2) { x2 = 2 * h[++i] }
if (h[n] == x3) { x3 = 3 * h[++j] }
if (h[n] == x5) { x5 = 5 * h[++k] }
}
return(h[limit-1])
}
function min(x,y) {
return((x < y) ? x : y)
}
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 1691: 2125764000 1000000: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
BASIC256
print "The first 20 Hamming numbers are :"
for i = 1 to 20
print Hamming(i);" ";
next i
print
print "H( 1691) = "; Hamming(1691)
end
function min(a, b)
if a < b then return a else return b
end function
function Hamming(limit)
dim h(1000000)
h[0] = 1
x2 = 2 : x3 = 3 : x5 = 5
i = 0 : j = 0 : k = 0
for n = 1 to limit
h[n] = min(x2, min(x3, x5))
if x2 = h[n] then i += 1: x2 = 2 *h[i]
if x3 = h[n] then j += 1: x3 = 3 *h[j]
if x5 = h[n] then k += 1: x5 = 5 *h[k]
next n
return h[limit -1]
end function
BBC BASIC
@% = &1010
FOR h% = 1 TO 20
PRINT "H("; h% ") = "; FNhamming(h%)
NEXT
PRINT "H(1691) = "; FNhamming(1691)
END
DEF FNhamming(l%)
LOCAL i%, j%, k%, n%, m, x2, x3, x5, h%()
DIM h%(l%) : h%(0) = 1
x2 = 2 : x3 = 3 : x5 = 5
FOR n% = 1 TO l%-1
m = x2
IF m > x3 m = x3
IF m > x5 m = x5
h%(n%) = m
IF m = x2 i% += 1 : x2 = 2 * h%(i%)
IF m = x3 j% += 1 : x3 = 3 * h%(j%)
IF m = x5 k% += 1 : x5 = 5 * h%(k%)
NEXT
= h%(l%-1)
- Output:
H(1) = 1 H(2) = 2 H(3) = 3 H(4) = 4 H(5) = 5 H(6) = 6 H(7) = 8 H(8) = 9 H(9) = 10 H(10) = 12 H(11) = 15 H(12) = 16 H(13) = 18 H(14) = 20 H(15) = 24 H(16) = 25 H(17) = 27 H(18) = 30 H(19) = 32 H(20) = 36 H(1691) = 2125764000
Bc
cat hamming_numbers.bc
define min(x,y) {
if (x < y) {
return x
} else {
return y
}
}
define hamming(limit) {
i = 0
j = 0
k = 0
h[0] = 1
x2 = 2
x3 = 3
x5 = 5
for (n=1; n<=limit; n++) {
h[n] = min(x2,min(x3,x5))
if (h[n] == x2) { x2 = 2 * h[++i] }
if (h[n] == x3) { x3 = 3 * h[++j] }
if (h[n] == x5) { x5 = 5 * h[++k] }
}
return (h[limit-1])
}
for (lab=1; lab<=20; lab++) {
hamming(lab)
}
hamming(1691)
hamming(1000000)
quit
- Output:
$ bc hamming_numbers.bc bc 1.06.95 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 51931278044838873608958984375000000000000000000000000000000000000000\ 0000000000000000
Bracmat
( ( hamming
= x2 x3 x5 n i j k min
. tbl$(h,!arg) { This creates an array. Arrays are always global in Bracmat. }
& 1:?(0$h)
& 2:?x2
& 3:?x3
& 5:?x5
& 0:?n:?i:?j:?k
& whl
' ( !n+1:<!arg:?n
& !x2:?min
& (!x3:<!min:?min|)
& (!x5:<!min:?min|)
& !min:?(!n$h) { !n is index into array h }
& ( !x2:!min
& 2*!((1+!i:?i)$h):?x2
|
)
& ( !x3:!min
& 3*!((1+!j:?j)$h):?x3
|
)
& ( !x5:!min
& 5*!((1+!k:?k)$h):?x5
|
)
)
& !((!arg+-1)$h) (tbl$(h,0)&) { We delete the array by setting its size to 0 }
)
& 0:?I
& whl'(!I+1:~>20:?I&put$(hamming$!I " "))
& out$
& out$(hamming$1691)
& out$(hamming$1000000)
);
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
Bruijn
n1000000
takes a very long time but eventually reduces to the correct result.
:import std/Combinator .
:import std/Number .
:import std/List .
merge y [[[∅?1 0 (1 [[2 [[go]]]])]]]
go 3 <? 1 (3 : (6 2 4)) (1 : (6 5 0))
# classic version while avoiding duplicate generation
hammings-classic (+1) : (foldr u empty ((+2) : ((+3) : {}(+5))))
u [[y [merge 1 ((mul 2) <$> ((+1) : 0))]]]
:test ((hammings-classic !! (+42)) =? (+162)) ([[1]])
# enumeration by a chain of folded merges (faster)
hammings-folded ([(0 ∘ a) ∘ (0 ∘ b)] (foldr merge1 empty)) $ c
merge1 [[1 [[1 : (merge 0 2)]]]]
a iterate (map (mul (+5)))
b iterate (map (mul (+3)))
c iterate (mul (+2)) (+1)
:test ((hammings-folded !! (+42)) =? (+162)) ([[1]])
# --- output ---
main [first-twenty : (n1691 : {}n1000000)]
first-twenty take (+20) hammings-folded
n1691 hammings-folded !! (+1690)
n1000000 hammings-folded !! (+999999)
C
Using a min-heap to keep track of numbers. Does not handle big integers.
#include <stdio.h>
#include <stdlib.h>
typedef unsigned long long ham;
size_t alloc = 0, n = 1;
ham *q = 0;
void qpush(ham h)
{
int i, j;
if (alloc <= n) {
alloc = alloc ? alloc * 2 : 16;
q = realloc(q, sizeof(ham) * alloc);
}
for (i = n++; (j = i/2) && q[j] > h; q[i] = q[j], i = j);
q[i] = h;
}
ham qpop()
{
int i, j;
ham r, t;
/* outer loop for skipping duplicates */
for (r = q[1]; n > 1 && r == q[1]; q[i] = t) {
/* inner loop is the normal down heap routine */
for (i = 1, t = q[--n]; (j = i * 2) < n;) {
if (j + 1 < n && q[j] > q[j+1]) j++;
if (t <= q[j]) break;
q[i] = q[j], i = j;
}
}
return r;
}
int main()
{
int i;
ham h;
for (qpush(i = 1); i <= 1691; i++) {
/* takes smallest value, and queue its multiples */
h = qpop();
qpush(h * 2);
qpush(h * 3);
qpush(h * 5);
if (i <= 20 || i == 1691)
printf("%6d: %llu\n", i, h);
}
/* free(q); */
return 0;
}
Alternative
Standard algorithm. Numbers are stored as exponents of factors instead of big integers, while GMP is only used for display. It's much more efficient this way.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <gmp.h>
/* number of factors. best be mutually prime -- duh. */
#define NK 3
#define MAX_HAM (1 << 24)
#define MAX_POW 1024
int n_hams = 0, idx[NK] = {0}, fac[] = { 2, 3, 5, 7, 11};
/* k-smooth numbers are stored as their exponents of each factor;
v is the log of the number, for convenience. */
typedef struct {
int e[NK];
double v;
} ham_t, *ham;
ham_t *hams, values[NK] = {{{0}, 0}};
double inc[NK][MAX_POW];
/* most of the time v can be just incremented, but eventually
* floating point precision will bite us, so better recalculate */
inline
void _setv(ham x) {
int i;
for (x->v = 0, i = 0; i < NK; i++)
x->v += inc[i][x->e[i]];
}
inline
int _eq(ham a, ham b) {
int i;
for (i = 0; i < NK && a->e[i] == b->e[i]; i++);
return i == NK;
}
ham get_ham(int n)
{
int i, ni;
ham h;
n--;
while (n_hams < n) {
for (ni = 0, i = 1; i < NK; i++)
if (values[i].v < values[ni].v)
ni = i;
*(h = hams + ++n_hams) = values[ni];
for (ni = 0; ni < NK; ni++) {
if (! _eq(values + ni, h)) continue;
values[ni] = hams[++idx[ni]];
values[ni].e[ni]++;
_setv(values + ni);
}
}
return hams + n;
}
void show_ham(ham h)
{
static mpz_t das_ham, tmp;
int i;
mpz_init_set_ui(das_ham, 1);
mpz_init_set_ui(tmp, 1);
for (i = 0; i < NK; i++) {
mpz_ui_pow_ui(tmp, fac[i], h->e[i]);
mpz_mul(das_ham, das_ham, tmp);
}
gmp_printf("%Zu\n", das_ham);
}
int main()
{
int i, j;
hams = malloc(sizeof(ham_t) * MAX_HAM);
for (i = 0; i < NK; i++) {
values[i].e[i] = 1;
inc[i][1] = log(fac[i]);
_setv(values + i);
for (j = 2; j < MAX_POW; j++)
inc[i][j] = j * inc[i][1];
}
printf(" 1,691: "); show_ham(get_ham(1691));
printf(" 1,000,000: "); show_ham(get_ham(1e6));
printf("10,000,000: "); show_ham(get_ham(1e7));
return 0;
}
- Output:
1,691: 2125764000 1,000,000: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 10,000,000: 16244105063830431823239 ..<a gadzillion digits>.. 000000000000000000000
C#
using System;
using System.Numerics;
using System.Linq;
namespace Hamming {
class MainClass {
public static BigInteger Hamming(int n) {
BigInteger two = 2, three = 3, five = 5;
var h = new BigInteger[n];
h[0] = 1;
BigInteger x2 = 2, x3 = 3, x5 = 5;
int i = 0, j = 0, k = 0;
for (int index = 1; index < n; index++) {
h[index] = BigInteger.Min(x2, BigInteger.Min(x3, x5));
if (h[index] == x2) x2 = two * h[++i];
if (h[index] == x3) x3 = three * h[++j];
if (h[index] == x5) x5 = five * h[++k];
}
return h[n - 1];
}
public static void Main(string[] args) {
Console.WriteLine(string.Join(" ", Enumerable.Range(1, 20).ToList().Select(x => Hamming(x))));
Console.WriteLine(Hamming(1691));
Console.WriteLine(Hamming(1000000));
}
}
}
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
Generic version for any set of numbers
The algorithm is similar to the one above.
using System;
using System.Numerics;
using System.Linq;
namespace Hamming {
class MainClass {
public static BigInteger[] Hamming(int n, int[] a) {
var primes = a.Select(x => (BigInteger)x).ToArray();
var values = a.Select(x => (BigInteger)x).ToArray();
var indexes = new int[a.Length];
var results = new BigInteger[n];
results[0] = 1;
for (int iter = 1; iter < n; iter++) {
results[iter] = values[0];
for (int p = 1; p < primes.Length; p++)
if (results[iter] > values[p])
results[iter] = values[p];
for (int p = 0; p < primes.Length; p++)
if (results[iter] == values[p])
values[p] = primes[p] * results[++indexes[p]];
}
return results;
}
public static void Main(string[] args) {
foreach (int[] primes in new int[][] { new int[] {2,3,5}, new int[] {2,3,5,7} }) {
Console.WriteLine("{0}-Smooth:", primes.Last());
Console.WriteLine(string.Join(" ", Hamming(20, primes)));
Console.WriteLine(Hamming(1691, primes).Last());
Console.WriteLine(Hamming(1000000, primes).Last());
Console.WriteLine();
}
}
}
}
- Output:
5-Smooth: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 7-Smooth: 1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24 25 27 3317760 4157409948433216829957008507500000000
Fast version
Like some of the other implementations on this page, this version represents each number as a list of exponents which would be applied to each prime number. So the number 60 would be represented as int[3] { 2, 1, 1 } which is interpreted as 2^2 * 3^1 * 5^1.
As often happens, optimizing for speed caused a marked increase in code size and complexity. Clearly the versions I wrote above are easier to read & understand. They were also much quicker to write. But the generic version above runs in 3+ seconds for the 1000000th 5-smooth number whereas this version does it in 0.35 seconds, 8-10 times faster.
I've tried to comment it as best I could, without bloating the code too much.
--Mike Lorenz
using System;
using System.Linq;
using System.Numerics;
namespace HammingFast {
class MainClass {
private static int[] _primes = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 };
public static BigInteger Big(int[] exponents) {
BigInteger val = 1;
for (int i = 0; i < exponents.Length; i++)
for (int e = 0; e < exponents[i]; e++)
val = val * _primes[i];
return val;
}
public static int[] Hamming(int n, int nprimes) {
var hammings = new int[n, nprimes]; // array of hamming #s we generate
var hammlogs = new double[n]; // log values for above
var primelogs = new double[nprimes]; // pre-calculated prime log values
var indexes = new int[nprimes]; // intermediate hamming values as indexes into hammings
var listheads = new int[nprimes, nprimes]; // intermediate hamming list heads
var listlogs = new double[nprimes]; // log values of list heads
for (int p = 0; p < nprimes; p++) {
listheads[p, p] = 1; // init list heads to prime values
primelogs[p] = Math.Log(_primes[p]); // pre-calc prime log values
listlogs[p] = Math.Log(_primes[p]); // init list head log values
}
for (int iter = 1; iter < n; iter++) {
int min = 0; // find index of min item in list heads
for (int p = 1; p < nprimes; p++)
if (listlogs[p] < listlogs[min])
min = p;
hammlogs[iter] = listlogs[min]; // that's the next hamming number
for (int i = 0; i < nprimes; i++)
hammings[iter, i] = listheads[min, i];
for (int p = 0; p < nprimes; p++) { // update each list head if it matches new value
bool equal = true; // test each exponent to see if number matches
for (int i = 0; i < nprimes; i++) {
if (hammings[iter, i] != listheads[p, i]) {
equal = false;
break;
}
}
if (equal) { // if it matches...
int x = ++indexes[p]; // set index to next hamming number
for (int i = 0; i < nprimes; i++) // copy each hamming exponent
listheads[p, i] = hammings[x, i];
listheads[p, p] += 1; // increment exponent = mult by prime
listlogs[p] = hammlogs[x] + primelogs[p]; // add log(prime) to log(value) = mult by prime
}
}
}
var result = new int[nprimes];
for (int i = 0; i < nprimes; i++)
result[i] = hammings[n - 1, i];
return result;
}
public static void Main(string[] args) {
foreach (int np in new int[] { 3, 4, 5 }) {
Console.WriteLine("{0}-Smooth:", _primes[np - 1]);
Console.WriteLine(string.Join(" ", Enumerable.Range(1, 20).Select(x => Big(Hamming(x, np)))));
Console.WriteLine(Big(Hamming(1691, np)));
Console.WriteLine(Big(Hamming(1000000, np)));
Console.WriteLine();
}
}
}
}
- Output:
5-Smooth: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 7-Smooth: 1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24 25 27 3317760 4157409948433216829957008507500000000 11-Smooth: 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 18 20 21 22 24 296352 561912530929780078125000
C# Enumerator Version
I wanted to fix the enumerator (old) version, as it wasn't working. It became a bit of an obsession... after a few iterations I came up with the following, which is the fastest C# version on my computer - your mileage may vary. It combines the speed of the Log method; Log(2)+Log(3)=Log(2*3) to help determine which is the next one to use. Then I have added some logic (using the series property) to ensure that exponent sets are never duplicated - which speeds the calculations up a bit.... Adding this trick to the Fast Version will probably result in the fastest version, but I'll leave that to someone else to implement. Finally it's all enumerated through a crazy one-way-linked-list-type-structure that only exists as long as the enumerator and is left up to the garbage collector to remove the bits no longer needed... I hope it's commented enough... follow it if you dare!
using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
namespace HammingTest
{
class HammingNode
{
public double log;
public int[] exponents;
public HammingNode next;
public int series;
}
class HammingListEnumerator : IEnumerable<BigInteger>
{
private int[] primes;
private double[] primelogs;
private HammingNode next;
private HammingNode[] values;
private HammingNode[] indexes;
public HammingListEnumerator(IEnumerable<int> seeds)
{
// Ensure our seeds are properly ordered, and generate their log values
primes = seeds.OrderBy(x => x).ToArray();
primelogs = primes.Select(x => Math.Log10(x)).ToArray();
// Start at 1 (log(1)=0, exponents are all 0, series = none)
next = new HammingNode { log = 0, exponents = new int[primes.Length], series = primes.Length };
// Set all exponent sequences to the start, and calculate the first value for each exponent
indexes = new HammingNode[primes.Length];
values = new HammingNode[primes.Length];
for(int i = 0; i < primes.Length; ++i)
{
indexes[i] = next;
values[i] = AddExponent(next, i);
}
}
// Make a copy of a node, and increment the specified exponent value
private HammingNode AddExponent(HammingNode node, int i)
{
HammingNode ret = new HammingNode { log = node.log + primelogs[i], exponents = (int[])node.exponents.Clone(), series = i };
++ret.exponents[i];
return ret;
}
private void GetNext()
{
// Find which exponent value is the lowest
int min = 0;
for(int i = 1; i < values.Length; ++i)
if(values[i].log < values[min].log)
min = i;
// Add it to the end of the 'list', and move to it
next.next = values[min];
next = values[min];
// Find the next node in an allowed sequence (skip those that would be duplicates)
HammingNode val = indexes[min].next;
while(val.series < min)
val = val.next;
// Keep the current index, and calculate the next value in the series for that exponent
indexes[min] = val;
values[min] = AddExponent(val, min);
}
// Skip values without having to calculate the BigInteger value from the exponents
public HammingListEnumerator Skip(int count)
{
for(int i = count; i > 0; --i)
GetNext();
return this;
}
// Calculate the BigInteger value from the exponents
internal BigInteger ValueOf(HammingNode n)
{
BigInteger val = 1;
for(int i = 0; i < n.exponents.Length; ++i)
for(int e = 0; e < n.exponents[i]; e++)
val = val * primes[i];
return val;
}
public IEnumerator<BigInteger> GetEnumerator()
{
while(true)
{
yield return ValueOf(next);
GetNext();
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
class Program
{
static void Main(string[] args)
{
foreach(int[] primes in new int[][] {
new int[] { 2, 3, 5 },
new int[] { 2, 3, 5, 7 },
new int[] { 2, 3, 5, 7, 9}})
{
HammingListEnumerator hammings = new HammingListEnumerator(primes);
System.Diagnostics.Debug.WriteLine("{0}-Smooth:", primes.Last());
System.Diagnostics.Debug.WriteLine(String.Join(" ", hammings.Take(20).ToArray()));
System.Diagnostics.Debug.WriteLine(hammings.Skip(1691 - 20).First());
System.Diagnostics.Debug.WriteLine(hammings.Skip(1000000 - 1691).First());
System.Diagnostics.Debug.WriteLine("");
}
}
}
}
- Output:
5-Smooth: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 7-Smooth: 1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24 25 27 3317760 4157409948433216829957008507500000000 11-Smooth: 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 18 20 21 22 24 296352 561912530929780078125000
Alternate Generic Enumerating version
YMMV, but unlike the author of the above code, I found the above version to be much slower on my machine than the "Generic version". The following version is actually just a little slower than the Generic version but uses much less memory due to avoiding duplicates and only keeping in memory those "lazy list" streams necessary for calculation from 1/5 of the current range to 1/2 (for Smooth-5 numbers), and not successive values in those ranges but only the values the are the multiples of previous ranges. Like the Haskell code from which it is translated, the head of the streams is not retained so can be garbage collected when no longer necessary and it is recommended that the primes be processed in reverse order so that the least dense streams are processed first for slightly less memory use and operations.
It also shows that one can use somewhat functional programming techniques in C#.
The class implements its own partial version of a lazy list using the Lazy class and uses lambda closures for the recursive use of the successive streams to avoid stack use. It uses Aggregate to implement the Haskell "foldl" function. The code demonstrates that even though C# is primarily imperative in paradigm, with its ability to implement closures using delegates/lambdas, it can express some algorithms such as this mostly functionally.
It isn't nearly as fast as a Haskell, Scala or even Clojure and Scheme (GambitC) versions of this algorithm, being about five times slower is primarily due to its use of many small heap based instances of classes, both for the LazyList's and for closures (implemented using at least one class to hold the captured free variables) and the inefficiency of DotNet's allocation and garbage collection of many small instance objects (although about twice as fast as F#'s implementation, whose closures must require even more small object instances); it seems Haskell and the (Java) JVM are much more efficient at doing these allocations/garbage collections for many small objects. The slower speed to a relatively minor extent is also due to less efficient BigInteger operations:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
namespace Hamming {
class Hammings : IEnumerable<BigInteger> {
private class LazyList<T> {
public T v; public Lazy<LazyList<T>> cont;
public LazyList(T v, Lazy<LazyList<T>> cont) {
this.v = v; this.cont = cont;
}
}
private uint[] primes;
private Hammings() { } // must have an argument!!!
public Hammings(uint[] prms) { this.primes = prms; }
private LazyList<BigInteger> merge(LazyList<BigInteger> xs,
LazyList<BigInteger> ys) {
if (xs == null) return ys; else {
var x = xs.v; var y = ys.v;
if (BigInteger.Compare(x, y) < 0) {
var cont = new Lazy<LazyList<BigInteger>>(() =>
merge(xs.cont.Value, ys));
return new LazyList<BigInteger>(x, cont);
}
else {
var cont = new Lazy<LazyList<BigInteger>>(() =>
merge(xs, ys.cont.Value));
return new LazyList<BigInteger>(y, cont);
}
}
}
private LazyList<BigInteger> llmult(uint mltplr,
LazyList<BigInteger> ll) {
return new LazyList<BigInteger>(mltplr * ll.v,
new Lazy<LazyList<BigInteger>>(() =>
llmult(mltplr, ll.cont.Value)));
}
public IEnumerator<BigInteger> GetEnumerator() {
Func<LazyList<BigInteger>,uint,LazyList<BigInteger>> u =
(acc, p) => { LazyList<BigInteger> r = null;
var cont = new Lazy<LazyList<BigInteger>>(() => r);
r = new LazyList<BigInteger>(1, cont);
r = this.merge(acc, llmult(p, r));
return r; };
yield return 1;
for (var stt = primes.Aggregate(null, u); ; stt = stt.cont.Value)
yield return stt.v;
}
IEnumerator IEnumerable.GetEnumerator() {
return this.GetEnumerator();
}
}
class Program {
static void Main(string[] args) {
Console.WriteLine("Calculates the Hamming sequence of numbers.\r\n");
var primes = new uint[] { 5, 3, 2 };
Console.WriteLine(String.Join(" ", (new Hammings(primes)).Take(20).ToArray()));
Console.WriteLine((new Hammings(primes)).ElementAt(1691 - 1));
var n = 1000000;
var elpsd = -DateTime.Now.Ticks;
var num = (new Hammings(primes)).ElementAt(n - 1);
elpsd += DateTime.Now.Ticks;
Console.WriteLine(num);
Console.WriteLine("The {0}th hamming number took {1} milliseconds", n, elpsd / 10000);
Console.Write("\r\nPress any key to exit:");
Console.ReadKey(true);
Console.WriteLine();
}
}
}
Fast enumerating logarithmic version
The so-called "fast" generic version above isn't really all that fast due to all the extra array accesses required by the generic implementation and that it doesn't avoid duplicates as the above functional code does avoid. It also uses a lot of memory as it has arrays that are the size of the range for which the Hamming numbers are calculated.
The following code eliminates or reduces all of those problems by being non-generic, eliminating duplicate calculations, saving memory by "draining" the growable List's used in blocks as back pointer indexes are used (thus using memory at the same rate as the functional version), thus avoiding excessive allocations/garbage collections; it also is enumerates through the Hamming numbers although that comes at a slight cost in overhead function calls:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
class HammingsLogArr : IEnumerable<Tuple<uint, uint, uint>> {
public static BigInteger trival(Tuple<uint, uint, uint> tpl) {
BigInteger rslt = 1;
for (var i = 0; i < tpl.Item1; ++i) rslt *= 2;
for (var i = 0; i < tpl.Item2; ++i) rslt *= 3;
for (var i = 0; i < tpl.Item3; ++i) rslt *= 5;
return rslt;
}
private const double lb3 = 1.5849625007211561814537389439478; // Math.Log(3) / Math.Log(2);
private const double lb5 = 2.3219280948873623478703194294894; // Math.Log(5) / Math.Log(2);
private struct logrep {
public double lg;
public uint x2, x3, x5;
public logrep(double lg, uint x, uint y, uint z) {
this.lg = lg; this.x2 = x; this.x3 = y; this.x5 = z;
}
public logrep mul2() {
return new logrep (this.lg + 1.0, this.x2 + 1, this.x3, this.x5);
}
public logrep mul3() {
return new logrep(this.lg + lb3, this.x2, this.x3 + 1, this.x5);
}
public logrep mul5() {
return new logrep(this.lg + lb5, this.x2, this.x3, this.x5 + 1);
}
}
public IEnumerator<Tuple<uint, uint, uint>> GetEnumerator() {
var one = new logrep();
var s2 = new List<logrep>(); var s3 = new List<logrep>();
s2.Add(one); s3.Add(one.mul3());
var s5 = one.mul5(); var mrg = one.mul3();
var s2hdi = 0; var s3hdi = 0;
while (true) {
if (s2hdi >= s2.Count) { s2.RemoveRange(0, s2hdi); s2hdi = 0; } // assume capacity stays the same...
var v = s2[s2hdi];
if ( v.lg < mrg.lg) { s2.Add(v.mul2()); s2hdi++; }
else {
if (s3hdi >= s3.Count) { s3.RemoveRange(0, s3hdi); s3hdi = 0; }
v = mrg; s2.Add(v.mul2()); s3.Add(v.mul3());
s3hdi++; var chkv = s3[s3hdi];
if (chkv.lg < s5.lg) { mrg = chkv; }
else { mrg = s5; s5 = s5.mul5(); s3hdi--; }
}
yield return Tuple.Create(v.x2, v.x3, v.x5);
}
}
IEnumerator IEnumerable.GetEnumerator() {
return this.GetEnumerator();
}
}
class Program {
static void Main(string[] args) {
Console.WriteLine(String.Join(" ", (new HammingsLogArr()).Take(20)
.Select(t => HammingsLogArr.trival(t))
.ToArray()));
Console.WriteLine(HammingsLogArr.trival((new HammingsLogArr()).ElementAt((int)1691 - 1)));
var n = 1000000UL;
var elpsd = -DateTime.Now.Ticks;
var rslt = (new HammingsLogArr()).ElementAt((int)n - 1);
elpsd += DateTime.Now.Ticks;
Console.WriteLine("2^{0} times 3^{1} times 5^{2}", rslt.Item1, rslt.Item2, rslt.Item3);
var lgrthm = Math.Log10(2.0) * ((double)rslt.Item1 +
((double)rslt.Item2 * Math.Log(3.0) + (double)rslt.Item3 * Math.Log(5.0)) / Math.Log(2.0));
var pwr = Math.Floor(lgrthm); var mntsa = Math.Pow(10.0, lgrthm - pwr);
Console.WriteLine("Approximately: {0}E+{1}", mntsa, pwr);
var s = HammingsLogArr.trival(rslt).ToString();
var lngth = s.Length;
Console.WriteLine("Decimal digits: {0}", lngth);
if (lngth <= 10000) {
var i = 0;
for (; i < lngth - 100; i += 100) Console.WriteLine(s.Substring(i, 100));
Console.WriteLine(s.Substring(i));
}
Console.WriteLine("The {0}th hamming number took {1} milliseconds", n, elpsd / 10000);
Console.Write("\r\nPress any key to exit:");
Console.ReadKey(true);
Console.WriteLine();
}
}
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 2^55 times 3^47 times 5^64 Approximately: 5.19312780448414E+83 Decimal digits: 84 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 The 1000000th hamming number took 55 milliseconds
The above code is about 30 times faster than the functional code due to both eliminating the lambda closures that were the main problem with that code as well as eliminating the BigInteger operations. It has about O(n) empirical performance and can find the billionth Hamming number in about 60 seconds.
Extremely fast non-enumerating version calculating the error band
The above code is about as fast as one can go generating sequences; however, if one is willing to forego sequences and just calculate the nth Hamming number (again), then some reading on the relationship between the size of numbers to the sequence numbers is helpful (Wikipedia: regular number). One finds that there is a very distinct relationship and that it quite quickly reduces to quite a small error band proportional to the log of the output value for larger ranges. Thus, the following code just scans for logarithmic representations to insert into a sequence for this top error band and extracts the correct nth representation from that band. It reduces time complexity to O(n^(2/3)) from O(n) for the sequence versions, but even more amazingly, reduces memory requirements to O(n^(1/3)) from O(n^(2/3)) and thus makes it possible to calculate very large values in the sequence on common personal computers. The code is as follows:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
static class NthHamming {
public static BigInteger trival(Tuple<uint, uint, uint> tpl) {
BigInteger rslt = 1;
for (var i = 0; i < tpl.Item1; ++i) rslt *= 2;
for (var i = 0; i < tpl.Item2; ++i) rslt *= 3;
for (var i = 0; i < tpl.Item3; ++i) rslt *= 5;
return rslt;
}
private struct logrep {
public uint x2, x3, x5;
public double lg;
public logrep(uint x, uint y, uint z, double lg) {
this.x2 = x; this.x3 = y; this.x5 = z; this.lg = lg;
}
}
private const double lb3 = 1.5849625007211561814537389439478; // Math.Log(3) / Math.Log(2);
private const double lb5 = 2.3219280948873623478703194294894; // Math.Log(5) / Math.Log(2);
private const double fctr = 6.0 * lb3 * lb5;
private const double crctn = 2.4534452978042592646620291867186; // Math.Log(Math.sqrt(30.0)) / Math.Log(2.0)
public static Tuple<uint, uint, uint> findNth(UInt64 n) {
if (n < 1) throw new Exception("NthHamming.findNth: argument must be > 0!");
if (n < 2) return Tuple.Create(0u, 0u, 0u); // trivial case for argument of one
var lgest = Math.Pow(fctr * (double)n, 1.0/3.0) - crctn; // from WP formula
var frctn = (n < 1000000000) ? 0.509 : 0.105;
var lghi = Math.Pow(fctr * ((double)n + frctn * lgest), 1.0/3.0) - crctn;
var lglo = 2.0 * lgest - lghi; // upper and lower bound of upper "band"
var count = 0UL; // need 64 bit precision in case...
var bnd = new List<logrep>();
for (uint k = 0, klmt = (uint)(lghi / lb5) + 1; k < klmt; ++k) {
var p = (double)k * lb5;
for (uint j = 0, jlmt = (uint)((lghi - p) / lb3) + 1; j < jlmt; ++j) {
var q = p + (double)j * lb3;
var ir = lghi - q;
var lg = q + Math.Floor(ir); // current log2 value (estimated)
count += (ulong)ir + 1;
if (lg >= lglo) bnd.Add(new logrep((UInt32)ir, j, k, lg));
}
}
if (n > count) throw new Exception("NthHamming.findNth: band high estimate is too low!");
var ndx = (int)(count - n);
if (ndx >= bnd.Count) throw new Exception("NthHamming.findNth: band low estimate is too high!");
bnd.Sort((a, b) => (b.lg < a.lg) ? -1 : 1); // sort in decending order
var rslt = bnd[ndx];
return Tuple.Create(rslt.x2, rslt.x3, rslt.x5);
}
}
class Program {
static void Main(string[] args) {
Console.WriteLine(String.Join(" ", Enumerable.Range(1,20).Select(i =>
NthHamming.trival(NthHamming.findNth((ulong)i))).ToArray()));
Console.WriteLine(NthHamming.trival((new HammingsLogArr()).ElementAt(1691 - 1)));
var n = 1000000000000UL;
var elpsd = -DateTime.Now.Ticks;
var rslt = NthHamming.findNth(n);
elpsd += DateTime.Now.Ticks;
Console.WriteLine("2^{0} times 3^{1} times 5^{2}", rslt.Item1, rslt.Item2, rslt.Item3);
var lgrthm = Math.Log10(2.0) * ((double)rslt.Item1 +
((double)rslt.Item2 * Math.Log(3.0) + (double)rslt.Item3 * Math.Log(5.0)) / Math.Log(2.0));
var pwr = Math.Floor(lgrthm); var mntsa = Math.Pow(10.0, lgrthm - pwr);
Console.WriteLine("Approximately: {0}E+{1}", mntsa, pwr);
var s = HammingsLogArr.trival(rslt).ToString();
var lngth = s.Length;
Console.WriteLine("Decimal digits: {0}", lngth);
if (lngth <= 10000) {
var i = 0;
for (; i < lngth - 100; i += 100) Console.WriteLine(s.Substring(i, 100));
Console.WriteLine(s.Substring(i));
}
Console.WriteLine("The {0}th hamming number took {1} milliseconds", n, elpsd / 10000);
Console.Write("\r\nPress any key to exit:");
Console.ReadKey(true);
Console.WriteLine();
}
}
The output is the same as above except that the time is too small to be measured. The billionth number in the sequence can be calculated in just about 10 milliseconds, the trillionth in about one second, the thousand trillionth in about a hundred seconds, and it should be possible to calculate the 10^19th value in less than a day (untested) on common personal computers. The (2^64 - 1)th value (18446744073709551615) cannot be calculated due to a slight overflow problem as it approaches that limit.
C++
C++11 For Each Generator
#include <iostream>
#include <vector>
// Hamming like sequences Generator
//
// Nigel Galloway. August 13th., 2012
//
class Ham {
private:
std::vector<unsigned int> _H, _hp, _hv, _x;
public:
bool operator!=(const Ham& other) const {return true;}
Ham begin() const {return *this;}
Ham end() const {return *this;}
unsigned int operator*() const {return _x.back();}
Ham(const std::vector<unsigned int> &pfs):_H(pfs),_hp(pfs.size(),0),_hv({pfs}),_x({1}){}
const Ham& operator++() {
for (int i=0; i<_H.size(); i++) for (;_hv[i]<=_x.back();_hv[i]=_x[++_hp[i]]*_H[i]);
_x.push_back(_hv[0]);
for (int i=1; i<_H.size(); i++) if (_hv[i]<_x.back()) _x.back()=_hv[i];
return *this;
}
};
5-Smooth
int main() {
int count = 1;
for (unsigned int i : Ham({2,3,5})) {
if (count <= 62) std::cout << i << ' ';
if (count++ == 1691) {
std::cout << "\nThe one thousand six hundred and ninety first Hamming Number is " << i << std::endl;
break;
}
}
return 0;
}
Produces:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 40 45 48 50 54 60 64 72 75 80 81 90 96 100 108 120 125 128 135 144 150 160 162 180 192 200 216 225 240 243 250 256 270 288 300 320 324 360 375 384 400 405 The one thousand six hundred and ninety first Hamming Number is 2125764000
7-Smooth
int main() {
int count = 1;
for (unsigned int i : Ham({2,3,5,7})) {
std::cout << i << ' ';
if (count++ == 64) break;
}
std::cout << std::endl;
return 0;
}
Produces:
1 2 3 4 5 6 7 8 9 10 12 14 15 16 18 20 21 24 25 27 28 30 32 35 36 40 42 45 48 49 50 54 56 60 63 64 70 72 75 80 81 84 90 96 98 100 105 108 112 120 125 126 128 135 140 144 147 150 160 162 168 175 180 189
Avoiding Duplicates with Functional Code
If converted to use multi-precision integers (via GMP, as in this code), the above code is slower than it needs to be due to several reasons: 1) It uses the an adaptation of the original Dijkstra's algorithm, which is somewhat slower due to repeated calculations (2 time 3, 3 times 2, etc.), 2) the generator is written generally to handle any series of multiples, and 3) for finding the nth Hamming number, the code as `for (auto hmg : Ham({5, 3, 2})` means that there is a copy of the sometimes very large multi-precision number which can consume more time than the actual computation. The following code isn't particularly fast due to other reasons that will be discussed, but avoids duplication of calculations by a modification of the algorithm; it is written functionally as a LazyList (which could easily also have iteration abilities added, with the a basic LazyList type defined here since there is no library available:
#include <chrono>
#include <iostream>
#include <gmpxx.h>
#include <functional>
#include <memory>
template<class T>
class Lazy {
public:
T _v;
private:
std::function<T()> _f;
public:
explicit Lazy(std::function<T()> thnk)
: _v(T()), _f(thnk) {};
T value() { // not thread safe!
if (this->_f != nullptr) {
this->_v = this->_f();
this->_f = nullptr;
}
return this->_v;
}
};
template<class T>
class LazyList {
public:
T head;
std::shared_ptr<Lazy<LazyList<T>>> tail;
LazyList(): head(T()) {} // only used in initializing Lazy...
LazyList(T head, std::function<LazyList<T>()> thnk)
: head(head), tail(std::make_shared<Lazy<LazyList<T>>>(thnk)) {}
// default Copy/Move constructors and assignment operators seem to work well enough
bool isEmpty() { return this->tail == nullptr; }
};
typedef std::shared_ptr<mpz_class> PBI;
typedef LazyList<PBI> LL;
typedef std::function<LL(LL)> FLL2LL;
LL merge(LL a, LL b) {
auto ha = a.head; auto hb = b.head;
if (*ha < *hb) {
return LL(ha, [=]() { return merge(a.tail->value(), b); });
} else {
return LL(hb, [=]() { return merge(a, b.tail->value()); });
}
}
LL smult(int m, LL s) {
const auto im = mpz_class(m);
const auto psmlt =
std::make_shared<FLL2LL>([](LL ss) { return ss; });
*psmlt = [=](LL ss) {
return LL(std::make_shared<mpz_class>(*ss.head * im),
[=]() { return (*psmlt)(ss.tail->value()); });
};
return (*psmlt)(s); // worker wrapper pattern with recursive closure as worker...
}
LL u(LL s, int n) {
const auto r = std::make_shared<LL>(LL()); // interior mutable...
*r = smult(n, LL(std::make_shared<mpz_class>(1), [=]() { return *r; }));
if (!s.isEmpty()) { *r = merge(s, *r); }
return *r;
}
LL hammings() {
auto r = LL();
for (auto pn : std::vector<int>({5, 3, 2})) {
r = u(r, pn);
}
return LL(std::make_shared<mpz_class>(1), [=]() { return r; });
}
int main() {
auto hmgs = hammings();
for (auto i = 0; i < 20; ++i) {
std::cout << *hmgs.head << " ";
hmgs = hmgs.tail->value();
}
std::cout << "\n";
hmgs = hammings();
for (auto i = 1; i < 1691; ++i) hmgs = hmgs.tail->value();
std::cout << *hmgs.head << "\n";
auto start = std::chrono::steady_clock::now();
hmgs = hammings();
for (auto i = 1; i < 1000000; ++i) hmgs = hmgs.tail->value();
auto stop = std::chrono::steady_clock::now();
auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << *hmgs.head << " in " << ms.count() << "milliseconds.\n";
}
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 in 1079 milliseconds.
Note that the repeat loop to increment to the desired value is written so as not to oopy unnecessary Hamming values, unlike the use of the first Generator class.
This shows that one can program functionally in C++, but the performance is many times slower than a language more suitable for functional paradigms such as Haskell or even Kotlin; this is likely due to the cost of memory allocation with (multi-threading-safe) reference counting and that the memory system isn't tuned to many small allocations/de-allocations such as are generally necessary with functional programming. One can easily see how to adapt this algorithm to make it work for the general case by just having an argument which is an vector of the required multipliers used in the `hammings` function.
There is another problem in using languages such as this that do not have cyclic reference breaking capbilities: the code will leak memory due to the cyclic memory reference cycles and it is perhaps impossible to change the function algorithm to manually free, even though the code uses "shared"/reference counting facilities.
Avoiding Duplicates with Imperative Code
To show that it is the execution time for the functional LazyList that is taking the time, here is the same algorithm implemented imperatively using vectors, also avoiding duplicate calculations; it is not written as a general function for any set of multipliers as the extra vector addressing takes some extra time; again, it minimizes copying of values:
#include <chrono>
#include <iostream>
#include <vector>
#include <gmpxx.h>
class Hammings {
private:
const mpz_class _two = 2, _three = 3, _five = 5;
std::vector<mpz_class> _m = {}, _h = {1};
mpz_class _x5 = 5, _x53 = 9, _mrg = 3, _x532 = 2;
int _i = 1, _j = 0;
public:
Hammings() {_m.reserve(65536); _h.reserve(65536); };
bool operator!=(const Hammings& other) const { return true; }
Hammings begin() const { return *this; }
Hammings end() const { return *this; }
mpz_class operator*() { return _h.back(); }
const Hammings& operator++() {
if (_i > _h.capacity() / 2) {
_h.erase(_h.begin(), _h.begin() + _i);
_i = 0;
}
if (_x532 < _mrg) {
_h.push_back(_x532);
_x532 = _h[_i++] * _two;
} else {
_h.push_back(_mrg);
if (_x53 < _x5) {
_mrg = _x53;
_x53 = _m[_j++] * _three;
} else {
_mrg = _x5;
_x5 = _x5 * _five;
}
if (_j > _m.capacity() / 2) {
_m.erase(_m.begin(), _m.begin() + _j);
_j = 0;
}
_m.push_back(_mrg);
}
return *this;
}
};
int main() {
auto cnt = 1;
for (auto hmg : Hammings()) {
if (cnt <= 20) std::cout << hmg << " ";
if (cnt == 20) std::cout << "\n";
if (cnt++ >= 1691) {
std::cout << hmg << "\n";
break;
}
}
auto start = std::chrono::steady_clock::now();
hmgs = hammings();
auto&& hmgitr = Hammings();
for (auto i = 1; i < 1000000; ++i) ++hmgitr;
auto stop = std::chrono::steady_clock::now();
auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << *hmgitr << " in " << ms.count() << "milliseconds.\n";
}
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 in 79 milliseconds.
This code takes about the same amount of time as Haskell for the functional algorithm calculating multi-precision values (also uses GMP; not including the list processing time), but greatly reduces the C++ processing time compared to the functional version due to the use of imperative code and vectors.
Chapel
Chapel is by no means a functional language although it has some Higher Order Functional (HOF) concepts such as zippering, scanning, and reducing of iterations, it lacks closures (functions that can capture variable bindings from the enclosing scope(s)) even though it has first class functions that can be passed as values and lambdas (anonymous functions), nor is tail-call-optimization of recursive functions and iterators guarantied. However, now that Chapel supports class fields that can be of any type including references to other classes of any storage type, we can emulate closures using shared classes (shared classes are automatically de-allocated when they have no more references, currently using reference counting). The following code does that for the non-duplicating version of the sequence of Hamming numbers:
Hamming_numbers#Avoiding_generation_of_duplicates
use BigInteger; use Time;
// Chapel doesn't have closure functions that can capture variables from
// outside scope, so we use a class to emulate them for this special case;
// the member fields mult, mrglst, and mltlst, emulate "captured" variables
// that would normally be captured by the `next` continuation closure...
class HammingsList {
const head: bigint;
const mult: uint(8);
var mrglst: shared HammingsList?;
var mltlst: shared HammingsList?;
var tail: shared HammingsList? = nil;
proc init(hd: bigint, mlt: uint(8), mrgl: shared HammingsList?,
mltl: shared HammingsList?) {
head = hd; mult = mlt; mrglst = mrgl; mltlst = mltl; }
proc next(): shared HammingsList {
if tail != nil then return tail: shared HammingsList;
const nhd: bigint = mltlst!.head * mult;
if mrglst == nil then {
tail = new shared HammingsList(nhd, mult,
nil: shared HammingsList?,
nil: shared HammingsList?);
mltlst = mltlst!.next();
tail!.mltlst <=> mltlst;
}
else {
if mrglst!.head < nhd then {
tail = new shared HammingsList(mrglst!.head, mult,
nil: shared HammingsList?,
nil: shared HammingsList?);
mrglst = mrglst!.next(); mrglst <=> tail!.mrglst;
mltlst <=> tail!.mltlst;
}
else {
tail = new shared HammingsList(nhd, mult,
nil: shared HammingsList?,
nil: shared HammingsList?);
mltlst = mltlst!.next(); mltlst <=> tail!.mltlst;
mrglst <=> tail!.mrglst;
}
}
return tail: shared HammingsList;
}
}
proc u(n: uint(8), s: shared HammingsList?): shared HammingsList {
var r = new shared HammingsList(1: bigint, n, s,
nil: shared HammingsList?);
r.mltlst = r; // lazy recursion!
return r.next();
}
iter hammings(): bigint {
var nxt: shared HammingsList? = nil: shared HammingsList?;
const mlts: [ 0 .. 2 ] int = [ 5, 3, 2 ];
for m in mlts do nxt = u(m: uint(8), nxt);
yield 1 : bigint;
while true { yield nxt!.head; nxt = nxt!.next(); }
}
write("The first 20 Hamming numbers are: ");
var cnt: int = 0;
for h in hammings() { write(" ", h); cnt += 1; if cnt >= 20 then break; }
write(".\nThe 1691st Hamming number is ");
cnt = 0;
for h in hammings() { cnt += 1; if cnt < 1691 then continue; write(h); break; }
writeln(".\nThe millionth Hamming number is ");
var timer: Timer; timer.start(); cnt = 0;
for h in hammings() { cnt += 1; if cnt < 1000000 then continue; write(h); break; }
timer.stop(); writeln(".\nThis last took ",
timer.elapsed(TimeUnits.milliseconds), " milliseconds.");
- Output:
The first 20 Hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36. The 1691st Hamming number is 2125764000. The millionth Hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 224.652 milliseconds.
The above time is as run on an Intel Skylake i5-6500 at 3.6 GHz (turbo, single-threaded).
It isn't as fast as the following versions due to the many memory allocations and de-allocations as for typically functional forms of code, but it is in the order of speed of many actual functional languages and faster than many, depending on how well their memory management is adapted for this programming paradigm and also because the "bigint" implementation isn't as fast as the "gmp" package used by many languages for multi-precision integer caluclations.
This shows that the functional forms of most algorithms can be translated into Chapel, although some concessions need to be made for the functional facilities that Chapel doesn't have. However, there is one major problem in using languages such as this for functional algorithms when such languages do not have cyclic reference breaking capabilities: the code will leak memory due to the cyclic memory reference cycles and it is perhaps impossible to change the functional algorithm to manually free, even though the code uses "shared"/reference counting facilities.
Alternate Imperative Version Using "Growable" Arrays
In general, we can convert functional algorithms into imperative style algorithms using Array's to emulate memoizing lazy lists and simple mutable variables to express the recursion within a while loop, as in the following codes (as also used when necessary in the above code). Rather than implement the true Dijkstra merge algorithm which is slower and uses more memory, the following codes implement the better non-duplicating algorithm:
use BigInteger; use Time;
iter nodupsHamming(): bigint {
var s2dom = { 0 .. 1023 }; var s2: [s2dom] bigint; // init so can double!
var s3dom = { 0 .. 1023 }; var s3: [s3dom] bigint; // init so can double!
s2[0] = 1: bigint; s3[0] = 3: bigint;
var x5 = 5: bigint; var mrg = 3: bigint;
var s2hdi, s2tli, s3hdi, s3tli: int;
while true {
s2tli += 1;
if s2hdi + s2hdi >= s2tli { // move in place to avoid allocation!
s2[0 .. s2tli - s2hdi - 1] = s2[s2hdi .. s2tli - 1];
s2tli -= s2hdi; s2hdi = 0; }
const s2sz = s2.size;
if s2tli >= s2sz then s2dom = { 0 .. s2sz + s2sz - 1 };
var rslt: bigint; const s2hd = s2[s2hdi];
if s2hd < mrg { rslt = s2hd; s2hdi += 1; }
else {
s3tli += 1;
if s3hdi + s3hdi >= s2tli { // move in place to avoid allocation!
s3[0 .. s3tli - s3hdi - 1] = s3[s3hdi .. s3tli - 1];
s3tli -= s3hdi; s3hdi = 0; }
const s3sz = s3.size;
if s3tli >= s3sz then s3dom = { 0 .. s3sz + s3sz - 1 };
rslt = mrg; s3[s3tli] = rslt * 3;
s3hdi += 1; const s3hd = s3[s3hdi];
if s3hd < x5 { mrg = s3hd; }
else { mrg = x5; x5 = x5 * 5; s3hdi -= 1; }
}
s2[s2tli] = rslt * 2;
yield rslt;
}
}
// test it...
write("The first 20 hamming numbers are: ");
var cnt = 0: uint(64);
for h in nodupsHamming() {
if cnt >= 20 then break; cnt += 1; write(" ", h); }
write("\nThe 1691st hamming number is "); cnt = 1;
for h in nodupsHamming() {
if cnt >= 1691 { writeln(h); break; } cnt += 1; }
write("The millionth hamming number is ");
var timer: Timer; cnt = 1;
timer.start(); var rslt: bigint;
for h in nodupsHamming() {
if cnt >= 1000000 { rslt = h; break; } cnt += 1; }
timer.stop();
write(rslt);
writeln(".\nThis last took ",
timer.elapsed(TimeUnits.milliseconds), " milliseconds.");
- Output:
The first 20 hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 The 1691st hamming number is 2125764000 The millionth hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 114.867 milliseconds.
The above time is as run on an Intel Skylake i5-6500 at 3.6 GHz (turbo, single-threaded).
As you can see, this algorithm is quite fast, as it minimizes the memory allocations/de-allocations, but it still takes considerable time for the many multi-precision (BigInteger) calculations even though the GMP library is being used under-the-covers.
Alternate version using logarithm approximations for sorting order
To greatly reduce the time used for BigInteger calculations, the following algorithm uses logarithmic approximations (real(64)) for internal calculations for sorting and only outputs the final answer(s) as BigInteger(s):
use BigInteger; use Math; use Time;
config const nth: uint(64) = 1000000;
const lb2 = 1: real(64); // log base 2 of 2!
const lb3 = log2(3: real(64)); const lb5 = log2(5: real(64));
record LogRep {
var lg: real(64); var x2: uint(32);
var x3: uint(32); var x5: uint(32);
inline proc mul2(): LogRep {
return new LogRep(this.lg + lb2, this.x2 + 1, this.x3, this.x5); }
inline proc mul3(): LogRep {
return new LogRep(this.lg + lb3, this.x2, this.x3 + 1, this.x5); }
inline proc mul5(): LogRep {
return new LogRep(this.lg + lb5, this.x2, this.x3, this.x5 + 1); }
proc lr2bigint(): bigint {
proc xpnd(bs: uint, v: uint(32)): bigint {
var rslt = 1: bigint; var bsm = bs: bigint; var vm = v: uint;
while vm > 0 { if vm & 1 then rslt *= bsm; bsm *= bsm; vm >>= 1; }
return rslt;
}
return xpnd(2: uint, this.x2) *
xpnd(3: uint, this.x3) * xpnd(5: uint, this.x5);
}
proc writeThis(lr) throws {
lr <~> this.lr2bigint();
}
}
operator <(const ref a: LogRep, const ref b: LogRep): bool { return a.lg < b.lg; }
const one = new LogRep(0, 0, 0, 0);
iter nodupsHammingLog(): LogRep {
var s2dom = { 0 .. 1023 }; var s2: [s2dom] LogRep; // init so can double!
var s3dom = { 0 .. 1023 }; var s3: [s3dom] LogRep; // init so can double!
s2[0] = one; s3[0] = one.mul3();
var x5 = one.mul5(); var mrg = one.mul3();
var s2hdi, s2tli, s3hdi, s3tli: int;
while true {
s2tli += 1;
if s2hdi + s2hdi >= s2tli { // move in place to avoid allocation!
s2[0 .. s2tli - s2hdi - 1] = s2[s2hdi .. s2tli - 1];
s2tli -= s2hdi; s2hdi = 0; }
const s2sz = s2.size;
if s2tli >= s2sz then s2dom = { 0 .. s2sz + s2sz - 1 };
var rslt: LogRep; const s2hd = s2[s2hdi];
if s2hd.lg < mrg.lg { rslt = s2hd; s2hdi += 1; }
else {
s3tli += 1;
if s3hdi + s3hdi >= s2tli { // move in place to avoid allocation!
s3[0 .. s3tli - s3hdi - 1] = s3[s3hdi .. s3tli - 1];
s3tli -= s3hdi; s3hdi = 0; }
const s3sz = s3.size;
if s3tli >= s3sz then s3dom = { 0 .. s3sz + s3sz - 1 };
rslt = mrg; s3[s3tli] = mrg.mul3(); s3hdi += 1;
const s3hd = s3[s3hdi];
if s3hd.lg < x5.lg { mrg = s3hd; }
else { mrg = x5; x5 = x5.mul5(); s3hdi -= 1; }
}
s2[s2tli] = rslt.mul2();
yield rslt;
}
}
// test it...
write("The first 20 hamming numbers are: ");
var cnt = 0: uint(64);
for h in nodupsHammingLog() {
if cnt >= 20 then break; cnt += 1; write(" ", h); }
write("\nThe 1691st hamming number is "); cnt = 1;
for h in nodupsHammingLog() {
if cnt >= 1691 { writeln(h); break; } cnt += 1; }
write("The ", nth, "th hamming number is ");
var timer: Timer; cnt = 1;
timer.start(); var rslt: LogRep;
for h in nodupsHammingLog() {
if cnt >= nth { rslt = h; break; } cnt += 1; }
timer.stop();
write(rslt);
writeln(".\nThis last took ",
timer.elapsed(TimeUnits.milliseconds), " milliseconds.");
- Output:
The first 20 hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 The 1691st hamming number is 2125764000 The 1000000th hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 6.372 milliseconds.
The above time is as run on an Intel Skylake i5-6500 at 3.6 GHz (turbo, single-threaded).
As you can see, the time expended for the required task is almost too fast to measure, meaning that much of the time expended in previous versions was just the time doing multi-precision arithmetic; the program takes about 8.1 seconds to find the billionth Hamming number.
Very Fast Algorithm Using a Sorted Error Band
The above code is about as fast as one can go generating sequences; however, if one is willing to forego sequences and just calculate the nth Hamming number (repeatedly), then some reading on the relationship between the size of numbers to the sequence numbers is helpful (Wikipedia: Regular Number). One finds that there is a very distinct relationship and that it quite quickly reduces to quite a small error band proportional to the log of the output value for larger ranges. Thus, the following code just scans for logarithmic representations to insert into a sequence for this top error band and extracts the correct nth representation from that band. It reduces time complexity to O(n^(2/3)) from O(n) for the sequence versions, but even more amazingly, reduces memory requirements to O(n^(1/3)) from O(n^(2/3)) and thus makes it possible to calculate very large values in the sequence on common personal computers. The code is as follows:
use BigInteger; use Math; use Sort; use Time;
config const nth = 1000000: uint(64);
type TriVal = 3*uint(32);
proc trival2bigint(x: TriVal): bigint {
proc xpnd(bs: uint, v: uint(32)): bigint {
var rslt = 1: bigint; var bsm = bs: bigint; var vm = v: uint;
while vm > 0 { if vm & 1 then rslt *= bsm; bsm *= bsm; vm >>= 1; }
return rslt;
}
const (x2, x3, x5) = x;
return xpnd(2: uint, x2) * xpnd(3: uint, x3) * xpnd(5: uint, x5);
}
proc nthHamming(n: uint(64)): TriVal {
if n < 1 {
writeln("nthHamming - argument must be at least one!"); exit(1); }
if n < 2 then return (0: uint(32), 0: uint(32), 0: uint(32)); // TriVal for 1
type LogRep = (real(64), uint(32), uint(32), uint(32));
record Comparator {} // used for sorting in reverse order!
proc Comparator.compare(a: LogRep, b: LogRep): real(64) {
return b[0] - a[0]; }
var logrepComp: Comparator;
const lb3 = log2(3.0: real(64)); const lb5 = log2(5.0: real(64));
const fctr = 6.0: real(64) * lb3 * lb5;
const crctn = log2(sqrt(30.0: real(64))); // log base 2 of sqrt 30
// from Wikipedia Regular Numbers formula...
const lgest = (fctr * n: real(64))**(1.0: real(64) / 3.0: real(64)) - crctn;
const frctn = if n < 1000000000 then 0.509: real(64) else 0.105: real(64);
const lghi = (fctr * (n: real(64) + frctn * lgest))**
(1.0: real(64) / 3.0: real(64)) - crctn;
const lglo = 2.0: real(64) * lgest - lghi; // lower limit of the upper "band"
var count = 0: uint(64); // need to use extended precision, might go over
var bndi = 0; var dombnd = { 0 .. bndi }; // one value so doubling size works!
var bnd: [dombnd] LogRep; const klmt = (lghi / lb5): uint(32);
for k in 0 .. klmt { // i, j, k values can be just uint(32) values!
const p = k: real(64) * lb5; const jlmt = ((lghi - p) / lb3): uint(32);
for j in 0 .. jlmt {
const q = p + j: real(64) * lb3;
const ir = lghi - q; const lg = q + floor(ir); // current log value (est)
count += ir: uint(64) + 1;
if lg >= lglo {
const sz = dombnd.size; if bndi >= sz then dombnd = { 0..sz + sz - 1 };
bnd[bndi] = (lg, ir: uint(32), j, k); bndi += 1;
}
}
}
if n > count {
writeln("nth_hamming: band high estimate is too low!"); exit(1); }
dombnd = { 0 .. bndi - 1 }; const ndx = (count - n): int;
if ndx >= dombnd.size {
writeln("nth_hamming: band low estimate is too high!"); exit(1); }
sort(bnd, comparator = logrepComp); // descending order leaves zeros at end!
const rslt = bnd[ndx]; return (rslt[1], rslt[2], rslt[3]);
}
// test it...
write("The first 20 Hamming numbers are: ");
for i in 1 .. 20 do write(" ", trival2bigint(nthHamming(i: uint(64))));
writeln("\nThe 1691st hamming number is ",
trival2bigint(nthHamming(1691: uint(64))));
var timer: Timer;
timer.start();
const answr = nthHamming(nth);
timer.stop();
write("The ", nth, "th Hamming number is 2**",
answr[0], " * 3**", answr[1], " * 5**", answr[2]);
const lgrslt = (answr[0]: real(64) + answr[1]: real(64) * log2(3: real(64)) +
answr[2]: real(64) * log2(5: real(64))) * log10(2: real(64));
const whl = lgrslt: uint(64); const frac = lgrslt - whl: real(64);
write(",\nwhich is approximately ", 10: real(64)**frac, "E+", whl);
const bganswr = trival2bigint(answr);
const answrstr = bganswr: string; const asz = answrstr.size;
writeln(" and has ", asz, " digits.");
if asz <= 2000 then write("Can be printed as: ", answrstr);
else write("It's too long to print");
writeln("!\nThis last took ",
timer.elapsed(TimeUnits.milliseconds), " milliseconds.");
- Output:
The first 20 Hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 The 1691st hamming number is 2125764000 The 1000000th Hamming number is 2**55 * 3**47 * 5**64, which is approximately 5.19313E+83 and has 84 digits. Can be printed as: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000! This last took 0.0 milliseconds.
As you can see, the execution time is much too small to be measured. The billionth number in the sequence can be calculated in about 15 milliseconds and the trillionth in about 0.359 seconds. The (2^64 - 1)th value (18446744073709551615) cannot be calculated due to a slight overflow problem as it approaches that limit. However, this version gives inaccurate results much about the 1e13th Hamming number due to the log base two (double) approximate representation not having enough precision to accurately sort the values put into the error band array.
Alternate version with a greatly increased range without error
To solve the problem of inadequate precision in the double log base two representation, the following code uses a BigInt representation of the log value with about twice the significant bits, which is then sufficient to extend the usable range well beyond any reasonable requirement:
use BigInteger; use Math; use Sort; use Time;
config const nth = 1000000: uint(64);
type TriVal = 3*uint(32);
proc trival2bigint(x: TriVal): bigint {
proc xpnd(bs: uint, v: uint(32)): bigint {
var rslt = 1: bigint; var bsm = bs: bigint; var vm = v: uint;
while vm > 0 { if vm & 1 then rslt *= bsm; bsm *= bsm; vm >>= 1; }
return rslt;
}
const (x2, x3, x5) = x;
return xpnd(2: uint, x2) * xpnd(3: uint, x3) * xpnd(5: uint, x5);
}
proc nthHamming(n: uint(64)): TriVal {
if n < 1 {
writeln("nthHamming - argument must be at least one!"); exit(1); }
if n < 2 then return (0: uint(32), 0: uint(32), 0: uint(32)); // TriVal for 1
type LogRep = (bigint, uint(32), uint(32), uint(32));
record Comparator {} // used for sorting in reverse order!
proc Comparator.compare(a: LogRep, b: LogRep): int {
return (b[0] - a[0]): int; }
var logrepComp: Comparator;
const lb3 = log2(3.0: real(64)); const lb5 = log2(5.0: real(64));
const bglb2 = "1267650600228229401496703205376": bigint;
const bglb3 = "2009178665378409109047848542368": bigint;
const bglb5 = "2943393543170754072109742145491": bigint;
const fctr = 6.0: real(64) * lb3 * lb5;
const crctn = log2(sqrt(30.0: real(64))); // log base 2 of sqrt 30
// from Wikipedia Regular Numbers formula...
const lgest = (fctr * n: real(64))**(1.0: real(64) / 3.0: real(64)) - crctn;
const frctn = if n < 1000000000 then 0.509: real(64) else 0.105: real(64);
const lghi = (fctr * (n: real(64) + frctn * lgest))**
(1.0: real(64) / 3.0: real(64)) - crctn;
const lglo = 2.0: real(64) * lgest - lghi; // lower limit of the upper "band"
var count = 0: uint(64); // need to use extended precision, might go over
var bndi = 0; var dombnd = { 0 .. bndi }; // one value so doubling size works!
var bnd: [dombnd] LogRep; const klmt = (lghi / lb5): uint(32);
for k in 0 .. klmt { // i, j, k values can be just uint(32) values!
const p = k: real(64) * lb5; const jlmt = ((lghi - p) / lb3): uint(32);
for j in 0 .. jlmt {
const q = p + j: real(64) * lb3;
const ir = lghi - q; const lg = q + floor(ir); // current log value (est)
count += ir: uint(64) + 1;
if lg >= lglo {
const sz = dombnd.size; if bndi >= sz then dombnd = { 0..sz + sz - 1 };
const bglg =
bglb2 * ir: int(64) + bglb3 * j: int(64) + bglb5 * k: int(64);
bnd[bndi] = (bglg, ir: uint(32), j, k); bndi += 1;
}
}
}
if n > count {
writeln("nth_hamming: band high estimate is too low!"); exit(1); }
dombnd = { 0 .. bndi - 1 }; const ndx = (count - n): int;
if ndx >= dombnd.size {
writeln("nth_hamming: band low estimate is too high!"); exit(1); }
sort(bnd, comparator = logrepComp); // descending order leaves zeros at end!
const rslt = bnd[ndx]; return (rslt[1], rslt[2], rslt[3]);
}
// test it...
write("The first 20 Hamming numbers are: ");
for i in 1 .. 20 do write(" ", trival2bigint(nthHamming(i: uint(64))));
writeln("\nThe 1691st hamming number is ",
trival2bigint(nthHamming(1691: uint(64))));
var timer: Timer;
timer.start();
const answr = nthHamming(nth);
timer.stop();
write("The ", nth, "th Hamming number is 2**",
answr[0], " * 3**", answr[1], " * 5**", answr[2]);
const lgrslt = (answr[0]: real(64) + answr[1]: real(64) * log2(3: real(64)) +
answr[2]: real(64) * log2(5: real(64))) * log10(2: real(64));
const whl = lgrslt: uint(64); const frac = lgrslt - whl: real(64);
write(",\nwhich is approximately ", 10: real(64)**frac, "E+", whl);
const bganswr = trival2bigint(answr);
const answrstr = bganswr: string; const asz = answrstr.size;
writeln(" and has ", asz, " digits.");
if asz <= 2000 then write("Can be printed as: ", answrstr);
else write("It's too long to print");
writeln("!\nThis last took ",
timer.elapsed(TimeUnits.milliseconds), " milliseconds.");
The above code has the same output as before and doesn't take an appreciably different amount time to execute; it can produce the billionth Hamming number in about 31 milliseconds, the trillionth in about 0.546 seconds and the thousand trillionth (which is now possible without error) in about 39.36 seconds. Thus, it successfully extends the usable range of the algorithm to near the maximum expressible 64 bit number in a few hours of execution time on a modern desktop computer although the (2^64 - 1)th Hamming number can't be found due to the restrictions of the expressible range limit in sizing of the required error band.
That said, if one actually needed a sequence of Hamming numbers for fairly large ranges, one would likely be better off to make this last adjustment to the final logarithmic sequence version above as although this error-band version is extremely fast for single values, the accumulative cost for repeating use will be more than the incremental cost of the sequence version at some range limit.
Clojure
This version implements Dijkstra's merge solution, so is closely related to the Haskell version.
(defn smerge [xs ys]
(lazy-seq
(let [x (first xs),
y (first ys),
[z xs* ys*]
(cond
(< x y) [x (rest xs) ys]
(> x y) [y xs (rest ys)]
:else [x (rest xs) (rest ys)])]
(cons z (smerge xs* ys*)))))
(def hamming
(lazy-seq
(->> (map #(*' 5 %) hamming)
(smerge (map #(*' 3 %) hamming))
(smerge (map #(*' 2 %) hamming))
(cons 1))))
Note that the above version uses a lot of space and time after calculating a few hundred thousand elements of the sequence. This is no doubt due to not avoiding the generation of duplicates in the sequences as well as its "holding on to the head": it maintains the entire generated sequences in memory.
Avoiding duplicates and reducing memory use
In order to fix the problems with the above program as to memory use and extra time expended, the following code implements the Haskell idea as a function so that it does not retain the pointers to the streams used so that they can be garbage collected from the beginning as they are consumed. it avoids duplicate number generation by using intermediate streams for each of the multiples and building each on the results of the last; also, it orders the streams from least dense to most so that the intermediate streams retained are as short as possible, with the "s5" stream only from one fifth to a third of the current value, the "s35" stream only between a third and a half of the current output value, and the s235 stream only between a half and the current output - as the sequence is not very dense with increasing range, mot many values need be retained:
(defn hamming
"Computes the unbounded sequence of Hamming 235 numbers."
[]
(letfn [(merge [xs ys]
(if (nil? xs) ys
(let [xv (first xs), yv (first ys)]
(if (< xv yv) (cons xv (lazy-seq (merge (next xs) ys)))
(cons yv (lazy-seq (merge xs (next ys)))))))),
(smult [m s] ;; equiv to map (* m) s -- faster
(cons (*' m (first s)) (lazy-seq (smult m (next s))))),
(u [s n] (let [r (atom nil)]
(reset! r (merge s (smult n (cons 1 (lazy-seq @r)))))))]
(cons 1 (lazy-seq (reduce u nil (list 5 3 2))))))
Much of the time expended for larger ranges (say 10 million or more) is due to the time doing extended precision arithmetic, with also a significant percentage spent in garbage collection. Following is the output from the REPL after compiling the program:
After compiling code in REPL:
- Output:
(take 20 (hamming)) (1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36) (->> (hamming) (drop 1690) (first) (time)) "Elapsed time: 1.105582 msecs" 2125764000 (->> (hamming) (drop 999999) (first) (time)) "Elapsed time: 447.561128 msecs" 519312780448388736089589843750000000000000000000000000000000000000000000000000000000N
So that generated '.class' files in a folder or a generated '.jar' file (possibly standalone, containing the run time library) run at about the same speed as inside the IDE (after compilation), the Leiningen "project.clj" file needs to be modified to contain the following line so as to eliminate JVM options that slow the performance:
:jvm-opts ^:replace []
CoffeeScript
# Generate hamming numbers in order. Hamming numbers have the
# property that they don't evenly divide any prime numbers outside
# a given set, such as [2, 3, 5].
generate_hamming_sequence = (primes, max_n) ->
# We use a lazy algorithm, only ever keeping N candidates
# in play, one for each of our seed primes. Let's say
# primes is [2,3,5]. Our virtual streams are these:
#
# hammings: 1,2,3,4,5,6,8,10,12,15,16,18,20,...
# hammings*2: 2,4,6,9.10,12,16,20,24,30,32,36,40...
# hammings*3: 3,6,9,12,15,18,24,30,36,45,...
# hammings*5: 5,10,15,20,25,30,40,50,...
#
# After encountering 40 for the last time, our candidates
# will be
# 50 = 2 * 25
# 45 = 3 * 15
# 50 = 5 * 10
# Then, after 45
# 50 = 2 * 25
# 48 = 3 * 16 <= new
# 50 = 5 * 10
hamming_numbers = [1]
candidates = ([p, p, 1] for p in primes)
last_number = 1
while hamming_numbers.length < max_n
# Get the next candidate Hamming Number tuple.
i = min_idx(candidates)
candidate = candidates[i]
[n, p, seq_idx] = candidate
# Add to sequence unless it's a duplicate.
if n > last_number
hamming_numbers.push n
last_number = n
# Replace the candidate with its successor (based on
# p = 2, 3, or 5).
#
# This is the heart of the algorithm. Let's say, over the
# primes [2,3,5], we encounter the hamming number 32 based on it being
# 2 * 16, where 16 is the 12th number in the sequence.
# We'll be passed in [32, 2, 12] as candidate, and
# hamming_numbers will be [1,2,3,4,5,6,8,9,10,12,16,18,...]
# by now. The next candidate we need to enqueue is
# [36, 2, 13], where the numbers mean this:
#
# 36 - next multiple of 2 of a Hamming number
# 2 - prime number
# 13 - 1-based index of 18 in the sequence
#
# When we encounter [36, 2, 13], we will then enqueue
# [40, 2, 14], based on 20 being the 14th hamming number.
q = hamming_numbers[seq_idx]
candidates[i] = [p*q, p, seq_idx+1]
hamming_numbers
min_idx = (arr) ->
# Don't waste your time reading this--it just returns
# the index of the smallest tuple in an array, respecting that
# the tuples may contain integers. (CS compiles to JS, which is
# kind of stupid about sorting. There are libraries to work around
# the limitation, but I wanted this code to be standalone.)
less_than = (tup1, tup2) ->
i = 0
while i < tup2.length
return true if tup1[i] <= tup2[i]
return false if tup1[i] > tup2[i]
i += 1
min_i = 0
for i in [1...arr.length]
if less_than arr[i], arr[min_i]
min_i = i
return min_i
primes = [2, 3, 5]
numbers = generate_hamming_sequence(primes, 10000)
console.log numbers[1690]
console.log numbers[9999]
Common Lisp
Maintaining three queues, popping the smallest value every time.
(defun next-hamm (factors seqs)
(let ((x (apply #'min (map 'list #'first seqs))))
(loop for s in seqs
for f in factors
for i from 0
with add = t do
(if (= x (first s)) (pop s))
;; prevent a value from being added to multiple lists
(when add
(setf (elt seqs i) (nconc s (list (* x f))))
(if (zerop (mod x f)) (setf add nil)))
finally (return x))))
(loop with factors = '(2 3 5)
with seqs = (loop for i in factors collect '(1))
for n from 1 to 1000001 do
(let ((x (next-hamm factors seqs)))
(if (or (< n 21)
(= n 1691)
(= n 1000000)) (format t "~d: ~d~%" n x))))
A much faster method:
(defun hamming (n)
(let ((fac '(2 3 5))
(idx (make-array 3 :initial-element 0))
(h (make-array (1+ n)
:initial-element 1
:element-type 'integer)))
(loop for i from 1 to n
with e with x = '(1 1 1) do
(setf e (setf (aref h i) (apply #'min x))
x (loop for y in x
for f in fac
for j from 0
collect (if (= e y) (* f (aref h (incf (aref idx j)))) y))))
(aref h n)))
(loop for i from 1 to 20 do
(format t "~2d: ~d~%" i (hamming i)))
(loop for i in '(1691 1000000) do
(format t "~d: ~d~%" i (hamming i)))
- Output:
1: 1 2: 2 3: 3 4: 4 5: 5 6: 6 7: 8 8: 9 9: 10 10: 12 11: 15 12: 16 13: 18 14: 20 15: 24 16: 25 17: 27 18: 30 19: 32 20: 36 1691: 2125764000 1000000: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
Crystal
require "big"
def hamming(limit)
h = Array.new(limit, 1.to_big_i) # h = Array.new(limit+1, 1.to_big_i)
x2, x3, x5 = 2.to_big_i, 3.to_big_i, 5.to_big_i
i, j, k = 0, 0, 0
(1...limit).each do |n| # (1..limit).each do |n|
h[n] = Math.min(x2, Math.min(x3, x5))
x2 = 2 * h[i += 1] if x2 == h[n]
x3 = 3 * h[j += 1] if x3 == h[n]
x5 = 5 * h[k += 1] if x5 == h[n]
end
h[limit - 1]
end
start = Time.monotonic
print "Hamming Number (1..20): "; (1..20).each { |i| print "#{hamming(i)} " }
puts
puts "Hamming Number 1691: #{hamming 1691}"
puts "Hamming Number 1,000,000: #{hamming 1_000_000}"
puts "Elasped Time: #{(Time.monotonic - start).total_seconds} secs"
System: I7-6700HQ, 3.5 GHz, Linux Kernel 5.6.17, Crystal 0.35 Run as: $ crystal run hammingnumbers.cr --release
- Output:
Hamming Number (1..20): 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 Hamming Number 1691: 2125764000 Hamming Number 1,000,000: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 Elasped Time: 0.21420532 secs
Functional Non-Duplicates Version
The above implementation is true to the original Dijkstra algorithm but it's one of the few times where Dijkstra's analysis wasn't complete; there has been developed a later algorithm that is at least twice as fast due to only processing non-duplicate Hamming numbers and keeping only the numbers as necessary for further extensions of the sequence (the tails of the lists). Although Crystal isn't really a functional language, it is capable of enough functional forms of code to be able to implement this new algorithm. The algorithm requires lazy lists, for which currently Crystal has no library module, but as Crystal does have full first class functions including the ability to capture environment variables as closures, the `LazyList` type is easy enough to implement, as in the following code:
require "big"
# Unlike some languages like Kotlin, Crystal doesn't have a Lazy module,
# but it has closures, so it is easy to implement a LazyList class;
# Memoizes the results of the thunk so only executed once...
class LazyList(T)
getter head
@tail : LazyList(T)? = nil
def initialize(@head : T, @thnk : Proc(LazyList(T)))
end
def initialize(@head : T, @thnk : Proc(Nil))
end
def initialize(@head : T, @thnk : Nil)
end
def tail # not thread safe without a lock/mutex...
if thnk = @thnk
@tail = thnk.call; @thnk = nil
end
@tail
end
end
class Hammings
include Iterator(BigInt)
private BASES = [ 5, 3, 2 ] of Int32
private EMPTY = nil.as(LazyList(BigInt)?)
@ll : LazyList(BigInt)
def initialize
rst = uninitialized LazyList(BigInt)
BASES.each.accumulate(EMPTY) { |u, n| Hammings.unify(u, n) }
.skip(1).each { |ll| rst = ll.not_nil! }
@ll = LazyList.new(BigInt.new(1), ->{ rst } )
end
protected def self.unify(s : LazyList(BigInt)?, n : Int32)
r = uninitialized LazyList(BigInt)?
if ss = s
r = merge(ss, mults(n, LazyList.new(BigInt.new(1), -> { r.not_nil! })))
else
r = mults(n, LazyList.new(BigInt.new(1), -> { r.not_nil! }))
end
r
end
private def self.mults(m : Int32, lls : LazyList(BigInt))
mlts = uninitialized Proc(LazyList(BigInt), LazyList(BigInt))
mlts = -> (ill : LazyList(BigInt)) {
LazyList.new(ill.head * m, -> { mlts.call(ill.tail.not_nil!) }) }
mlts.call(lls)
end
private def self.merge(x : LazyList(BigInt), y : LazyList(BigInt))
xhd = x.head; yhd = y.head
if xhd < yhd
LazyList.new(xhd, -> { merge(x.tail.not_nil!, y) })
else
LazyList.new(yhd, -> { merge(x, y.tail.not_nil!) })
end
end
def next
rslt = @ll.head; @ll = @ll.tail.not_nil!; rslt
end
end
print "The first 20 Hamming numbers are: "
Hammings.new.first(20).each { |h| print(" ", h) }
print ".\r\nThe 1691st Hamming number is "
Hammings.new.skip(1690).first(1).each { |h| print h }
print ".\r\nThe millionth Hamming number is "
start_time = Time.monotonic
Hammings.new.skip(999_999).first(1).each { |h| print h }
elpsd = (Time.monotonic - start_time).total_milliseconds
printf(".\r\nThis last took %f milliseconds.\r\n", elpsd)
- Output:
The first 20 Hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36. The 1691st Hamming number is 2125764000. The millionth Hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 162.713293 milliseconds.
The time is as run on an Intel SkyLake i5-6500 CPU running at 3.6 GHz single threaded as here. The code is a little slower than the fastest functional languages, such as Haskell or Kotlin due to that the speed of the Boehm Garbage Collector used by Crystal isn't as tuned for the many small allocations as necessary for functional forms of code such as the `LazyList` as those other languages which use memory pools to reduce the allocation/deallocation time from many small blocks of memory; that said, many common languages are much slower than this for functional algorithms due to their memory allocators being even slower and less tuned for this use.
About a quarter of the time is spent doing extended precision calculations (which time will increase disproportional to range as the numbers get larger) but over two thirds of the the is spent just handling memory allocations/deallocations.
Functional Non-Duplicates Version Using Log Estimations
In order to show the time expended in multi-precision integer calculations, the following code implements the same algorithm as above but uses logarithmic estimations rather than multi-precision integer arithmetic to compute each instance of the Hamming number sequence, only converting to `BigInt` for the results:
require "big"
# Unlike some languages like Kotlin, Crystal doesn't have a Lazy module,
# but it has closures, so it is easy to implement a LazyList class;
# Memoizes the results of the thunk so only executed once...
class LazyList(T)
getter head
@tail : LazyList(T)? = nil
def initialize(@head : T, @thnk : Proc(LazyList(T)))
end
def initialize(@head : T, @thnk : Proc(Nil))
end
def initialize(@head : T, @thnk : Nil)
end
def tail # not thread safe without a lock/mutex...
if thnk = @thnk
@tail = thnk.call; @thnk = nil
end
@tail
end
end
class LogRep
private LOG2_2 = 1.0_f64
private LOG2_3 = Math.log2 3.0_f64
private LOG2_5 = Math.log2 5.0_f64
def initialize(@logrep : Float64, @x2 : Int32, @x3 : Int32, @x5 : Int32)
end
def self.mult2(x : LogRep)
LogRep.new(x.@logrep + LOG2_2, x.@x2 + 1, x.@x3, x.@x5)
end
def self.mult3(x : LogRep)
LogRep.new(x.@logrep + LOG2_3, x.@x2, x.@x3 + 1, x.@x5)
end
def self.mult5(x : LogRep)
LogRep.new(x.@logrep + LOG2_5, x.@x2, x.@x3, x.@x5 + 1)
end
def <(other : LogRep)
self.@logrep < other.@logrep
end
def toBigInt
expnd = -> (x : Int32, mlt : Int32) do
rslt = BigInt.new(1); m = BigInt.new(mlt)
while x > 0
rslt *= m if (x & 1) > 0; m *= m; x >>= 1
end
rslt
end
expnd.call(@x2, 2) * expnd.call(@x3, 3) * expnd.call(@x5, 5)
end
end
class HammingsLogRep
include Iterator(LogRep)
private BASES = [ -> (x : LogRep) { LogRep.mult5 x },
-> (x : LogRep) { LogRep.mult3 x },
-> (x : LogRep) { LogRep.mult2 x } ]
private EMPTY = nil.as(LazyList(LogRep)?)
private ONE = LogRep.new(0.0, 0, 0, 0)
@ll : LazyList(LogRep)
def initialize
rst = uninitialized LazyList(LogRep)
BASES.each.accumulate(EMPTY) { |u, n| HammingsLogRep.unify(u, n) }
.skip(1).each { |ll| rst = ll.not_nil! }
@ll = LazyList.new(ONE, ->{ rst } )
end
protected def self.unify(s : LazyList(LogRep)?, n : LogRep -> LogRep)
r = uninitialized LazyList(LogRep)?
if ss = s
r = merge(ss, mults(n, LazyList.new(ONE, -> { r.not_nil! })))
else
r = mults(n, LazyList.new(ONE, -> { r.not_nil! }))
end
r
end
private def self.mults(m : LogRep -> LogRep, lls : LazyList(LogRep))
mlts = uninitialized Proc(LazyList(LogRep), LazyList(LogRep))
mlts = -> (ill : LazyList(LogRep)) {
LazyList.new(m.call(ill.head), -> { mlts.call(ill.tail.not_nil!) }) }
mlts.call(lls)
end
private def self.merge(x : LazyList(LogRep), y : LazyList(LogRep))
xhd = x.head; yhd = y.head
if xhd < yhd
LazyList.new(xhd, -> { merge(x.tail.not_nil!, y) })
else
LazyList.new(yhd, -> { merge(x, y.tail.not_nil!) })
end
end
def next
rslt = @ll.head; @ll = @ll.tail.not_nil!; rslt
end
end
print "The first 20 Hamming numbers are: "
HammingsLogRep.new.first(20).each { |h| print(" ", h.toBigInt) }
print ".\r\nThe 1691st Hamming number is "
HammingsLogRep.new.skip(1690).first(1).each { |h| print h.toBigInt }
print ".\r\nThe millionth Hamming number is "
start_time = Time.monotonic
HammingsLogRep.new.skip(999_999).first(1).each { |h| print h.toBigInt }
elpsd = (Time.monotonic - start_time).total_milliseconds
printf(".\r\nThis last took %f milliseconds.\r\n", elpsd)
- Output:
The first 20 Hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36. The 1691st Hamming number is 2125764000. The millionth Hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 131.661941 milliseconds.
As can be seen by comparing with the above results using the same Intel Skylake i5-6500 CPU, this is about 20 percent faster due to less time spent doing the increasingly long multi-precision `BigInt`'s. Note that using a `struct` rather than a `class` would make this code about twice as slow due to the larger memory copies required in copying "value's" rather than "reference" pointers.
Functional Non-Duplicates Version Using Log Estimations and Imperative Code
To show that the majority of the time for the above implementations is used in memory allocations/deallocations for the functional lazy list form of code, the following code implements this imperatively by using home-grown "growable" arrays; these "growable" arrays were hand implemented using pointer allocations to avoid the automatic bounds checking done for conventional Array's; note that the `LogRep` is now a `struct` rather than a `class` as now there aren't many value copies and to save the quite large amount of time required to allocation/deallocate memory as if `class`'s were used:
require "big"
struct LogRep
private LOG2_2 = 1.0_f64
private LOG2_3 = Math.log2 3.0_f64
private LOG2_5 = Math.log2 5.0_f64
def initialize(@logrep : Float64, @x2 : Int32, @x3 : Int32, @x5 : Int32)
end
def mult2
LogRep.new(@logrep + LOG2_2, @x2 + 1, @x3, @x5)
end
def mult3
LogRep.new(@logrep + LOG2_3, @x2, @x3 + 1, @x5)
end
def mult5
LogRep.new(@logrep + LOG2_5, @x2, @x3, @x5 + 1)
end
def <(other : LogRep)
self.@logrep < other.@logrep
end
def toBigInt
expnd = -> (x : Int32, mlt : Int32) do
rslt = BigInt.new(1); m = BigInt.new(mlt)
while x > 0
rslt *= m if (x & 1) > 0; m *= m; x >>= 1
end
rslt
end
expnd.call(@x2, 2) * expnd.call(@x3, 3) * expnd.call(@x5, 5)
end
end
class HammingsImpLogRep
include Iterator(LogRep)
private ONE = LogRep.new(0.0, 0, 0, 0)
# use pointers to avoid bounds checking...
@s2 = Pointer(LogRep).malloc 1024; @s3 = Pointer(LogRep).malloc 1024
@s5 : LogRep = ONE.mult5; @mrg : LogRep = ONE.mult3
@s2sz = 1024; @s3sz = 1024
@s2hdi = 0; @s2tli = 0; @s3hdi = 0; @s3tli = 0
def initialize
@s2[0] = ONE; @s3[0] = ONE.mult3
end
def next
@s2tli += 1
if @s2hdi + @s2hdi >= @s2sz # unused is half of used
@s2.move_from(@s2 + @s2hdi, @s2tli - @s2hdi)
@s2tli -= @s2hdi; @s2hdi = 0
end
if @s2tli >= @s2sz # grow array, copying former contents
@s2sz += @s2sz; ns2 = Pointer(LogRep).malloc @s2sz
ns2.move_from(@s2, @s2tli); @s2 = ns2
end
rsltp = @s2 + @s2hdi;
if rsltp.value < @mrg
@s2[@s2tli] = rsltp.value.mult2; @s2hdi += 1
else
@s3tli += 1
if @s3hdi + @s3hdi >= @s3sz # unused is half of used
@s3.move_from(@s3 + @s3hdi, @s3tli - @s3hdi)
@s3tli -= @s3hdi; @s3hdi = 0
end
if @s3tli >= @s3sz # grow array, copying former contents
@s3sz += @s3sz; ns3 = Pointer(LogRep).malloc @s3sz
ns3.move_from(@s3, @s3tli); @s3 = ns3
end
@s2[@s2tli] = @mrg.mult2; @s3[@s3tli] = @mrg.mult3
@s3hdi += 1; ns3hdp = @s3 + @s3hdi
rslt = @mrg; rsltp = pointerof(rslt)
if ns3hdp.value < @s5
@mrg = ns3hdp.value
else
@mrg = @s5; @s5 = @s5.mult5; @s3hdi -= 1
end
end
rsltp.value
end
end
print "The first 20 Hamming numbers are: "
HammingsImpLogRep.new.first(20).each { |h| print(" ", h.toBigInt) }
print ".\r\nThe 1691st Hamming number is "
HammingsImpLogRep.new.skip(1690).first(1).each { |h| print h.toBigInt }
print ".\r\nThe millionth Hamming number is "
start_time = Time.monotonic
HammingsImpLogRep.new.skip(999_999).first(1).each { |h| print h.toBigInt }
elpsd = (Time.monotonic - start_time).total_milliseconds
printf(".\r\nThis last took %f milliseconds.\r\n", elpsd)
- Output:
The first 20 Hamming numbers are: 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36. The 1691st Hamming number is 2125764000. The millionth Hamming number is 519312780448388736089589843750000000000000000000000000000000000000000000000000000000. This last took 7.330211 milliseconds.
As can be seen by comparing with the above results using the same Intel Skylake i5-6500 CPU, this is about eighteen times faster than the functional version also using logarithmic representations due to less time spent doing memory allocations/deallocations by using the imperative form of code. This version can find the billionth Hamming number in about 7.6 seconds on this machine.
D
Basic Version
This version keeps all numbers in memory, computing all the Hamming numbers up to the needed one. Performs constant number of operations per Hamming number produced.
import std.stdio, std.bigint, std.algorithm, std.range, core.memory;
auto hamming(in uint n) pure nothrow /*@safe*/ {
immutable BigInt two = 2, three = 3, five = 5;
auto h = new BigInt[n];
h[0] = 1;
BigInt x2 = 2, x3 = 3, x5 = 5;
size_t i, j, k;
foreach (ref el; h.dropOne) {
el = min(x2, x3, x5);
if (el == x2) x2 = two * h[++i];
if (el == x3) x3 = three * h[++j];
if (el == x5) x5 = five * h[++k];
}
return h.back;
}
void main() {
GC.disable;
iota(1, 21).map!hamming.writeln;
1_691.hamming.writeln;
1_000_000.hamming.writeln;
}
- Output:
[1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36] 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
Runtime is about 1.6 seconds with LDC2.
Alternative Version 1
This keeps numbers in memory, but over-computes a sequence by a factor of about , calculating extra multiples past that as well. Incurs an extra factor of operations per each number produced (reinserting its multiples into a tree). Doesn't stop when the target number is reached, instead continuing until it is no longer needed:
import std.stdio, std.bigint, std.container, std.algorithm, std.range,
core.memory;
BigInt hamming(in int n)
in {
assert(n > 0);
} body {
auto frontier = redBlackTree(2.BigInt, 3.BigInt, 5.BigInt);
auto lowest = 1.BigInt;
foreach (immutable _; 1 .. n) {
lowest = frontier.front;
frontier.removeFront;
frontier.insert(lowest * 2);
frontier.insert(lowest * 3);
frontier.insert(lowest * 5);
}
return lowest;
}
void main() {
GC.disable;
writeln("First 20 Hamming numbers: ", iota(1, 21).map!hamming);
writeln("hamming(1691) = ", 1691.hamming);
writeln("hamming(1_000_000) = ", 1_000_000.hamming);
}
- Output:
First 20 Hamming numbers: [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36] hamming(1691) = 2125764000 hamming(1_000_000) = 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
About 3.2 seconds run time with LDC2.
Alternative Version 2
Does exactly what the first version does, creating an array and filling it with Hamming numbers, keeping the three back pointers into the sequence for next multiples calculations, except that it represents the numbers as their coefficients triples and their logarithm values (for comparisons), thus saving on BigInt calculations.
import std.stdio: writefln;
import std.bigint: BigInt;
import std.conv: text;
import std.numeric: gcd;
import std.algorithm: copy, map;
import std.array: array;
import core.stdc.stdlib: calloc;
import std.math: log; // ^^
// Number of factors.
enum NK = 3;
enum MAX_HAM = 10_000_000;
static assert(gcd(NK, MAX_HAM) == 1);
enum int[NK] factors = [2, 3, 5];
/// K-smooth numbers (stored as their exponents of each factor).
struct Hamming {
double v; // Log of the number, for convenience.
ushort[NK] e; // Exponents of each factor.
public static __gshared immutable double[factors.length] inc =
factors[].map!log.array;
bool opEquals(in ref Hamming y) const pure nothrow @nogc {
//return this.e == y.e; // Too much slow.
foreach (immutable i; 0 .. this.e.length)
if (this.e[i] != y.e[i])
return false;
return true;
}
void update() pure nothrow @nogc {
//this.v = dotProduct(inc, this.e); // Too much slow.
this.v = 0.0;
foreach (immutable i; 0 .. this.e.length)
this.v += inc[i] * this.e[i];
}
string toString() const {
BigInt result = 1;
foreach (immutable i, immutable f; factors)
result *= f.BigInt ^^ this.e[i];
return result.text;
}
}
// Global variables.
__gshared Hamming[] hams;
__gshared Hamming[NK] values;
nothrow @nogc static this() {
// Slower than calloc if you don't use all the MAX_HAM items.
//hams = new Hamming[MAX_HAM];
auto ptr = cast(Hamming*)calloc(MAX_HAM, Hamming.sizeof);
static const err = new Error("Not enough memory.");
if (!ptr)
throw err;
hams = ptr[0 .. MAX_HAM];
foreach (immutable i, ref v; values) {
v.e[i] = 1;
v.v = Hamming.inc[i];
}
}
ref Hamming getHam(in size_t n) nothrow @nogc
in {
assert(n <= MAX_HAM);
} body {
// Most of the time v can be just incremented, but eventually
// floating point precision will bite us, so better recalculate.
__gshared static size_t[NK] idx;
__gshared static int n_hams;
for (; n_hams < n; n_hams++) {
{
// Find the index of the minimum v.
size_t ni = 0;
foreach (immutable i; 1 .. NK)
if (values[i].v < values[ni].v)
ni = i;
hams[n_hams] = values[ni];
hams[n_hams].update;
}
foreach (immutable i; 0 .. NK)
if (values[i] == hams[n_hams]) {
values[i] = hams[idx[i]];
idx[i]++;
values[i].e[i]++;
values[i].update;
}
}
return hams[n - 2];
}
void main() {
foreach (immutable n; [1691, 10 ^^ 6, MAX_HAM])
writefln("%8d: %s", n, n.getHam);
}
The output is similar to the second C version. Runtime is about 0.11 seconds if MAX_HAM = 1_000_000 (as the task requires), and 0.90 seconds if MAX_HAM = 10_000_000.
Alternative Version 3
This version is similar to the precedent, but frees unused values. It's a little slower than the precedent version, but it uses much less RAM, so it allows to compute the result for larger n.
import std.stdio: writefln;
import std.bigint: BigInt;
import std.conv: text;
import std.algorithm: map;
import std.array: array;
import core.stdc.stdlib: malloc, calloc, free;
import std.math: log; // ^^
// Number of factors.
enum NK = 3;
__gshared immutable int[NK] primes = [2, 3, 5];
__gshared immutable double[NK] lnPrimes = primes[].map!log.array;
/// K-smooth numbers (stored as their exponents of each factor).
struct Hamming {
double ln; // Log of the number.
ushort[NK] e; // Exponents of each factor.
Hamming* next;
size_t n;
// Recompute the logarithm from the exponents.
void recalculate() pure nothrow @safe @nogc {
this.ln = 0.0;
foreach (immutable i, immutable ei; this.e)
this.ln += lnPrimes[i] * ei;
}
string toString() const {
BigInt result = 1;
foreach (immutable i, immutable f; primes)
result *= f.BigInt ^^ this.e[i];
return result.text;
}
}
Hamming getHam(in size_t n) nothrow @nogc
in {
assert(n && n != size_t.max);
} body {
static struct Candidate {
typeof(Hamming.ln) ln;
typeof(Hamming.e) e;
void increment(in size_t n) pure nothrow @safe @nogc {
e[n] += 1;
ln += lnPrimes[n];
}
bool opEquals(T)(in ref T y) const pure nothrow @safe @nogc {
// return this.e == y.e; // Slow.
return !((this.e[0] ^ y.e[0]) |
(this.e[1] ^ y.e[1]) |
(this.e[2] ^ y.e[2]));
}
int opCmp(T)(in ref T y) const pure nothrow @safe @nogc {
return (ln > y.ln) ? 1 : (ln < y.ln ? -1 : 0);
}
}
static struct HammingIterator { // Not a Range.
Candidate cand;
Hamming* base;
size_t primeIdx;
this(in size_t i, Hamming* b) pure nothrow @safe @nogc {
primeIdx = i;
base = b;
cand.e = base.e;
cand.ln = base.ln;
cand.increment(primeIdx);
}
void next() pure nothrow @safe @nogc {
base = base.next;
cand.e = base.e;
cand.ln = base.ln;
cand.increment(primeIdx);
}
}
HammingIterator[NK] its;
Hamming* head = cast(Hamming*)calloc(Hamming.sizeof, 1);
Hamming* freeList, cur = head;
Candidate next;
foreach (immutable i, ref it; its)
it = HammingIterator(i, cur);
for (size_t i = cur.n = 1; i < n; ) {
auto leastReferenced = size_t.max;
next.ln = double.max;
foreach (ref it; its) {
if (it.cand == *cur)
it.next;
if (it.base.n < leastReferenced)
leastReferenced = it.base.n;
if (it.cand < next)
next = it.cand;
}
// Collect unferenced numbers.
while (head.n < leastReferenced) {
auto tmp = head;
head = head.next;
tmp.next = freeList;
freeList = tmp;
}
if (!freeList) {
cur.next = cast(Hamming*)malloc(Hamming.sizeof);
} else {
cur.next = freeList;
freeList = freeList.next;
}
cur = cur.next;
version (fastmath) {
cur.ln = next.ln;
cur.e = next.e;
} else {
cur.e = next.e;
cur.recalculate; // Prevent FP error accumulation.
}
cur.n = i++;
cur.next = null;
}
auto result = *cur;
version (leak) {}
else {
while (head) {
auto tmp = head;
head = head.next;
tmp.free;
}
while (freeList) {
auto tmp = freeList;
freeList = freeList.next;
tmp.free;
}
}
return result;
}
void main() {
foreach (immutable n; [1691, 10 ^^ 6, 10_000_000])
writefln("%8d: %s", n, n.getHam);
}
The output is the same as the second alternative version.
Dart
In order to produce reasonable ranges of Hamming numbers, one needs the BigInt type, but processing of many BigInt's in generating a sequence slows the code; for that reason the following code records the determined values as a combination of an approximation of the log base two value and the triple of the powers of two, three and five, only generating the final output values as BigInt's as required:
import 'dart:math';
final lb2of2 = 1.0;
final lb2of3 = log(3.0) / log(2.0);
final lb2of5 = log(5.0) / log(2.0);
class Trival {
final double log2;
final int twos;
final int threes;
final int fives;
Trival mul2() {
return Trival(this.log2 + lb2of2, this.twos + 1, this.threes, this.fives);
}
Trival mul3() {
return Trival(this.log2 + lb2of3, this.twos, this.threes + 1, this.fives);
}
Trival mul5() {
return Trival(this.log2 + lb2of5, this.twos, this.threes, this.fives + 1);
}
@override String toString() {
return this.log2.toString() + " "
+ this.twos.toString() + " "
+ this.threes.toString() + " "
+ this.fives.toString();
}
const Trival(this.log2, this.twos, this.threes, this.fives);
}
Iterable<Trival> makeHammings() sync* {
var one = Trival(0.0, 0, 0, 0);
yield(one);
var s532 = one.mul2();
var mrg = one.mul3();
var s53 = one.mul3().mul3(); // equivalent to 9 for advance step
var s5 = one.mul5();
var i = -1; var j = -1;
List<Trival> h = [];
List<Trival> m = [];
Trival rslt;
while (true) {
if (s532.log2 < mrg.log2) {
rslt = s532; h.add(s532); ++i; s532 = h[i].mul2();
} else {
rslt = mrg; h.add(mrg);
if (s53.log2 < s5.log2) {
mrg = s53; m.add(s53); ++j; s53 = m[j].mul3();
} else {
mrg = s5; m.add(s5); s5 = s5.mul5();
}
if (j > (m.length >> 1)) {m.removeRange(0, j); j = 0; }
}
if (i > (h.length >> 1)) {h.removeRange(0, i); i = 0; }
yield(rslt);
}
}
BigInt trival2Int(Trival tv) {
return BigInt.from(2).pow(tv.twos)
* BigInt.from(3).pow(tv.threes)
* BigInt.from(5).pow(tv.fives);
}
void main() {
final numhams = 1000000000000;
var hamseqstr = "The first 20 Hamming numbers are: ( ";
makeHammings().take(20)
.forEach((h) => hamseqstr += trival2BigInt(h).toString() + " ");
print(hamseqstr + ")");
var nthhamseqstr = "The first 20 Hamming numbers are: ( ";
for (var i = 1; i <= 20; ++i) {
nthhamseqstr += trival2BigInt(nthHamming(i)).toString() + " ";
}
print(nthhamseqstr + ")");
final strt = DateTime.now().millisecondsSinceEpoch;
final answr = makeHammings().skip(999999).first;
final elpsd = DateTime.now().millisecondsSinceEpoch - strt;
print("The ${numhams}th Hamming number is: $answr");
print("in full as: ${trival2BigInt(answr)}");
print("This test took $elpsd milliseconds.");
}
- Output:
The first 20 Hamming numbers are: ( 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 ) The 1000000th Hamming number is: 278.096635606686 55 47 64 in full as: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 This test took 311 milliseconds.
Due to using a mutable extendable List (Array) and mutation, the above generator is reasonably fast, and as well has the feature that List memory is recovered as it is no longer required, with a considerable saving in both execution speed and memory requirement.
Alternate extremely fast version using an "error band"
Although not a Hamming sequence generator, the following code uses the known characteristics of the distribution of Hamming numbers to just scan through to find all possibilities in a relatively narrow "error band" which then can be sorted based on the log base two approximation and the nth element determined inside that band; it has a huge advantage that memory requirements drop to O(n^(1/3)) and asymptotic execution complexity drops from O(n) to O(n^(2/3)) for an extremely fast execution speed (thanks to WillNess for the start of this algorithm as referenced in the Haskell section): Template:Translated from
import 'dart:math';
final lb2of2 = 1.0;
final lb2of3 = log(3.0) / log(2.0);
final lb2of5 = log(5.0) / log(2.0);
class Trival {
final double log2;
final int twos;
final int threes;
final int fives;
Trival mul2() {
return Trival(this.log2 + lb2of2, this.twos + 1, this.threes, this.fives);
}
Trival mul3() {
return Trival(this.log2 + lb2of3, this.twos, this.threes + 1, this.fives);
}
Trival mul5() {
return Trival(this.log2 + lb2of5, this.twos, this.threes, this.fives + 1);
}
@override String toString() {
return this.log2.toString() + " "
+ this.twos.toString() + " "
+ this.threes.toString() + " "
+ this.fives.toString();
}
const Trival(this.log2, this.twos, this.threes, this.fives);
}
BigInt trival2BigInt(Trival tv) {
return BigInt.from(2).pow(tv.twos)
* BigInt.from(3).pow(tv.threes)
* BigInt.from(5).pow(tv.fives);
}
Trival nthHamming(int n) {
if (n < 1) throw Exception("nthHamming: argument must be higher than 0!!!");
if (n < 7) {
if (n & (n - 1) == 0) {
final bts = n.bitLength - 1;
return Trival(bts.toDouble(), bts, 0, 0);
}
switch (n) {
case 3: return Trival(lb2of3, 0, 1, 0);
case 5: return Trival(lb2of5, 0, 0, 1);
case 6: return Trival(lb2of2 + lb2of3, 1, 1, 0);
}
}
final fctr = 6.0 * lb2of3 * lb2of5;
final crctn = log(sqrt(30.0)) / log(2.0);
final lb2est = pow(fctr * n.toDouble(), 1.0/3.0) - crctn;
final lb2rng = 2.0/lb2est;
final lb2hi = lb2est + 1.0/lb2est;
List<Trival> ebnd = [];
var cnt = 0;
for (var k = 0; k < (lb2hi / lb2of5).ceil(); ++k) {
final lb2p = lb2hi - k * lb2of5;
for (var j = 0; j < (lb2p / lb2of3).ceil(); ++j) {
final lb2q = lb2p - j * lb2of3;
final i = lb2q.floor(); final lb2frac = lb2q - i;
cnt += i + 1;
if (lb2frac <= lb2rng) {
final lb2v = i * lb2of2 + j * lb2of3 + k * lb2of5;
ebnd.add(Trival(lb2v, i, j, k));
}
}
}
ebnd.sort((a, b) => b.log2.compareTo(a.log2)); // descending order
final ndx = cnt - n;
if (ndx < 0) throw Exception("nthHamming: not enough triples generated!!!");
if (ndx >= ebnd.length) throw Exception("nthHamming: error band is too narrow!!!");
return ebnd[ndx];
}
void main() {
final numhams = 1000000;
var nthhamseqstr = "The first 20 Hamming numbers are: ( ";
for (var i = 1; i <= 20; ++i) {
nthhamseqstr += trival2BigInt(nthHamming(i)).toString() + " ";
}
print(nthhamseqstr + ")");
final strt = DateTime.now().millisecondsSinceEpoch;
final answr = nthHamming(numhams);
final elpsd = DateTime.now().millisecondsSinceEpoch - strt0;
print("The ${numhams}th Hamming number is: $answr");
print("in full as: ${trival2BigInt(answr)}");
print("This test took $elpsd milliseconds.");
}
The output from the above code is the same as the above version but it is so fast that the time to find the millionth Hamming number is too small to be measured other than the Dart VM JIT time. It can find the billionth prime in a fraction of a second and the trillionth prime in seconds.
Increasing the range above 1e13 by using a BigInt log base two representation
For arguments higher than about 1e13, the precision of the Double log base two approximations used above is not adequate to do an accurate sort, but the algorithm continues to work (although perhaps slightly slower) by changing the code to use BigInt log base two representations as follows:
import 'dart:math';
final biglb2of2 = BigInt.from(1) << 100; // 100 bit representations...
final biglb2of3 = (BigInt.from(1784509131911002) << 50) + BigInt.from(134114660393120);
final biglb2of5 = (BigInt.from(2614258625728952) << 50) + BigInt.from(773584997695443);
class BigTrival {
final BigInt log2;
final int twos;
final int threes;
final int fives;
@override String toString() {
return this.log2.toString() + " "
+ this.twos.toString() + " "
+ this.threes.toString() + " "
+ this.fives.toString();
}
const BigTrival(this.log2, this.twos, this.threes, this.fives);
}
BigInt bigtrival2BigInt(BigTrival tv) {
return BigInt.from(2).pow(tv.twos)
* BigInt.from(3).pow(tv.threes)
* BigInt.from(5).pow(tv.fives);
}
BigTrival nthHamming(int n) {
if (n < 1) throw Exception("nthHamming: argument must be higher than 0!!!");
if (n < 7) {
if (n & (n - 1) == 0) {
final bts = n.bitLength - 1;
return BigTrival(BigInt.from(bts) << 100, bts, 0, 0);
}
switch (n) {
case 3: return BigTrival(biglb2of3, 0, 1, 0);
case 5: return BigTrival(biglb2of5, 0, 0, 1);
case 6: return BigTrival(biglb2of2 + biglb2of3, 1, 1, 0);
}
}
final fctr = lb2of3 * lb2of5 * 6;
final crctn = log(sqrt(30.0)) / log(2.0);
final lb2est = pow(fctr * n.toDouble(), 1.0/3.0) - crctn;
final lb2rng = 2.0/lb2est;
final lb2hi = lb2est + 1.0/lb2est;
List<BigTrival> ebnd = [];
var cnt = 0;
for (var k = 0; k < (lb2hi / lb2of5).ceil(); ++k) {
final lb2p = lb2hi - k * lb2of5;
for (var j = 0; j < (lb2p / lb2of3).ceil(); ++j) {
final lb2q = lb2p - j * lb2of3;
final i = lb2q.floor(); final lb2frac = lb2q - i;
cnt += i + 1;
if (lb2frac <= lb2rng) {
// final lb2v = i * lb2of2 + j * lb2of3 + k * lb2of5;
// ebnd.add(Trival(lb2v, i, j, k));
final lb2v = BigInt.from(i) * biglb2of2
+ BigInt.from(j) * biglb2of3
+ BigInt.from(k) * biglb2of5;
ebnd.add(BigTrival(lb2v, i, j, k));
}
}
}
ebnd.sort((a, b) => b.log2.compareTo(a.log2)); // descending order
final ndx = cnt - n;
if (ndx < 0) throw Exception("nthHamming: not enough triples generated!!!");
if (ndx >= ebnd.length) throw Exception("nthHamming: error band is too narrow!!!");
return ebnd[ndx];
}
void main() {
final numhams = 1000000000;
var nthhamseqstr = "The first 20 Hamming numbers are: ( ";
for (var i = 1; i <= 20; ++i) {
nthhamseqstr += bigtrival2BigInt(nthHamming(i)).toString() + " ";
}
print(nthhamseqstr + ")");
final strt = DateTime.now().millisecondsSinceEpoch;
final answr = nthHamming(numhams);
final elpsd = DateTime.now().millisecondsSinceEpoch - strt;
print("The ${numhams}th Hamming number is: $answr");
print("in full as: ${bigtrival2BigInt(answr)}");
print("This test took $elpsd milliseconds.");
}
With these changes, the algorithm can find the 1e19'th prime in the order af days depending on the CPU used.
DCL
$ limit = p1
$
$ n = 0
$ h_'n = 1
$ x2 = 2
$ x3 = 3
$ x5 = 5
$ i = 0
$ j = 0
$ k = 0
$
$ n = 1
$ loop:
$ x = x2
$ if x3 .lt. x then $ x = x3
$ if x5 .lt. x then $ x = x5
$ h_'n = x
$ if x2 .eq. h_'n
$ then
$ i = i + 1
$ x2 = 2 * h_'i
$ endif
$ if x3 .eq. h_'n
$ then
$ j = j + 1
$ x3 = 3 * h_'j
$ endif
$ if x5 .eq. h_'n
$ then
$ k = k + 1
$ x5 = 5 * h_'k
$ endif
$ n = n + 1
$ if n .le. limit then $ goto loop
$
$ i = 0
$ loop2:
$ write sys$output h_'i
$ i = i + 1
$ if i .lt. 20 then $ goto loop2
$
$ n = limit - 1
$ write sys$output h_'n
- Output:
Here's the output; $ @hamming 1691 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000
Delphi
See Pascal.
EasyLang
func hamming lim .
len h[] lim
h[1] = 1
x2 = 2 ; x3 = 3 ; x5 = 5
i = 1 ; j = 1 ; k = 1
for n = 2 to lim
h[n] = lower x2 lower x3 x5
if x2 = h[n]
i += 1
x2 = 2 * h[i]
.
if x3 = h[n]
j += 1
x3 = 3 * h[j]
.
if x5 = h[n]
k += 1
x5 = 5 * h[k]
.
.
return h[lim]
.
for nr = 1 to 20
write hamming nr & " "
.
print ""
print hamming 1691
- Output:
1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000
Eiffel
note
description : "Initial part, in order, of the sequence of Hamming numbers"
math : "[
Hamming numbers, also known as regular numbers and 5-smooth numbers, are natural integers
that have 2, 3 and 5 as their only prime factors.
]"
computer_arithmetic :
"[
This version avoids integer overflow and stops at the last representable number in the sequence.
]"
output : "[
Per requirements of the RosettaCode example, execution will produce items of indexes 1 to 20 and 1691.
The algorithm (procedure `hamming') is more general and will produce the first `n' Hamming numbers
for any `n'.
]"
source : "This problem was posed in Edsger W. Dijkstra, A Discipline of Programming, Prentice Hall, 1978"
date : "8 August 2012"
authors : "Bertrand Meyer", "Emmanuel Stapf"
revision : "1.0"
libraries : "Relies on SORTED_TWO_WAY_LIST from EiffelBase"
implementation : "[
Using SORTED_TWO_WAY_LIST provides an elegant illustration of how to implement
a lazy scheme in Eiffel through the use of object-oriented data structures.
]"
warning : "[
The formatting (<syntaxhighlight lang="text">) specifications for Eiffel in RosettaCode are slightly obsolete:
`note' and other newer keywords not supported, red color for manifest strings.
This should be fixed soon.
]"
class
APPLICATION
create
make
feature {NONE} -- Initialization
make
-- Print first 20 Hamming numbers, in order, and the 1691-st one.
local
Hammings: like hamming
-- List of Hamming numbers, up to 1691-st one.
do
Hammings := hamming (1691)
across 1 |..| 20 as i loop
io.put_natural (Hammings.i_th (i.item)); io.put_string (" ")
end
io.put_new_line; io.put_natural (Hammings.i_th (1691)); io.put_new_line
end
feature -- Basic operations
hamming (n: INTEGER): ARRAYED_LIST [NATURAL]
-- First `n' elements (in order) of the Hamming sequence,
-- or as many of them as will not produce overflow.
local
sl: SORTED_TWO_WAY_LIST [NATURAL]
overflow: BOOLEAN
first, next: NATURAL
do
create Result.make (n); create sl.make
sl.extend (1); sl.start
across 1 |..| n as i invariant
-- "The numbers output so far are the first `i' - 1 Hamming numbers, in order".
-- "Result.first is the `i'-th Hamming number."
until sl.is_empty loop
first := sl.first; sl.start
Result.extend (first); sl.remove
across << 2, 3, 5 >> as multiplier loop
next := multiplier.item * first
overflow := overflow or next <= first
if not overflow and then not sl.has (next) then sl.extend (next) end
end
end
end
end
- Output:
1 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 2125764000
Elixir
defmodule Hamming do
def generater do
queues = [{2, queue}, {3, queue}, {5, queue}]
Stream.unfold({1, queues}, fn {n, q} -> next(n, q) end)
end
defp next(n, queues) do
queues = Enum.map(queues, fn {m, queue} -> {m, push(queue, m*n)} end)
min = Enum.map(queues, fn {_, queue} -> top(queue) end) |> Enum.min
queues = Enum.map(queues, fn {m, queue} ->
{m, (if min==top(queue), do: erase_top(queue), else: queue)}
end)
{n, {min, queues}}
end
defp queue, do: {[], []}
defp push({input, output}, term), do: {[term | input], output}
defp top({input, []}), do: List.last(input)
defp top({_, [h|_]}), do: h
defp erase_top({input, []}), do: erase_top({[], Enum.reverse(input)})
defp erase_top({input, [_|t]}), do: {input, t}
end
IO.puts "first twenty Hamming numbers:"
IO.inspect Hamming.generater |> Enum.take(20)
IO.puts "1691st Hamming number:"
IO.puts Hamming.generater |> Enum.take(1691) |> List.last
IO.puts "one millionth Hamming number:"
IO.puts Hamming.generater |> Enum.take(1_000_000) |> List.last
- Output:
first twenty Hamming numbers: [1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36] 1691st Hamming number: 2125764000 one millionth Hamming number: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
Elm
The Elm language has many restrictions that make the implementation of the Hamming Number sequence algorithms difficult, as the classic Edsger Dijkstra algorithm as written in Haskell Hamming_numbers#The_classic_version cannot be written in Elm as current Elm forbids cyclic value references (the value "hamming" is back referenced three times), and the implementation wouldn't be efficient even if it could as the current Elm version 0.19.x has removed the "Lazy" package the would defer the memoization of the result of a computation as necessary in implementing Haskell's lazy lists. Thus, one has to implement memoization using a different data structure than a lazy list; however, all current Elm data structures are persistent/forbid mutation and can only implement some sort of Copy On Write (COW), thus there is no implementation of a linear array and the "Array" module is a tree based structure (with some concessions to data blocks for slightly better performance) that will have a logarithmic execution complexity when the size increases above a minimum. In fact, all Elm data structures that could be used for this also have a logarithmic response (Dict, Set, Array). The implementation of List is not lazy so new elements can't be added to the "tail" but need to be added to the "head" for efficiency, which means if one wants to add higher elements to a list in increasing order, one needs to (COW) reverse the List (twice) in order to do it!
The solution here uses a pure functional implementation of a Min Heap (Binary Heap) Priority Queue so that the minimum element can be viewed in O(1) time although inserting new elements/replacing elements still takes O(log n) time where "n" is the number of elements in the queue. As written, no queue needs to be maintained for the multiples of five, but two queues are maintained, one for the merge of the multiples of five and three, and the larger one for the merge of all the multiples of five, three, and two. In order to minimize redundant computation time, the implementation maintains the "next" comparison values as part of the recursive function loop states that can change with every loop.
To express the sequence, a Co-Inductive Stream (CIS) is used as a deferred execution (lazy) stream; it does not memoize computations (as discussed above) but that isn't necessary for this application where the sequence is only traversed once and consumed as being traversed.
In addition, in order to reduce the "BigInt" computation time, the calculations are done on the basis of a "Float" logarithmic approximation while maintaining "Trival" triple representation of the number of powers of two, three, and five, are multiplied in order to obtain the current value represented by the logarithmic approximation. The working code is as follows:
module Main exposing ( main )
import Bitwise exposing (..)
import BigInt
import Task exposing ( Task, succeed, perform, andThen )
import Html exposing ( div, text )
import Browser exposing ( element )
import Time exposing ( now, posixToMillis )
cLIMIT : Int
cLIMIT = 1000000
-- an infinite non-empty non-memoizing Co-Inductive Stream (CIS)...
type CIS a = CIS a (() -> CIS a)
takeCIS2List : Int -> CIS a -> List a
takeCIS2List n cis =
let loop i (CIS hd tl) lst =
if i < 1 then List.reverse lst
else loop (i - 1) (tl()) (hd :: lst)
in loop n cis []
nthCIS : Int -> CIS a -> a
nthCIS n (CIS hd tl) =
if n <= 1 then hd else nthCIS (n - 1) (tl())
type PriorityQ comparable v =
Mt
| Br comparable v (PriorityQ comparable v)
(PriorityQ comparable v)
emptyPQ : PriorityQ comparable v
emptyPQ = Mt
peekMinPQ : PriorityQ comparable v -> Maybe (comparable, v)
peekMinPQ pq = case pq of
(Br k v _ _) -> Just (k, v)
Mt -> Nothing
pushPQ : comparable -> v -> PriorityQ comparable v
-> PriorityQ comparable v
pushPQ wk wv pq =
case pq of
Mt -> Br wk wv Mt Mt
(Br vk vv pl pr) ->
if wk <= vk then Br wk wv (pushPQ vk vv pr) pl
else Br vk vv (pushPQ wk wv pr) pl
siftdown : comparable -> v -> PriorityQ comparable v
-> PriorityQ comparable v -> PriorityQ comparable v
siftdown wk wv pql pqr =
case pql of
Mt -> Br wk wv Mt Mt
(Br vkl vvl pll prl) ->
case pqr of
Mt -> if wk <= vkl then Br wk wv pql Mt
else Br vkl vvl (Br wk wv Mt Mt) Mt
(Br vkr vvr plr prr) ->
if wk <= vkl && wk <= vkr then Br wk wv pql pqr
else if vkl <= vkr then Br vkl vvl (siftdown wk wv pll prl) pqr
else Br vkr vvr pql (siftdown wk wv plr prr)
replaceMinPQ : comparable -> v -> PriorityQ comparable v
-> PriorityQ comparable v
replaceMinPQ wk wv pq = case pq of
Mt -> Mt
(Br _ _ pl pr) -> siftdown wk wv pl pr
type alias Trival = (Int, Int, Int)
showTrival : Trival -> String
showTrival tv =
let (x2, x3, x5) = tv
xpnd x m r =
if x <= 0 then r
else xpnd (shiftRightBy 1 x) (BigInt.mul m m)
(if (and 1 x) /= 0 then BigInt.mul m r else r)
in BigInt.fromInt 1 |> xpnd x2 (BigInt.fromInt 2)
|> xpnd x3 (BigInt.fromInt 3) |> xpnd x5 (BigInt.fromInt 5)
|> BigInt.toString
type alias LogRep = { lr: Float, trv: Trival }
ltLogRep : LogRep -> LogRep -> Bool
ltLogRep lra lrb = lra.lr < lrb.lr
oneLogRep : LogRep
oneLogRep = { lr = 0.0, trv = (0, 0, 0) }
lg2_2 : Float
lg2_2 = 1.0 -- log base two of two
lg2_3 : Float
lg2_3 = logBase 2.0 3.0
lg2_5 : Float
lg2_5 = logBase 2.0 5.0
multLR2 : LogRep -> LogRep
multLR2 lr = let (x2, x3, x5) = lr.trv
in LogRep (lr.lr + lg2_2) (x2 + 1, x3, x5)
multLR3 : LogRep -> LogRep
multLR3 lr = let (x2, x3, x5) = lr.trv
in LogRep (lr.lr + lg2_3) (x2, x3 + 1, x5)
multLR5 : LogRep -> LogRep
multLR5 lr = let (x2, x3, x5) = lr.trv
in LogRep (lr.lr + lg2_5) (x2, x3, x5 + 1)
hammingsLog : () -> CIS Trival
hammingsLog() =
let im235 = multLR2 oneLogRep
im35 = multLR3 oneLogRep
imrg = im35
im5 = multLR5 oneLogRep
next bpq mpq m235 mrg m35 m5 =
if ltLogRep m235 mrg then
let omin = case peekMinPQ bpq of
Just (lr, trv) -> LogRep lr trv
Nothing -> m235 -- at the beginning!
nm235 = multLR2 omin
nbpq = replaceMinPQ m235.lr m235.trv bpq
in CIS m235.trv <| \ () ->
next nbpq mpq nm235 mrg m35 m5
else
if ltLogRep mrg m5 then
let omin = case peekMinPQ mpq of
Just (lr, trv) -> LogRep lr trv
Nothing -> mrg -- at the beginning!
nm35 = multLR3 omin
nmrg = if ltLogRep nm35 m5 then nm35 else m5
nmpq = replaceMinPQ mrg.lr mrg.trv mpq
nbpq = pushPQ mrg.lr mrg.trv bpq
in CIS mrg.trv <| \ () ->
next nbpq nmpq m235 nmrg nm35 m5
else
let nm5 = multLR5 m5
nmrg = if ltLogRep m35 nm5 then m35 else nm5
nmpq = pushPQ m5.lr m5.trv mpq
nbpq = pushPQ m5.lr m5.trv bpq
in CIS m5.trv <| \ () ->
next nbpq nmpq m235 nmrg m35 nm5
in CIS (0, 0, 0) <| \ () ->
next emptyPQ emptyPQ im235 imrg im35 im5
timemillis : () -> Task Never Int -- a side effect function
timemillis() = now |> andThen (\ t -> succeed (posixToMillis t))
test : Int -> Cmd Msg -- side effect function chain (includes "perform")...
test lmt =
let msg1 = "The first 20 Hamming numbers are: " ++
(hammingsLog() |> takeCIS2List 20
|> List.map showTrival
|> String.join ", ") ++ "."
msg2 = "The 1691st Hamming number is " ++
(hammingsLog() |> nthCIS 1691
|> showTrival) ++ "."
msg3 = "The " ++ String.fromInt cLIMIT ++ "th Hamming number is:"
in timemillis()
|> andThen (\ strt ->
let rsltstr = hammingsLog() |> nthCIS lmt
|> showTrival in
timemillis()
|> andThen (\ stop ->
succeed [msg1, msg2, msg3, rsltstr ++ " in "
++ String.fromInt (stop - strt)
++ " milliseconds."]))
|> perform Done
-- following code has to do with outputting to a web page using MUV/TEA...
type alias Model = List String
type Msg = Done Model
main : Program () Model Msg
main = -- starts with empty list of strings; views model of filled list...
element { init = \ _ -> ( [], test cLIMIT )
, update = \ (Done mdl) _ -> ( mdl , Cmd.none )
, subscriptions = \ _ -> Sub.none
, view = div [] << List.map (div [] << List.singleton << text) }
- Output:
The first 20 Hamming numbers are: 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36. The 1691st Hamming number is 2125764000. The 1000000th Hamming number is: 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 in 767 milliseconds.
Do note that, due to the logarithmic response of the Min Heap Priority Queue, the execution time is logarithmic with number of elements evaluation and not linear as it would otherwise be, so if it takes 0.7 seconds to find the millionth Hamming number, it takes something about 10 seconds to find the ten millionth value instead of about 7 seconds. Considering that the generated "native" code is just JavaScript, it is reasonably fast and somewhat competitive with easier implementations in other languages such as F#.
Erlang
For relatively small values of n we can use an elegant code:
list(N) -> array:to_list(element(1, array(N, [2, 3, 5]))).
nth(N) -> array:get(N-1, element(1, array(N, [2, 3, 5]))).
array(N, Primes) -> array(array:new(), N, 1, [{P, 1, P} || P <- Primes]).
array(Array, Max, Max, Candidates) -> {Array, Candidates};
array(Array, Max, I, Candidates) ->
Smallest = smallest(Candidates),
N_array = array:set(I, Smallest, Array),
array(N_array, Max, I+1, update(Smallest, N_array, Candidates)).
update(Val, Array, Candidates) -> [update_(Val, C, Array) || C <- Candidates].
update_(Val, {Val, Ind, Mul}, Array) ->
{Mul*array:get(Ind, Array), Ind+1, Mul};
update_(_, X, _) -> X.
smallest(L) -> lists:min([element(1, V) || V <- L]).
However, when n become large (let say above 5e7) the memory needed grew very large as I store all the values. Fortunately, the algorithm uses only a small fraction of the end of the array. So I can drop the beginning of the array when it is no longer needed.
nth(N, Batch) ->
array:get(N-1, element(1, compact_array(N, Batch, [2, 3, 5]))).
compact_array(Goal, Lim, Primes) ->
{Array, Candidates} = array(Lim, Primes),
compact_array(Goal, Lim, Lim, Array, Candidates).
compact_array(Goal, _, Index, Array, Candidates) when Index > Goal ->
{Array, Candidates};
compact_array(Goal, Lim, Index, Array, Candidates) ->
{N_array, N_candidates} =
array(compact(Array, Candidates), Index + Lim, Index, Candidates),
compact_array(Goal, Lim, Index+Lim, N_array, N_candidates).
compact(Array, L) ->
Index = lists:min([element(2, V) || V <- L]),
Keep = [E || E <- array:sparse_to_orddict(Array), element(1, E) >= Index],
array:from_orddict(Keep).
With this approach memory is no longer an issue:
- Output:
timer:tc(task_hamming_numbers, nth, [100_000_000, 1_000_000]). {232894309, 18140143309611363532953342430693354584669635033709097929462505366714035156593135818380467866054222964635144914854949550271375442721368122191972041094311075107507067573147191502194201568268202614781694681859513649083616294200541611489469967999559505365172812095568020073934100699850397033005903158113691518456912149989919601385875227049401605594538145621585911726469930727034807205200195312500}
So a bit less than 4 minutes to get the 100 000 000th regular number. The complexity is slightly worse than linear which is not a surprise given than all the regular numbers are computed.
ERRE
For bigger numbers, you have to use an external program, like MULPREC.R
PROGRAM HAMMING
!$DOUBLE
DIM H[2000]
PROCEDURE HAMMING(L%->RES)
LOCAL I%,J%,K%,N%,M,X2,X3,X5
H[0]=1
X2=2 X3=3 X5=5
FOR N%=1 TO L%-1 DO
M=X2
IF M>X3 THEN M=X3 END IF
IF M>X5 THEN M=X5 END IF
H[N%]=M
IF M=X2 THEN I%+=1 X2=2*H[I%] END IF
IF M=X3 THEN J%+=1 X3=3*H[J%] END IF
IF M=X5 THEN K%+=1 X5=5*H[K%] END IF
END FOR
RES=H[L%-1]
END PROCEDURE
BEGIN
FOR H%=1 TO 20 DO
HAMMING(H%->RES)
PRINT("H(";H%;")=";RES)
END FOR
HAMMING(1691->RES)
PRINT("H(1691)=";RES)
END PROGRAM
- Output:
H( 1 )= 1H( 2 )= 2 H( 3 )= 3 H( 4 )= 4 H( 5 )= 5 H( 6 )= 6 H( 7 )= 8 H( 8 )= 9 H( 9 )= 10 H( 10 )= 12 H( 11 )= 15 H( 12 )= 16 H( 13 )= 18 H( 14 )= 20 H( 15 )= 24 H( 16 )= 25 H( 17 )= 27 H( 18 )= 30 H( 19 )= 32 H( 20 )= 36 H(1691)= 2125764000
F#
This version implements Dijkstra's merge solution, so is closely related to the Haskell classic version.
type LazyList<'a> = Cons of 'a * Lazy<LazyList<'a>>
let rec hammings() =
let rec (-|-) (Cons(x, nxf) as xs) (Cons(y, nyf) as ys) =
if x < y then Cons(x, lazy(nxf.Value -|- ys))
elif x > y then Cons(y, lazy(xs -|- nyf.Value))
else Cons(x, lazy(nxf.Value -|- nyf.Value))
let rec inf_map f (Cons(x, nxf)) =
Cons(f x, lazy(inf_map f nxf.Value))
Cons(1I, lazy(let x = inf_map ((*) 2I) hamming
let y = inf_map ((*) 3I) hamming
let z = inf_map ((*) 5I) hamming
x -|- y -|- z))
// testing...
[<EntryPoint>]
let main args =
let rec iterLazyListFor f n (Cons(v, rf)) =
if n > 0 then f v; iterLazyListFor f (n - 1) rf.Value
let rec nthLazyList n ((Cons(v, rf)) as ll) =
if n <= 1 then v else nthLazyList (n - 1) rf.Value
printf "( "; iterLazyListFor (printf "%A ") 20 (hammings()); printfn ")"
printfn "%A" (hammings() |> nthLazyList 1691)
printfn "%A" (hammings() |> nthLazyList 1000000)
0
The above code memory residency is quite high as it holds the entire lazy sequence in memory due to the reference preventing garbage collection as the sequence is consumed,
The following code reduces that high memory residency by making the routine a function and using internal local stream references for the intermediate streams so that they can be collected as the stream is consumed as long as no reference is held to the main results stream (which is not in the sample test functions); it also avoids duplication of factors by successively building up streams and further reduces memory use by ordering of the streams so that the least dense are determined first:
let cNUMVALS = 1000000
type LazyList<'a> = Cons of 'a * Lazy<LazyList<'a>>
let hammings() =
let rec merge (Cons(x, f) as xs) (Cons(y, g) as ys) =
if x < y then Cons(x, lazy(merge (f.Force()) ys))
else Cons(y, lazy(merge xs (g.Force())))
let rec smult m (Cons(x, rxs)) =
Cons(m * x, lazy(smult m (rxs.Force())))
let rec first = smult 5I (Cons(1I, lazy first))
let u s n =
let rec r = merge s (smult n (Cons(1I, lazy r))) in r
Seq.unfold (fun (Cons(hd, rst)) -> Some (hd, rst.Value))
(Cons(1I, lazy(Seq.fold u first [| 3I; 2I |])))
[<EntryPoint>]
let main argv =
printf "( "; hammings() |> Seq.take 20 |> Seq.iter (printf "%A "); printfn ")"
printfn "%A" (hammings() |> Seq.item (1691 - 1))
let strt = System.DateTime.Now.Ticks
let rslt = (hammings()) |> Seq.item (cNUMVALS - 1)
let stop = System.DateTime.Now.Ticks
printfn "%A" rslt
printfn "Found this last up to %d in %d milliseconds." cNUMVALS ((stop - strt) / 10000L)
0 // return an integer exit code
Both codes output the same results as follows but the second is over three times faster:
- Output:
( 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 ) 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 Found this last up to 1000000 in 1302 milliseconds.
Both codes are over 10 times slower as compared to Haskell (or Kotlin or Scala or Clojure) when all are written in exactly the same style, perhaps due in some small degree to the BigInteger implementation being much slower for these operations than GMP and the JVM's implementation of BigInteger. Much of this is due to that the DotNet runtime does not allocate from a memory pool as the Haskell and JVM runtime's do, which is much slower when allocating for these functional algorithms where many small allocations/de-allocations are necessary.
Fast somewhat imperative sequence version using logarithms
Since the above pure functional approach isn't very efficient, a more imperative approach using "growable" arrays which are "drained" of unnecessary older values in blocks once the back pointer indices are advanced is used in the following code. The code also implements an algorithm to avoid duplicate calculations and thus does the same number of operations as the above code but faster due to using integer and floating point operations rather an BigInteger ones. Due to the "draining" the memory use is the same as the above by a constant factor. Note that the implementation of IEnumerable using sequences in F# is also not very efficient and a "roll-your-own" IEnumerable implementation would likely be somewhat faster:
F# has a particularly slow enumeration ability in the use of the `Seq` type (although easy to use) so in order to be able to bypass that, the following code still uses the imperative `ResizeArray`'s but outputs a closure "next" function that can be used directly to avoid the generation of a `Seq` sequence where maximum speed is desired: Template:Tran
let cCOUNT = 1000000
type LogRep = struct val lr: double; val x2: uint32; val x3: uint32; val x5: uint32
new(lr, x2, x3, x5) = {lr = lr; x2 = x2; x3 = x3; x5 = x5 } end
let one: LogRep = LogRep(0.0, 0u, 0u, 0u)
let lg2_2: double = 1.0
let lg3_2: double = log 3.0 / log 2.0
let lg5_2: double = log 5.0 / log 2.0
let inline mul2 (lr: LogRep): LogRep = LogRep(lr.lr + lg2_2, lr.x2 + 1u, lr.x3, lr.x5)
let inline mul3 (lr: LogRep): LogRep = LogRep(lr.lr + lg3_2, lr.x2, lr.x3 + 1u, lr.x5)
let inline mul5 (lr: LogRep): LogRep = LogRep(lr.lr + lg5_2, lr.x2, lr.x3, lr.x5 + 1u)
let hammingsLog() = // imperative arrays, eliminates the BigInteger operations...
let s2 = ResizeArray<_>() in let s3 = ResizeArray<_>()
s2.Add(one); s3.Add(mul3 one)
let mutable s5 = mul5 one in let mutable mrg = mul3 one
let mutable s2hdi = 0 in let mutable s3hdi = 0
let next() = // imperative next function to advance value
if s2hdi + s2hdi >= s2.Count then s2.RemoveRange(0, s2hdi); s2hdi <- 0
let mutable rslt: LogRep = s2.[s2hdi]
if rslt.lr < mrg.lr then s2.Add(mul2 rslt); s2hdi <- s2hdi + 1
else
if s3hdi + s3hdi >= s3.Count then s3.RemoveRange(0, s3hdi); s3hdi <- 0
rslt <- mrg; s2.Add(mul2 rslt); s3.Add(mul3 rslt); s3hdi <- s3hdi + 1
let chkv: LogRep = s3.[s3hdi]
if chkv.lr < s5.lr then mrg <- chkv
else mrg <- s5; s5 <- mul5 s5; s3hdi <- s3hdi - 1
rslt
next
let hl2Seq f = Seq.unfold (fun v -> Some(v, f())) (f())
let nthLogHamming n f =
let rec nxt i = if i >= n then f() else f() |> ignore; nxt (i + 1) in nxt 0
let lr2BigInt (lr: LogRep) = // convert trival to BigInteger
let rec xpnd n mlt rslt =
if n <= 0u then rslt
else xpnd (n - 1u) mlt (mlt * rslt)
xpnd lr.x2 2I 1I |> xpnd lr.x3 3I |> xpnd lr.x5 5I
[<EntryPoint>]
let main argv =
printf "( "; hammingsLog() |> hl2Seq |> Seq.take 20
|> Seq.iter (printf "%A " << lr2BigInt); printfn ")"
printfn "%A" (hammingsLog() |> hl2Seq |> Seq.item (1691 - 1) |> lr2BigInt)
let strt = System.DateTime.Now.Ticks
// slow way using Seq:
// let rslt = (hammingsLog()) |> hl2Seq |> Seq.item (1000000 - 1)
// fast way using closure directly:
let rslt = (hammingsLog()) |> nthLogHamming (1000000 - 1)
let stop = System.DateTime.Now.Ticks
printfn "%A" (rslt |> lr2BigInt)
printfn "Found this last up to %d in %d milliseconds." cCOUNT ((stop - strt) / 10000L)
printfn ""
0 // return an integer exit code
- Output:
( 1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36 ) 2125764000 519312780448388736089589843750000000000000000000000000000000000000000000000000000000 Found this last up to 1000000 in 57 milliseconds.
The above code can find the billionth Hamming number in about 60 seconds on the same Intel i5-6500 at 3.6 GHz (single threaded boosted). If the "fast way" is commented out and the commenting out removed from the "slow way", the code is about twice as slow.
Extremely fast non-enumerating version sorting values in error band
If one is willing to forego sequences and just calculate the nth Hamming number, then some reading on the relationship between the size of numbers to the sequence numbers is helpful (Wikipedia: regular number). One finds that there is a very distinct relationship and that it quite quickly reduces to quite a small error band proportional to the log of the output value for larger ranges. Thus, the following code just scans for logarithmic representations to insert into a sequence for this top error band and extracts the correct nth representation from that band. It reduces time complexity to O(n^(2/3)) from O(n) for the sequence versions, but even more amazingly, reduces memory requirements to O(n^(1/3)) from O(n^(2/3)) and thus makes it possible to calculate very large values in the sequence on common personal computers. The code is as follows:
let nthHamming n =
if n < 1UL then failwith "nthHamming; argument must be > 0!"
if n < 2UL then 0u, 0u, 0u else // trivial case for first value of one
let lb3 = 1.5849625007211561814537389439478 // Math.Log(3) / Math.Log(2);
let lb5 = 2.3219280948873623478703194294894 // Math.Log(5) / Math.Log(2);
let fctr = 6.0 * lb3 * lb5
let crctn = 2.4534452978042592646620291867186 // Math.Log(Math.sqrt(30.0)) / Math.Log(2.0)
let lbest = (fctr * double n) ** (1.0/3.0) - crctn // from WP formula
let lbhi = lbest + 1.0 / lbest
let lblo = 2.0 * lbest - lbhi // upper and lower bound of upper "band"
let klmt = uint32 (lbhi / lb5)
let rec loopk k kcnt kbnd =
if k > klmt then kcnt, kbnd else
let p = lbhi - double k * lb5
let jlmt = uint32 (p / lb3)
let rec loopj