Parallel calculations
Many programming languages allow you to specify computations to be run in parallel. While Concurrent computing is focused on concurrency, the purpose of this task is to distribute time-consuming calculations on as many CPUs as possible.
You are encouraged to solve this task according to the task description, using any language you may know.
Assume we have a collection of numbers, and want to find the one with the largest minimal prime factor (that is, the one that contains relatively large factors). To speed up the search, the factorization should be done in parallel using separate threads or processes, to take advantage of multi-core CPUs.
Show how this can be formulated in your language. Parallelize the factorization of those numbers, then search the returned list of numbers and factors for the largest minimal factor, and return that number and its prime factors.
For the prime number decomposition you may use the solution of the Prime decomposition task.
Ada
I took the version from Prime decomposition and adjusted it to use tasks.
prime_numbers.ads:
generic
type Number is private;
Zero : Number;
One : Number;
Two : Number;
with function Image (X : Number) return String is <>;
with function "+" (X, Y : Number) return Number is <>;
with function "/" (X, Y : Number) return Number is <>;
with function "mod" (X, Y : Number) return Number is <>;
with function ">=" (X, Y : Number) return Boolean is <>;
package Prime_Numbers is
type Number_List is array (Positive range <>) of Number;
procedure Put (List : Number_List);
task type Calculate_Factors is
entry Start (The_Number : in Number);
entry Get_Size (Size : out Natural);
entry Get_Result (List : out Number_List);
end Calculate_Factors;
end Prime_Numbers;
prime_numbers.adb:
with Ada.Text_IO;
package body Prime_Numbers is
procedure Put (List : Number_List) is
begin
for Index in List'Range loop
Ada.Text_IO.Put (Image (List (Index)));
end loop;
end Put;
task body Calculate_Factors is
Size : Natural := 0;
N : Number;
M : Number;
K : Number := Two;
begin
accept Start (The_Number : in Number) do
N := The_Number;
M := N;
end Start;
-- Estimation of the result length from above
while M >= Two loop
M := (M + One) / Two;
Size := Size + 1;
end loop;
M := N;
-- Filling the result with prime numbers
declare
Result : Number_List (1 .. Size);
Index : Positive := 1;
begin
while N >= K loop -- Divisors loop
while Zero = (M mod K) loop -- While divides
Result (Index) := K;
Index := Index + 1;
M := M / K;
end loop;
K := K + One;
end loop;
Index := Index - 1;
accept Get_Size (Size : out Natural) do
Size := Index;
end Get_Size;
accept Get_Result (List : out Number_List) do
List (1 .. Index) := Result (1 .. Index);
end Get_Result;
end;
end Calculate_Factors;
end Prime_Numbers;
Example usage:
parallel.adb:
with Ada.Text_IO;
with Prime_Numbers;
procedure Parallel is
package Integer_Primes is new Prime_Numbers (
Number => Integer, -- use Large_Integer for longer numbers
Zero => 0,
One => 1,
Two => 2,
Image => Integer'Image);
My_List : Integer_Primes.Number_List :=
( 12757923,
12878611,
12757923,
15808973,
15780709,
197622519);
Decomposers : array (My_List'Range) of Integer_Primes.Calculate_Factors;
Lengths : array (My_List'Range) of Natural;
Max_Length : Natural := 0;
begin
for I in My_List'Range loop
-- starts the tasks
Decomposers (I).Start (My_List (I));
end loop;
for I in My_List'Range loop
-- wait until task has reached Get_Size entry
Decomposers (I).Get_Size (Lengths (I));
if Lengths (I) > Max_Length then
Max_Length := Lengths (I);
end if;
end loop;
declare
Results :
array (My_List'Range) of Integer_Primes.Number_List (1 .. Max_Length);
Largest_Minimal_Factor : Integer := 0;
Winning_Index : Positive;
begin
for I in My_List'Range loop
-- after Get_Result, the tasks terminate
Decomposers (I).Get_Result (Results (I));
if Results (I) (1) > Largest_Minimal_Factor then
Largest_Minimal_Factor := Results (I) (1);
Winning_Index := I;
end if;
end loop;
Ada.Text_IO.Put_Line
("Number" & Integer'Image (My_List (Winning_Index)) &
" has largest minimal factor:");
Integer_Primes.Put (Results (Winning_Index) (1 .. Lengths (Winning_Index)));
Ada.Text_IO.New_Line;
end;
end Parallel;
- Output:
Number 12878611 has largest minimal factor: 47 101 2713
C
C code using OpenMP. Compiled with gcc -Wall -std=c99 -fopenmp
,
where GCC 4.2 or later is required.
Note that the code finds the largest first prime factor,
but does not return the factor list:
it's just a matter of repeating the prime factor test,
which adds clutter but does not make the code any more interesting.
For that matter, the code uses the dumbest prime factoring method,
and doesn't even test if the numbers can be divided by 2.
#include <stdio.h>
#include <omp.h>
int main()
{
int data[] = {12757923, 12878611, 12878893, 12757923, 15808973, 15780709, 197622519};
int largest, largest_factor = 0;
omp_set_num_threads(4);
/* "omp parallel for" turns the for loop multithreaded by making each thread
* iterating only a part of the loop variable, in this case i; variables declared
* as "shared" will be implicitly locked on access
*/
#pragma omp parallel for shared(largest_factor, largest)
for (int i = 0; i < 7; i++) {
int p, n = data[i];
for (p = 3; p * p <= n && n % p; p += 2);
if (p * p > n) p = n;
if (p > largest_factor) {
largest_factor = p;
largest = n;
printf("thread %d: found larger: %d of %d\n",
omp_get_thread_num(), p, n);
} else {
printf("thread %d: not larger: %d of %d\n",
omp_get_thread_num(), p, n);
}
}
printf("Largest factor: %d of %d\n", largest_factor, largest);
return 0;
}
- Output:
(YMMV regarding the order of output)
thread 1: found larger: 47 of 12878893 thread 0: not larger: 3 of 12757923 thread 0: not larger: 47 of 12878611 thread 3: not larger: 3 of 197622519 thread 2: not larger: 29 of 15808973 thread 2: not larger: 7 of 15780709 thread 1: not larger: 3 of 12757923 Largest factor: 47 of 12878893
C#
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
public static List<int> PrimeFactors(int number)
{
var primes = new List<int>();
for (int div = 2; div <= number; div++)
{
while (number % div == 0)
{
primes.Add(div);
number = number / div;
}
}
return primes;
}
static void Main(string[] args)
{
int[] n = { 12757923, 12878611, 12757923, 15808973, 15780709, 197622519 };
// Calculate each of those numbers' prime factors, in parallel
var factors = n.AsParallel().Select(PrimeFactors).ToList();
// Make a new list showing the smallest factor for each
var smallestFactors = factors.Select(thisNumbersFactors => thisNumbersFactors.Min()).ToList();
// Find the index that corresponds with the largest of those factors
int biggestFactor = smallestFactors.Max();
int whatIndexIsThat = smallestFactors.IndexOf(biggestFactor);
Console.WriteLine("{0} has the largest minimum prime factor: {1}", n[whatIndexIsThat], biggestFactor);
Console.WriteLine(string.Join(" ", factors[whatIndexIsThat]));
}
}
- Output:
12878611 has the largest minimum prime factor: 47 47 101 2713
Another version, using Parallel.For:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
private static void Main(string[] args)
{
int j = 0, m = 0;
decimal[] n = {12757923, 12878611, 12757923, 15808973, 15780709, 197622519};
var l = new List<int>[n.Length];
Parallel.For(0, n.Length, i => { l[i] = getPrimes(n[i]); });
for (int i = 0; i<n.Length; i++)
if (l[i].Min()>m)
{
m = l[i].Min();
j = i;
}
Console.WriteLine("Number {0} has largest minimal factor:", n[j]);
foreach (int list in l[j])
Console.Write(" "+list);
}
- Output:
Number 12878611 has largest minimal factor: 47 101 2713
C++
This uses C++11 features including lambda functions.
#include <iostream>
#include <iterator>
#include <vector>
#include <ppl.h> // MSVC++
#include <concurrent_vector.h> // MSVC++
struct Factors
{
int number;
std::vector<int> primes;
};
const int data[] =
{
12757923, 12878611, 12878893, 12757923, 15808973, 15780709, 197622519
};
int main()
{
// concurrency-safe container replaces std::vector<>
Concurrency::concurrent_vector<Factors> results;
// parallel algorithm replaces std::for_each()
Concurrency::parallel_for_each(std::begin(data), std::end(data), [&](int n)
{
Factors factors;
factors.number = n;
for (int f = 2; n > 1; ++f)
{
while (n % f == 0)
{
factors.primes.push_back(f);
n /= f;
}
}
results.push_back(factors); // add factorization to results
});
// end of parallel calculations
// find largest minimal prime factor in results
auto max = std::max_element(results.begin(), results.end(), [](const Factors &a, const Factors &b)
{
return a.primes.front() < b.primes.front();
});
// print number(s) and factorization
std::for_each(results.begin(), results.end(), [&](const Factors &f)
{
if (f.primes.front() == max->primes.front())
{
std::cout << f.number << " = [ ";
std::copy(f.primes.begin(), f.primes.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << "]\n";
}
});
return 0;
}
- Output:
12878611 = [ 47 101 2713 ] 12878893 = [ 47 274019 ]
Clojure
(use '[clojure.contrib.lazy-seqs :only [primes]])
(defn lpf [n]
[n (or (last
(for [p (take-while #(<= (* % %) n) primes)
:when (zero? (rem n p))]
p))
1)])
(->> (range 2 100000)
(pmap lpf)
(apply max-key second)
println
time)
- Output:
[99847 313] "Elapsed time: 2547.53566 msecs"
Common Lisp
Depends on quicklisp.
(ql:quickload '(lparallel))
(setf lparallel:*kernel* (lparallel:make-kernel 4)) ;; Configure for your system.
(defun factor (n &optional (acc '()))
(when (> n 1)
(loop with max-d = (isqrt n)
for d = 2 then (if (evenp d) (1+ d) (+ d 2)) do
(cond ((> d max-d) (return (cons (list n 1) acc)))
((zerop (rem n d))
(return (factor (truncate n d)
(if (eq d (caar acc))
(cons
(list (caar acc) (1+ (cadar acc)))
(cdr acc))
(cons (list d 1) acc)))))))))
(defun max-minimum-factor (numbers)
(lparallel:pmap-reduce
(lambda (n) (cons n (apply #'min (mapcar #'car (factor n)))))
(lambda (a b) (if (> (cdr a) (cdr b)) a b))
numbers))
(defun print-max-factor (pair)
(format t "~a has the largest minimum factor ~a~%" (car pair) (cdr pair)))
CL-USER> (print-max-factor (max-minimum-factor '(12757923 12878611 12878893 12757923 15808973 15780709 197622519)))
12878893 has the largest minimum factor 47
D
Using Eager Parallel Map
ulong[] decompose(ulong n) pure nothrow {
typeof(return) result;
for (ulong i = 2; n >= i * i; i++)
for (; n % i == 0; n /= i)
result ~= i;
if (n != 1)
result ~= n;
return result;
}
void main() {
import std.stdio, std.algorithm, std.parallelism, std.typecons;
immutable ulong[] data = [
2UL^^59-1, 2UL^^59-1, 2UL^^59-1, 112_272_537_195_293UL,
115_284_584_522_153, 115_280_098_190_773,
115_797_840_077_099, 112_582_718_962_171,
112_272_537_095_293, 1_099_726_829_285_419];
//auto factors = taskPool.amap!(n => tuple(decompose(n), n))(data);
//static enum genPair = (ulong n) pure => tuple(decompose(n), n);
static genPair(ulong n) pure { return tuple(decompose(n), n); }
auto factors = taskPool.amap!genPair(data);
auto pairs = factors.map!(p => tuple(p[0].reduce!min, p[1]));
writeln("N. with largest min factor: ", pairs.reduce!max[1]);
}
- Output:
N. with largest min factor: 7310027617718202995
Using Threads
import std.stdio, std.math, std.algorithm, std.typecons,
core.thread, core.stdc.time;
final class MinFactor: Thread {
private immutable ulong num;
private ulong[] fac;
private ulong minFac;
this(in ulong n) /*pure nothrow*/ {
super(&run);
num = n;
fac = new ulong[0];
}
@property ulong number() const pure nothrow {
return num;
}
@property const(ulong[]) factors() const pure nothrow {
return fac;
}
@property ulong minFactor() const pure nothrow {
return minFac;
}
private void run() {
immutable clock_t begin = clock;
switch (num) {
case 0: fac = [];
break;
case 1: fac = [1];
break;
default:
uint limit = cast(uint)(1 + double(num).sqrt);
ulong n = num;
for (ulong divi = 3; divi < limit; divi += 2) {
if (n == 1)
break;
if ((n % divi) == 0) {
while ((n > 1) && ((n % divi) == 0)) {
fac ~= divi;
n /= divi;
}
limit = cast(uint)(1 + double(n).sqrt);
}
}
if (n > 1)
fac ~= n;
}
minFac = fac.reduce!min;
immutable clock_t end = clock;
writefln("num: %20d --> min. factor: %20d ticks(%7d -> %7d)",
num, minFac, begin, end);
}
}
void main() {
immutable ulong[] numbers = [
2UL^^59-1, 2UL^^59-1, 2UL^^59-1, 112_272_537_195_293UL,
115_284_584_522_153, 115_280_098_190_773,
115_797_840_077_099, 112_582_718_962_171,
112_272_537_095_293, 1_099_726_829_285_419];
auto tGroup = new ThreadGroup;
foreach (const n; numbers)
tGroup.add(new MinFactor(n));
writeln("Minimum factors for respective numbers are:");
foreach (t; tGroup)
t.start;
tGroup.joinAll;
auto maxMin = tuple(0UL, [0UL], 0UL);
foreach (thread; tGroup) {
auto s = cast(MinFactor)thread;
if (s !is null && maxMin[2] < s.minFactor)
maxMin = tuple(s.number, s.factors.dup, s.minFactor);
}
writefln("Number with largest min. factor is %16d," ~
" with factors:\n\t%s", maxMin.tupleof);
}
- Output:
(1 core CPU, edited to fit page width)
Minimum factors for respective numbers are: num: 576460752303423487 --> min. factor: 179951 ticks( 16 -> 78) num: 576460752303423487 --> min. factor: 179951 ticks( 78 -> 125) num: 576460752303423487 --> min. factor: 179951 ticks( 141 -> 203) num: 112272537195293 --> min. factor: 173 ticks( 203 -> 203) num: 115284584522153 --> min. factor: 513937 ticks( 203 -> 219) num: 115280098190773 --> min. factor: 513917 ticks( 219 -> 250) num: 115797840077099 --> min. factor: 544651 ticks( 250 -> 266) num: 112582718962171 --> min. factor: 3121 ticks( 266 -> 266) num: 112272537095293 --> min. factor: 131 ticks( 266 -> 266) num: 1099726829285419 --> min. factor: 271 ticks( 266 -> 266) Number with largest min. factor is 115797840077099, with factors: [544651, 212609249]
Delphi
program Parallel_calculations;
{$APPTYPE CONSOLE}
uses
System.SysUtils,
System.Threading,
Velthuis.BigIntegers;
function IsPrime(n: BigInteger): Boolean;
var
i: BigInteger;
begin
if n <= 1 then
exit(False);
i := 2;
while i < BigInteger.Sqrt(n) do
begin
if n mod i = 0 then
exit(False);
inc(i);
end;
Result := True;
end;
function GetPrimes(n: BigInteger): TArray<BigInteger>;
var
divisor, next, rest: BigInteger;
begin
divisor := 2;
next := 3;
rest := n;
while (rest <> 1) do
begin
while (rest mod divisor = 0) do
begin
SetLength(Result, Length(Result) + 1);
Result[High(Result)] := divisor;
rest := rest div divisor;
end;
divisor := next;
next := next + 2;
end;
end;
function Min(l: TArray<BigInteger>): BigInteger;
begin
if Length(l) = 0 then
exit(0);
Result := l[0];
for var v in l do
if v < result then
Result := v;
end;
const
n: array of Uint64 = [12757923, 12878611, 12757923, 15808973, 15780709, 197622519];
var
m: BigInteger;
len, j, i: Uint64;
l: TArray<TArray<BigInteger>>;
begin
j := 0;
m := 0;
len := length(n);
SetLength(l, len);
TParallel.for (0, len - 1,
procedure(i: Integer)
begin
l[i] := getPrimes(n[i]);
end);
for i := 0 to len - 1 do
begin
var _min := Min(l[i]);
if _min > m then
begin
m := _min;
j := i;
end;
end;
writeln('Number ', n[j].ToString, ' has largest minimal factor:');
for var v in l[j] do
write(' ', v.ToString);
readln;
end.
Erlang
Perhaps it is of interest that the code will handle exceptions correctly. If the function (in this case factors/1) throws an exception, then the task will get it. I had to copy factors/1 from Prime_decomposition since it is only a fragment, not a complete example.
-module( parallel_calculations ).
-export( [fun_results/2, task/0] ).
fun_results( Fun, Datas ) ->
My_pid = erlang:self(),
Pids = [fun_spawn( Fun, X, My_pid ) || X <- Datas],
[fun_receive(X) || X <- Pids].
task() ->
Numbers = [12757923, 12878611, 12757923, 15808973, 15780709, 197622519],
Results = fun_results( fun factors/1, Numbers ),
Min_results = [lists:min(X) || X <- Results],
{_Max_min_factor, Number} = lists:max( lists:zip(Min_results, Numbers) ),
{Number, Factors} = lists:keyfind( Number, 1, lists:zip(Numbers, Results) ),
io:fwrite( "~p has largest minimal factor among its prime factors ~p~n", [Number, Factors] ).
factors(N) -> factors(N,2,[]).
factors(1,_,Acc) -> Acc;
factors(N,K,Acc) when N rem K == 0 -> factors(N div K,K, [K|Acc]);
factors(N,K,Acc) -> factors(N,K+1,Acc).
fun_receive( Pid ) ->
receive
{ok, Result, Pid} -> Result;
{Type, Error, Pid} -> erlang:Type( Error )
end.
fun_spawn( Fun, Data, My_pid ) ->
erlang:spawn( fun() ->
Result = try
{ok, Fun(Data), erlang:self()}
catch
Type:Error -> {Type, Error, erlang:self()}
end,
My_pid ! Result
end ).
- Output:
8> parallel_calculations:task(). 12878611 has largest minimal factor among its prime factors [2713,101,47]
F#
open System
open PrimeDecomp // Has the decompose function from the Prime decomposition task
let data = [112272537195293L; 112582718962171L; 112272537095293L; 115280098190773L; 115797840077099L; 1099726829285419L]
let decomp num = decompose num 2L
let largestMinPrimeFactor (numbers: int64 list) =
let decompDetails = Async.Parallel [ for n in numbers -> async { return n, decomp n } ] // Compute the number and its prime decomposition list
|> Async.RunSynchronously // Start and wait for all parallel computations to complete.
|> Array.sortBy (snd >> List.min >> (~-)) // Sort in descending order, based on the min prime decomp number.
decompDetails.[0]
let showLargestMinPrimeFactor numbers =
let number, primeList = largestMinPrimeFactor numbers
printf "Number %d has largest minimal factor:\n " number
List.iter (printf "%d ") primeList
showLargestMinPrimeFactor data
- Output:
Number 115797840077099 has largest minimal factor: 544651 212609249
Factor
Manual Thread Management
USING: io kernel fry locals sequences arrays math.primes.factors math.parser channels threads prettyprint ;
IN: <filename>
:: map-parallel ( seq quot -- newseq )
<channel> :> ch
seq [ '[ _ quot call ch to ] "factors" spawn ] { } map-as
dup length [ ch from ] replicate nip ;
{ 576460752303423487 576460752303423487
576460752303423487 112272537195293
115284584522153 115280098190773
115797840077099 112582718962171
112272537095293 1099726829285419 }
dup [ factors ] map-parallel
dup [ infimum ] map dup supremum
swap index swap dupd nth -rot swap nth
"Number with largest min. factor is " swap number>string append
", with factors: " append write .
- Output:
Number with largest min. factor is 576460752303423487, with factors: { 544651 212609249 }
With Concurency Module
USING: kernel io prettyprint sequences arrays math.primes.factors math.parser concurrency.combinators ;
{ 576460752303423487 576460752303423487 576460752303423487 112272537195293
115284584522153 115280098190773 115797840077099 112582718962171 }
dup [ factors ] parallel-map dup [ infimum ] map dup supremum
swap index swap dupd nth -rot swap nth
"Number with largest min. factor is " swap number>string append
", with factors: " append write .
- Output:
Number with largest min. factor is 115797840077099, with factors: { 544651 212609249 }
Fortran
Using OpenMP (compile with -fopenmp)
program Primes
use ISO_FORTRAN_ENV
implicit none
integer(int64), dimension(7) :: data = (/2099726827, 15780709, 1122725370, 15808973, 576460741, 12878611, 12757923/)
integer(int64), dimension(100) :: outprimes
integer(int64) :: largest_factor = 0, largest = 0, minim = 0, val = 0
integer(int16) :: count = 0, OMP_GET_THREAD_NUM
call omp_set_num_threads(4);
!$omp parallel do private(val,outprimes,count) shared(data,largest_factor,largest)
do val = 1, 7
outprimes = 0
call find_factors(data(val), outprimes, count)
minim = minval(outprimes(1:count))
if (minim > largest_factor) then
largest_factor = minim
largest = data(val)
end if
write(*, fmt = '(A7,i0,A2,i12,100i12)') 'Thread ', OMP_GET_THREAD_NUM(), ': ', data(val), outprimes(1:count)
end do
!$omp end parallel do
write(*, fmt = '(i0,A26,i0)') largest, ' have the Largest factor: ', largest_factor
return
contains
subroutine find_factors(n, d, count)
integer(int64), intent(in) :: n
integer(int64), dimension(:), intent(out) :: d
integer(int16), intent(out) :: count
integer(int16) :: i
integer(int64) :: div, next, rest
i = 1
div = 2; next = 3; rest = n
do while (rest /= 1)
do while (mod(rest, div) == 0)
d(i) = div
i = i + 1
rest = rest / div
end do
div = next
next = next + 2
end do
count = i - 1
end subroutine find_factors
end program Primes
- Output:
Thread 3: 12757923 3 3 283 5009 Thread 1: 1122725370 2 3 5 13 2878783 Thread 1: 15808973 29 347 1571 Thread 2: 576460741 19 30340039 Thread 2: 12878611 47 101 2713 Thread 0: 2099726827 11 190884257 Thread 0: 15780709 7 17 132611 12878611 have the Largest factor: 47
FreeBASIC
FreeBASIC does not have native support for parallel or multithreaded programming. However, you can use external C libraries that provide multithreading functionality, such as the POSIX threading library (pthreads) or the Windows threading library.
Here's a basic example of how you could use the pthreads library in FreeBASIC:
#ifdef __FB_WIN32__
' ... instructions only for Win ...
#Include "windows.bi"
Function ThreadFunc As Dword Cdecl Alias "ThreadFunc"(param As Any Ptr) Export
Print "Thread running"
Function = 0
End Function
Dim As HANDLE thread
Dim As Dword threadId
thread = CreateThread(NULL, 0, @ThreadFunc, NULL, 0, @threadId)
If thread = NULL Then
Print "Error creating thread"
Sleep
End 1
End If
WaitForSingleObject(thread, INFINITE)
#endif
#ifdef __FB_LINUX__
' ... instructions only for Linux ...
#Include "crt/pthread.bi"
Function ThreadFunc As Any Ptr Cdecl Alias "ThreadFunc"(param As Any Ptr) Export
Print "Thread running"
Function = 0
End Function
Dim As pthread_t thread
If pthread_create(@thread, NULL, @ThreadFunc, NULL) <> 0 Then
Print "Error creating thread"
Sleep
End 1
End If
pthread_join(thread, NULL)
#endif
Print "Thread finished"
Sleep
Go
package main
import (
"fmt"
"math/big"
)
// collection of numbers. A slice is used for the collection.
// The elements are big integers, since that's what the function Primes
// uses (as was specified by the Prime decomposition task.)
var numbers = []*big.Int{
big.NewInt(12757923),
big.NewInt(12878611),
big.NewInt(12878893),
big.NewInt(12757923),
big.NewInt(15808973),
big.NewInt(15780709),
}
// main just calls the function specified by the task description and
// prints results. note it allows for multiple numbers with the largest
// minimal factor. the task didn't specify to handle this, but obviously
// it's possible.
func main() {
rs := lmf(numbers)
fmt.Println("largest minimal factor:", rs[0].decomp[0])
for _, r := range rs {
fmt.Println(r.number, "->", r.decomp)
}
}
// this type associates a number with it's prime decomposition.
// the type is neccessary so that they can be sent together over
// a Go channel, but it turns out to be convenient as well for
// the return type of lmf.
type result struct {
number *big.Int
decomp []*big.Int
}
// the function specified by the task description, "largest minimal factor."
func lmf([]*big.Int) []result {
// construct result channel and start a goroutine to decompose each number.
// goroutines run in parallel as CPU cores are available.
rCh := make(chan result)
for _, n := range numbers {
go decomp(n, rCh)
}
// collect results. <-rCh returns a single result from the result channel.
// we know how many results to expect so code here collects exactly that
// many results, and accumulates a list of those with the largest
// minimal factor.
rs := []result{<-rCh}
for i := 1; i < len(numbers); i++ {
switch r := <-rCh; r.decomp[0].Cmp(rs[0].decomp[0]) {
case 1:
rs = rs[:1]
rs[0] = r
case 0:
rs = append(rs, r)
}
}
return rs
}
// decomp is the function run as a goroutine. multiple instances of this
// function will run concurrently, one for each number being decomposed.
// it acts as a driver for Primes, calling Primes as needed, packaging
// the result, and sending the packaged result on the channel.
// "as needed" turns out to mean sending Primes a copy of n, as Primes
// as written is destructive on its argument.
func decomp(n *big.Int, rCh chan result) {
rCh <- result{n, Primes(new(big.Int).Set(n))}
}
// code below copied from Prime decomposition task
var (
ZERO = big.NewInt(0)
ONE = big.NewInt(1)
)
func Primes(n *big.Int) []*big.Int {
res := []*big.Int{}
mod, div := new(big.Int), new(big.Int)
for i := big.NewInt(2); i.Cmp(n) != 1; {
div.DivMod(n, i, mod)
for mod.Cmp(ZERO) == 0 {
res = append(res, new(big.Int).Set(i))
n.Set(div)
div.DivMod(n, i, mod)
}
i.Add(i, ONE)
}
return res
}
- Output:
largest minimal factor: 47 12878611 -> [47 101 2713] 12878893 -> [47 274019]
Haskell
import Control.Parallel.Strategies (parMap, rdeepseq)
import Control.DeepSeq (NFData)
import Data.List (maximumBy)
import Data.Function (on)
nums :: [Integer]
nums =
[ 112272537195293
, 112582718962171
, 112272537095293
, 115280098190773
, 115797840077099
, 1099726829285419
]
lowestFactor
:: Integral a
=> a -> a -> a
lowestFactor s n
| even n = 2
| otherwise = head y
where
y =
[ x
| x <- [s .. ceiling . sqrt $ fromIntegral n] ++ [n]
, n `rem` x == 0
, odd x ]
primeFactors
:: Integral a
=> a -> a -> [a]
primeFactors l n = f n l []
where
f n l xs =
if n > 1
then f (n `div` l) (lowestFactor (max l 3) (n `div` l)) (l : xs)
else xs
minPrimes
:: (Control.DeepSeq.NFData a, Integral a)
=> [a] -> (a, [a])
minPrimes ns =
(\(x, y) -> (x, primeFactors y x)) $
maximumBy (compare `on` snd) $ zip ns (parMap rdeepseq (lowestFactor 3) ns)
main :: IO ()
main = print $ minPrimes nums
- Output:
(115797840077099,[212609249,544651])
Icon and Unicon
The following only works in Unicon.
procedure main(A)
threads := []
L := list(*A)
every i := 1 to *A do put(threads, thread L[i] := primedecomp(A[i]))
every wait(!threads)
maxminF := L[maxminI := 1][1]
every i := 2 to *L do if maxminF <:= L[i][1] then maxminI := i
every writes((A[maxminI]||": ")|(!L[maxminI]||" ")|"\n")
end
procedure primedecomp(n) #: return a list of factors
every put(F := [], genfactors(n))
return F
end
link factors
Sample run:
->pc 57646075230343 112272537195 115284584525 57646075230343: 8731 6602459653 ->
J
The code at [1] implements parallel computation. With it, we can write
numbers =. 12757923 12878611 12878893 12757923 15808973 15780709 197622519
factors =. q:&.> parallelize 2 numbers NB. q: is parallelized here
ind =. (i. >./) <./@> factors
ind { numbers ;"_1 factors
┌────────┬───────────┐
│12878611│47 101 2713│
└────────┴───────────┘
Java
import static java.lang.System.out;
import static java.util.Arrays.stream;
import static java.util.Comparator.comparing;
public interface ParallelCalculations {
public static final long[] NUMBERS = {
12757923,
12878611,
12878893,
12757923,
15808973,
15780709,
197622519
};
public static void main(String... arguments) {
stream(NUMBERS)
.unordered()
.parallel()
.mapToObj(ParallelCalculations::minimalPrimeFactor)
.max(comparing(a -> a[0]))
.ifPresent(res -> out.printf(
"%d has the largest minimum prime factor: %d%n",
res[1],
res[0]
));
}
public static long[] minimalPrimeFactor(long n) {
for (long i = 2; n >= i * i; i++) {
if (n % i == 0) {
return new long[]{i, n};
}
}
return new long[]{n, n};
}
}
12878611 has the largest minimum prime factor: 47
JavaScript
This code demonstrates Web Workers. This should work on current versions of Firefox, Safari, Chrome and Opera.
This first portion should be placed in a file called "parallel_worker.js". This file contains the logic used by every worker created.
var onmessage = function(event) {
postMessage({"n" : event.data.n,
"factors" : factor(event.data.n),
"id" : event.data.id});
};
function factor(n) {
var factors = [];
for(p = 2; p <= n; p++) {
if((n % p) == 0) {
factors[factors.length] = p;
n /= p;
}
}
return factors;
}
For each number a worker is spawned. Once the final worker completes its task (worker_count is reduced to 0), the reduce function is called to determine which number is the answer.
var numbers = [12757923, 12878611, 12757923, 15808973, 15780709, 197622519];
var workers = [];
var worker_count = 0;
var results = [];
for(var i = 0; i < numbers.length; i++) {
worker_count++;
workers[i] = new Worker("parallel_worker.js");
workers[i].onmessage = accumulate;
workers[i].postMessage({n: numbers[i], id: i});
}
function accumulate(event) {
n = event.data.n;
factors = event.data.factors;
id = event.data.id;
console.log(n + " : " + factors);
results[id] = {n:n, factors:factors};
// Cleanup - kill the worker and countdown until all work is done
workers[id].terminate();
worker_count--;
if(worker_count == 0)
reduce();
}
function reduce() {
answer = 0;
for(i = 1; i < results.length; i++) {
min = results[i].factors[0];
largest_min = results[answer].factors[0];
if(min > largest_min)
answer = i;
}
n = results[answer].n;
factors = results[answer].factors;
console.log("The number with the relatively largest factors is: " + n + " : " + factors);
}
Julia
using Primes
factortodict(d, n) = (d[minimum(collect(keys(factor(n))))] = n)
# Numbers are from from the Raku example.
numbers = [64921987050997300559, 70251412046988563035, 71774104902986066597,
83448083465633593921, 84209429893632345702, 87001033462961102237,
87762379890959854011, 89538854889623608177, 98421229882942378967,
259826672618677756753, 262872058330672763871, 267440136898665274575,
278352769033314050117, 281398154745309057242, 292057004737291582187]
mins = Dict()
Base.@sync(
Threads.@threads for n in numbers
factortodict(mins, n)
end
)
answer = maximum(keys(mins))
println("The number that has the largest minimum prime factor is $(mins[answer]), with a smallest factor of $answer")
- Output:
The number that has the largest minimum prime factor is 98421229882942378967, with a smallest factor of 736717
Kotlin
// version 1.1.51
import java.util.stream.Collectors
/* returns the number itself, its smallest prime factor and all its prime factors */
fun primeFactorInfo(n: Int): Triple<Int, Int, List<Int>> {
if (n <= 1) throw IllegalArgumentException("Number must be more than one")
if (isPrime(n)) return Triple(n, n, listOf(n))
val factors = mutableListOf<Int>()
var factor = 2
var nn = n
while (true) {
if (nn % factor == 0) {
factors.add(factor)
nn /= factor
if (nn == 1) return Triple(n, factors.min()!!, factors)
if (isPrime(nn)) factor = nn
}
else if (factor >= 3) factor += 2
else factor = 3
}
}
fun isPrime(n: Int) : Boolean {
if (n < 2) return false
if (n % 2 == 0) return n == 2
if (n % 3 == 0) return n == 3
var d = 5
while (d * d <= n) {
if (n % d == 0) return false
d += 2
if (n % d == 0) return false
d += 4
}
return true
}
fun main(args: Array<String>) {
val numbers = listOf(
12757923, 12878611, 12878893, 12757923, 15808973, 15780709, 197622519
)
val info = numbers.stream()
.parallel()
.map { primeFactorInfo(it) }
.collect(Collectors.toList())
val maxFactor = info.maxBy { it.second }!!.second
val results = info.filter { it.second == maxFactor }
println("The following number(s) have the largest minimal prime factor of $maxFactor:")
for (result in results) {
println(" ${result.first} whose prime factors are ${result.third}")
}
}
- Output:
The following number(s) have the largest minimal prime factor of 47: 12878611 whose prime factors are [47, 101, 2713] 12878893 whose prime factors are [47, 274019]
Mathematica /Wolfram Language
hasSmallestFactor[data_List]:=Sort[Transpose[{ParallelTable[FactorInteger[x][[1, 1]], {x, data}],data}]][[1, 2]]
Nim
Using threads
We use one thread per number to process. To find the lowest prime factor, we use a simple algorithm as performance is not important here.
Note that the program must be compiled with option --threads:on
.
import strformat, strutils, threadpool
const Numbers = [576460752303423487,
576460752303423487,
576460752303423487,
112272537195293,
115284584522153,
115280098190773,
115797840077099,
112582718962171,
299866111963290359]
proc lowestFactor(n: int64): int64 =
if n mod 2 == 0: return 2
if n mod 3 == 0: return 3
var p = 5
var delta = 2
while p * p < n:
if n mod p == 0: return p
inc p, delta
delta = 6 - delta
result = n
proc factors(n, lowest: int64): seq[int64] =
var n = n
var lowest = lowest
while true:
result.add lowest
n = n div lowest
if n == 1: break
lowest = lowestFactor(n)
# Launch a thread for each number to process.
var responses: array[Numbers.len, FlowVar[int64]]
for i, n in Numbers:
responses[i] = spawn lowestFactor(n)
# Read the results and find the largest minimum prime factor.
var maxMinfact = 0i64
var maxIdx: int
for i in 0..responses.high:
let minfact = ^responses[i] # Blocking read.
echo &"For n = {Numbers[i]}, the lowest factor is {minfact}."
if minfact > maxMinfact:
maxMinfact = minfact
maxIdx = i
let result = Numbers[maxIdx]
echo ""
echo "The first number with the largest minimum prime factor is: ", result
echo "Its factors are: ", result.factors(maxMinfact).join(", ")
- Output:
For n = 576460752303423487, the lowest factor is 179951. For n = 576460752303423487, the lowest factor is 179951. For n = 576460752303423487, the lowest factor is 179951. For n = 112272537195293, the lowest factor is 173. For n = 115284584522153, the lowest factor is 513937. For n = 115280098190773, the lowest factor is 513917. For n = 115797840077099, the lowest factor is 544651. For n = 112582718962171, the lowest factor is 3121. For n = 299866111963290359, the lowest factor is 544651. The first number with the largest minimum prime factor is: 115797840077099 Its factors are: 544651, 212609249
Using parallel statement
The parallel statement is an experimental feature. It uses threads but may simplify slightly the code in comparison to manual management.
Note that program must be compiled with option --threads:on
.
import sequtils, strutils, threadpool
{.experimental: "parallel".}
const Numbers = [576460752303423487,
576460752303423487,
576460752303423487,
112272537195293,
115284584522153,
115280098190773,
115797840077099,
112582718962171,
299866111963290359]
proc lowestFactor(n: int64): int64 =
if n mod 2 == 0: return 2
if n mod 3 == 0: return 3
var p = 5
var delta = 2
while p * p < n:
if n mod p == 0: return p
inc p, delta
delta = 6 - delta
result = n
proc factors(n, lowest: int64): seq[int64] =
var n = n
var lowest = lowest
while true:
result.add lowest
n = n div lowest
if n == 1: break
lowest = lowestFactor(n)
# Launch the threads.
var results: array[Numbers.len, int64] # To store the results.
parallel:
for i, n in Numbers:
results[i] = spawn lowestFactor(n)
# Find the minimum prime factor and the first number with this minimum factor.
let maxIdx = results.maxIndex()
let maxMinfact = results[maxIdx]
let result = Numbers[maxIdx]
echo ""
echo "The first number with the largest minimum prime factor is: ", result
echo "Its factors are: ", result.factors(maxMinfact).join(", ")
- Output:
Output is the same as with manual management of threads.
Oforth
Version used for #factors is Prime decomposition.
mapParallel runs parallel computations : each element of the list is computed into a separate task.
Default is to use as many workers as number of cores.
Numbers of workers to use can be adjusted using --W command line option.
import: parallel
: largeMinFactor dup mapParallel(#factors) zip maxFor(#[ second first ]) ;
- Output:
[ 12757923, 12878611, 12757923, 15808973, 15780709, 197622519 ] largeMinFactor println [12878611, [47, 101, 2713]]
ooRexx
This program calls the programs shown under REXX (modified for ooRexx and slightly expanded).
/* Concurrency in ooRexx. Example of early reply */
object1 = .example~new
object2 = .example~new
say object1~primes(1,11111111111,11111111114)
say object2~primes(2,11111111111,11111111114)
say "Main ended at" time()
exit
::class example
::method primes
use arg which,bot,top
reply "Start primes"which':' time()
Select
When which=1 Then Call pd1 bot top
When which=2 Then Call pd2 bot top
End
- Output:
Start primes1: 09:25:25 Start primes2: 09:25:25 11111111111 (2) prime factors: 21649 513239 Main ended at 09:25:25 11111111112 (5) prime factors: 2 2 2 3 462962963 11111111111 (2) prime factors: 21649 513239 11111111112 (5) prime factors: 2 2 2 3 462962963 11111111113 (1) prime factors: [prime] 11111111113 11111111113 (1) prime factors: [prime] 11111111113 11111111114 (2) prime factors: 2 5555555557 1 primes found. PD1 took 1.203000 seconds 11111111114 (2) prime factors: 2 5555555557 1 primes found. PD2 took 1.109000 seconds
/*PD1 REXX pgm does prime decomposition of a range of positive integers (with a prime count)*/
Call Time 'R'
numeric digits 1000 /*handle thousand digits for the powers*/
parse arg bot top step base add /*get optional arguments from the C.L. */
if bot=='' then do; bot=1; top=100; end /*no BOT given? Then use the default.*/
if top=='' then top=bot /* " TOP? " " " " " */
if step=='' then step= 1 /* " STEP? " " " " " */
if add =='' then add= -1 /* " ADD? " " " " " */
tell= top>0; top=abs(top) /*if TOP is negative, suppress displays*/
w=length(top) /*get maximum width for aligned display*/
if base\=='' then w=length(base**top) /*will be testing powers of two later? */
commat.=left('', 7); commat.0="{unity}"; commat.1='[prime]' /*some literals: pad; prime (or not).*/
numeric digits max(9, w+1) /*maybe increase the digits precision. */
hash=0 /*hash: is the number of primes found. */
do n=bot to top by step /*process a single number or a range.*/
?=n; if base\=='' then ?=base**n + add /*should we perform a "Mercenne" test? */
pf=factr(?); f=words(pf) /*get prime factors; number of factors.*/
if f==1 then hash=hash+1 /*Is N prime? Then bump prime counter.*/
if tell then say right(?,w) right('('f")",9) 'prime factors: ' commat.f pf
end /*n*/
say
ps= 'primes'; if p==1 then ps= "prime" /*setup for proper English in sentence.*/
say right(hash, w+9+1) ps 'found.' /*display the number of primes found. */
Say 'PD1 took' time('E') 'seconds'
exit /*stick a fork in it, we're all done. */
/*--------------------------------------------------------------------------------------*/
factr: procedure; parse arg x 1 d,dollar /*set X, D to argument 1; dollar to null.*/
if x==1 then return '' /*handle the special case of X = 1. */
do while x//2==0; dollar=dollar 2; x=x%2; end /*append all the 2 factors of new X.*/
do while x//3==0; dollar=dollar 3; x=x%3; end /* " " " 3 " " " " */
do while x//5==0; dollar=dollar 5; x=x%5; end /* " " " 5 " " " " */
do while x//7==0; dollar=dollar 7; x=x%7; end /* " " " 7 " " " " */
/* ___*/
q=1; do while q<=x; q=q*4; end /*these two lines compute integer v X */
r=0; do while q>1; q=q%4; _=d-r-q; r=r%2; if _>=0 then do; d=_; r=r+q; end; end
do j=11 by 6 to r /*insure that J isn't divisible by 3.*/
parse var j '' -1 _ /*obtain the last decimal digit of J. */
if _\==5 then do while x//j==0; dollar=dollar j; x=x%j; end /*maybe reduce by J. */
if _ ==3 then iterate /*Is next Y is divisible by 5? Skip.*/
y=j+2; do while x//y==0; dollar=dollar y; x=x%y; end /*maybe reduce by J. */
end /*j*/
/* [?] The dollar list has a leading blank.*/
if x==1 then return dollar /*Is residual=unity? Then don't append.*/
return dollar x /*return dollar with appended residual. */
/*PD2 REXX pgm does prime decomposition of a range of positive integers (with a prime count)*/
Call time 'R'
numeric digits 1000 /*handle thousand digits for the powers*/
parse arg bot top step base add /*get optional arguments from the C.L. */
if bot=='' then do; bot=1; top=100; end /*no BOT given? Then use the default.*/
if top=='' then top=bot /* " TOP? " " " " " */
if step=='' then step= 1 /* " STEP? " " " " " */
if add =='' then add= -1 /* " ADD? " " " " " */
tell= top>0; top=abs(top) /*if TOP is negative, suppress displays*/
w=length(top) /*get maximum width for aligned display*/
if base\=='' then w=length(base**top) /*will be testing powers of two later? */
commat.=left('', 7); commat.0="{unity}"; commat.1='[prime]' /*some literals: pad; prime (or not).*/
numeric digits max(9, w+1) /*maybe increase the digits precision. */
hash=0 /*hash: is the number of primes found. */
do n=bot to top by step /*process a single number or a range.*/
?=n; if base\=='' then ?=base**n + add /*should we perform a "Mercenne" test? */
pf=factr(?); f=words(pf) /*get prime factors; number of factors.*/
if f==1 then hash=hash+1 /*Is N prime? Then bump prime counter.*/
if tell then say right(?,w) right('('f")",9) 'prime factors: ' commat.f pf
end /*n*/
say
ps= 'primes'; if p==1 then ps= "prime" /*setup for proper English in sentence.*/
say right(hash, w+9+1) ps 'found.' /*display the number of primes found. */
Say 'PD2 took' time('E') 'seconds'
exit /*stick a fork in it, we're all done. */
/*--------------------------------------------------------------------------------------*/
factr: procedure; parse arg x 1 d,dollar /*set X, D to argument 1; dollar to null.*/
if x==1 then return '' /*handle the special case of X = 1. */
do while x// 2==0; dollar=dollar 2; x=x%2; end /*append all the 2 factors of new X.*/
do while x// 3==0; dollar=dollar 3; x=x%3; end /* " " " 3 " " " " */
do while x// 5==0; dollar=dollar 5; x=x%5; end /* " " " 5 " " " " */
do while x// 7==0; dollar=dollar 7; x=x%7; end /* " " " 7 " " " " */
do while x//11==0; dollar=dollar 11; x=x%11; end /* " " " 11 " " " " */ /* ?¦¦¦¦ added.*/
do while x//13==0; dollar=dollar 13; x=x%13; end /* " " " 13 " " " " */ /* ?¦¦¦¦ added.*/
do while x//17==0; dollar=dollar 17; x=x%17; end /* " " " 17 " " " " */ /* ?¦¦¦¦ added.*/
do while x//19==0; dollar=dollar 19; x=x%19; end /* " " " 19 " " " " */ /* ?¦¦¦¦ added.*/
do while x//23==0; dollar=dollar 23; x=x%23; end /* " " " 23 " " " " */ /* ?¦¦¦¦ added.*/
/* ___*/
q=1; do while q<=x; q=q*4; end /*these two lines compute integer v X */
r=0; do while q>1; q=q%4; _=d-r-q; r=r%2; if _>=0 then do; d=_; r=r+q; end; end
do j=29 by 6 to r /*insure that J isn't divisible by 3.*/ /* ?¦¦¦¦ changed.*/
parse var j '' -1 _ /*obtain the last decimal digit of J. */
if _\==5 then do while x//j==0; dollar=dollar j; x=x%j; end /*maybe reduce by J. */
if _ ==3 then iterate /*Is next Y is divisible by 5? Skip.*/
y=j+2; do while x//y==0; dollar=dollar y; x=x%y; end /*maybe reduce by J. */
end /*j*/
/* [?] The dollar list has a leading blank.*/
if x==1 then return dollar /*Is residual=unity? Then don't append.*/
return dollar x /*return dollar with appended residual. */
OxygenBasic
'CONFIGURATION
'=============
% max 8192 'Maximum amount of Prime Numbers (must be 2^n) (excluding 1 and 2)
% cores 4 'CPU cores available (limited to 4 here)
% share 2048 'Amount of numbers allocated to each core
'SETUP
'=====
'SOURCE DATA BUFFERS
sys primes[max]
sys numbers[max]
'RESULT BUFFER
double pp[max] 'main thread
'MULTITHREADING AND TIMING API
'=============================
extern lib "kernel32.dll"
'
void QueryPerformanceCounter(quad*c)
void QueryPerformanceFrequency(quad*freq)
sys CreateThread (sys lpThreadAttributes, dwStackSize, lpStartAddress, lpParameter, dwCreationFlags, *lpThreadId)
dword WaitForMultipleObjects(sys nCount,*lpHandles, bWaitAll, dwMilliseconds)
bool CloseHandle(sys hObject)
void Sleep(sys dwMilliSeconds)
'
quad freq,t1,t2
QueryPerformanceFrequency freq
'MACROS AND FUNCTIONS
'====================
macro FindPrimes(p)
'==================
finit
sys n=1
sys c,k
do
n+=2
if c>=max then exit do
'
'IS IT DIVISIBLE BE ANY PREVIOUS PRIME
'
for k=1 to c
if frac(n/p[k])=0 then exit for
next
'
if k>c then
c++
p[c]=n 'STORE PRIME
end if
end do
end macro
macro ProcessNumbers(p,bb)
'=========================
finit
sys i,b,e
b=bb*share
e=b+share
sys v,w
for i=b+1 to e
v=numbers(i)
for j=max to 1 step -1
w=primes(j)
if w<v then
if frac(v/w)=0 then
p(i)=primes(j) 'store highest factor
exit for 'process next number
end if
end if
next
next
end macro
'THREAD FUNCTIONS
function threadA(sys v) as sys
ProcessNumbers(pp,v)
end function
function threadB(sys v) as sys
ProcessNumbers(pp,v)
end function
function threadC(sys v) as sys
ProcessNumbers(pp,v)
end function
end extern
function mainThread(sys b)
'===========================
ProcessNumbers(pp,b)
end function
'SOURCE DATA GENERATION
sys seed = 0x12345678
function Rnd() as sys
'====================
'
mov eax,seed
rol eax,7
imul eax,eax,13
mov seed,eax
return eax
end function
function GenerateNumbers()
'=========================
sys i,v,mask
mask=max * 8 -1 'as bit mask
for i=1 to max
v=rnd()
v and=mask
numbers(i)=v
next
end function
FindPrimes(primes)
GenerateNumbers()
% threads Cores-1
% INFINITE 0xFFFFFFFF 'Infinite timeout
sys Funs[threads]={@threadA,@threadB,@threadC} '3 additional threads
sys hThread[threads], id[threads], i
'
'START TIMER
'
QueryPerformanceCounter t1
'
for i=1 to threads
hThread(i) = CreateThread 0,0,funs(i),i,0,id(i)
next
MainThread(0) 'process numbers in main thread (bottom share)
if threads>0 then
WaitForMultipleObjects Threads, hThread, 1, INFINITE
end if
for i=1 to Threads
CloseHandle hThread(i)
next
'CAPTURE NUMBER WITH HIGHEST PRIME FACTOR
sys n,f
for i=1 to max
if pp(i)>f then f=pp(i) : n=i
next
'STOP TIMER
QueryPerformanceCounter t2
print str((t2-t1)/freq,3) " secs " numbers(n) " " f 'number with highest prime factor
PARI/GP
See Bill Allombert's slides on parallel programming in GP. This can be configured to use either MPI (good for many networked computers) or pthreads (good for a single machine).
v=pareval(vector(1000,i,()->factor(2^i+1)[1,1]));
vecmin(v)
Pascal
Free Pascal
program FactorialPrimes;
{$mode ObjFPC}{$H+}
uses
{$ifdef unix} cthreads, {$endif} SysUtils, Math;
type
TArr = array of UInt32;
const
Numbers: array[0..5] of UInt32 = (12757923, 197622519, 12878611, 12757923, 15808973, 15780709);
var
Finished: UInt32;
PrimeFactors: array[0..5] of TArr;
function GetPrimes(n: UInt32): TArr;
var
Divisor, Next, Rest: UInt32;
begin
Divisor := 2;
Next := 3;
Rest := n;
while (Rest <> 1) do
begin
while (Rest mod Divisor = 0) do
begin
SetLength(Result, Length(Result) + 1);
Result[High(Result)] := Divisor;
Rest := Rest div Divisor;
end;
Divisor := Next;
Next := Next + 2;
end;
end;
function FindPrimeFactors(p: Pointer): PtrInt; {threaded function}
var
Index: UInt32;
begin
Index := UInt32(p);
PrimeFactors[Index] := GetPrimes(Numbers[Index]);
InterLockedIncrement(Finished);
Result := 0;
end;
function LowestFactor(factors: TArr): UInt32;
var
Factor: UInt32;
begin
Result := factors[0];
for Factor in factors do
Result := Min(Result, Factor);
end;
var
ThreadCount, i, MinFactorIndex, MaxMinFactor, Factor: UInt32;
begin
Finished := 0;
ThreadCount := Length(Numbers);
for i := 0 to High(Numbers) do
BeginThread(@FindPrimeFactors, Pointer(i));
while Finished < ThreadCount do;
MaxMinFactor := 0;
MinFactorIndex := 0;
for i := 0 to High(Numbers) do
begin
Factor := LowestFactor(PrimeFactors[i]);
if Factor > MaxMinFactor then
begin
MaxMinFactor := Factor;
MinFactorIndex := i;
end;
end;
Writeln('Number ', Numbers[MinFactorIndex], ' has the largest minimal factor:');
for Factor in PrimeFactors[MinFactorIndex] do
Write(' ', Factor);
end.
- Output:
Number 12878611 has the largest minimal factor: 47 101 2713
Perl
use ntheory qw/factor vecmax/;
use threads;
use threads::shared;
my @results :shared;
my $tnum = 0;
$_->join() for
map { threads->create('tfactor', $tnum++, $_) }
(qw/576460752303423487 576460752303423487 576460752303423487 112272537195293
115284584522153 115280098190773 115797840077099 112582718962171 299866111963290359/);
my $lmf = vecmax( map { $_->[1] } @results );
print "Largest minimal factor of $lmf found in:\n";
print " $_->[0] = [@$_[1..$#$_]]\n" for grep { $_->[1] == $lmf } @results;
sub tfactor {
my($tnum, $n) = @_;
push @results, shared_clone([$n, factor($n)]);
}
- Output:
Largest minimal factor of 544651 found in: 115797840077099 = [544651 212609249] 299866111963290359 = [544651 550565613509]
Phix
-- -- demo\rosetta\ParallelCalculations.exw -- ===================================== -- -- Proof that more threads can make things faster... -- without js -- (threads) include mpfr.e sequence res constant res_cs = init_cs() -- critical section procedure athread() mpz z = mpz_init() while true do integer found = 0 enter_cs(res_cs) for i=1 to length(res) do if integer(res[i]) and res[i]>0 then found = i res[i] = 0 exit end if end for leave_cs(res_cs) if not found then exit end if mpz_ui_pow_ui(z,2,found) mpz_add_ui(z,z,1) object r = mpz_prime_factors(z, 1_000_000) enter_cs(res_cs) res[found] = r r = 0 leave_cs(res_cs) end while exit_thread(0) end procedure for nthreads=1 to 5 do progress("testing %d threads...",{nthreads}) atom t0 = time() res = tagset(100) sequence threads = {} for i=1 to nthreads do threads = append(threads,create_thread(routine_id("athread"),{})) end for wait_thread(threads) integer k = largest(res,true) string e = elapsed(time()-t0) printf(1,"largest is 2^%d+1 with smallest factor of %d (%d threads, %s)\n", {k,res[k][1][1],nthreads,e}) end for delete_cs(res_cs)
- Output:
largest is 2^64+1 with smallest factor of 274177 (1 threads, 9.5s) largest is 2^64+1 with smallest factor of 274177 (2 threads, 7.1s) largest is 2^64+1 with smallest factor of 274177 (3 threads, 5.8s) largest is 2^64+1 with smallest factor of 274177 (4 threads, 4.5s) largest is 2^64+1 with smallest factor of 274177 (5 threads, 4.5s)
IE: Checking the first 1 million primes as factors of 2^(1..100)+1 takes 1 core 9.5s and less when spread over multiple cores.
Note however that I added quite a bit of locking to mpz_prime_factors(), specifically around get_prime() and mpz_probable_prime(),
[happy to leave it in since the effect on the single-thread case was neglible] so it is really only the mpz_divisible_ui_p() calls
and some control-flow scaffolding that is getting parallelised. I only got 4 cores so the 5th thread was not expected to help.
PicoLisp
The 'later' function is used in PicoLisp to start parallel computations. The following solution calls 'later' on the 'factor' function from Prime decomposition#PicoLisp, and then 'wait's until all results are available:
(let Lst
(mapcan
'((N)
(later (cons) # When done,
(cons N (factor N)) ) ) # return the number and its factors
(quote
188573867500151328137405845301 # Process a collection of 12 numbers
3326500147448018653351160281
979950537738920439376739947
2297143294659738998811251
136725986940237175592672413
3922278474227311428906119
839038954347805828784081
42834604813424961061749793
2651919914968647665159621
967022047408233232418982157
2532817738450130259664889
122811709478644363796375689 ) )
(wait NIL (full Lst)) # Wait until all computations are done
(maxi '((L) (apply min L)) Lst) ) # Result: Number in CAR, factors in CDR
- Output:
-> (2532817738450130259664889 6531761 146889539 2639871491)
Prolog
This piece needs prime_decomp definition from the Prime decomposition#Prolog example, it worked on my swipl, but I don't know how other Dialects thread.
threaded_decomp(Number,ID):-
thread_create(
(prime_decomp(Number,Y),
thread_exit((Number,Y)))
,ID,[]).
threaded_decomp_list(List,Erg):-
maplist(threaded_decomp,List,IDs),
maplist(thread_join,IDs,Results),
maplist(pack_exit_out,Results,Smallest_Factors_List),
largest_min_factor(Smallest_Factors_List,Erg).
pack_exit_out(exited(X),X).
%Note that here some error handling should happen.
largest_min_factor([(N,Facs)|A],(N2,Fs2)):-
min_list(Facs,MF),
largest_min_factor(A,(N,MF,Facs),(N2,_,Fs2)).
largest_min_factor([],Acc,Acc).
largest_min_factor([(N1,Facs1)|Rest],(N2,MF2,Facs2),Goal):-
min_list(Facs1, MF1),
(MF1 > MF2->
largest_min_factor(Rest,(N1,MF1,Facs1),Goal);
largest_min_factor(Rest,(N2,MF2,Facs2),Goal)).
format_it(List):-
threaded_decomp_list(List,(Number,Factors)),
format('Number with largest minimal Factor is ~w\nFactors are ~w\n',
[Number,Factors]).
Example (Numbers Same as in Ada Example):
?- ['prime_decomp.prolog', parallel]. % prime_decomp.prolog compiled 0.00 sec, 3,392 bytes % parallel compiled 0.00 sec, 4,672 bytes true. format_it([12757923, 12878611, 12757923, 15808973, 15780709, 197622519]). Number with largest minimal factor is 12878611 Factors are [2713, 101, 47] true.
PureBasic
Structure IO_block
ThreadID.i
StartSeamaphore.i
Value.q
MinimumFactor.i
List Factors.i()
EndStructure
;\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
Declare Factorize(*IO.IO_block)
Declare main()
;\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
Main()
End
;\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
Procedure Main()
Protected AvailableCpu, MainSemaphore
Protected i, j, qData.q, Title$, Message$
NewList T.IO_block()
;
AvailableCpu = Val(GetEnvironmentVariable("NUMBER_OF_PROCESSORS"))
If AvailableCpu<1: AvailableCpu=1: EndIf
MainSemaphore = CreateSemaphore(AvailableCpu)
;
Restore Start_of_data
For i=1 To (?end_of_data-?Start_of_data) / SizeOf(Quad)
; Start all threads at ones, they will then be let to
; self-oganize according to the availiable Cores.
AddElement(T())
Read.q qData
T()\Value = qData
T()\StartSeamaphore = MainSemaphore
T()\ThreadID = CreateThread(@Factorize(), @T())
Next
;
ForEach T()
; Wait for all threads to complete their work and
; find the smallest factor from eact task.
WaitThread(T()\ThreadID)
Next
;
i = OffsetOf(IO_block\MinimumFactor)
SortStructuredList(T(), #PB_Sort_Integer, i, #PB_Sort_Descending)
FirstElement(T())
Title$="Info"
Message$="Number "+Str(T()\Value)+" has largest minimal factor:"+#CRLF$
ForEach T()\Factors()
Message$ + Str(T()\Factors())+" "
Next
MessageRequester(Title$, Message$)
EndProcedure
ProcedureDLL Factorize(*IO.IO_block) ; Fill list Factors() with the factor parts of Number
;Based on http://rosettacode.org/wiki/Prime_decomposition#PureBasic
With *IO
Protected Value.q=\Value
WaitSemaphore(\StartSeamaphore)
Protected I = 3
ClearList(\Factors())
While Value % 2 = 0
AddElement(\Factors())
\Factors() = 2
Value / 2
Wend
Protected Max = Value
While I <= Max And Value > 1
While Value % I = 0
AddElement(\Factors())
\Factors() = I
Value / I
Wend
I + 2
Wend
SortList(\Factors(), #PB_Sort_Ascending)
FirstElement(\Factors())
\MinimumFactor=\Factors()
SignalSemaphore(\StartSeamaphore)
EndWith ;*IO
EndProcedure
DataSection
Start_of_data: ; Same numbers as Ada
Data.q 12757923, 12878611, 12757923, 15808973, 15780709, 197622519
end_of_data:
EndDataSection
Python
Python3 - concurrent
Python 3.2 has a new concurrent.futures module that allows for the easy specification of thread-parallel or process-parallel processes. The following is modified from their example and will run, by default, with as many processes as there are available cores on your machine.
Note that there is no need to calculate all prime factors of all NUMBERS
when only the prime factors of the number with the lowest overall prime factor is needed.
from concurrent import futures
from math import floor, sqrt
NUMBERS = [
112272537195293,
112582718962171,
112272537095293,
115280098190773,
115797840077099,
1099726829285419]
# NUMBERS = [33, 44, 55, 275]
def lowest_factor(n, _start=3):
if n % 2 == 0:
return 2
search_max = int(floor(sqrt(n))) + 1
for i in range(_start, search_max, 2):
if n % i == 0:
return i
return n
def prime_factors(n, lowest):
pf = []
while n > 1:
pf.append(lowest)
n //= lowest
lowest = lowest_factor(n, max(lowest, 3))
return pf
def prime_factors_of_number_with_lowest_prime_factor(NUMBERS):
with futures.ProcessPoolExecutor() as executor:
low_factor, number = max( (l, f) for l, f in zip(executor.map(lowest_factor, NUMBERS), NUMBERS) )
all_factors = prime_factors(number, low_factor)
return number, all_factors
def main():
print('For these numbers:')
print('\n '.join(str(p) for p in NUMBERS))
number, all_factors = prime_factors_of_number_with_lowest_prime_factor(NUMBERS)
print(' The one with the largest minimum prime factor is {}:'.format(number))
print(' All its prime factors in order are: {}'.format(all_factors))
if __name__ == '__main__':
main()
- Output:
For these numbers: 112272537195293 112582718962171 112272537095293 115280098190773 115797840077099 1099726829285419 The one with the largest minimum prime factor is 115797840077099: All its prime factors in order are: [544651, 212609249]
Python General - multiprocessing
This method works for both Python2 and Python3 using the standard library module multiprocessing. The result of the following code is the same as the previous example only the different package is used.
import multiprocessing
# ========== #Python3 - concurrent
from math import floor, sqrt
numbers = [
112272537195293,
112582718962171,
112272537095293,
115280098190773,
115797840077099,
1099726829285419]
# numbers = [33, 44, 55, 275]
def lowest_factor(n, _start=3):
if n % 2 == 0:
return 2
search_max = int(floor(sqrt(n))) + 1
for i in range(_start, search_max, 2):
if n % i == 0:
return i
return n
def prime_factors(n, lowest):
pf = []
while n > 1:
pf.append(lowest)
n //= lowest
lowest = lowest_factor(n, max(lowest, 3))
return pf
# ========== #Python3 - concurrent
def prime_factors_of_number_with_lowest_prime_factor(numbers):
pool = multiprocessing.Pool(processes=5)
factors = pool.map(lowest_factor,numbers)
low_factor,number = max((l,f) for l,f in zip(factors,numbers))
all_factors = prime_factors(number,low_factor)
return number,all_factors
if __name__ == '__main__':
print('For these numbers:')
print('\n '.join(str(p) for p in numbers))
number, all_factors = prime_factors_of_number_with_lowest_prime_factor(numbers)
print(' The one with the largest minimum prime factor is {}:'.format(number))
print(' All its prime factors in order are: {}'.format(all_factors))
Racket
#lang racket
(require math)
(provide main)
(define (smallest-factor n)
(list (first (first (factorize n))) n))
(define numbers
'(112272537195293 112582718962171 112272537095293
115280098190773 115797840077099 1099726829285419))
(define (main)
; create as many instances of Racket as
; there are numbers:
(define ps
(for/list ([_ numbers])
(place ch
(place-channel-put
ch
(smallest-factor
(place-channel-get ch))))))
; send the numbers to the instances:
(map place-channel-put ps numbers)
; get the results and find the maximum:
(argmax first (map place-channel-get ps)))
Session:
> (main)
'(544651 115797840077099)
Raku
(formerly Perl 6)
Takes the list of numbers and converts them to a HyperSeq that is stored in a variable and evaluated concurrently. HyperSeqs overload map and grep to convert and pick values in worker threads. The runtime will pick the number of OS-level threads and assign worker threads to them while avoiding stalling in any part of the program. A HyperSeq is lazy, so the computation of values will happen in chunks as they are requested.
The hyper (and race) method can take two parameters that will tweak how the parallelization occurs: :degree and :batch. :degree is the number of worker threads to allocate to the job. By default it is set to the number of physical cores available. If you have a hyper threading processor, and the tasks are not cpu bound, it may be useful to raise that number but it is a reasonable default. :batch is how many sub-tasks are parceled out at a time to each worker thread. Default is 64. For small numbers of cpu intensive tasks a lower number will likely be better, but too low may make the dispatch overhead cancel out the benefit of threading. Conversely, too high will over-burden some threads and starve others. Over long-running processes with many hundreds / thousands of sub-tasks, the scheduler will automatically adjust the batch size up or down to try to keep the pipeline filled.
On my system, under the load I was running, I found a batch size of 3 to be optimal for this task. May be different for different systems and different loads.
As a relative comparison, perform the same factoring task on the same set of 100 numbers as found in the SequenceL example, using varying numbers of threads. The absolute speed numbers are not very significant, they will vary greatly between systems, this is more intended as a comparison of relative throughput. On a Core i7-4770 @ 3.40GHz with 4 cores and hyper-threading under Linux, there is a distinct pattern where more threads on physical cores give reliable increases in throughput. Adding hyperthreads may (and, in this case, does seem to) give some additional marginal benefit.
Using the prime-factors routine as defined in the prime decomposition task.
my @nums = 64921987050997300559, 70251412046988563035, 71774104902986066597,
83448083465633593921, 84209429893632345702, 87001033462961102237,
87762379890959854011, 89538854889623608177, 98421229882942378967,
259826672618677756753, 262872058330672763871, 267440136898665274575,
278352769033314050117, 281398154745309057242, 292057004737291582187;
my @factories = @nums.hyper(:3batch).map: &prime-factors;
printf "%21d factors: %s\n", |$_ for @nums Z @factories;
my $gmf = {}.append(@factories»[0] »=>« @nums).max: +*.key;
say "\nGreatest minimum factor: ", $gmf.key;
say "from: { $gmf.value }\n";
say 'Run time: ', now - INIT now;
say '-' x 80;
# For amusements sake and for relative comparison, using the same 100
# numbers as in the SequenceL example, testing with different numbers of threads.
@nums = <625070029 413238785 815577134 738415913 400125878 967798656 830022841
774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092
516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916
659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079
290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101
892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331
908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136
151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758
766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490
703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270
666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901
43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268
984990763 275192380 39848200 892766084 76503760>».Int;
for 1..8 -> $degree {
my $start = now;
my \factories = @nums.hyper(:degree($degree), :3batch).map: &prime-factors;
my $gmf = {}.append(factories»[0] »=>« @nums).max: +*.key;
say "\nFactoring {+@nums} numbers, greatest minimum factor: {$gmf.key}";
say "Using: $degree thread{ $degree > 1 ?? 's' !! ''}";
my $end = now;
say 'Run time: ', $end - $start, ' seconds.';
}
# Prime factoring routines from the Prime decomposition task
sub prime-factors ( Int $n where * > 0 ) {
return $n if $n.is-prime;
return [] if $n == 1;
my $factor = find-factor( $n );
sort flat prime-factors( $factor ), prime-factors( $n div $factor );
}
sub find-factor ( Int $n, $constant = 1 ) {
return 2 unless $n +& 1;
if (my $gcd = $n gcd 6541380665835015) > 1 {
return $gcd if $gcd != $n
}
my $x = 2;
my $rho = 1;
my $factor = 1;
while $factor == 1 {
$rho *= 2;
my $fixed = $x;
for ^$rho {
$x = ( $x * $x + $constant ) % $n;
$factor = ( $x - $fixed ) gcd $n;
last if 1 < $factor;
}
}
$factor = find-factor( $n, $constant + 1 ) if $n == $factor;
$factor;
}
- Typical output:
64921987050997300559 factors: 736717 88123373087627 70251412046988563035 factors: 5 43 349 936248577956801 71774104902986066597 factors: 736717 97424255043641 83448083465633593921 factors: 736717 113270202079813 84209429893632345702 factors: 2 3 3 3 41 107880821 352564733 87001033462961102237 factors: 736717 118092881612561 87762379890959854011 factors: 3 3 3 3 331 3273372119315201 89538854889623608177 factors: 736717 121537652707381 98421229882942378967 factors: 736717 133594351539251 259826672618677756753 factors: 7 37118096088382536679 262872058330672763871 factors: 3 47 1864340839224629531 267440136898665274575 factors: 3 5 5 71 50223499887073291 278352769033314050117 factors: 7 39764681290473435731 281398154745309057242 factors: 2 809 28571 46061 132155099 292057004737291582187 factors: 7 151 373 2339 111323 2844911 Greatest minimum factor: 736717 from: 64921987050997300559 71774104902986066597 83448083465633593921 87001033462961102237 89538854889623608177 98421229882942378967 Run time: 0.2968644 -------------------------------------------------------------------------------- Factoring 100 numbers, greatest minimum factor: 782142901 Using: 1 thread Run time: 0.3438752 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 2 threads Run time: 0.2035372 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 3 threads Run time: 0.14177834 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 4 threads Run time: 0.110738 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 5 threads Run time: 0.10142434 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 6 threads Run time: 0.10954304 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 7 threads Run time: 0.097886 seconds. Factoring 100 numbers, greatest minimum factor: 782142901 Using: 8 threads Run time: 0.0927695 seconds.
Beside HyperSeq and its (allowed to be) out-of-order equivalent RaceSeq, Rakudo supports primitive threads, locks and highlevel promises. Using channels and supplies values can be move thread-safely from one thread to another. A react-block can be used as a central hub for message passing.
In Raku most errors are bottled up Exceptions inside Failure objects that remember where they are created and thrown when used. This is useful to pass errors from one thread to another without losing file and line number of the source file that caused the error.
In the future hyper operators, junctions and feeds will be candidates for autothreading.
Rust
//! This solution uses [rayon](https://github.com/rayon-rs/rayon), a data-parallelism library.
//! Since Rust guarantees that a program has no data races, adding parallelism to a sequential
//! computation is as easy as importing the rayon traits and calling the `par_iter()` method.
extern crate rayon;
extern crate prime_decomposition;
use rayon::prelude::*;
/// Returns the largest minimal factor of the numbers in a slice
pub fn largest_min_factor(numbers: &[usize]) -> usize {
numbers
.par_iter()
.map(|n| {
// `factor` returns a sorted vector, so we just take the first element.
prime_decomposition::factor(*n)[0]
})
.max()
.unwrap()
}
fn main() {
let numbers = &[
1_122_725, 1_125_827, 1_122_725, 1_152_800, 1_157_978, 1_099_726,
];
let max = largest_min_factor(numbers);
println!("The largest minimal factor is {}", max);
}
- Output:
The largest minimal factor is 23
SequenceL
SequenceL compiles to parallel C++ without any input from the user regarding explicit parallelization. The number of threads to be executed on can be specified at runtime, or by default, the runtime detects the maximum number of logical cores and uses that many threads.
import <Utilities/Conversion.sl>;
import <Utilities/Math.sl>;
import <Utilities/Sequence.sl>;
main(args(2)) :=
let
inputs := stringToInt(args);
factored := primeFactorization(inputs);
minFactors := vectorMin(factored);
indexOfMax := firstIndexOf(minFactors, vectorMax(minFactors));
in
"Number " ++ intToString(inputs[indexOfMax]) ++ " has largest minimal factor:\n" ++ delimit(intToString(factored[indexOfMax]), ' ');
Using the Trial Division version of primeFactorization here: [2]
The primary source of parallelization in the above code is from the line:
factored := primeFactorization(inputs);
Since primeFactorization is defined on scalar integers and inputs is a sequence of integers, this call results in a Normalize Transpose. The value of factored will be the sequence of results of applying primeFactorization to each element of inputs.
Normalize Transpose is one of the semantics which allows SequenceL to automatically generate parallel code.
To test the performance, I ran the program on 100 randomly generated (generated using random.org) numbers between 1 and 1,000,000,000. The system used to test had an i7-4702MQ@2.2GHz
Input:
625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760
- Output:
cmd:> main.exe --sl_threads 1 --sl_timer 625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760 "Number 782142901 has largest minimal factor: 782142901" Total Time:8.83028 cmd:> main.exe --sl_threads 2 --sl_timer 625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760 "Number 782142901 has largest minimal factor: 782142901" Total Time:5.67931 cmd:> main.exe --sl_threads 3 --sl_timer 625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760 "Number 782142901 has largest minimal factor: 782142901" Total Time:3.57379 cmd:> main.exe --sl_threads 4 --sl_timer 625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760 "Number 782142901 has largest minimal factor: 782142901" Total Time:2.86046 cmd:> main.exe --sl_threads 0 --sl_timer 625070029 413238785 815577134 738415913 400125878 967798656 830022841 774153795 114250661 259366941 571026384 522503284 757673286 509866901 6303092 516535622 177377611 520078930 996973832 148686385 33604768 384564659 95268916 659700539 149740384 320999438 822361007 701572051 897604940 2091927 206462079 290027015 307100080 904465970 689995756 203175746 802376955 220768968 433644101 892007533 244830058 36338487 870509730 350043612 282189614 262732002 66723331 908238109 635738243 335338769 461336039 225527523 256718333 277834108 430753136 151142121 602303689 847642943 538451532 683561566 724473614 422235315 921779758 766603317 364366380 60185500 333804616 988528614 933855820 168694202 219881490 703969452 308390898 567869022 719881996 577182004 462330772 770409840 203075270 666478446 351859802 660783778 503851023 789751915 224633442 347265052 782142901 43731988 246754498 736887493 875621732 594506110 854991694 829661614 377470268 984990763 275192380 39848200 892766084 76503760 "Number 782142901 has largest minimal factor: 782142901" Total Time:3.01593
The i7 has 4 physical cores with hyperthreading. You can see that nearly linear speedup is gained, automatically, while only using the physical cores. Once the hyperthreaded cores are used the performance suffers slightly.
Sidef
The code uses the prime_factors() function defined in the "Prime decomposition" task.
var nums = [1275792312878611, 12345678915808973,
1578070919762253, 14700694496703910,];
var factors = nums.map {|n| prime_factors.ffork(n) }.map { .wait }
say ((nums ~Z factors)->max_by {|m| m[1][0] })
- Output:
$ time sidef parallel.sf [1275792312878611, [11, 7369, 15739058129]] sidef parallel.sf 24.46s user 0.02s system 158% cpu 15.436 total
Standard ML
Parallel code is from the 'concurrent computing task'. Works with PolyML. Function -factor- is a deformatted version of the one from the prime decomposition page.
structure TTd = Thread.Thread ;
structure TTm = Thread.Mutex ;
val threadedBigPrime = fn input:IntInf.int list =>
let
(* --------------------- code from prime decomposition page ------------------- *)
val factor = fn n :IntInf.int =>
let
val unfactored = fn (u,_,_) => u; val factors = fn (_,f,_) => f; val try = fn (_,_,i) => i; fun getresult t = unfactored t::(factors t);
fun until done change x = if done x then getresult x else until done change (change x); (* iteration *)
fun lastprime t = unfactored t < (try t)*(try t)
fun trymore t = if unfactored t mod (try t) = 0 then (unfactored t div (try t) , try t::(factors t) , try t) else (unfactored t, factors t , try t + 1)
in until lastprime trymore (n,[],2) end;
(* --------------------- end of code from prime decomposition page ------------ *)
val mx = TTm.mutex () ;
val results : IntInf.int list list ref = ref [ ] ;
val tasks : IntInf.int list list ref = ref [ ] ;
val divideup = fn cores => fn inp : IntInf.int list =>
let
val np = (List.length inp) div cores + (cores +1) div cores (* assume length > cores to reduce code *)
val rec divd = fn ([], outp) => ([],outp )
| (inp,outp) => divd ( List.drop (inp,np) , (List.take (inp,np))::outp ) handle Subscript => ([],inp :: outp)
in
#2 ( divd (inp, [ ] ))
end;
val doTask = fn () =>
let
val mytask : IntInf.int list ref = ref [];
val myres : IntInf.int list list ref = ref [];
in
( TTm.lock mx ; mytask := hd ( !tasks ) ; tasks:= tl (!tasks) ; TTm.unlock mx ;
myres := List.map factor ( !mytask ) ;
TTm.lock mx ; results := !myres @ ( !results ) ; TTm.unlock mx ;
TTd.exit ()
)
end;
val cores = TTd.numProcessors ();
val tmp = tasks := divideup cores input ;
val processes = List.tabulate ( cores , fn i => TTd.fork (doTask , []) ) ;
val maxim = ( while ( List.exists TTd.isActive processes ) do (Posix.Process.sleep (Time.fromReal 1.0 ));
List.foldr IntInf.max 1 ( List.map (fn i => List.last i ) (!results) ) ) (* maximal lowest prime *)
in
List.filter (fn lst => List.last lst = maxim ) (!results)
end ;
call and output - interpreter
> threadedBigPrime [ 62478923478923409323, 69478923478923409313, 79234790234098402349,
33498023480920234793, 92834098234098023409, 31908234098234098243,
92873400002348028833, 73498200234098200239, 4349023423478999243,
13480234982340982343, 62478923478925971503, 5340823480234982007,
134802349691098498233, 81780923490092302251, 802487292348792949 ] ;
val it = [[1463103844669601, 42703], [1463103844669541, 42703]]:
(* numbers *)
> List.map (List.foldr IntInf.* 1 ) it ;
val it = [62478923478925971503, 62478923478923409323]: IntInf.int list
Swift
import BigInt
import Foundation
extension BinaryInteger {
@inlinable
public func primeDecomposition() -> [Self] {
guard self > 1 else { return [] }
func step(_ x: Self) -> Self {
return 1 + (x << 2) - ((x >> 1) << 1)
}
let maxQ = Self(Double(self).squareRoot())
var d: Self = 1
var q: Self = self & 1 == 0 ? 2 : 3
while q <= maxQ && self % q != 0 {
q = step(d)
d += 1
}
return q <= maxQ ? [q] + (self / q).primeDecomposition() : [self]
}
}
let numbers = [
112272537195293,
112582718962171,
112272537095293,
115280098190773,
115797840077099,
1099726829285419,
1275792312878611,
BigInt("64921987050997300559")
]
func findLargestMinFactor<T: BinaryInteger>(for nums: [T], then: @escaping ((n: T, factors: [T])) -> ()) {
let waiter = DispatchSemaphore(value: 0)
let lock = DispatchSemaphore(value: 1)
var factors = [(n: T, factors: [T])]()
DispatchQueue.concurrentPerform(iterations: nums.count) {i in
let n = nums[i]
print("Factoring \(n)")
let nFacs = n.primeDecomposition().sorted()
print("Factored \(n)")
lock.wait()
factors.append((n, nFacs))
if factors.count == nums.count {
waiter.signal()
}
lock.signal()
}
waiter.wait()
then(factors.sorted(by: { $0.factors.first! > $1.factors.first! }).first!)
}
findLargestMinFactor(for: numbers) {res in
let (n, factors) = res
print("Number with largest min prime factor: \(n); factors: \(factors)")
exit(0)
}
dispatchMain()
- Output:
$ time ./.build/x86_64-apple-macosx/release/Runner Factoring 112272537195293 Factoring 64921987050997300559 Factoring 112582718962171 Factoring 112272537095293 Factoring 115797840077099 Factoring 115280098190773 Factoring 1099726829285419 Factoring 1275792312878611 Factored 1099726829285419 Factored 112582718962171 Factored 112272537095293 Factored 1275792312878611 Factored 112272537195293 Factored 115280098190773 Factored 115797840077099 Factored 64921987050997300559 Number with largest min prime factor: 64921987050997300559; factors: [736717, 88123373087627] real 0m2.983s user 0m3.570s sys 0m0.012s
Tcl
With Tcl, it is necessary to explicitly perform computations in other threads because each thread is strongly isolated from the others (except for inter-thread messaging). However, it is entirely practical to wrap up the communications so that only a small part of the code needs to know very much about it, and in fact most of the complexity is managed by a thread pool; each value to process becomes a work item to be handled. It is easier to transfer the results by direct messaging instead of collecting the thread pool results, since we can leverage Tcl's vwait
command nicely.
package require Tcl 8.6
package require Thread
# Pooled computation engine; runs event loop internally
namespace eval pooled {
variable poolSize 3; # Needs to be tuned to system size
proc computation {computationDefinition entryPoint values} {
variable result
variable poolSize
# Add communication shim
append computationDefinition [subst -nocommands {
proc poolcompute {value target} {
set outcome [$entryPoint \$value]
set msg [list set ::pooled::result(\$value) \$outcome]
thread::send -async \$target \$msg
}
}]
# Set up the pool
set pool [tpool::create -initcmd $computationDefinition \
-maxworkers $poolSize]
# Prepare to receive results
unset -nocomplain result
array set result {}
# Dispatch the computations
foreach value $values {
tpool::post $pool [list poolcompute $value [thread::id]]
}
# Wait for results
while {[array size result] < [llength $values]} {vwait pooled::result}
# Dispose of the pool
tpool::release $pool
# Return the results
return [array get result]
}
}
This is the definition of the prime factorization engine (a somewhat stripped-down version of the Tcl Prime decomposition solution:
# Code for computing the prime factors of a number
set computationCode {
namespace eval prime {
variable primes [list 2 3 5 7 11]
proc restart {} {
variable index -1
variable primes
variable current [lindex $primes end]
}
proc get_next_prime {} {
variable primes
variable index
if {$index < [llength $primes]-1} {
return [lindex $primes [incr index]]
}
variable current
while 1 {
incr current 2
set p 1
foreach prime $primes {
if {$current % $prime} {} else {
set p 0
break
}
}
if {$p} {
return [lindex [lappend primes $current] [incr index]]
}
}
}
proc factors {num} {
restart
set factors [dict create]
for {set i [get_next_prime]} {$i <= $num} {} {
if {$num % $i == 0} {
dict incr factors $i
set num [expr {$num / $i}]
continue
} elseif {$i*$i > $num} {
dict incr factors $num
break
} else {
set i [get_next_prime]
}
}
return $factors
}
}
}
# The values to be factored
set values {
188573867500151328137405845301
3326500147448018653351160281
979950537738920439376739947
2297143294659738998811251
136725986940237175592672413
3922278474227311428906119
839038954347805828784081
42834604813424961061749793
2651919914968647665159621
967022047408233232418982157
2532817738450130259664889
122811709478644363796375689
}
Putting everything together:
# Do the computation, getting back a dictionary that maps
# values to its results (itself an ordered dictionary)
set results [pooled::computation $computationCode prime::factors $values]
# Find the maximum minimum factor with sorting magic
set best [lindex [lsort -integer -stride 2 -index {1 0} $results] end-1]
# Print in human-readable form
proc renderFactors {factorDict} {
dict for {factor times} $factorDict {
lappend v {*}[lrepeat $times $factor]
}
return [join $v "*"]
}
puts "$best = [renderFactors [dict get $results $best]]"
Wren
Although all Wren code runs within the context of a fiber (of which there can be thousands) only one fiber can run at a time and so the language's virtual machine (VM) is effectively single threaded.
However, it's possible for a suitable host on a suitable machine to run multiple VMs in parallel as the following example, with a C host, shows when run on a machine with four cores. Four VMs are used each of which runs on its own thread.
/* Parallel_calculations.wren */
import "./math" for Int
class C {
static minPrimeFactor(n) { Int.primeFactors(n)[0] }
static allPrimeFactors(n) { Int.primeFactors(n) }
}
We now embed this Wren script in the following C program, compile and run it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
#include "wren.h"
#define NUM_VMS 4
WrenVM* vms[NUM_VMS]; // array of VMs
void doParallelCalcs() {
int data[] = {12757923, 12878611, 12878893, 12757923, 15808973, 15780709, 197622519};
int i, count, largest, largest_factor = 0;
omp_set_num_threads(4);
// we can share the same call and class handles amongst VMs
WrenHandle* callHandle = wrenMakeCallHandle(vms[0], "minPrimeFactor(_)");
WrenHandle* callHandle2 = wrenMakeCallHandle(vms[0], "allPrimeFactors(_)");
wrenEnsureSlots(vms[0], 1);
wrenGetVariable(vms[0], "main", "C", 0);
WrenHandle* classHandle = wrenGetSlotHandle(vms[0], 0);
#pragma omp parallel for shared(largest_factor, largest)
for (i = 0; i < 7; ++i) {
int n = data[i];
int vi = omp_get_thread_num(); // assign a VM (via its array index) for this number
wrenEnsureSlots(vms[vi], 2);
wrenSetSlotHandle(vms[vi], 0, classHandle);
wrenSetSlotDouble(vms[vi], 1, (double)n);
wrenCall(vms[vi], callHandle);
int p = (int)wrenGetSlotDouble(vms[vi], 0);
if (p > largest_factor) {
largest_factor = p;
largest = n;
printf("Thread %d: found larger: %d of %d\n", vi, p, n);
} else {
printf("Thread %d: not larger: %d of %d\n", vi, p, n);
}
}
printf("\nLargest minimal prime factor: %d of %d\n", largest_factor, largest);
printf("All prime factors for this number: ");
wrenEnsureSlots(vms[0], 2);
wrenSetSlotHandle(vms[0], 0, classHandle);
wrenSetSlotDouble(vms[0], 1, (double)largest);
wrenCall(vms[0], callHandle2);
count = wrenGetListCount(vms[0], 0);
for (i = 0; i < count; ++i) {
wrenGetListElement(vms[0], 0, i, 1);
printf("%d ", (int)wrenGetSlotDouble(vms[0], 1));
}
printf("\n");
wrenReleaseHandle(vms[0], callHandle);
wrenReleaseHandle(vms[0], callHandle2);
wrenReleaseHandle(vms[0], classHandle);
}
static void writeFn(WrenVM* vm, const char* text) {
printf("%s", text);
}
void errorFn(WrenVM* vm, WrenErrorType errorType, const char* module, const int line, const char* msg) {
switch (errorType) {
case WREN_ERROR_COMPILE:
printf("[%s line %d] [Error] %s\n", module, line, msg);
break;
case WREN_ERROR_STACK_TRACE:
printf("[%s line %d] in %s\n", module, line, msg);
break;
case WREN_ERROR_RUNTIME:
printf("[Runtime Error] %s\n", msg);
break;
}
}
char *readFile(const char *fileName) {
FILE *f = fopen(fileName, "r");
fseek(f, 0, SEEK_END);
long fsize = ftell(f);
rewind(f);
char *script = malloc(fsize + 1);
fread(script, 1, fsize, f);
fclose(f);
script[fsize] = 0;
return script;
}
static void loadModuleComplete(WrenVM* vm, const char* module, WrenLoadModuleResult result) {
if( result.source) free((void*)result.source);
}
WrenLoadModuleResult loadModule(WrenVM* vm, const char* name) {
WrenLoadModuleResult result = {0};
if (strcmp(name, "random") != 0 && strcmp(name, "meta") != 0) {
result.onComplete = loadModuleComplete;
char fullName[strlen(name) + 6];
strcpy(fullName, name);
strcat(fullName, ".wren");
result.source = readFile(fullName);
}
return result;
}
int main(int argc, char **argv) {
WrenConfiguration config;
wrenInitConfiguration(&config);
config.writeFn = &writeFn;
config.errorFn = &errorFn;
config.loadModuleFn = &loadModule;
const char* module = "main";
const char* fileName = "Parallel_calculations.wren";
char *script = readFile(fileName);
// config the VMs and interpret the script
int i;
for (i = 0; i < NUM_VMS; ++i) {
vms[i] = wrenNewVM(&config);
wrenInterpret(vms[i], module, script);
}
doParallelCalcs();
for (i = 0; i < NUM_VMS; ++i) wrenFreeVM(vms[i]);
free(script);
return 0;
}
- Output:
Sample output as this will obviously vary depending on which threads return first and which of the two numbers with a minimal prime factor of 47 is found first.
Thread 0: found larger: 3 of 12757923 Thread 2: found larger: 29 of 15808973 Thread 0: found larger: 47 of 12878611 Thread 2: not larger: 7 of 15780709 Thread 1: not larger: 47 of 12878893 Thread 1: not larger: 3 of 12757923 Thread 3: not larger: 3 of 197622519 Largest minimal prime factor: 47 of 12878611 All prime factors for this number: 47 101 2713
- Output:
Sample output when the other number is found first.
Thread 0: found larger: 3 of 12757923 Thread 2: found larger: 29 of 15808973 Thread 1: found larger: 47 of 12878893 Thread 0: not larger: 47 of 12878611 Thread 2: not larger: 7 of 15780709 Thread 1: not larger: 3 of 12757923 Thread 3: not larger: 3 of 197622519 Largest minimal prime factor: 47 of 12878893 All prime factors for this number: 47 274019
zkl
Using 64 bit ints and "green"/co-op threads. Using native threads is bad in this case because spawning a bunch of threads (ie way more than there are cpus) really clogs the system and actually slows things down. Strands (zkl speak for green threads) queues up computations for a limited pool of threads, and, as each thread finishes a job, it reads from the queue for the next computation to perform. Strands, as they start up, kick back a future, which can be forced (ie waited on until a result is available) by evaluating it, in this case, by doing a future.noop().
fcn factorize(x,y,z,etc){
xyzs:=vm.arglist;
fs:=xyzs.apply(factors.strand) // queue up factorizing for x,y,...
.apply("noop") // wait for all threads to finish factoring
.apply(fcn{ (0).min(vm.arglist) }); // find minimum factor for x,y...
[0..].zip(fs).filter(fcn([(n,x)],M){ x==M }.fp1((0).max(fs))) // find max of mins
.apply('wrap([(n,_)]){ xyzs[n] }) // and pluck src from arglist
}
factorize(12757923,12878611,12757923,15808973,15780709,197622519).println();
// do a bunch so I can watch the system monitor
factorize( (0).pump(5000,List,fcn{(1000).random() }).xplode() ).println();
- Output:
L(12878611) L(4177950757)
fcn factors(n){ // Return a list of factors of n
acc:=fcn(n,k,acc,maxD){ // k is 2,3,5,7,9,... not optimum
if(n==1 or k>maxD) acc.close();
else{
q,r:=n.divr(k); // divr-->(quotient,remainder)
if(r==0) return(self.fcn(q,k,acc.write(k),q.toFloat().sqrt()));
return(self.fcn(n,k+1+k.isOdd,acc,maxD))
}
}(n,2,Sink(List),n.toFloat().sqrt());
m:=acc.reduce('*,1); // mulitply factors
if(n!=m) acc.append(n/m); // opps, missed last factor
else acc;
}