Cyclotomic polynomial: Difference between revisions

m
J: remove unnecessary implementation of FFT -- defer to the FFT page for that.
m (J: mention fftw addon)
m (J: remove unnecessary implementation of FFT -- defer to the FFT page for that.)
Line 2,147:
As noted in the [http://jsoftware.com/pipermail/programming/2022-March/060209.html J programming forum], we can improve the big-O character of this algorithm by using the [[j:Essays/FFT|fast fourier transform]] for polynomial multiplication and division.
 
<lang J>cyclotomic000=:NB. {{ assert.0<yinstall'math/fftw'
require'math/fftw'
 
cyclotomic000=: {{ assert.0<y
if. y = 1 do. _1 1 return. end.
'q p'=. __ q: y
Line 2,157 ⟶ 2,160:
,(y%*/q) {."0 cyclotomic */q
else.
lgl=. {:$ ctlist=. cyclotomic "0 }:*/@>,{1,each q NB. ctlist is 2-d table of polynomial divisors
lgd=. # dividend=. _1,(-y){.1 NB. (x^n) - 1, and its size
lg=. >.&.(2&^.) lgl >. lgd NB. required lengths of all polynomials for fft transforms
NB. really, "divisor" is the fft of the divisor!
divisor=. */ fftfftw"1 lg{."1 ctlist NB. FFT article doesn't deal with lists of multiplicands
unpad roundreal ifftifftw"1 divisor %~ fftfftw lg{.dividend NB. similar to article's multiplication
end.
}}
 
roundreal =: [: <. 0.5 + 9&o.</lang>
NB. discard high order zero coefficients in representation of polynomial
unpad=: {.~ 1+0 i:~0=]
 
This approachvariation for polynomial division is only valid when there's no remainder to be concerned with (which is the case, here).
NB. Fast Fourier Transform:
cube =: ($~ q:@#) :. ,
rou =: {{(,j.) r,(%:0j1),j.+|.}.r=.^o.j.(i.@%&8 % -:) y}} NB. roots of unity
floop =: {{for_r. i.#$x do. (y=.{."1 y) ] x=.(+/x) ,&,:"r (-/x)*y end.}}
fft =: ( ] floop&.cube rou@#) f. :. ifft
ifft =: (# %~ ] floop&.cube +@rou@#) f. :. fft
roundreal =: [: <. 0.5 + 9&o.</lang>
 
This approach gave slightly over a 2x16x speedup for <tt>taskorder 10</tt>, from a 2 element cache, with an approximately 50% increased memory footprint. (Remember, of course, that benchmarks and benchmark ratios have dependencies on computer architecture and language implementation, and the host environment.)
This approach for polynomial division is only valid when there's no remainder to be concerned with (which is the case, here).
 
This approach gave slightly over a 2x speedup for <tt>taskorder 10</tt>, from a 2 element cache, with an approximately 50% increased memory footprint. (Remember, of course, that benchmarks and benchmark ratios have dependencies on computer architecture and language implementation, and the host environment.)
 
A further roughly 8x speedup can be obtained here, by using the fftw addon.
 
<lang J>NB. install'math/fftw'
require'math/fftw'
fft=: fftw
ifft=: ifftw</lang>
 
=={{header|Java}}==
6,962

edits