Talk:Square form factorization

Since most SquFoF solutions follow the fancy C code on Wp, I will expose how it might fail to factor a composite number and why it is better to stick to the original algorithm.

Below 50,000,000 these N do not split:

The pattern suggests that 1 in every 1,351,351 N won't factor, so more than 2^41 misses below 2^62.

Take for example:

4558849 was not factored.

Now look under the hood; m is the multiplier, forms P# are in the principal and A# in an ambiguous cycle.
Each time, after just a few steps, the multiplier or some trivial factor is found and the principal cycle exited:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)
P54 = (-53, 4238, 36^2)
A1 = ( 36, 4238,-1908)
A35 = ( 2, 4270,-312)
m = 3
P1 = ( 1, 7396,-1343)
P12 = (-911, 6028, 71^2)
A1 = ( 71, 7306,-4678)
A7 = ( 3, 7392,-5377)
m = 5
P1 = ( 1, 9548,-3169)
P8 = (-4681, 4382, 62^2)
A1 = ( 62, 9466,-6338)
A5 = ( 2, 9546,-6358)
m = 7
P1 = ( 1, 11298,-742)
P6 = (-4206, 8966, 53^2)
A1 = ( 53, 11298,-14)
A2 = (-14, 11298, 53)
m = 11
P1 = ( 1, 14162,-6778)
P16 = (-3850, 8866, 89^2)
A1 = ( 89, 14028,-10687)
A9 = ( 22, 14146,-5455)

And so on. Not showing m = 15 through 385, the last multiplier gives:

m = 1155
P1 = ( 1, 145126,-81626)
P16 = (-77195, 105680, 179^2)
A1 = ( 179, 145060,-27205)
A5 = ( 11, 145112,-99769)
4558849 was not factored.

For such a small N, this odd behaviour should raise suspicion.

The obvious remedy is to try and use more multipliers. Indeed, augmenting the list with primes 13 through 37 all of the above numbers eventually yield. But this is just a patch: there are many more N beyond 50,000,000... At least it's clear that a larger set of multipliers should be used, although I prefer a search method where the problem doesn't exist.

This brings us to Shanks's approach. The last square form is saved and if a trivial factor is found we continue searching the principal cycle:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)
P54 = (-53, 4238, 36^2)
A1 = ( 36, 4238,-1908)
A35 = ( 2, 4270,-312)
P68 = (-1240, 4006, 21^2)
A1 = ( 21, 4258,-1248)
A39 = ( 1, 4270,-624)
P114 = (-997, 2652, 53^2)
A1 = ( 53, 4242,-1136)
A54 = (-1, 4270, 624)
P182 = (-781, 2964, 55^2)
A1 = ( 55, 4174,-3696)
A87 = ( 1, 4270,-624)
P248 = (-88, 4186, 45^2)
A1 = ( 45, 4186,-3960)
A122 = (-1, 4270, 624)

No luck: iteration bound 260 reached through a series of bootless A-jumps. (However P276 = (-2037, 3442, 28²) would've been proper). Using the first multiplier:

m = 3
P1 = ( 1, 7396,-1343)
P12 = (-911, 6028, 71^2)
A1 = ( 71, 7306,-4678)
A7 = ( 3, 7392,-5377)

and back in the principal cycle where a proper square is but two steps away:

P14 = (-1898, 7334, 11^2)
A1 = ( 11, 7378,-6166)
A6 = (-766, 6894, 2343)
f = 383  N/f = 11903

Factor found, consuming only 2 instead of max. 11 bits multiplier space. (This task can do without bigint's; for the reduction steps even 32 bits suffice).

Note these two statements from the example C code:

L = 2 * sqrtl( 2*s );
B = 3 * L;

The ghost of Shanks's Queue lingers on in this spectral variable L, referenced once and never to be seen again: it's the Q-entry bound.

Cutting the queue, the editor produced code that is both short & deceptively fast, but didn't realize that absent a queue, SquFoF heuristic demands a return to the principal cycle if a square turns out to be improper — a roundabout but dependable route.

To complete the picture, this is the same factorization applying a queue:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)

All improper squares are filtered out, hence no futile backward jumps.

m = 3
P1 = ( 1, 7396,-1343)
P14 = (-1898, 7334, 11^2)
A1 = ( 11, 7378,-6166)
A6 = (-766, 6894, 2343)
f = 383  N/f = 11903

— Udo e.

Can't say I really followed that, nevermind, a translation of the 2nd C entry fixed my problems. --Pete Lomax (talk) 19:23, 20 March 2021 (UTC)

To prove things were working as they should, I wrote a little ditty <lang Phix>integer prev = 3 sequence res = {}, r for n=3 to 100_000 do

   integer p = get_prime(n)
   for N=prev+2 to p-2 by 2 do
       atom f = squfof(N)
       if f=1 then
           sequence pf = prime_factors(N)
           if length(pf)=1 and N=power(pf[1],3) then
               r = sprintf("%d (%d cubed)",{N,pf[1]})
           else
               r = sprintf("%d (factors: %v)",{N,pf})
           end if
           res = append(res, r)
       end if
   end for
   prev = p

end for puts(1,join_by(res,3,10))</lang> and got this, the only non-prime odd numbers that fail below 1,000,000 are indeed cubes of primes, just as advertised:

24389 (29 cubed)   103823 (47 cubed)   226981 (61 cubed)   389017 (73 cubed)   704969 (89 cubed)   1092727 (103 cubed)
68921 (41 cubed)   148877 (53 cubed)   300763 (67 cubed)   493039 (79 cubed)   912673 (97 cubed)   1225043 (107 cubed)
79507 (43 cubed)   205379 (59 cubed)   357911 (71 cubed)   571787 (83 cubed)   1030301 (101 cubed)   1295029 (109 cubed)

Plus, the above mentioned 37 problem cases also now work fine.

Raku (ping Hkdtam)

There is something very wrong with (the performance of) the Raku example. I know Raku is the slowest computer language ever written (don't worry I'm just kidding) but... Seriously, just copy (say) the 2nd C example into (say) repl.it and run it. --Pete Lomax (talk) 23:39, 25 March 2021 (UTC)

Hi Pete, thank you for bringing this up. I was just trying to be playful on making it look as close to the algorithm and pseudocode as possible, for example, the floor and sqrt routines were replaced with operators which add unnecessary layers of operation. Also many recalculation of values like √𝑁, √(𝑘*𝑁) and √Q, among others, can be avoided by using temporary variables. Above all that I didn't apply any optimization like the other entries (TBH I am too slack to understand them and fit them in :-P). Anyway, thanks again and have a nice day. --Hkdtam (talk) 17:37, 26 March 2021 (UTC)

No worries, if it was 20 or 100 times slower I wouldn't have said anything, but two and a half million times slower... --Pete Lomax (talk) 19:48, 26 March 2021 (UTC)