Talk:Square form factorization: Difference between revisions

From Rosetta Code
Content added Content deleted
mNo edit summary
(Clearing some points)
Line 1: Line 1:
Since most SquFoF solutions follow the example C code on Wp,
Since most SquFoF solutions follow the fancy C code on Wp,
I will expose how it might fail to factor a composite number.
I will expose how it might fail to factor a composite number
and why it is better to stick to the original algorithm.


Below 50,000,000 these N do not split:
Below 50,000,000 these N do not split:
Line 41: Line 42:
49531597}</nowiki>
49531597}</nowiki>


The regularity suggests that 1 in every 1,351,351 N
The pattern suggests that 1 in every 1,351,351 N
won't factor, so more than 2^41 misses below 2^62.
won't factor, so more than 2^41 misses below 2^62.


Line 49: Line 50:


Now look under the hood; m is the multiplier,
Now look under the hood; m is the multiplier,
forms P# are on the principal and A# on an ambiguous cycle.<br/>
forms P# are in the principal and A# in an ambiguous cycle.<br/>
Each time, after just a few steps, the multiplier or some
Each time, after just a few steps, the multiplier or some
trivial factor is returned and the principal cycle exited:
trivial factor is found and the principal cycle exited:


<nowiki>N = 4558849
<nowiki>N = 4558849
Line 80: Line 81:
A9 = ( 22, 14146,-5455)</nowiki>
A9 = ( 22, 14146,-5455)</nowiki>


The pattern continues. Not showing m = 15 through 385,
And so on. Not showing m = 15 through 385,
the last multiplier gives:
the last multiplier gives:


Line 99: Line 100:
where the problem doesn't exist.
where the problem doesn't exist.


This brings us to Shanks's approach.
This is what happens if the last square form is saved and we
The last square form is saved and if a trivial factor is found we
[[Square_Form_Factorization#Classical_heuristic|continue searching the principal cycle]]:
[[Square_Form_Factorization#Classical_heuristic|''continue searching the principal cycle'']]:


<nowiki>N = 4558849
<nowiki>N = 4558849
Line 131: Line 133:
A7 = ( 3, 7392,-5377)</nowiki>
A7 = ( 3, 7392,-5377)</nowiki>


and back on the principal cycle where a proper square is
and back in the principal cycle where a proper square
but two steps away:
is but two steps away:


<nowiki>P14 = (-1898, 7334, 11^2)
<nowiki>P14 = (-1898, 7334, 11^2)

Revision as of 16:03, 23 March 2021

Since most SquFoF solutions follow the fancy C code on Wp, I will expose how it might fail to factor a composite number and why it is better to stick to the original algorithm.

Below 50,000,000 these N do not split:

{ 988193,
 2265133,
 2681279,
 3636679,
 4558849,
 5214317,
 5812727,
 6109459,
10144837,
10848581,
13093541,
15513347,
19029173,
19839509,
20834839,
22393891,
23143969,
23181601,
23515517,
23530531,
25275583,
26715247,
30525329,
31089529,
31513079,
33409583,
33817367,
38444453,
38581751,
40653427,
42205501,
43125547,
43616197,
44524219,
45391447,
49469239,
49531597}

The pattern suggests that 1 in every 1,351,351 N won't factor, so more than 2^41 misses below 2^62.

Take for example:

4558849 was not factored.

Now look under the hood; m is the multiplier, forms P# are in the principal and A# in an ambiguous cycle.
Each time, after just a few steps, the multiplier or some trivial factor is found and the principal cycle exited:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)
P54 = (-53, 4238, 36^2)
A1 = ( 36, 4238,-1908)
A35 = ( 2, 4270,-312)
m = 3
P1 = ( 1, 7396,-1343)
P12 = (-911, 6028, 71^2)
A1 = ( 71, 7306,-4678)
A7 = ( 3, 7392,-5377)
m = 5
P1 = ( 1, 9548,-3169)
P8 = (-4681, 4382, 62^2)
A1 = ( 62, 9466,-6338)
A5 = ( 2, 9546,-6358)
m = 7
P1 = ( 1, 11298,-742)
P6 = (-4206, 8966, 53^2)
A1 = ( 53, 11298,-14)
A2 = (-14, 11298, 53)
m = 11
P1 = ( 1, 14162,-6778)
P16 = (-3850, 8866, 89^2)
A1 = ( 89, 14028,-10687)
A9 = ( 22, 14146,-5455)

And so on. Not showing m = 15 through 385, the last multiplier gives:

m = 1155
P1 = ( 1, 145126,-81626)
P16 = (-77195, 105680, 179^2)
A1 = ( 179, 145060,-27205)
A5 = ( 11, 145112,-99769)
4558849 was not factored.

For such a small N, this odd behaviour should raise suspicion.

The obvious remedy is to try and use more multipliers. Indeed, augmenting the list with primes 13 through 37 all of the above numbers eventually yield. But this is just a patch: there are many more N beyond 50,000,000... At least it's clear that a larger set of multipliers should be used, although I prefer a search method where the problem doesn't exist.

This brings us to Shanks's approach. The last square form is saved and if a trivial factor is found we continue searching the principal cycle:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)
P54 = (-53, 4238, 36^2)
A1 = ( 36, 4238,-1908)
A35 = ( 2, 4270,-312)
P68 = (-1240, 4006, 21^2)
A1 = ( 21, 4258,-1248)
A39 = ( 1, 4270,-624)
P114 = (-997, 2652, 53^2)
A1 = ( 53, 4242,-1136)
A54 = (-1, 4270, 624)
P182 = (-781, 2964, 55^2)
A1 = ( 55, 4174,-3696)
A87 = ( 1, 4270,-624)
P248 = (-88, 4186, 45^2)
A1 = ( 45, 4186,-3960)
A122 = (-1, 4270, 624)

No luck: iteration bound 260 reached through a series of bootless A-jumps. (However P276 = (-2037, 3442, 282) would've been proper). Using the first multiplier:

m = 3
P1 = ( 1, 7396,-1343)
P12 = (-911, 6028, 71^2)
A1 = ( 71, 7306,-4678)
A7 = ( 3, 7392,-5377)

and back in the principal cycle where a proper square is but two steps away:

P14 = (-1898, 7334, 11^2)
A1 = ( 11, 7378,-6166)
A6 = (-766, 6894, 2343)
f = 383  N/f = 11903

Factor found, consuming only 2 instead of max. 11 bits multiplier space. (This task can do without bigint's; for the reduction steps even 32 bits suffice).

Note these two statements from the example C code:

L = 2 * sqrtl( 2*s );
B = 3 * L;

The ghost of Shanks's Queue lingers on in this spectral variable L, referenced once and never to be seen again: it's the Q-entry bound.

Cutting the queue, the editor produced code that is both short & deceptively fast, but didn't realize that absent a queue, SquFoF heuristic demands a return to the principal cycle if a square turns out to be improper — a roundabout but dependable route.

To complete the picture, this is the same factorization applying a queue:

N = 4558849
m = 1
P1 = ( 1, 4270,-624)

All improper squares are filtered out, hence no futile backward jumps.

m = 3
P1 = ( 1, 7396,-1343)
P14 = (-1898, 7334, 11^2)
A1 = ( 11, 7378,-6166)
A6 = (-766, 6894, 2343)
f = 383  N/f = 11903

Udo e.


Can't say I really followed that, nevermind, a translation of the 2nd C entry fixed my problems. --Pete Lomax (talk) 19:23, 20 March 2021 (UTC)

To prove things were working as they should, I wrote a little ditty <lang Phix>integer prev = 3 sequence res = {}, r for n=3 to 100_000 do

   integer p = get_prime(n)
   for N=prev+2 to p-2 by 2 do
       atom f = squfof(N)
       if f=1 then
           sequence pf = prime_factors(N)
           if length(pf)=1 and N=power(pf[1],3) then
               r = sprintf("%d (%d cubed)",{N,pf[1]})
           else
               r = sprintf("%d (factors: %v)",{N,pf})
           end if
           res = append(res, r)
       end if
   end for
   prev = p

end for puts(1,join_by(res,3,10))</lang> and got this, the only non-prime odd numbers that fail below 1,000,000 are indeed cubes of primes, just as advertised:

24389 (29 cubed)   103823 (47 cubed)   226981 (61 cubed)   389017 (73 cubed)   704969 (89 cubed)   1092727 (103 cubed)
68921 (41 cubed)   148877 (53 cubed)   300763 (67 cubed)   493039 (79 cubed)   912673 (97 cubed)   1225043 (107 cubed)
79507 (43 cubed)   205379 (59 cubed)   357911 (71 cubed)   571787 (83 cubed)   1030301 (101 cubed)   1295029 (109 cubed)