Pathological floating point problems: Difference between revisions

Content deleted Content added
Dinosaur (talk | contribs)
→‎{{header|Go}}: Add Fortran.
Dinosaur (talk | contribs)
→‎{{header|Fortran}}: Ah, Siegfried...
Line 324:
 
=={{header|Fortran}}==
Problems arise because the floating-point arithmetic as performed by digital computers has only an oblique relationship to the arithmetic of '''Real''' numbers: many axia are violated, even if only by a little, and in certain situations. Most seriously, only a limited precision is available even if the floating-point variables are declared via such words as "REAL", and. actionsActions such as subtraction (of nearly-equal values) can in one step destroy many or all the digits of precisionaccuracy of the value being developed.
 
Fortran's only "built-in" assistance in this is provided via the ability to declare variables DOUBLE PRECISION, and on some systems, QUADRUPLE PRECISION is available. Earlier systems such as the IBM1620 supported decimal arithmetic of up to a hundredninety-nine decimal digits (via hardware!), specified via control cards, but this sort of ability has been abandoned. Special "bignumber" arithmetic routines can be written supporting floating-point (or integer, or rational) arithmetic of hundreds or thousands or more words of storage per number, but this is not a standard arrangement.
 
Otherwise, a troublesome calculation might be recast into a different form that avoids a catastrophic loss of precision, probably after a lot of careful and difficult analysis and exploration and ingenuity.
 
Here, no such attempt is made. In the spirit of Formula Translation, this is a direct translation of the specified formulae into Fortran, with single and double precision results on display. There is no REAL*16 option, nor the REAL*10 that some systems allow to correspond to the eighty-bit floating-point format supported by the floating-point processor. The various integer constants cause no difficulty and I'm not bothering with writing them as <langinteger>.0 Fortran- the compiler can deal with this. The constants with fractional parts happen to be exactly represented in binary so there is no fuss over 333.75 and 333.75D0 whereas by contrast 0.1 and 0.1D0 are not equal. Similarly, there is no attempt to rearrange the formulae, for instance to have <code>A**2 * B**2</code> replaced by <code>(A*B)**2</code>, nor worry over <code>B**8</code> where 33096**8 = 1.439E36 and the largest possible single-precision number is 3.4028235E+38, in part because arithmetic within an expression can be conducted with a greater dynamic range. Most of all, no attention has been given to SUBROUTINEthe MULLERsubtractions...
 
This would be F77 style Fortran, except for certain conveniences offered by F90, especially the availability of generic functions such as EXP whose type is determined by the type of its parameter, rather than having to use EXP and DEXP for single and double precision respectively, or else... The END statement for subroutines and functions names the routine being ended, a useful matter to have checked. <lang Fortran> SUBROUTINE MULLER
REAL*4 VN,VNL1,VNL2 !The exact precision and dynamic range
REAL*8 WN,WNL1,WNL2 !Depends on the format's precise usage of bits.
INTEGER I !A stepper.
WRITE (6,1) !A RADIX(VN),DIGITS(VN),"single",DIGITS(WN),"double"heading.
1 FORMAT ("Floating-pointMuller's arithmeticsequence isshould conductedconverge in baseto six...",I0,/
1 2(I3," digits forN Single ",A," precisionDouble",/),//,
2 "Muller's sequence should converge to six...",/
2 " N Single Double")
VNL1 = 2; VN = -4 !Initialise for N = 2.
WNL1 = 2; WN = -4 !No fractional parts yet.
Line 361:
1 FORMAT (///"The Chaotic Bank Society in action..."/"Year")
WRITE (6,2) 0,V,W !Show the initial deposit.
2 FORMAT (I3,F16.7,F28.16)
DO YEAR = 1,25 !Step through some years.
V = V*YEAR - 1 !The specified procedure.
Line 368:
END DO !On to the following year.
END SUBROUTINE CBS !Madness!
 
REAL*4 FUNCTION SR4(A,B) !Siegfried Rump's example function of 1988.
REAL*4 A,B
SR4 = 333.75*B**6
1 + A**2*(11*A**2*B**2 - B**6 - 121*B**4 - 2)
2 + 5.5*B**8 + A/(2*B)
END FUNCTION SR4
REAL*8 FUNCTION SR8(A,B) !Siegfried Rump's example function.
REAL*8 A,B
SR8 = 333.75*B**6 !.75 is exactly represented in binary.
1 + A**2*(11*A**2*B**2 - B**6 - 121*B**4 - 2)
2 + 5.5*B**8 + A/(2*B)!.5 is exactly represented in binary.
END FUNCTION SR8
 
PROGRAM POKE
REAL*4 V !Some example variables.
REAL*8 W !Whose type goes to the inquiry function.
WRITE (6,1) RADIX(V),DIGITS(V),"single",DIGITS(W),"double"
1 FORMAT ("Floating-point arithmetic is conducted in base ",I0,/
1 2(I3," digits for ",A," precision",/))
WRITE (6,*) "Single precision limit",HUGE(V)
WRITE (6,*) "Double precision limit",HUGE(W)
WRITE (6,*)
 
CALL MULLER
 
CALL CBS
 
WRITE (6,10)
10 FORMAT (///"Evaluation of Siegfried Rump's function of 1988",
1 " where F(77617,33096) = -0.827396059946821")
WRITE (6,*) "Single precision:",SR4(77617.0,33096.0)
WRITE (6,*) "Double precision:",SR8(77617.0D0,33096.0D0) !Must match the types.
END</lang>
 
{{Out}}
====Output====
Floating-point numbers in single and double precision use the "implicit leading one" binary format on this system: there have been ''many'' variations across different computers across the years. One can write strange routines that will test the workings of arithmetic (and other matters) so as to determine the situation on the computer of the moment, but F90 introduced special "inquiry" routines that reveal certain details as standard. This information could be used to make choices amongst calculation paths and ploys appropriate for different results, at of course a large expenditure in thought to produce a compound scheme that will (should?) work correctly on a variety of computers. No such effort has been made here!
 
FloatingFifty-pointthree numbersbinary indigits singlecorresponds andto double15·95 precisiondecimal usedigits: there is no simple conversion so the "implicitusual leadingploy one"is binaryto formatshow onadditional thisdecimal system:digits, thereknowing arethat ''many''the variationslower-order acrossdigits differentwill computersbe acrossfuzz due to the yearsbinary/decimal conversion. The "Muller" sequence has for its fourth term 9.3783783783783790 - note that this is a recurring sequence, and its precision is ''less'' than the displayed sixteen decimal digits (seventeen digits of "precision" are on show) - when trying for maximum accuracy, converting a binary value to decimal adds confusion.
 
<pre>
Floating-point arithmetic is conducted in base 2
Line 380 ⟶ 413:
53 digits for double precision
 
Single precision limit 3.4028235E+38
Double precision limit 1.797693134862316E+308
 
Muller's sequence should converge to six...
Line 448 ⟶ 483:
24**************** 48072289.9364179447293282
25**************** 1201807247.4104485511779785
 
 
 
Evaluation of Siegfried Rump's function of 1988 where F(77617,33096) = -0.827396059946821
Single precision: -1.1805916E+21
Double precision: -1.1805916E+21
</pre>
Floating-point numbers in single and double precision use the "implicit leading one" binary format on this system: there are ''many'' variations across different computers across the years. The "Muller" sequence has for its fourth term 9.3783783783783790 - note that this is a recurring sequence, and its precision is ''less'' than the displayed sixteen decimal digits (seventeen digits of "precision" are on show) - when trying for maximum accuracy, converting a binary value to decimal adds confusion.
 
None of the results are remotely correct! In the absence of a Fortran compiler supporting still higher precision (such as quadruple, or REAL*16) only two options remain: either devise multi-word high-precision arithmetic routines and try again with even more more brute-force, or, analyse the calculation with a view to finding a way to avoid the loss of accuracy with calculations conducted in the available precision.
 
Alternatively, do not present various intermediate results such as might give rise to doubts, nor yet entertain any doubts, just declare the answer to be what appears, and move on. In a letter from F.S. Acton, "A former student of mine now hands out millions of dollars for computation ... and he dismally estimates that 70% of the "answers" are worthless because of poor analysis and poor programming."
 
=={{header|Go}}==