Kahan summation: Difference between revisions

→‎{{header|Fortran}}: Inspect the arithmetic
(→‎REXX tweaked version: add output for ooRexx)
(→‎{{header|Fortran}}: Inspect the arithmetic)
Line 277:
 
=={{header|Fortran}}==
===When the computer works in decimal===
As the person who provided the worked example for the Wikipædia article, I am hoist by my own petard! So, consider the IBM1620, a decimal computer, where the user could specify the number of digits to be used in arithmetic: it shall be six decimal digits instead of the default eight. The following source file is Second Fortran (of 1958). Variables were not usually declared (there are no INTEGER or REAL statements), so those whose name starts with I,...,N were integers (known as "fixed-point"), otherwise floating-point. Array bounds were not checked, and a subprogram receiving an array as a parameter needed to know merely that it was an array with the bound specified for one-dimensional arrays unimportant. Some compilers produce code that always executes a DO-loop once (because their check is at the end of the loop) so that zero-loops could not be relied on until after F77, thus S is ''not'' initialised to the first array element and the loop started with the second element, just in case N = 1.
 
Line 378 ⟶ 379:
CSum = 10005.9
</pre>
Which may be more by good luck than by good management. If the TRUNC6ROUND6 attempt to restrict the precision to something like six decimal digits worth is abandoned and the calculations are preformed with ordinary binary floating-point arithmetic (as per the first source file), then both sums come out as 10005·9. This too may be more by good luck.
 
The code is not special, and should be acceptable to F77 onwards. DO-statements now may execute zero times, variables can be declared with types, generic function names are recognised, and in-line comments can be strewn about. F90 offers interesting functions related to the floating-point number scheme, such as FRACTION(x) which returns the mantissa that for binary floating point will be in the range [0·5,1) so FRACTION(3·0) = 0·75, and EXPONENT(x), which will be for powers of two in a binary scheme. RADIX(x) will reveal the base and DIGITS(x) the number of digits of precision.
 
The conversion to a six-digit decimal style of floating-point arithmetic shows a side-effect of the base on precision. A normalised mantissa has the range [100000,999999] (in the above scheme, the fractional point is after the mantissa: more usual is to have it at the start, either as 0.100000 or 1.00000) and evidently, the resolution is one step in the lowest place. Thus the relative precision differs by a factor of ten over the possible range of values. With binary, it would differ only by a factor of two, but in base sixteen, by a factor of sixteen. In this regard, small bases are better.
 
===Third phase: what is being computed?===
This involves inspecting some code, as produced on an AMD FX 6300 six-core cpu by the Cpmpaq "Visual" Fortran compiler, version 6.6 for F90/95 on a Windows XP system. No optimisation activities were requested from the compiler.
 
The expression <code>S = A(1) + A(2) + A(3)</code> was rendered as four operation codes in a row: FLD, FADD, FADD, FSTP - omitting the addressing details. This means that the arithmetic was conducted in the cpu's floating-point unit using eighty-bit floating-point arithmetic (or, REAL*10), however the FSTP that stored the result to S, stored to a 32-bit recipient so that the result was in single precision and would have been rounded to fit.
 
If instead a simple DO-loop is used to calculate the sum, each iteration involved a FLD, FADD, FSTP. This means that at each step, the value of S (single precision) is loaded into the 80-bit register, a (single precision) value is added using 80-bit arithmetic, then the result is stored back to the single-precision variable, S. The third option is to use the supplied facility, <code>S = SUM(A)</code>, but alas, this accumulates the result in the same stepwise manner, using a temporary storage area of the same type as the array: single precision. It makes no attempt to recognise that the summation could be accumulated in a register, still less to do so with 80-bit precision.
 
For the test data, all methods, including compensated summation, returned exactly the same result, even to the last bit, so one wonders just what is going on. Using decimal numbers when concerned with the exact details of binary arithmetic is troublesome because the conversions are generally inexact, so the plan is to reveal the numbers in base two. Besides F format, later Fortran version offers further options such as B O, and Z formats for binary, octal and hexadecimal (the code H being spoken for another use) however these present the bit-patterns as packed into the variable, they do not reveal its ''numerical'' value in binary, etc. So, for example, 3.14159 in decimal
1000000010010010000111111010000 is what is shown with B-format.
11.001001000011111101 is that value in binary.
11.001001000011111101101010100010001000010110100011000010001101001100010... is more accurate.
 
=={{header|Go}}==
1,220

edits