Floating point: Difference between revisions

Content added Content deleted

Inline

Latest revision as of 10:40, 24 March 2012

Floating point is a numeric system for approximating real numbers. Each floating-point number stores some digits and an exponent (plus a sign, which is either 1 or -1) taking the form

value = sign × digits × RADIX^exponent

This design uses a constant RADIX and limits the maximum number of digits. Calculations are fast but inexact, because the limit on digits causes round-off errors. It should be noted that, with an appropriate exponent, a floating point number can represent a substantial range of integers exactly (though less than the range that could fit in the same space with a “pure” integer).

The most common floating-point formats in modern practice are those based on the IEEE 754 standard, in particular with the RADIX being 2, and the digits and exponent being a fixed number of binary digits that fit (together with the sign) in a piece of memory of size 32 bits (4 bytes, float) or 64 bits (8 bytes, double).

@@ Line 1: / Line 1: @@
 [[Category:Encyclopedia]]
-[[wp:Floating point|Floating point]] is a numeric system for approximating real numbers. Each floating-point number stores some ''digits'' and an ''exponent'', taking the form
+[[wp:Floating point|Floating point]] is a numeric system for approximating real numbers. Each floating-point number stores some ''digits'' and an ''exponent'' (plus a ''sign'', which is either ''1'' or ''-1'') taking the form
-: ''value = digits &times; RADIX<sup>exponent</sup>''
+: ''value = sign &times; digits &times; RADIX<sup>exponent</sup>''
 This design uses a constant ''RADIX'' and limits the maximum number of digits. Calculations are fast but inexact, because the limit on digits causes round-off errors.
+It should be noted that, with an appropriate ''exponent'', a floating point number can represent a substantial range of integers exactly (though less than the range that could fit in the same space with a “pure” integer).
+The most common floating-point formats in modern practice are those based on the [[wp:IEEE 754|IEEE 754]] standard, in particular with the ''RADIX'' being ''2'', and the ''digits'' and ''exponent'' being a fixed number of binary digits that fit (together with the ''sign'') in a piece of memory of size 32 bits (4 bytes, <code>float</code>) or 64 bits (8 bytes, <code>double</code>).