*** Welcome to piglix ***

32-bit floating point


Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

Floating point is used to represent fractional values, or when its wider range than fixed point (of the same bit width) is important, even if at the cost of precision. A signed 32-bit integer can have a maximum value of 231 − 1 = 2,147,483,647, whereas the maximum representable IEEE 754 32-bit base-2 floating-point value is (2 − 2−23) × 2127 ≈ 3.402823 × 1038. All integers with 6 or fewer significant decimal digits can be converted to an IEEE 754 floating-point value without loss of precision, and any number that can be written as 2n such that n is a whole number from -126 to 127 can be converted to an IEEE 754 floating-point number without a loss of precision.

In the IEEE 754-2008 standard, the 32-bit base-2 format is officially referred to as binary32; it was called single in IEEE 754-1985. IEEE 754 specifies additional floating-point types, such as 64-bit base-2 double precision and, more recently, base-10 representations.

One of the first programming languages to provide single- and double-precision floating-point data types was Fortran. Before the widespread adoption of IEEE 754-1985, the representation and properties of floating-point data types depended on the computer manufacturer and computer model, and upon decisions made by programming-language implementers. E.g., GW-BASIC's single-precision data type was the 32-bit MBF floating-point format.


...
Wikipedia

...