*** Welcome to piglix ***

Binary radian


Binary scaling is a computer programming technique used mainly by embedded C, DSP and assembler programmers to perform a pseudo floating point using integer arithmetic.

Binary scaling is both faster and more accurate than directly using floating point instructions; however, care must be taken not to cause an arithmetic overflow.

A position for the virtual 'binary point' is taken, and then subsequent arithmetic operations determine the 'binary point'.

Binary points obey the mathematical laws of exponentiation.

To give an example, a common way to use integer arithmetic to simulate floating point is to multiply the coefficients by 65536.

Using binary scientific notation, this will place the binary point at 1B16.

For instance, to represent 1.2 and 5.6 floating point real numbers as 1B16 one multiplies them by 216, giving 78643 and 367001.

Multiplying these together gives

To convert it back to 1B16, divide it by 216.

This gives 440400B16, which when converted back to a floating point number (by dividing again by 216, but holding the result as floating point) gives 6.71999. The correct floating point result is 6.72.

The scaling range here is for any number between 65535.9999 and −65536.0 with 16 bits to hold fractional quantities (of course assuming the use of a 64 bit result register). Note that some computer architectures may restrict arithmetic to 32 bit results. In this case extreme care must be taken not to overflow the 32 bit register. For other number ranges the binary scale can be adjusted for optimum accuracy.

The example above for a B16 multiplication is a simplified example. Re-scaling depends on both the B scale value and the word size. B16 is often used in 32 bit systems because it works simply by multiplying and dividing by 65536 (or shifting 16 bits).

Consider the Binary Point in a signed 32 bit word thus:

where S is the sign bit and X are the other bits.

Placing the binary point at


...
Wikipedia

...