*** Welcome to piglix ***

Elias omega coding


Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.

Omega coding is used in applications where the largest encoded value is not known ahead of time, or to compress data in which small values are much more frequent than large values.

To code a number N:

To decode an Elias omega-coded integer:

Omega codes can be thought of as a number of "groups". A group is either a single 0 bit, which terminates the code, or two or more bits beginning with 1, which is followed by another group.

The first few codes are shown below. Included is the so-called implied distribution, describing the distribution of values for which this coding yields a minimum-size code; see Relationship of universal codes to practical compression for details.

The encoding for 1 googol, 10100, is 11 1000 101001100 (15 bits of length header) followed by the 333-bit binary representation of 1 googol, which is 10010 01001001 10101101 00100101 10010100 11000011 01111100 11101011 00001011 00100111 10000100 11000100 11001110 00001011 11110011 10001010 11001110 01000000 10001110 00100001 00011010 01111100 10101010 10110010 01000011 00001000 10101000 00101110 10001111 00010000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 and a trailing 0, for a total of 349 bits.

A googol to the hudredth power (1010000) is a 33220-bit binary number. Its omega encoding is 33243 bits long: 11 1111 1000000111000100 (22 bits), followed by 33220 bits of the value, and a trailing 0. Under Elias delta coding, the same number is 33250 bits long: 000000000000000 1000000111000100 (31 bits) followed by 33219 bits of the value. As log2(1010000) = 33219.28, so in this instance, omega and delta coding are, respectively, only 0.07% and 0.09% longer than optimal.


...
Wikipedia

...