*** Welcome to piglix ***

Numerals in Unicode


Numerals (often called numbers in Unicode) are characters or sequences of characters that denote a number. The same Arabic-Indic numerals are used widely in various writing systems throughout the world and all share the same semantics for denoting numbers. However, the graphemes representing these numerals differ widely from one writing system to another. To support these grapheme differences, Unicode includes encodings of these numerals within many of the script blocks. The decimal digits are repeated in 23 separate blocks: twice in Arabic. Six additional blocks contain the digits again as rich text primarily to serve as a palette of graphemes for specialized mathematical use. In addition to many forms of the Arabic-Indic numerals, Unicode also includes several less common numerals such as: Aegean numerals, Roman numerals, counting rod numerals, Cuneiform numerals and ancient Greek numerals.

Numerals invariably involve composition of glyphs as a limited number of characters are composed to make other numerals. For example, the sequence 9–9–0 in Arabic-Indic numerals composes the numeral for nine hundred ninety (990). In Roman numerals, the same number is expressed by the composed numeral Ⅹↀ or ⅩⅯ. Each of these is a distinct numeral for representing the same abstract number. The semantics of the numerals differ in particular in their composition. The Arabic-Indic decimal digits are positional-value compositions, while the Roman numerals are sign-value and they are additive and subtractive depending on their composition.

Grouped by their numerical property as used in a text, Unicode has four values for Numeric Type. First there is the "not a number" type. Then there are decimal-radix numbers, commonly used in Western style decimals (plain 0-9), there are numbers that are not part of a decimal system such as Roman numbers, and decimal numbers in typographic context, such as encircled numbers. Not noted is a numbering like "A. B. C." for chapter numbering.

Hexadecimal digits in Unicode are not separate characters, existing letters and numbers are used. These characters have marked Character properties Hex_digit=Yes, and ASCII_Hex_digit=Yes when appropriate.


...
Wikipedia

...