*** Welcome to piglix ***

Wide character


A wide character is a computer character datatype that generally has a size greater than the traditional 8-bit character. The increased datatype size allows for the use of larger coded character sets.

During the 1960s, mainframe and mini-computer manufacturers began to standardize around the 8-bit byte as their smallest datatype. The 7-bit ASCII character set became the industry standard method for encoding alphanumeric characters for teletype machines and computer terminals. The extra bit was used for parity, to ensure the integrity of data storage and transmission. As a result, the 8-bit byte became the de facto datatype for computer systems storing ASCII characters in memory.

Later, computer manufacturers began to make use of the spare bit to extend the ASCII character set beyond its limited set of English alphabet characters. 8-bit extensions such as IBM code page 37, PETSCII and ISO 8859 became commonplace, offering terminal support for Greek, Cyrillic, and many others. However, such extensions were still limited in that they were region specific and often could not be used in tandem. Special conversion routines had to be used to convert from one character set to another, often resulting in destructive translation when no equivalent character existed in the target set.

In 1989, the International Organization for Standardization began work on the Universal Character Set (UCS), a multilingual character set that could be encoded using either a 16-bit (2-byte) or 32-bit (4-byte) value. These larger values required the use of a datatype larger than 8-bits to store the new character values in memory. Thus the term wide character was used to differentiate them from traditional 8-bit character datatypes.


...
Wikipedia

...