*** Welcome to piglix ***

Unicode control characters


Many Unicode control characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character (U+0000 <control-0000>) is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string (as opposed to a starting address and a length), since the string ends once the program reads the null character.

The control characters U+0000–U+001F and U+007F come from ASCII. Additionally, U+0080–U+009F were used in conjunction with ISO 8859 character sets (among others). They are specified in ISO 6429 and often referred to as C0 and C1 control codes respectively.

Most of these characters play no explicit role in Unicode text handling. The characters U+0000 <control-0000> (NULL), U+0009 <control-0009> (HT), U+000A <control-000A> (LF), U+000D <control-000D> (CR), and U+0085 <control-0085> (NEL) are commonly used in text processing as formatting characters.

In an attempt to simplify the several newline characters used in legacy text, UCS introduces its own newline characters to separate either lines or paragraphs: U+2028 line separator (HTML &#8232; · LSEP) and U+2029 paragraph separator (HTML &#8233; · PSEP). These characters are text formatting only, and not <control> characters.


...
Wikipedia

...