In SGML, HTML and XML documents, the logical constructs known as character data and attribute values consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference and a character entity reference. This article lists the character entity references that are valid in HTML and XML documents.
A character entity reference refers to the content of a named entity. An entity declaration is created by using the <!ENTITY name "value">
syntax in a Document Type Definition (DTD).
A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the format:
or
where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be lowercase in XML documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The hhhh may mix uppercase and lowercase, though uppercase is the usual style.
In contrast, a character entity reference refers to a character by the name of an entity which has the desired character as its replacement text. The entity must either be predefined (built into the markup language) or explicitly declared in a Document Type Definition (DTD). The format is the same as for any entity reference:
where name is the case-sensitive name of the entity. The semicolon is required.
ISO Entity Sets: SGML supplied a comprehensive set of entity declarations for characters widely used in Western technical and reference publishing, for Latin, Greek and Cyrillic scripts. The American Mathematical Society also contributed entities for mathematical characters.