*** Welcome to piglix ***

HTML tags


An HTML element is an individual component of an HTML document or web page, once this has been parsed into the Document Object Model. HTML is composed of a tree of HTML nodes, such as text nodes. Each node can have HTML attributes specified. Nodes can also have content, including other nodes and text. Many HTML nodes represent semantics, or meaning. For example, the title node represents the title of the document.

HTML documents are delivered as "documents". These are then parsed, which turns them into the Document Object Model (DOM) internal representation, within the web browser.

Presentation by the web browser, such as screen rendering or access by JavaScript, is then performed on this internal model, not the original document.

Early HTML documents, and to a lesser extent today, were largely invalid HTML and riddled with syntax errors. The parsing process was also required to "fix-up" these errors, as best it could. The resultant model was often not correct (i.e. it did not represent what a careless coder had originally intended), but it would at least be valid, according to the HTML standard. A valid model was produced, no matter how bad the "tag soup" supplied had been. Only in the rarest cases would the parser abandon parsing altogether.

"Elements" and "tags" are terms that are widely confused. HTML documents contain tags, but do not contain the elements. The elements are only generated after the parsing step, from these tags.

As is generally understood, the position of an element is indicated as spanning from a start tag, possibly including some child content, and is terminated by an end tag. This is the case for many, but not all, elements within an HTML document.

As HTML is based on SGML, its parsing also depends on the use of a DTD, specifically an HTML DTD such as that for HTML 4.01. The DTD specifies which element types are possible (i.e. it defines the set of element types that go to make up HTML) and it also specifies the valid combinations in which they may appear in a document. It is part of general SGML behaviour that where only one valid structure is possible (per the DTD), it is not generally a requirement that the document explicitly states that structure. As a simple example, the <p> start tag indicating the start of a paragraph element should be closed by a </p> end tag, indicating the end of the element. Also the DTD states that paragraph elements cannot be nested. The HTML document fragment:


...
Wikipedia

...