In mathematics, in the areas of combinatorics and computer science, a Lyndon word is a nonempty string that is strictly smaller in lexicographic order than all of its rotations. Lyndon words are named after mathematician Roger Lyndon, who investigated them in 1954, calling them standard lexicographic sequences. Anatoly Shirshov introduced Lyndon words in 1953 calling them regular words.
Several equivalent definitions are possible.
A k-ary Lyndon word of length n > 0 is an n-character string over an alphabet of size k, and which is the unique minimum element in the lexicographical ordering of all its rotations. Being the singularly smallest rotation implies that a Lyndon word differs from any of its non-trivial rotations, and is therefore aperiodic.
Alternately, a Lyndon word has the property that it is nonempty and, whenever it is split into two nonempty substrings, the left substring is always lexicographically less than the right substring. That is, if w is a Lyndon word, and w = uv is any factorization into two substrings, with u and v understood to be non-empty, then u < v. This definition implies that a string w of length ≥ 2 is a Lyndon word if and only if there exist Lyndon words u and v such that u < v and w = uv. Although there may be more than one choice of u and v with this property, there is a particular choice, called the standard factorization, in which v is as long as possible.
The Lyndon words over the two-symbol binary alphabet {0,1}, sorted by length and then lexicographically within each length class, form an infinite sequence that begins
The first string that does not belong to this sequence, "00", is omitted because it is periodic (it consists of two repetitions of the substring "0"); the second omitted string, "10", is aperiodic but is not minimal in its permutation class as it can be cyclically permuted to the smaller string "01".