In mathematics, the Chvátal–Sankoff constants are mathematical constants that describe the lengths of longest common subsequences of random strings. Although the existence of these constants has been proven, their exact values are unknown. They are named after Václav Chvátal and David Sankoff, who began investigating them in the mid-1970s.
There is one Chvátal–Sankoff constant for each positive integer k, where k is the number of characters in the alphabet from which the random strings are drawn. The sequence of these numbers grows inversely proportionally to the square root of k. However, some authors write "the Chvátal–Sankoff constant" to refer to , the constant defined in this way for the binary alphabet.
A common subsequence of two strings S and T is a string whose characters appear in the same order (not necessarily consecutively) both in S and in T. The problem of computing a longest common subsequence has been well studied in computer science. It can be solved in polynomial time by dynamic programming; this basic algorithm has additional speedups for small alphabets (the Method of Four Russians), for strings with few differences, for strings with few matching pairs of characters, etc. This problem and its generalizations to more complex forms of edit distance have important applications in areas that include bioinformatics (in the comparison of DNA and protein sequences and the reconstruction of evolutionary trees), geology (in stratigraphy), and computer science (in data comparison and revision control).