LCP array | ||
---|---|---|
Type | Array | |
Invented by | Manber & Myers (1990) | |
Average | Worst case | |
Space | ||
Construction |
In computer science, the longest common prefix array (LCP array) is an auxiliary data structure to the suffix array. It stores the lengths of the longest common prefixes (LCPs) between all pairs of consecutive suffixes in a sorted suffix array.
For example, if A := [aab, ab, abaab, b, baab] is a suffix array, the longest common prefix between A[1] = aab and A[2] = ab is a which has length 1, so H[2] = 1 in the LCP array H. Likewise, the LCP of A[2] = ab and A[3] = abaab is ab, so H[3] = 2.
Augmenting the suffix array with the LCP array allows one to efficiently simulate top-down and bottom-up traversals of the suffix tree, speeds up pattern matching on the suffix array and is a prerequisite for compressed suffix trees.
The LCP array was introduced in 1993, by Udi Manber and Gene Myers alongside the suffix array in order to improve the running time of their string search algorithm.Gene Myers later became the vice president of Informatics Research at Celera Genomics, and Udi Manber the vice president of engineering at Google.
Let be the suffix array of the string and let denote the length of the longest common prefix between two strings and . Let further denote the substring of ranging from to .