Warren Richard Gish | |
---|---|
Nationality | American |
Fields | Bioinformatics |
Institutions |
National Center for Biotechnology Information Washington University in St. Louis Advanced Biocomputing LLC University of California, Berkeley |
Alma mater | University of California, Berkeley |
Thesis | I. SV40 mutants isolated from transformed human cells. II. Methods for sequence analysis (1988) |
Doctoral advisor | Michael Botchan |
Known for | BLAST |
Warren Richard Gish is the owner of Advanced Biocomputing LLC. He joined Washington University in St. Louis as a junior faculty member in 1994, and was a Research Associate Professor of Genetics from 2002 to 2007.
After initially studying physics, Gish obtained an A.B. degree in Biochemistry from University of California, Berkeley, and completed work for his Ph.D. degree in Molecular Biology at the same institution in 1988.
Gish is primarily known for his contributions to NCBI BLAST, and his creation of the BLAST Network Service and nr (non-redundant) databases; his 1996 release of the original gapped BLAST (WU-BLAST 2.0); and most recently his development and support of AB-BLAST. At Washington University in St. Louis, Gish also led the genome analysis group which annotated all finished human, mouse and rat genome data produced by the University's Genome Sequencing Center from 1995 through 2002.
As a graduate student, Gish applied the Quine-McCluskey algorithm to the analysis of splice site recognition sequences. In 1985, with a view toward rapid identification of restriction enzyme recognition sites in DNA, Gish developed a DFA function library in the C language. The idea to apply a finite-state machine to this problem had been suggested by fellow graduate student and BSD UNIX developer Mike Karels. Gish's DFA implementation was that of a Mealy machine architecture, which is more compact than an equivalent Moore machine and hence faster. Construction of the DFA was O(n), where n is the sum of the lengths of the query sequences. The DFA could then be used to scan subject sequences in a single pass with no backtracking in O(m) time, where m is the total length of the subject(s). The method of DFA construction was recognized later as being a consolidation of two algorithms, Algorithms 3 and 4 described by Alfred V. Aho and Margaret J. Corasick.