Phred base-calling is a computer program for identifying a base (nucleobase) sequence from a fluorescence "trace" data generated by an automated DNA sequencer that uses electrophoresis and 4-fluorescent dye method. When originally developed, Phred produced significantly fewer errors in the data sets examined than other methods, averaging 40–50% fewer errors. Phred quality scores have become widely accepted to characterize the quality of DNA sequences, and can be used to compare the efficacy of different sequencing methods.
The fluorescent-dye DNA sequencing is a molecular biology technique that involves labeling single-strand DNA sequences of varied length with 4 fluorescent dyes (corresponding to 4 different bases used in DNA) and subsequently separating the DNA sequences by "slab gel"- or capillary-electrophoresis method (see DNA Sequencing). The electrophoresis run is monitored by a CCD on the DNA sequencer and this produces a time "trace" data (or "chromatogram") of the fluorescent "peaks" that passed the CCD point. Examining the fluorescence peaks in the trace data, we can determine the order of individual bases (nucleobase) in the DNA. Since the intensity, shape and the location of a fluorescence peak are not always consistent or unambiguous, however, sometimes it is difficult or time-consuming to determine (or "call") the correct bases for the peaks accurately if it is done manually.
Automated DNA sequencing techniques have revolutionized the field of molecular biology – generating vast amounts of DNA sequence data. However, the sequence data is produced at a significantly higher rate than can be manually processed (i.e. interpreting the trace data to produce the sequence data), thereby creating a bottleneck. To remove the bottleneck, both automated software that can speed up the processing with improved accuracy and a reliable measure of the accuracy are needed. To meet this need, many software programs have been developed. One such program is Phred.