DNA barcoding is a taxonomic method that uses a short genetic marker in an organism's DNA to identify it as belonging to a particular species. It differs from molecular phylogeny in that the main goal is not to determine patterns of relationship but to identify an unknown sample in terms of a preexisting classification. Although barcodes are sometimes used in an effort to identify unknown species or assess whether species should be combined or separated, the utility of DNA barcoding for these purposes is subject to debate. The most commonly used barcode region for animals and protists is a segment of approximately 600 base pairs of the gene (COI). This differs in the case of fungi, where part of Internal Transcribed Spacer 2 (ITS2) between rRNA genes is used, and again in plants, where a dual barcode of rbcL and matK is used.
Applications include, for example, identifying plant leaves even when flowers or fruit are not available, identifying insect larvae (which may have fewer diagnostic characters than adults and are frequently less well-known), identifying the diet of an animal, based on its stomach contents or faeces and identifying products in commerce (for example, herbal supplements, wood, or skins and other animal parts).
A desirable locus for DNA barcoding should be standardized (so that large databases of sequences for that locus can be developed), present in most of the taxa of interest and sequenceable without species-specific PCR primers, short enough to be easily sequenced with current technology, and provide a large variation between species yet a relatively small amount of variation within a species.
Although several loci have been suggested, a common set of standardized regions were selected by the respective committees:
DNA barcoding is based on a relatively simple concept. All eukaryote cells contain , and animal mitochondrial DNA (mtDNA) has a relatively fast mutation rate, resulting in the generation of diversity within and between populations over relatively short evolutionary timescales (thousands of generations). Typically, in animals, a single mtDNA genome is transmitted to offspring by each breeding female, and the genetic effective population size is proportional to the number of breeding females. This contrasts with the nuclear genome, which is around 100 000 times larger, where males and females each contribute two full genomes to the gene pool and effective size is therefore proportional to twice the total population size. This reduction in effective population size leads to more rapid sorting of mtDNA gene lineages within and among populations through time, due to variance in fecundity among individuals (the principle of coalescence). The combined effect of higher mutation rates and more rapid sorting of variation usually results in divergence of mtDNA sequences among species and a comparatively small variance within species. A 658-bp region (the Folmer region) of the subunit I (COI) gene was proposed as a potential 'barcode'.