Orphan genes (also called ORFans, especially in microbial literature) are genes without detectable homologues in other lineages. Orphans are a subset of taxonomically-restricted genes (TRGs), which are unique to a specific taxonomic level (e.g. plant-specific). In contrast to non-orphan TRGs, orphans are usually considered unique to a very narrow taxon, generally a species.
The classic model of evolution is based on duplication, rearrangement, and mutation of genes with the idea of common descent. Orphan genes differ in that they are lineage-specific with no known history of shared duplication and rearrangement outside of their specific species or clade. Orphan genes may arise through a variety of mechanisms, such as horizontal gene transfer, duplication and rapid divergence, and de novo origination, and may act at different rates in insects, primates, and plants. Despite their relatively recent origin, orphan genes may encode functionally important proteins.
Orphan genes were first discovered when the yeast genome-sequencing project began in 1996. Orphan genes accounted for an estimated 26% of the yeast genome, but it was believed that these genes could be classified with homologues when more genomes were sequenced. At the time, gene duplication was considered the only serious model of gene evolution and there were few sequenced genomes for comparison, so a lack of detectable homologues was thought to be most likely due to a lack of sequencing data and not due to a true lack of homology. However, orphan genes continued to persist as the quantity of sequenced genomes grew, eventually leading to the conclusion that orphan genes are ubiquitous to all genomes. Estimates of the percentage of genes which are orphans varies enormously between species and between studies; 10-30% is a commonly cited figure.
The study of orphan genes emerged largely after the turn of the century. In 2003, a study of Caenorhabditis briggsae and related species compared over 2000 genes. They proposed that these genes must be evolving too quickly to be detected and are consequently sites of very rapid evolution. In 2005, Wilson examined 122 bacterial species to try to examine whether the large number of orphan genes in many species was legitimate. The study found that it was legitimate and played a role in bacterial adaptation. The definition of taxonomically-restricted genes was introduced into the literature to make orphan genes seem less "mysterious."
In 2008, a yeast protein of established functionality, BSC4, was found to have evolved de novo from non-coding sequences whose homology was still detectable in sister species.