Disease gene identification is a process by which scientists identify the mutant genotypes responsible for an inherited genetic disorder. Mutations in these genes can include single nucleotide substitutions, single nucleotide additions/deletions, deletion of the entire gene, and other genetic abnormalities.
Knowledge of which genes (when non-functional) cause which disorders will simplify diagnosis of patients and provide insights into the functional characteristics of the mutation. The advent of modern-day high-throughput sequencing technologies combined with insights provided from the growing field of genomics is resulting in more rapid disease gene identification, thus allowing scientists to identify more complex mutations.
Disease gene identification techniques often follow the same overall procedure. DNA is first collected from several patients who are believed to have the same genetic disease. Then, their DNA samples are analyzed and screened to determine probable regions where the mutation could potentially reside. These techniques are mentioned below. These probable regions are then lined-up with one another and the overlapping region should contain the mutant gene. If enough of the genome sequence is known, that region is searched for candidate genes. Coding regions of these genes are then sequenced until a mutation is discovered or another patient is discovered, in which case the analysis can be repeated, potentially narrowing down the region of interest.
The differences between most disease gene identification procedures are in the second step (where DNA samples are analyzed and screened to determine regions in which the mutation could reside).
Without the aid of the whole-genome sequences, pre-genomics investigations looked at select regions of the genome, often with only minimal knowledge of the gene sequences they were looking at. Genetic techniques capable of providing this sort of information include Restriction Fragment Length Polymorphism (RFLP) analysis and microsatellite analysis.
Loss of heterozygosity (LOH) is a technique that can only be used to compare two samples from the same individual. LOH analysis is often used when identifying cancer-causing oncogenes in that one sample consists of (mutant) tumor DNA and the other (control) sample consists of genomic DNA from non-cancerous cells from the same individual. RFLPs and microsatellite markers provide patterns of DNA polymorphisms, which can be interpreted as residing in a heterozygous region or a homozygous region of the genome. Provided that all individuals are affected with the same disease resulting from a manifestation of a deletion of a single copy of the same gene, all individuals will contain one region where their control sample is heterozygous but the mutant sample is homozygous - this region will contain the disease gene.