The Cancer Genome Atlas (TCGA) is a project, begun in 2005, to catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics. TCGA applies high-throughput genome analysis techniques to improve our ability to diagnose, treat, and prevent cancer through a better understanding of the genetic basis of this disease.
TCGA is supervised by the National Cancer Institute's Center for Cancer Genomics and the National Human Genome Research Institute funded by the US government. A three-year pilot project, begun in 2006, focused on characterization of three types of human cancers: glioblastoma multiforme, lung, and ovarian cancer. In 2009, it expanded into phase II, which planned to complete the genomic characterization and sequence analysis of 20-25 different tumor types by 2014. TCGA surpassed that goal, characterizing 33 cancer types including 10 rare cancers. Funding is split between genome characterization centers (GCCs), which perform the sequencing, and genome data analysis centers (GDACs), which perform the bioinformatic analyses.
The project scheduled 500 patient samples, more than most genomics studies, and used different techniques to analyze the patient samples. Techniques include gene expression profiling, copy number variation profiling, SNP genotyping, genome wide DNA methylation profiling, microRNA profiling, and exon sequencing of at least 1,200 genes. TCGA is sequencing the entire genomes of some tumors, including at least 6,000 candidate genes and microRNA sequences. This targeted sequencing is being performed by all three sequencing centers using hybrid-capture technology. In phase II, TCGA is performing whole exon sequencing on 80% of the cases and whole genome sequencing on 80% of the cases used in the project.