A conserved non-coding sequence (CNS) is a DNA sequence of noncoding DNA that is evolutionarily conserved. These sequences are of interest for their potential to regulate gene production.
CNSs in plants and animals are highly associated with transcription factor binding sites and other cis-acting regulatory elements. Conserved non-coding sequences can be important sites of evolutionary divergence as mutations in these regions may alter the regulation of conserved genes, producing species-specific patterns of gene expression. These features have made them an invaluable resource in comparative genomics.
All CNSs are likely to perform some function in order to have constraints on their evolution, but they can be distinguished based on where in the genome they are found and how they got there.
Introns are stretches of sequence found mostly in eukaryotic organisms which interrupt the coding regions of genes, with basepair lengths varying across three orders of magnitude. Intron sequences may be conserved, often because they contain expression regulating elements that put functional constraints on their evolution. Patterns of conserved introns between species of different kingdoms have been used to make inferences about intron density at different points in evolutionary history. This makes them an important resource for understanding the dynamics of intron gain and loss in eukaryotes (1,28).
Some of the most highly conserved noncoding regions are found in the untranslated regions (UTRs) at the 3’ end of mature RNA transcripts, rather than in the introns. This suggests an important function operating at the post-transcriptional level. If these regions perform an important regulatory function, the increase in 3’-UTR length over evolutionary time suggests that conserved UTRs contribute to organism complexity. Regulatory motifs in UTRs often conserved in genes belonging to the same metabolic family could potentially be used to develop highly specific medicines that target RNA transcripts.