Cis-natural antisense transcript

Natural antisense transcripts (NATs) are a group of RNAs encoded within a cell that have transcript complementarity to other RNA transcripts. They have been identified in multiple eukaryotes, including humans, mice, yeast and Arabidopsis thaliana. This class of RNAs includes both protein-coding and non-coding RNAs. Current evidence has suggested a variety of regulatory roles for NATs, such as RNA interference (RNAi), alternative splicing, genomic imprinting, and X-chromosome inactivation. NATs are broadly grouped into two categories based on whether they act in cis or in trans. Trans-NATs are transcribed from a different location than their targets and usually have complementarity to multiple transcripts with some mismatches.MicroRNAs (miRNA) are an example of trans-NATs that can target multiple transcripts with a few mismatches.Cis-natural antisense transcripts (cis-NATs) on the other hand are transcribed from the same genomic locus as their target but from the opposite DNA strand and form perfect pairs.

Cis-NATs have a variety of orientations and differing lengths of overlap between pairs. There have been five identified orientations for cis-NATs to date. The most common orientation is head-to-head, where the 5' ends of both transcripts align together. This orientation would result in the greatest knockdown of gene expression if transcriptional collision is the reason for transcript inhibition. There are however some studies that have suggested that tail-to-tail orientations are the most common NAT pairs. Others such as tail to tail, overlapping, nearby head-to- head, and nearby tail-to-tail are less frequently encountered. Completely overlapping NATs involve the antisense gene being located completely over top of each other. Nearby head-to-head and tail-to-tail orientations are physically discrete from each other but are located very close to each other. Current evidence suggests that there is an overrepresentation of NAT pairs in genes that have catalytic activity. There may be something about these genes in particular that makes them more prone to this type of regulation.

Identification of NATs in whole genomes is possible due to the large collection of sequence data available from multiple organisms. In silico methods for detecting NATs suffer from several shortcomings depending on the source of sequence information. Studies that use mRNA have sequences whose orientations are known, but the amount of mRNA sequence information available is small. Predicted gene models using algorithms trained to look for genes gives an increased coverage of the genome at the cost of confidence in the identified gene. Another resource is the extensive expressed sequence tag (EST) libraries but these small sequences must first be assigned an orientation before useful information can be extracted from them. Some studies have utilized special sequence information in the ESTs such as the poly(A) signal, poly(A) tail, and splicing sites to both filter the ESTs and to give them the correct transcriptional orientation. Combinations of the different sequence sources attempts to maximize coverage as well as maintain integrity in the data.

...
Wikipedia