Solenoid protein domains are a highly modular type of protein domains. They consist of a chain of nearly identical folds, often simply called "repeats". They are extremely common among all types of proteins, though exact figures are unknown.
In proteins, a "repeat" is any sequence block that returns more than one time in the sequence, either in an identical or a highly similar form. Repetitiveness does not in itself indicate anything about the structure of the protein. As a "rule of thumb", short repetitive sequences (e.g. those below the length of 10 amino acids) may be intrinsically disordered, and not part of any folded protein domains. Repeats that are at least 30 to 40 amino acids long, are far more likely to be folded as part of a domain. Such long repeats are frequently indicative of the presence of a solenoid domain in the protein.
Examples of disordered repetitive sequences include the 7-mer peptide repeats found in the RPB1 subunit of RNA polymerase II, or the tandem beta-catenin or axin binding linear motifs in APC (adenomatous polyposis coli). Examples of short repeats exhibiting ordered structures include the three-residue collagen repeat or the five-residue pentapeptide repeat that forms a beta helix structure.
Due to the identical form of their building blocks, solenoid domains can only assume a limited number of shapes. Two main topologies are possible: linear (or open, generally with some degree of helical curvature) and circular (or closed).
If the two terminal repeats in a solenoid do not physically interact, it leads to an open or linear structure. Members of this group are frequently rod- or crescent-shaped. The number of individual repeats can range from 2 to over 50. A clear advantage of this topology is that both the N- and C-terminal ends are free to add new repeats and folds, or even remove existing ones during evolution without any gross impact on the structural stability of the entire domain. This type of domain is extremely common among extracellular segments of receptors or cell adhesion molecules. A non-exhaustive list of examples include: EGF repeats, cadherin repeats, leucine-rich repeats, HEAT repeats, ankyrin repeats, armadillo repeats, tetratricopeptide repeats, etc. Whenever a linear solenoid domain structure participates in protein-protein interactions, frequently at least 3 or more repetitive subunits form the ligand-binding sites. Thus - while individual repeats might have a (limited) ability to fold on their own – they usually cannot perform the functions of the entire domain alone.