Abstract
BACKGROUND: Ankyrin repeat is one of the most frequently observed structural motif in proteins across all kingdoms of life. These proteins are involved in diverse set of cellular functions and act as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers, and consequently, defects in ankyrin repeat proteins have been found in a number of human diseases. Identification of these structural repeats at the sequence level is difficult due to low conservation between the repeat copies. Thus, analysis at the structure level is desirable. RESULTS: In this study, we propose a graph based approach in the identification and analysis of ankyrin repeats. The 3-dimensional topology of protein structures has been shown to be well captured by protein contact graphs. The connectivity information of these networks is represented in the adjacency matrix and here we propose the analysis of the eigen spectra of the adjacency matrix in the identification of structural repeats. A clear two-peak pattern corresponding to the helix-turn-helix region of the Ankyrin motif is observed in the principal eigenvector of the adjacency matrix. The length distribution of this repetitive pattern along with the organization of the secondary structure elements is used to design an algorithm to identify the Ankyrin structural motifs. The analysis has been carried out on a non-redundant set of 51 proteins annotated in the UniProt database and a very good agreement is observed. Analysis of all the proteins in the alpha class and alpha+beta class in SCOP database has been performed and a number of novel repeats, not annotated in the database have been identified. This approach is then applied on other structural repeats such as Tetraticopeptide repeat (TPR), Annexin, HEAT, ARM, etc. CONCLUSIONS: The graph based analysis of protein structures, along with domain information such as the organization of the secondary structure architecture provides a computationally efficient approach for the identification of structural repeats.