Abstract
Internal repetition within proteins is a commonly observed phenomenon and presents multiple binding and structural roles to proteins. Ankyrin repeat, one of the most widely existing protein motifs in nature, forms helix-turn-helix motif, and exclusively functions to mediate proteinprotein interactions, some of which are directly involved in the development of human cancer and other diseases. It has been observed that the ankyrin repeat motif is defined by its fold rather than by its function, and as there is no specific sequence underlying the fold, its identification at the structural level is desirable. We propose graph based approach in the identification and analysis of these repeats. The topological details of protein structures has been shown to be effectively captured by protein graphs. In this study we analyzed about twelve graph measures for the identification of repeated structural motifs. Of these, degree, Katz status, Page rank, closeness vitality and eigenvectors corresponding to the principal eigenvalue of the adjacency matrix are found to be promising. We observe that on considering the secondary structure information along with the analysis of various graph measures, it is possible to accurately identify the boundaries of individual repeats. This approach is computationally very efficient compared to the structure-structure self-comparison. Design and engineering of repeat proteins may help to elucidate their structural and biophysical properties, such as the dependence of stability and folding on the number of repeats, the importance of key intra- and inter-repeat interactions and identifying novel binding molecules suitable for biotechnological or medical applications.