Abstract
The detailed molecular level understanding of physiochemical interactions responsible for the overall architecture and dynamics of functional RNA molecules embodies a major challenge to the scientific community. RNA 3D structure is currently analyzed in terms of strongly hydrogenbonded secondary structural elements interacting through primarily noncanonical tertiary base pairing interactions. Identification of recurring combinations of such elements as motifs, important in terms of architecture or as anchors for association, enhances our understanding of the RNA tertiary structures. Current approaches in motif mining are constrained by their dependence on identification of strongly hydrogen bonded base pairs. In our work, we have attempted a bottom-up approach towards identifying and characterizing structural features responsible for the complex architecture and dynamics of functional RNA molecules. At the lower end we have studied, noncovalent interactions such as inter-nucleotide hydrogen bonding interactions, water and ion mediated interactions; and at the higher end we have tried to capture the recurrent structural motifs made of several base-pairs and higher order. To study all the important weak and 'other than base base' interactions, within a non-redundant dataset of functional RNA molecules, we have developed a python based automated tool Inter-Nucleotide Contact Annotator in RNA (INCAR). INCAR efficiently manages space and time complexity to capture, categorize and suitably annotate all significant non-covalent interactions, such as hydrogen bonded and ion mediated interactions. Results are with high specificity, without compromising on sensitivity, with respect to currently available benchmarks like BPFIND and FR3D. Key features of our analysis include a) classification of bifurcated geometries, b) Identification of various structural elements such as platforms, bulges, loops, stems etc. c) significance of weak hydrogen bonds d) classification of higher order interactions such as triplets and quartets and e) use of a new automated centroid based approach which doesn't require a prior knowledge on standard base-pair geometries and atoms involving in hydrogen bonding is designed and implemented for detection of glycosidic bond orientation of various standard and non-standard base-pairing geometries (bifurcated, single hydrogen bonded, deformed). We have identified a new structurally important A-A Di-loop and an interesting tertiary interaction in a ribozyme. In addition, INCAR has been applied in several areas such as comparative analysis to investigate the structural conservancy of a class of sequence variants, investigation of motifs in terms of 'other than base-base' contacts and analysis of molecular dynamics simulation trajectories of RNA molecules.