Abstract
Facilitating the discovery of drugs by combining diverse compounds is becoming prevalent, especially for treating complex diseases like cancers and HIV. A drug is a chemi- cal compound structure and any sub-structure of a chemical compound is designated as a fragment. A chemical compound or a fragment can be modeled as a graph structure. Given a set of chemical compounds and their corresponding large set of fragments modeled as graph structures, we address the problem of identifying potential combinations of diverse chemical compounds, which cover a certain percentage of the set of fragments. In this regard, the key contributions of this work are three-fold: First, we introduce the notion of Graph Transactional Coverage Patterns (GTCPs) for any given graph transactional dataset. Second, we propose an efficient model and framework for extracting GTCPs from a given graph transactional dataset. Third, we conduct an extensive performance study using three real datasets to demonstrate that it is indeed feasible to efficiently extract GTCPs using our proposed GTCP-extraction framework. We also demonstrate the effectiveness of the GTCP-extraction framework through a case study in computer-aided drug design. Index Terms—Graph mining, Graph transactions, Coverage patterns, Drug discovery