Abstract
Pattern mining is an important task of data mining and involves the extraction of interesting associations from large databases. Typically, pattern mining is carried out from huge databases, which tend to get updated several times. Consequently, as a given database is updated, some of the patterns discovered may become invalid, while some new patterns may emerge. This has motivated significant research efforts in the area of Incremental Mining. The goal of incremental mining is to efficiently and incrementally mine patterns when a database is updated as opposed to mining all of the patterns from scratch from the complete database. Incidentally, research efforts are being made to develop incremental pattern mining algorithms for extracting different kinds of patterns such as frequent patterns, sequential patterns and utility patterns. However, none of the existing works addresses incremental mining in the context of coverage patterns, which has important applications in areas such as banner advertising, search engine advertising and graph mining. In this regard, the main contributions of this work are three-fold. First, we introduce the problem of incremental mining in the context of coverage patterns. Second, we propose the IncCMine algorithm for efficiently extracting the knowledge of coverage patterns when incremental database is added to the existing database. Third, we performed extensive experiments using two real-world click stream datasets and one synthetic dataset. The results of our performance evaluation demonstrate that our proposed IncCMine algorithm indeed improves the performance significantly w.r.t. the existing CMine algorithm. Index Terms—Data mining, Coverage patterns, Incremental mining, Knowledge discovery