Abstract
Given a retail transactional database, the objective of high-utility pattern mining is to discover high-utility itemsets (HUIs), i.e., itemsets that satisfy a user-specified utility threshold. In retail applications, when purchasing a set of items (i.e., itemsets), consumers seek to replace or substitute items with each other to suit their individual preferences (e.g., Coke with Pepsi, tea with coffee). In practice, retailers, too, require substitutes to address operational issues like stockouts, expiration, and other supply chain constraints. The implication is that items that are interchangeably purchased, i.e., substitute goods, are critical to ensuring both user satisfaction and sustained retailer profits. In this regard, this work presents (i) an efficient model to identify HUIs containing substitute goods in place of items that require substitution, (ii) the SubstiTution-based Itemset indeX (STIX) to retrieve HUIs containing substitutes, and (iii) an experimental study to depict the benefits of the proposed approach w.r.t. a baseline method.