Constraint-Based Rule Mining in Large, Dense Databases |
| |
Authors: | Roberto J Bayardo Jr Rakesh Agrawal Dimitrios Gunopulos |
| |
Affiliation: | (1) IBM Almaden Research Center, San Jose, CA 95120, USA;(2) IBM Almaden Research Center, San Jose, CA 95120, USA;(3) IBM Almaden Research Center, San Jose, CA 95120, USA |
| |
Abstract: | Constraint-based rule miners find all rules in a given data-set meeting user-specified constraints such as minimum support
and confidence. We describe a new algorithm that directly exploits all user-specified constraints including minimum support,
minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications.
Our algorithm maintains efficiency even at low supports on data that is dense (e.g. relational tables). Previous approaches
such as Apriori and its variants exploit only the minimum support constraint, and as a result are ineffective on dense data
due to a combinatorial explosion of “frequent itemsets”. |
| |
Keywords: | data mining association rules rule induction |
本文献已被 SpringerLink 等数据库收录! |
|