Pudi, Vikram and Haritsa, Jayant R (2003) Reducing rule covers with deterministic error bounds. In: Advances In Knowledge Discovery And Data Mining, 2637 . pp. 313-324.
vikram_gclosed.pdf - Published Version
Restricted to Registered users only
Download (115Kb) | Request a copy
The output of boolean association rule mining algorithms is often too large for manual examination. For dense datasets, it is often impractical to even generate all frequent itemsets. The closed itemset approach handles this information overload by pruning "uninteresting" rules following the observation that most rules can be derived from other rules. In this paper, we propose a new framework, namely, the generalized closed (or g-closed) itemset framework. By allowing for a small tolerance in the accuracy of itemset supports, we show that the number of such redundant rules is far more than what was previously estimated. Our scheme can be integrated into both levelwise algorithms (Apriori) and two-pass algorithms (ARMOR). We evaluate its performance by measuring the reduction in output size as well as in response time. Our experiments show that incorporating g-closed itemsets provides significant performance improvements on a variety of databases.
|Item Type:||Journal Article|
|Additional Information:||Copyright of this article belongs to Springer.|
|Department/Centre:||Division of Information Sciences > Supercomputer Education & Research Centre|
|Date Deposited:||31 Oct 2009 07:53|
|Last Modified:||19 Sep 2010 04:56|
Actions (login required)