Sciweavers

RIDE
1997
IEEE

Evaluation of Sampling for Data Mining of Association Rules

13 years 8 months ago
Evaluation of Sampling for Data Mining of Association Rules
Discovery of association rules is a prototypical problem in data mining. The current algorithms proposed for data mining of association rules make repeated passes over the database to determine the commonly occurring itemsets (or set of items). For large databases, the I/O overhead in scanning the database can be extremely high. In this paper we show that random sampling of transactions in the database is an effective method for finding association rules. Sampling can speed up the miningprocess by more than an order of magnitudeby reducing I/O costs and drastically shrinking the number of transaction to be considered. We may also be ableto makethesampleddatabase residentin main-memory. Furthermore, we show that sampling can accurately represent the data patterns in the database with high confidence. We experimentally evaluate the effectiveness of sampling on different databases, and study the relationship between the performance, and the accuracy and confidence of the chosen sample...
Mohammed Javeed Zaki, Srinivasan Parthasarathy, We
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1997
Where RIDE
Authors Mohammed Javeed Zaki, Srinivasan Parthasarathy, Wei Li, Mitsunori Ogihara
Comments (0)