Sciweavers

INAP
2001
Springer

Discovering Frequent Itemsets in the Presence of Highly Frequent Items

13 years 8 months ago
Discovering Frequent Itemsets in the Presence of Highly Frequent Items
This paper presents new techniques for focusing the discoveryof frequent itemsets within large, dense datasets containing highly frequent items. The existence of highly frequent items adds signi cantly to the cost of computing the complete set of frequent itemsets. Our approach allows for the exclusion of such items during the candidate generation phase of the Apriori algorithm. Afterwards, the highly frequent items can be reintroduced, via an inferencing framework, providing for a capability to generate frequent itemsets without counting their frequency. We demonstrate the use of these new techniques within the well-studied framework of the Apriori algorithm. Furthermore, we provide empirical results using our techniques on both synthetic and real datasets - both relevant since the real datasets exhibit statistical characteristics di erent from the probabilistic assumptions behind the synthetic data. The source we used for real data was the U.S. Census.
Dennis P. Groth, Edward L. Robertson
Added 30 Jul 2010
Updated 30 Jul 2010
Type Conference
Year 2001
Where INAP
Authors Dennis P. Groth, Edward L. Robertson
Comments (0)