Sciweavers

JIIS
2008

Maintaining frequent closed itemsets over a sliding window

13 years 4 months ago
Maintaining frequent closed itemsets over a sliding window
In this paper, we study the incremental update of Frequent Closed Itemsets (FCIs) over a sliding window in a high-speed data stream. We propose the notion of semi-FCIs, which is to progressively increase the minimum support threshold for an itemset as it is retained longer in the window, thereby drastically reducing the number of itemsets that need to be maintained and processed. We explore the properties of semiFCIs and observe that a majority of the subsets of a semi-FCI are not semi-FCIs and need not be updated. This finding allows us to devise an efficient algorithm, IncMine, that incrementally updates the set of semi-FCIs over a sliding window. We also develop an inverted index to facilitate the update process. Our empirical results show that IncMine achieves significantly higher throughput and consumes less memory than the state-of-the-art streaming algorithms for mining FCIs and FIs. IncMine also attains high accuracy of 100% precision and over 93% recall.
James Cheng, Yiping Ke, Wilfred Ng
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2008
Where JIIS
Authors James Cheng, Yiping Ke, Wilfred Ng
Comments (0)