Extracting redundancy-aware top-k patterns

12 years 8 months ago
Extracting redundancy-aware top-k patterns
Observed in many applications, there is a potential need of extracting a small set of frequent patterns having not only high significance but also low redundancy. The significance is usually defined by the context of applications. Previous studies have been concentrating on how to compute top-k significant patterns or how to remove redundancy among patterns separately. There is limited work on finding those top-k patterns which demonstrate high-significance and low-redundancy simultaneously. In this paper, we study the problem of extracting redundancy-aware top-k patterns from a large collection of frequent patterns. We first examine the evaluation functions for measuring the combined significance of a pattern set and propose the MMS (Maximal Marginal Significance) as the problem formulation. The problem is known as NP-hard. We further present a greedy algorithm which approximates the optimal solution with performance bound O(log k) (with conditions on redundancy), where k is the numb...
Dong Xin, Hong Cheng, Xifeng Yan, Jiawei Han
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2006
Where KDD
Authors Dong Xin, Hong Cheng, Xifeng Yan, Jiawei Han
Comments (0)