Sciweavers

KDD
2007
ACM

The minimum consistent subset cover problem and its applications in data mining

14 years 4 months ago
The minimum consistent subset cover problem and its applications in data mining
In this paper, we introduce and study the Minimum Consistent Subset Cover (MCSC) problem. Given a finite ground set X and a constraint t, find the minimum number of consistent subsets that cover X, where a subset of X is consistent if it satisfies t. The MCSC problem generalizes the traditional set covering problem and has Minimum Clique Partition, a dual problem of graph coloring, as an instance. Many practical data mining problems in the areas of rule learning, clustering, and frequent pattern mining can be formulated as MCSC instances. In particular, we discuss the Minimum Rule Set problem that minimizes model complexity of decision rules as well as some converse k-clustering problems that minimize the number of clusters satisfying certain distance constraints. We also show how the MCSC problem can find applications in frequent pattern summarization. For any of these MCSC formulations, our proposed novel graphbased generic algorithm CAG can be directly applicable. CAG starts by con...
Byron J. Gao, Martin Ester, Jin-yi Cai, Oliver Sch
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2007
Where KDD
Authors Byron J. Gao, Martin Ester, Jin-yi Cai, Oliver Schulte, Hui Xiong
Comments (0)