Sciweavers

PVLDB
2008

Mining non-redundant high order correlations in binary data

13 years 3 months ago
Mining non-redundant high order correlations in binary data
Many approaches have been proposed to find correlations in binary data. Usually, these methods focus on pair-wise correlations. In biology applications, it is important to find correlations that involve more than just two features. Moreover, a set of strongly correlated features should be non-redundant in the sense that the correlation is strong only when all the interacting features are considered together. Removing any feature will greatly reduce the correlation. In this paper, we explore the problem of finding non-redundant high order correlations in binary data. The high order correlations are formalized using multi-information, a generalization of pairwise mutual information. To reduce the redundancy, we require any subset of a strongly correlated feature subset to be weakly correlated. Such feature subsets are referred to as Non-redundant Interacting Feature Subsets (NIFS). Finding all NIFSs is computationally challenging, because in addition to enumerating feature combinations,...
Xiang Zhang, Feng Pan, Wei Wang 0010, Andrew B. No
Added 28 Dec 2010
Updated 28 Dec 2010
Type Journal
Year 2008
Where PVLDB
Authors Xiang Zhang, Feng Pan, Wei Wang 0010, Andrew B. Nobel
Comments (0)