Sciweavers

KDD
2009
ACM

Mining discrete patterns via binary matrix factorization

14 years 5 months ago
Mining discrete patterns via binary matrix factorization
Mining discrete patterns in binary data is important for subsampling, compression, and clustering. We consider rankone binary matrix approximations that identify the dominant patterns of the data, while preserving its discrete property. A best approximation on such data has a minimum set of inconsistent entries, i.e., mismatches between the given binary data and the approximate matrix. Due to the hardness of the problem, previous accounts of such problems employ heuristics and the resulting approximation may be far away from the optimal one. In this paper, we show that the rank-one binary matrix approximation can be reformulated as a 0-1 integer linear program (ILP). However, the ILP formulation is computationally expensive even for small-size matrices. We propose a linear program (LP) relaxation, which is shown to achieve a guaranteed approximation error bound. We further extend the proposed formulations using the regularization technique, which is commonly employed to address overfi...
Bao-Hong Shen, Shuiwang Ji, Jieping Ye
Added 25 Nov 2009
Updated 25 Nov 2009
Type Conference
Year 2009
Where KDD
Authors Bao-Hong Shen, Shuiwang Ji, Jieping Ye
Comments (0)