Topics in 0--1 data

12 years 12 days ago
Topics in 0--1 data
Large 0-1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which indicate why certain methods succeed or fail. We describe simple algorithms for finding topic models from 0-1 data. We give theoretical results showing that the algorithms can discover the epsilon-separable topic models of Papadimitriou et al. We present empirical results showing that the algorithms find natural topics in real-world data sets. We also briefly discuss the connections to matrix approaches, including nonnegative matrix factorization and independent component analysis. Categories and Subject Descriptors G.3 [Probability and Statistics]: Contingency table analysis; H.2.8 [Database Management]: Database Applications--Data mining; I.5.1 [Pattern Recognition]: Models--Structural General Terms Algorithms, Theory
Ella Bingham, Heikki Mannila, Jouni K. Seppän
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2002
Where KDD
Authors Ella Bingham, Heikki Mannila, Jouni K. Seppänen
Comments (0)