Noisy-OR Component Analysis and its Application to Link Analysis

11 years 1 months ago
Noisy-OR Component Analysis and its Application to Link Analysis
We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model that assumes the expression of observed high-dimensional binary data is driven by a small number of hidden binary sources combined via noisy-or units. The component analysis procedure is equivalent to learning of NOCA parameters. Since the classical EM formulation of the NOCA learning problem is intractable, we develop its variational approximation. We test the NOCA framework on two problems: (1) a synthetic image-decomposition problem and (2) a co-citation data analysis problem for thousands of CiteSeer documents. We demonstrate good performance of the new model on both problems. In addition, we contrast the model to two mixture-based latent-factor models: the probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA). Differing assumptions underlying these models cause them to discover...
Tomás Singliar, Milos Hauskrecht
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2006
Where JMLR
Authors Tomás Singliar, Milos Hauskrecht
Comments (0)