Dirichlet Enhanced Latent Semantic Analysis

13 years 5 months ago
Dirichlet Enhanced Latent Semantic Analysis
This paper describes nonparametric Bayesian treatments for analyzing records containing occurrences of items. The introduced model retains the strength of previous approaches that explore the latent factors of each record (e.g. topics of documents), and further uncovers the clustering structure of records, which reflects the statistical dependencies of the latent factors. The nonparametric model induced by a Dirichlet process (DP) flexibly adapts model complexity to reveal the clustering structure of the data. To avoid the problems of dealing with infinite dimensions, we further replace the DP prior by a simpler alternative, namely Dirichlet-multinomial allocation (DMA), which maintains the main modelling properties of the DP. Instead of relying on Markov chain Monte Carlo (MCMC) for inference, this paper applies efficient variational inference based on DMA. The proposed approach yields encouraging empirical results on both a toy problem and text data. The results show that the propos...
Kai Yu, Shipeng Yu, Volker Tresp
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where LWA
Authors Kai Yu, Shipeng Yu, Volker Tresp
Comments (0)