Sciweavers

NIPS
2007

Distributed Inference for Latent Dirichlet Allocation

13 years 5 months ago
Distributed Inference for Latent Dirichlet Allocation
We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where each of ¢ processors only sees £¥¤¦¢ of the total data set. We propose two distributed inference schemes that are motivated from different perspectives. The first scheme uses local Gibbs sampling on each processor with periodic updates—it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme relies on a hierarchical Bayesian extension of the standard LDA model to directly account for the fact that data are distributed across ¢ processors—it has a theoretical guarantee of convergence but is more complex to implement than the approximate method. Using five real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are i...
David Newman, Arthur Asuncion, Padhraic Smyth, Max
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2007
Where NIPS
Authors David Newman, Arthur Asuncion, Padhraic Smyth, Max Welling
Comments (0)