Distributed Inference for Latent Dirichlet Allocation

15 years 2 months ago

Download www.datalab.uci.edu

We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where each of ¢ processors only sees £¥¤¦¢ of the total data set. We propose two distributed inference schemes that are motivated from different perspectives. The ﬁrst scheme uses local Gibbs sampling on each processor with periodic updates—it is simple to implement and can be viewed as an approximation to a single processor implementation of Gibbs sampling. The second scheme relies on a hierarchical Bayesian extension of the standard LDA model to directly account for the fact that data are distributed across ¢ processors—it has a theoretical guarantee of convergence but is more complex to implement than the approximate method. Using ﬁve real-world text corpora we show that distributed learning works very well for LDA models, i.e., perplexity and precision-recall scores for distributed learning are i...

David Newman, Arthur Asuncion, Padhraic Smyth, Max

Real-time Traffic

Distributed Computation | Distributed Learning | Information Technology | LDA Models | NIPS 2007 |

claim paper

» Incorporating domain knowledge into topic modeling via Dirichlet Forest priors

» 2LDA Segmentation for Recognition

» Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation

» Topic models with powerlaw using PitmanYor process

» Latent Dirichlet Allocation

» PLDA Parallel Latent Dirichlet Allocation for LargeScale Applications

» Holistic Sentiment Analysis Across Languages Multilingual Supervised Latent Dirichlet Allo...

» A TopicBased Measure of Resource Description Quality for Distributed Information Retrieval

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	NIPS
Authors	David Newman, Arthur Asuncion, Padhraic Smyth, Max Welling

Comments (0)

Sciweavers

Distributed Inference for Latent Dirichlet Allocation

Distributed Computation | Distributed Learning | Information Technology | LDA Models | NIPS 2007 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers