Sciweavers

NIPS
2008

Asynchronous Distributed Learning of Topic Models

13 years 5 months ago
Asynchronous Distributed Learning of Topic Models
Distributed learning is a problem of fundamental interest in machine learning and cognitive science. In this paper, we present asynchronous distributed learning algorithms for two well-known unsupervised learning frameworks: Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Processes (HDP). In the proposed approach, the data are distributed across P processors, and processors independently perform Gibbs sampling on their local data and communicate their information in a local asynchronous manner with other processors. We demonstrate that our asynchronous algorithms are able to learn global topic models that are statistically as accurate as those learned by the standard LDA and HDP samplers, but with significant improvements in computation time and memory. We show speedup results on a 730-million-word text corpus using 32 processors, and we provide perplexity results for up to 1500 virtual processors. As a stepping stone in the development of asynchronous HDP, a parallel HDP...
Arthur Asuncion, Padhraic Smyth, Max Welling
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where NIPS
Authors Arthur Asuncion, Padhraic Smyth, Max Welling
Comments (0)