We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where ...
David Newman, Arthur Asuncion, Padhraic Smyth, Max...
Abstract—We present an empirical study to statistically analyze the equivalence of several traceability recovery methods based on Information Retrieval (IR) techniques. The analy...
Users of topic modeling methods often have knowledge about the composition of words that should have high or low probability in various topics. We incorporate such domain knowledg...
In this paper we propose a domainindependent text segmentation method, which consists of three components. Latent Dirichlet allocation (LDA) is employed to compute words semantic ...
Abstract. This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, non-...