Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (suc...
Matrix factorization is a fundamental technique in machine learning that is applicable to collaborative filtering, information retrieval and many other areas. In collaborative fil...
Given a movie comment, does it contain a spoiler? A spoiler is a comment that, when disclosed, would ruin a surprise or reveal an important plot detail. We study automatic methods...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...