Most approaches to topic modeling assume an independence between documents that is frequently violated. We present an topic model that makes use of one or more user-specified grap...
The four-level pachinko allocation model (PAM) (Li & McCallum, 2006) represents correlations among topics using a DAG structure. It does not, however, represent a nested hiera...
This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic. Connectivity...
In this paper, we study the distribution of relevant documents in aggregates, formed by grouping the retrieved documents according to their domain. For each aggregate, we take int...
Latent Dirichlet allocation is a fully generative statistical language model that has been proven to be successful in capturing both the content and the topics of a corpus of docum...