This paper presents a statistical model for discovering topical clusters of words in unstructured text. The model uses a hierarchical Bayesian structure and it is also able to iden...
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
The bag-of-words representation has attracted a lot of attention recently in the field of object recognition. Based on the bag-of-words representation, topic models such as Probab...
In this paper, we proposed a novel probabilistic generative model to deal with explicit multiple-topic documents: Parametric Dirichlet Mixture Model(PDMM). PDMM is an expansion of...
Abstract. Database systems are islands of structure in a sea of unstructured data sources. Several real-world applications now need to create bridges for smooth integration of semi...