We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
In spite of the popularity of probabilistic mixture models for latent structure discovery from data, mixture models do not have a natural mechanism for handling sparsity, where ea...
A relational probability tree (RPT) is a type of decision tree that can be used for probabilistic classification of instances with a relational structure. Each leaf of an RPT cont...
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic ...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...