We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...
In many practical domains, misclassification costs can differ greatly and may be represented by class ratios, however, most learning algorithms struggle with skewed class distrib...
William Klement, Peter A. Flach, Nathalie Japkowic...
Multinomial distributions over words are frequently used to model topics in text collections. A common, major challenge in applying all such topic models to any text mining proble...
The explosion of Web opinion data has made essential the need for automatic tools to analyze and understand people’s sentiments toward different topics. In most sentiment analy...
Clustering data in high dimensions is believed to be a hard problem in general. A number of efficient clustering algorithms developed in recent years address this problem by proje...
Kamalika Chaudhuri, Sham M. Kakade, Karen Livescu,...