Probabilistic topic models have become popular as methods for dimensionality reduction in collections of text documents or images. These models are usually treated as generative m...
Sentiment classification is the task of labeling a review document according to the polarity of its prevailing opinion (favorable or unfavorable). In approaching this problem, a m...
Philip Beineke, Trevor Hastie, Shivakumar Vaithyan...
When dealing with information overload from the Internet, such as the classification of Web pages and the filtering of email spam, a new technique called cotraining has been shown...
The paper presents an approach to the task of automatic document categorization in the field of economics. Since the documents can be annotated with multiple keywords (labels), we ...
The recent proliferation of XML-based standards and technologies for managing data on the Web demonstrates the need for effective and efficient management of treestructured data....