Cross-language Text Categorization is the task of assigning semantic classes to documents written in a target language (e.g. English) while the system is trained using labeled doc...
This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization. In summary we show that a higher performance -- as measured ...
When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we...
This paper presents a cluster-based text categorization system which uses class distributional clustering of words. We propose a new clustering model which considers the global in...
Automatic text categorization is a problem of automatically assigning text documents to predefined categories. In order to classify text documents, we must extract good features f...