Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
– This paper describes a text categorization approach that is based on a combination of a newly designed text representation with a kNN classifier. The new text document represen...
In many real-world domains, supervised learning requires a large number of training examples. In this paper, we describe an active learning method that uses a committee of learner...
Abstract. Most common feature selection techniques for document categorization are supervised and require lots of training data in order to accurately capture the descriptive and d...
Author identification is a text categorization task with applications in intelligence, criminal law, computer forensics, etc. Usually, in such cases there is shortage of training t...