Instance selection and feature selection are two orthogonal methods for reducing the amount and complexity of data. Feature selection aims at the reduction of redundant features i...
Clustering for better representation of the diversity of text or image search results has been studied extensively. In this paper, we extend this methodology to the novel domain o...
This paper presents an empirical study for improving the performance of text chunking. We focus on two issues: the problem of selecting feature spaces, and the problem of alleviat...
This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods we...
Supervised text categorization is a machine learning task where a predefined category label is automatically assigned to a previously unlabelled document based upon characteristic...