Integrating feature and instance selection for text classification

16 years 4 months ago

Download www.meretakis.gr

Instance selection and feature selection are two orthogonal methods for reducing the amount and complexity of data. Feature selection aims at the reduction of redundant features in a dataset whereas instance selection aims at the reduction of the number of instances. So far, these two methods have mostly been considered in isolation. In this paper, we present a new algorithm, which we call FIS (Feature and Instance Selection) that targets both problems simultaneously in the context of text classification Our experiments on the Reuters and 20-Newsgroups datasets show that FIS considerably reduces both the number of features and the number of instances. The accuracy of a range of classifiers including Na?ve Bayes, TAN and LB considerably improves when using the FIS preprocessed datasets, matching and exceeding that of Support Vector Machines, which is currently considered to be one of the best text classification methods. In all cases the results are much better compared to Mutual Infor...

Dimitris Fragoudis, Dimitris Meretakis, Spiros Lik

Real-time Traffic

Data Mining | Feature Selection | FIS Preprocessed Datasets | Instance Selection | KDD 2002 |

claim paper

» Generalized LARS as an effective feature selection tool for text classification with SVMs

» An Instance Selection Approach to Multiple Instance Learning

» Selection of Training Instances for Music Genre Classification

» An Improved Method of Feature Selection Based on Concept Attributes in Text Classification

» Feature Selection for Text Classification Based on Gini Coefficient of Inequality

» Feature selection methods for text classification

» Selecting Features for Ordinal Text Classification

» Feature Selection Using Improved Mutual Information for Text Classification

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2002
Where	KDD
Authors	Dimitris Fragoudis, Dimitris Meretakis, Spiros Likothanassis

Comments (0)

Sciweavers

Integrating feature and instance selection for text classification

Data Mining | Feature Selection | FIS Preprocessed Datasets | Instance Selection | KDD 2002 |

Explore & Download

Productivity Tools

Sciweavers