Sciweavers

SDM
2008
SIAM

Exploration and Reduction of the Feature Space by Hierarchical Clustering

13 years 6 months ago
Exploration and Reduction of the Feature Space by Hierarchical Clustering
In this paper we propose and test the use of hierarchical clustering for feature selection. The clustering method is Ward's with a distance measure based on GoodmanKruskal tau. We motivate the choice of this measure and compare it with other ones. Our hierarchical clustering is applied to over 40 data-sets from UCI archive. The proposed approach is interesting from many viewpoints. First, it produces the feature subsets dendrogram which serves as a valuable tool to study relevance relationships among features. Secondarily, the dendrogram is used in a feature selection algorithm to select the best features by a wrapper method. Experiments were run with three different families of classifiers: Naive Bayes, decision trees and k nearest neighbours. Our method allows all the three classifiers to generally outperform their corresponding ones without feature selection. We compare our feature selection with other state-of-the-art methods, obtaining on average a better classification accu...
Dino Ienco, Rosa Meo
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SDM
Authors Dino Ienco, Rosa Meo
Comments (0)