Sciweavers

SAC
2006
ACM

Exploiting partial decision trees for feature subset selection in e-mail categorization

13 years 10 months ago
Exploiting partial decision trees for feature subset selection in e-mail categorization
In this paper we propose PARTfs which adopts a supervised machine learning algorithm, namely partial decision trees, as a method for feature subset selection. In particular, it is shown that an aggressive reduction of the feature space can be achieved with PARTfs while still allowing for comparable classification results with conventional feature selection metrics. The approach is empirically verified by employing two different document representations and four different text classification algorithms that are applied to a document collection consisting of personal e-mail messages. The results show that a reduction of the feature space in the magnitude of ten is achievable without loss of classification accuracy. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Retrieval; H.3.1 [Information Systems]: Content Analysis and Indexing General Terms Indexing methods, information filtering, feature selection Keywords Text categorization, machine l...
Helmut Berger, Dieter Merkl, Michael Dittenbach
Added 14 Jun 2010
Updated 14 Jun 2010
Type Conference
Year 2006
Where SAC
Authors Helmut Berger, Dieter Merkl, Michael Dittenbach
Comments (0)