Sciweavers

AUSDM
2006
Springer

A Study of Local and Global Thresholding Techniques in Text Categorization

13 years 8 months ago
A Study of Local and Global Thresholding Techniques in Text Categorization
Feature Filtering is an approach that is widely used for dimensionality reduction in text categorization. In this approach feature scoring methods are used to evaluate features leading to selection. Thresholding is then applied to select the highest scoring features either locally or globally. In this paper, we investigate several local and global feature selection methods. The usage of Standard Deviation (STD) and Maximum Deviation (MD) as globalization schemes is suggested. This work provides a comparative study among fourteen thresholding techniques using different scoring methods and benchmark datasets of diverse nature. This includes investigation of normalizing feature scores before combining them in the global pool. The results suggest that normalized MD outperforms other methods in thresholding Document Frequency (DF) scores using even and moderate diverse data-sets. Furthermore, the results indicated that normalizing feature scores improves the performance of rare categories ...
Nayer M. Wanas, Dina A. Said, Nevin M. Darwish, Na
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where AUSDM
Authors Nayer M. Wanas, Dina A. Said, Nevin M. Darwish, Nadia Hegazy
Comments (0)