Sciweavers

ADVIS
2004
Springer

Multiple Sets of Rules for Text Categorization

13 years 10 months ago
Multiple Sets of Rules for Text Categorization
An important issue in text mining is how to make use of multiple pieces knowledge discovered to improve future decisions. In this paper, we propose a new approach to combining multiple sets of rules for text categorization using Dempster’s rule of combination. We develop a boosting-like technique for generating multiple sets of rules based on rough set theory and model classification decisions from multiple sets of rules as pieces of evidence which can be combined by Dempster’s rule of combination. We apply these methods to 10 of the 20-newsgroups – a benchmark data collection (Baker and McCallum 1998), individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data is statistically significantly better than that of the best single set of rules. The comparative analysis between the Dempster-Shafer and the majority voting (MV) methods along with an overfitting study con...
Yaxin Bi, Terry J. Anderson, Sally I. McClean
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where ADVIS
Authors Yaxin Bi, Terry J. Anderson, Sally I. McClean
Comments (0)