Sciweavers

PRICAI
2000
Springer

A Comparative Study on Chinese Text Categorization Methods

13 years 8 months ago
A Comparative Study on Chinese Text Categorization Methods
Abstract. This paper reports our comparative evaluation of three machine learning methods on Chinese text categorization. Whereas a wide range of methods have been applied to English text categorization, relatively few studies have been done on Chinese text categorization. Based on a People's Daily news corpus, a series of controlled experiments evaluate three machine learning methods, namely k Nearest Neighbor (kNN) algorithm, Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM), in terms of their capabilities in mining categorization knowledge from high dimensional, sparse, and relatively noisy document feature vectors. Experiments reveal that all three methods produce satisfactory performance on the test corpus while ARAM exhibits a marginally better generalization capability, especially from relatively small and noisy training sets.
Ji He, Ah-Hwee Tan, Chew Lim Tan
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where PRICAI
Authors Ji He, Ah-Hwee Tan, Chew Lim Tan
Comments (0)