Sciweavers

ERSHOV
2006
Springer

On the Importance of Parameter Tuning in Text Categorization

13 years 8 months ago
On the Importance of Parameter Tuning in Text Categorization
Abstract. Text Categorization algorithms have a large number of parameters that determine their behaviour, whose effect is not easily predicted objectively or intuitively and may very well depend on the corpus or on the document representation. Their values are usually taken over from previously published results, which may lead to less than optimal accuracy in experimenting on particular corpora. In this paper we investigate the effect of parameter tuning on the accuracy of two Text Categorization algorithms: the well-known Rocchio algorithm and the lesser-known Winnow. We show that the optimal parameter values for a specific corpus are sometimes very different from those found in literature. We show that the effect of individual parameters is corpus-dependent, and that parameter tuning can greatly improve the accuracy of both Winnow and Rocchio. We argue that the dependence of the categorization algorithms on experimentally established parameter values makes it hard to compare the ou...
Cornelis H. A. Koster, Jean Beney
Added 22 Aug 2010
Updated 22 Aug 2010
Type Conference
Year 2006
Where ERSHOV
Authors Cornelis H. A. Koster, Jean Beney
Comments (0)