Using Error-Correcting Codes for Text Classification

11 years 7 months ago
Using Error-Correcting Codes for Text Classification
This paper explores in detail the use of Error Correcting Output Coding (ECOC) for learning text classifiers. We show that the accuracy of a Naive Bayes Classifier over text classification tasks can be significantly improved by taking advantage of the error-correcting properties of the code. We also explore the use of different kinds of codes, namely Error-Correcting Codes, Random Codes, and Domain and Data-specific codes and give experimental results for each of them. The ECOC method scales well to large data sets with a large number of classes. Experiments on a real-world data set show a reduction in classification error by up to 66% over the traditional Naive Bayes Classifier. We also compare our empirical results to semitheoretical results and find that the two closely agree.
Rayid Ghani
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2000
Where ICML
Authors Rayid Ghani
Comments (0)