Sciweavers

CIKM
2008
Springer

Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization

13 years 6 months ago
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a generalist, capable of classifying under all classes, to deliver a reasonably accurate initial category ranking given an instance. Edge then computes a confusion graph for the generalist and allocates the learning resources to train experts on relatively small groups of classes that tend to be systematically confused with one another by the generalist. The experts' votes, when invoked on a given instance, yield a reranking of the classes, thereby correcting the errors of the generalist. Our evaluations showcase the improved classification and ranking performance on several large-scale text categorization datasets. Edge is in particular efficient when the underlying learners are efficient. Our study of confusion graphs is also of independent interest. Categories and Subject Descriptors H.3.3 [Information ...
Jian Huang 0002, Omid Madani, C. Lee Giles
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where CIKM
Authors Jian Huang 0002, Omid Madani, C. Lee Giles
Comments (0)