Sciweavers

KDD
2002
ACM

Exploiting unlabeled data in ensemble methods

14 years 4 months ago
Exploiting unlabeled data in ensemble methods
An adaptive semi-supervised ensemble method, ASSEMBLE, is proposed that constructs classification ensembles based on both labeled and unlabeled data. ASSEMBLE alternates between assigning "pseudo-classes" to the unlabeled data using the existing ensemble and constructing the next base classifier using both the labeled and pseudolabeled data. Mathematically, this intuitive algorithm corresponds to maximizing the classification margin in hypothesis space as measured on both the labeled and unlabeled data. Unlike alternative approaches, ASSEMBLE does not require a semi-supervised learning method for the base classifier. ASSEMBLE can be used in conjunction with any cost-sensitive classification algorithm for both two-class and multi-class problems. ASSEMBLE using decision trees won the NIPS 2001 Unlabeled Data Competition. In addition, strong results on several benchmark datasets using both decision trees and neural networks support the proposed method. Keywords Boosting, Semi-S...
Kristin P. Bennett, Ayhan Demiriz, Richard Maclin
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2002
Where KDD
Authors Kristin P. Bennett, Ayhan Demiriz, Richard Maclin
Comments (0)