Leveraging the margin more carefully

10 years 10 months ago
Leveraging the margin more carefully
Boosting is a popular approach for building accurate classifiers. Despite the initial popular belief, boosting algorithms do exhibit overfitting and are sensitive to label noise. Part of the sensitivity of boosting algorithms to outliers and noise can be attributed to the unboundedness of the margin-based loss functions that they employ. In this paper we describe two leveraging algorithms that build on boosting techniques and employ a bounded loss function of the margin. The first algorithm interleaves the expectation maximization (EM) algorithm with boosting steps. The second algorithm decomposes a non-convex loss into a difference of two convex losses. We prove that both algorithms converge to a stationary point. We also analyze the generalization properties of the algorithms using the Rademacher complexity. We describe experiments with both synthetic data and natural data (OCR and text) that demonstrate the merits of our framework, in particular robustness to outliers.
Nir Krause, Yoram Singer
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2004
Where ICML
Authors Nir Krause, Yoram Singer
Comments (0)