Avoiding Boosting Overfitting by Removing Confusing Samples

13 years 10 months ago

Download graphicon.ru

Boosting methods are known to exhibit noticeable overfitting on some datasets, while being immune to overfitting on other ones. In this paper we show that standard boosting algorithms are not appropriate in case of overlapping classes. This inadequateness is likely to be the major source of boosting overfitting while working with real world data. To verify our conclusion we use the fact that any overlapping classes’ task can be reduced to a deterministic task with the same Bayesian separating surface. This can be done by removing “confusing samples” – samples that are misclassified by a “perfect” Bayesian classifier. We propose an algorithm for removing confusing samples and experimentally study behavior of AdaBoost trained on the resulting data sets. Experiments confirm that removing confusing samples helps boosting to reduce the generalization error and to avoid overfitting on both synthetic and real world. Process of removing confusing samples also provides an accurate e...

Alexander Vezhnevets, Olga Barinova

Real-time Traffic

Boosting | Confusing Samples | ECML 2007 | Machine Learning | Standard Boosting Algorithms |

posted by Vezhnick

Added	07 Jun 2010
Updated	30 Aug 2010
Type	Conference
Year	2007
Where	ECML
Authors	Alexander Vezhnevets, Olga Barinova

	Towards Weakly Supervised Semantic Segmentation by Means of Multiple Instance and Multitask Learning. 1135 views
	Avoiding Boosting Overfitting by Removing Confusing Samples 263 views
	Agnostic Domain Adaptation 355 views
	Weakly supervised structured output learning for semantic segmentation 278 views

Sciweavers

Avoiding Boosting Overfitting by Removing Confusing Samples

Boosting | Confusing Samples | ECML 2007 | Machine Learning | Standard Boosting Algorithms |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers