Sciweavers

KAIS
2010

Boosting support vector machines for imbalanced data sets

13 years 3 months ago
Boosting support vector machines for imbalanced data sets
Real world data mining applications must address the issue of learning from imbalanced data sets. The problem occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed vector spaces or lack of information. Common approaches for dealing with the class imbalance problem involve modifying the data distribution or modifying the classifier. In this work, we choose to use a combination of both approaches. We use support vector machines with soft margins as the base classifier to solve the skewed vector spaces problem. Then we use a boosting algorithm to get an ensemble classifier that has lower error than a single classifier. We found that this ensemble of SVMs makes an impressive improvement in prediction performance, not only for the majority class, but also for the minority class.
Benjamin X. Wang, Nathalie Japkowicz
Added 29 Jan 2011
Updated 29 Jan 2011
Type Journal
Year 2010
Where KAIS
Authors Benjamin X. Wang, Nathalie Japkowicz
Comments (0)