Sciweavers

IFIP12
2008

A Study with Class Imbalance and Random Sampling for a Decision Tree Learning System

13 years 6 months ago
A Study with Class Imbalance and Random Sampling for a Decision Tree Learning System
Sampling methods are a direct approach to tackle the problem of class imbalance. These methods sample a data set in order to alter the class distributions. Usually these methods are applied to obtain a more balanced distribution. An open-ended question about sampling methods is which distribution can provide the best results, if any. In this work we develop a broad empirical study aiming to provide more insights into this question. Our results suggest that altering the class distribution can improve the classification performance of classifiers considering AUC as a performance metric. Furthermore, as a general recommendation, random over-sampling to balance distribution is a good starting point in order to deal with class imbalance.
Ronaldo C. Prati, Gustavo E. A. P. A. Batista, Mar
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where IFIP12
Authors Ronaldo C. Prati, Gustavo E. A. P. A. Batista, Maria Carolina Monard
Comments (0)