Sciweavers

ESEM
2007
ACM

The Effects of Over and Under Sampling on Fault-prone Module Detection

13 years 8 months ago
The Effects of Over and Under Sampling on Fault-prone Module Detection
The goal of this paper is to improve the prediction performance of fault-prone module prediction models (fault-proneness models) by employing over/under sampling methods, which are preprocessing procedures for a fit dataset. The sampling methods are expected to improve prediction performance when the fit dataset is imbalanced, i.e. there exists a large difference between the number of fault-prone modules and not-fault-prone modules. So far, there has been no research reporting the effects of applying sampling methods to fault-proneness models. In this paper, we experimentally evaluated the effects of four sampling methods (random over sampling, synthetic minority over sampling, random under sampling and one-sided selection) applied to four fault-proneness models (linear discriminant analysis, logistic regression analysis, neural network and classification tree) by using two module sets of industry legacy software. All four sampling methods improved the prediction performance of the li...
Yasutaka Kamei, Akito Monden, Shinsuke Matsumoto,
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2007
Where ESEM
Authors Yasutaka Kamei, Akito Monden, Shinsuke Matsumoto, Takeshi Kakimoto, Ken-ichi Matsumoto
Comments (0)