Sciweavers

BMCBI
2010

The behaviour of random forest permutation-based variable importance measures under predictor correlation

13 years 4 months ago
The behaviour of random forest permutation-based variable importance measures under predictor correlation
Background: Random forests (RF) have been increasingly used in applications such as genome-wide association and microarray studies where predictor correlation is frequently observed. Recent works on permutation-based variable importance measures (VIMs) used in RF have come to apparently contradictory conclusions. We present an extended simulation study to synthesize results. Results: In the case when both predictor correlation was present and predictors were associated with the outcome (HA), the unconditional RF VIM attributed a higher share of importance to correlated predictors, while under the null hypothesis that no predictors are associated with the outcome (H0) the unconditional RF VIM was unbiased. Conditional VIMs showed a decrease in VIM values for correlated predictors versus the unconditional VIMs under HA and was unbiased under H0. Scaled VIMs were clearly biased under HA and H0. Conclusions: Unconditional unscaled VIMs are a computationally tractable choice for large data...
Kristin K. Nicodemus, James D. Malley, Carolin Str
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2010
Where BMCBI
Authors Kristin K. Nicodemus, James D. Malley, Carolin Strobl, Andreas Ziegler
Comments (0)