Sciweavers

BMCBI
2006

Bias in error estimation when using cross-validation for model selection

13 years 4 months ago
Bias in error estimation when using cross-validation for model selection
Background: Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data. Results: We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (w...
Sudhir Varma, Richard Simon
Added 10 Dec 2010
Updated 10 Dec 2010
Type Journal
Year 2006
Where BMCBI
Authors Sudhir Varma, Richard Simon
Comments (0)