On the Dangers of Cross-Validation. An Experimental Evaluation

9 years 18 days ago
On the Dangers of Cross-Validation. An Experimental Evaluation
Cross validation allows models to be tested using the full training set by means of repeated resampling; thus, maximizing the total number of points used for testing and potentially, helping to protect against overfitting. Improvements in computational power, recent reductions in the (computational) cost of classification algorithms, and the development of closed-form solutions (for performing cross validation in certain classes of learning algorithms) makes it possible to test thousand or millions of variants of learning models on the data. Thus, it is now possible to calculate cross validation performance on a much larger number of tuned models than would have been possible otherwise. However, we empirically show how under such large number of models the risk for overfitting increases and the performance estimated by cross validation is no longer an effective estimate of generalization; hence, this paper provides an empirical reminder of the dangers of cross validation. We use a clo...
R. Bharat Rao, Glenn Fung
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SDM
Authors R. Bharat Rao, Glenn Fung
Comments (0)