Sciweavers

ICML
2000
IEEE

Complete Cross-Validation for Nearest Neighbor Classifiers

14 years 5 months ago
Complete Cross-Validation for Nearest Neighbor Classifiers
Cross-validation is an established technique for estimating the accuracy of a classifier and is normally performed either using a number of random test/train partitions of the data, or using kfold cross-validation. We present a technique for calculating the complete cross-validation for nearest-neighbor classifiers: i.e., averaging over all desired test/train partitions of data. This technique is applied to several common classifier variants such as K-nearest-neighbor, stratified data partitioning and arbitrary loss functions. We demonstrate, with complexity analysis and experimental timing results, that the technique can be performed in time comparable to k-fold cross-validation, though in effect it averages an exponential number of trials. We show that the results of complete cross-validation are biased equally compared to subsampling and kfold cross-validation, and there is some reduction in variance. This algorithm offers significant benefits both in terms of time and accuracy.
Matthew D. Mullin, Rahul Sukthankar
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2000
Where ICML
Authors Matthew D. Mullin, Rahul Sukthankar
Comments (0)