Abstract
Most machine learning researchers perform quantitative experiments to estimate generalization error and compare the perforniance of different algorithms (in particular, their proposed algorithmn). In order to be able to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the very commonly used K-fold cross-validation estimator of generalization performance. The main theorem shows that there exists no universal (valid under all distributions) unbiased estimator of the variance of K-fold cross-validation, based on a single computation of the K-fold cross-validation estimator. The analysis that accompanies this result is based on the eigen-decomposition of the covariance matrix of errors, which has only three different eigenvalues corresponding to three degrees of freedom of the matrix and three components of the total variance. This analysis helps to better understand the nature of the problem and how it can make naive estimators (that don't take into account the error correlations due to the overlap between training and test sets) grossly underestimate variance. This is confirmed by numerical experiments in which the three components of the variance are compared when the difficulty of the learning problem and the number of folds are varied.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alpaydin, E. (1999). Combined 5 × 2 cv F test for comparing supervised classification learning algorithms. Neural Computation, 11:1885–1892.
Anthony, M. and Holden, S.B. (1998). Cross-validation for binary classification by real-valued functions: Theoretical analysis. In Proceedings of the International Conference on Computational Learning Theory, pages 218–229.
Blum, A., Kalai, A., and Langford, J. (1999). Beating the hold-out: Bounds for k-fold and progressive cross-validation. In Proceedings of the International Conference on Computational Learning Theory, pages 203–208.
Breiman, L. (1996). Heuristics of instability and stabilization in model selection. The Annals of Statistics, 24:2350–2383.
Dawid, A.P. (1997). Prequential analysis. In Kotz, S., Read, C.B., and Banks, D.L., editors, Encyciopedia of Statistical Sciences, Update Volume 1, pages 464–470. Wiley-Interscience.
Devroye, L., Györfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.
Dietterich, T.G. (1999). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10:1895–1924.
Efron, B. and Tibshirani, R.J. (1993). An Introduction to the Bootstrap, volume 57 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.
Hastie, T.J. and Tibshirani, R.J. (1990). Generalized Additive Models. Volume 43 of Monographs on Statistics and Applied Probability, Chapman & Hall, London.
Kearns, M. and Ron, D. (1996). Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation, 11:1427–1453.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1137–1143.
Nadeau, C. and Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52:239–281.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B, 36:111–147.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Bengio, Y., Grandvalet, Y. (2005). Bias in Estimating the Variance of K-Fold Cross-Validation. In: Duchesne, P., RÉMillard, B. (eds) Statistical Modeling and Analysis for Complex Data Problems. Springer, Boston, MA. https://doi.org/10.1007/0-387-24555-3_5
Download citation
DOI: https://doi.org/10.1007/0-387-24555-3_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24554-6
Online ISBN: 978-0-387-24555-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)