Bias in Estimating the Variance of K-Fold Cross-Validation

Bengio, Yoshua; Grandvalet, Yves

doi:10.1007/0-387-24555-3_5

Bias in Estimating the Variance of K-Fold Cross-Validation

Yoshua Bengio &
Yves Grandvalet

Chapter

2267 Accesses
26 Citations

Abstract

Most machine learning researchers perform quantitative experiments to estimate generalization error and compare the perforniance of different algorithms (in particular, their proposed algorithmn). In order to be able to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the very commonly used K-fold cross-validation estimator of generalization performance. The main theorem shows that there exists no universal (valid under all distributions) unbiased estimator of the variance of K-fold cross-validation, based on a single computation of the K-fold cross-validation estimator. The analysis that accompanies this result is based on the eigen-decomposition of the covariance matrix of errors, which has only three different eigenvalues corresponding to three degrees of freedom of the matrix and three components of the total variance. This analysis helps to better understand the nature of the problem and how it can make naive estimators (that don't take into account the error correlations due to the overlap between training and test sets) grossly underestimate variance. This is confirmed by numerical experiments in which the three components of the variance are compared when the difficulty of the learning problem and the number of folds are varied.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alpaydin, E. (1999). Combined 5 × 2 cv F test for comparing supervised classification learning algorithms. Neural Computation, 11:1885–1892.
Article Google Scholar
Anthony, M. and Holden, S.B. (1998). Cross-validation for binary classification by real-valued functions: Theoretical analysis. In Proceedings of the International Conference on Computational Learning Theory, pages 218–229.
Google Scholar
Blum, A., Kalai, A., and Langford, J. (1999). Beating the hold-out: Bounds for k-fold and progressive cross-validation. In Proceedings of the International Conference on Computational Learning Theory, pages 203–208.
Google Scholar
Breiman, L. (1996). Heuristics of instability and stabilization in model selection. The Annals of Statistics, 24:2350–2383.
Article MATH MathSciNet Google Scholar
Dawid, A.P. (1997). Prequential analysis. In Kotz, S., Read, C.B., and Banks, D.L., editors, Encyciopedia of Statistical Sciences, Update Volume 1, pages 464–470. Wiley-Interscience.
Google Scholar
Devroye, L., Györfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.
Google Scholar
Dietterich, T.G. (1999). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10:1895–1924.
Article Google Scholar
Efron, B. and Tibshirani, R.J. (1993). An Introduction to the Bootstrap, volume 57 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.
Google Scholar
Hastie, T.J. and Tibshirani, R.J. (1990). Generalized Additive Models. Volume 43 of Monographs on Statistics and Applied Probability, Chapman & Hall, London.
Google Scholar
Kearns, M. and Ron, D. (1996). Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation, 11:1427–1453.
Article Google Scholar
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1137–1143.
Google Scholar
Nadeau, C. and Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52:239–281.
Article Google Scholar
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B, 36:111–147.
MATH Google Scholar

Download references

Authors

Yoshua Bengio
View author publications
You can also search for this author in PubMed Google Scholar
Yves Grandvalet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Université de Montréal and GERAD, Montréal
Pierre Duchesne
HEC Montréal and GERAD, Montréal
Bruno RÉMillard

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bengio, Y., Grandvalet, Y. (2005). Bias in Estimating the Variance of K-Fold Cross-Validation. In: Duchesne, P., RÉMillard, B. (eds) Statistical Modeling and Analysis for Complex Data Problems. Springer, Boston, MA. https://doi.org/10.1007/0-387-24555-3_5

Download citation

DOI: https://doi.org/10.1007/0-387-24555-3_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24554-6
Online ISBN: 978-0-387-24555-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics