Bagging Equalizes Influence

Grandvalet, Yves

doi:10.1023/B:MACH.0000027783.34431.42

Bagging Equalizes Influence

Published: June 2004

Volume 55, pages 251–270, (2004)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Bagging Equalizes Influence

Download PDF

Yves Grandvalet¹

1743 Accesses
93 Citations
2 Altmetric
Explore all metrics

Abstract

Bagging constructs an estimator by averaging predictors trained on bootstrap samples. Bagged estimates almost consistently improve on the original predictor. It is thus important to understand the reasons for this success, and also for the occasional failures. It is widely believed that bagging is effective thanks to the variance reduction stemming from averaging predictors. However, seven years from its introduction, bagging is still not fully understood. This paper provides experimental evidence supporting the hypothesis that bagging stabilizes prediction by equalizing the influence of training examples. This effect is detailed in two different frameworks: estimation on the real line and regression. Bagging’s improvements/deteriorations are explained by the goodness/badness of highly influential examples, in situations where the usual variance reduction argument is at best questionable. Finally, reasons for the equalization effect are advanced. They support that other resampling strategies such as half-sampling should provide qualitatively identical effects while being computationally less demanding than bootstrap sampling.

References

Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:1/2, 105–139.
Article Google Scholar
Breiman, L. (1996a). Bagging predictors. Machine Leanring, 24:2, 123–140.
Google Scholar
Breiman, L. (1996b). Bias, variance, and arcing classifiers. Technical R: Statistics Department, University of California at Berkeley.
Breiman, L. (1996c). Heuristics of instability and stabilization in model selection. The Annals of Statistics, 24:6, 2350–2383.
Article Google Scholar
Breiman, L. (1998). Arcing classifiers. The Annals of Statistics, 263,:801–849.
Google Scholar
Breiman, L. (1999), Prediction games and arcing algorithms.Neural Computation, 11:7, 1493–1517.
Article PubMed Google Scholar
Bitihlmann, P., & Yu, B. (2000). Explaining Bagging. Technical Report 92, Seminar för Statistik, ETH, Zörich.
Buja, A., & Stuetzle, W. (2000). The effect of bagging on variance, bias and mean squared error. Technical Report, AT & T Labs-Research.
Burgess, A. N. (1997). Estimating equivalint:kernels For neural networks: A data perturbation approach. In M. Mozer, M. Jordan, & T. Petsche’(Eds.l,;:Advances in neural information processing systems 9 (pp. 382–388). MIT Press.
Dietterich, T. G. (2000). An expr”:iental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting aid randomization. Machine Learning, 40:2,1–19.
Article Google Scholar
Domingos, P. (1997). Why does bagging work? A bayesian account and its implications. In Proceedings of the Third i nte tial Conference on Knowledge Discovery and Data Mining (pp. 155–158). AAAI Press.
Drucker, H.(1997). Improving regressors using boosting techniques. In Proceedings of the Fourteenth International Conference:ton Machine Learning (pp. 107–115). Morgan Kaufmann.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap, Vol. 57 of monographs on statistics and applied probability. Chapman & Hall.
Friedman, J. H., & Hall, P. (2000), On bagging and non-linear estimation. Technical Report, Stanford University, Stanford, CA.
Google Scholar
Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models, Vol. 43 of monographs on statistics and appliedprobability. Chapman & Hall.
Huber, P. J. (1981). Robust statistics. Wiley.
Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. In Proceedings of the Fourteenth National Conference on Artificial Intelligence (pp. 546–551). AAAI Press.
Quinlan, J. R. (1996). Bagging, boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725–730). AAAI Press.
Rao, J. S., & Tibshirani, R. J. (1997). The out-of-bootstrap method for model averaging and selection. Technical Report, University of Toronto.
Raviv, Y., & Intrator, N. (1996). Bootstrapping with noise: An effective regularization technique. Connection Science, 8:3, 355–372.
Article Google Scholar
Rousseeuw, P. J. (1997). Robust regression, positive breakdown. In S. Kotz, C. Read, & D. Banks (Eds.), Encv-clopedia of statistical sciences. Wiley.
Saporta, G. (1990). Probabilits, analyse des donnies et statistique. Paris: Editions Technip.
Google Scholar
Schapire, R., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26:5, 1651–1686.
Article Google Scholar
Taniguchi, M., & Tresp, V. (1997). Averaging regularized estimators. Neural computation, 9:7, 1163–1178.
Google Scholar
Tibshirani, R. J., & Knight, K. (1999). The covariance inflation criterion for adaptive model selection. Journal of the Royal Statistical Society, B, 61:3, 529–546.
Article Google Scholar
Wolpert, D. H., & Macready, W. G. (1996). Combining stacking with bagging to improve a learning algorithm. Technical Report SFI-TR–96–03–123, Santa Fe Institute.

Download references

Author information

Authors and Affiliations

Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, France
Yves Grandvalet

Authors

Yves Grandvalet
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grandvalet, Y. Bagging Equalizes Influence. Mach Learn 55, 251–270 (2004). https://doi.org/10.1023/B:MACH.0000027783.34431.42

Download citation

Issue Date: June 2004
DOI: https://doi.org/10.1023/B:MACH.0000027783.34431.42

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bagging Equalizes Influence

Abstract

Article PDF

Similar content being viewed by others

An Empirical Methodology to Analyze the Behavior of Bagging

Bootstrap bias corrections for ensemble methods

Bootstrap Aggregating and Random Forest

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Bagging Equalizes Influence

Abstract

Article PDF

Similar content being viewed by others

An Empirical Methodology to Analyze the Behavior of Bagging

Bootstrap bias corrections for ensemble methods

Bootstrap Aggregating and Random Forest

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation