Abstract
The current study investigates a method for avoidance of an overfitting/overtraining problem in Artificial Neural Network (ANN) based on a combination of two algorithms: Early Stopping and Ensemble averaging (ESE). We show that ESE provides an improvement of the prediction ability of ANN trained according to Cascade Correlation Algorithm. A simple algorithm to estimate the generalization ability of the method according to the Leave-One-Out technique is proposed and discussed. In the accompanying paper the problem of optimal selection of training cases is considered for accelerated learning of the ESE method.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
References
R. Hecht-Nielsen, “Kolmogorov's mapping neural network existence theorem”, Proc. Int. Conf. on Neural Networks, pp. 11–14, 1987.
G. Wahba, “Generalization and regularization in nonlinear learning systems”, The Handbook of Brain Theory and Neural Networks, M. Arbib (editor) MIT press, pp. 426–430, 1995.
S. Fahlman and C. Lebiere, “The cascade-correlation learning architecture”, NIPS, Vol. 2, pp. 524–532, 1990.
R. Reed and R. Pruning, “Algorithms - a survey”, IEEE Trans. Neural Networks, Vol. 4, pp. 740–747, 1993.
Y. Le Cun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard and L. Jackel, “Handwritten digit recognition with a backpropagation network, NIPS, Vol. 2, pp. 396–404, 1990.
S. Bös, “Avoiding overfitting by finite temperature learning and cross-validation”, Proc. ICANN'95 (Paris), Vol. 2, pp. 111–116, 1995.
D.J.C. MacKay, “A practical Bayesian framework for backpropagation networks”, Neural Computation, Vol. 4, pp. 448–472, 1992.
C.M. Bischop, Neural networks for pattern recognition, Oxford: Oxford University Press, 1995.
R. Hecht-Nielsen, Neurocomputing, Addison-Wesley, 1989.
I.V. Tetko, D.J. Livingstone and A.I. Luik, “Neural network studies. 1. Comparison of overfitting and overtraining”, J. Chem. Inf. Comput. Sci., Vol. 35, pp. 826–833, 1995.
L. Breimann, “Bagging predictors”, Machine Learning, Vol. 24, pp. 123–140, 1996.
I.V. Tetko and A.E.P. Villa, “Efficient partition of learning data sets for neural network training”, Neural Networks, 1997, in press.
S. Geman, E. Bienenstock and R. Dourstat, “Neural networks and the bias/variance dilemma”, Neural Computation, Vol. 4, pp. 1–58, 1992.
S. Amari, N. Murata, K.-R. Muller, M. Finke and H. Yang, “Asymptomic statistical theory of overtraining and cross-validation”, Technical report University of Tokyo, METR- 95/06, 1995.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Tetko, I.V., Villa, A.E. An Enhancement of Generalization Ability in Cascade Correlation Algorithm by Avoidance of Overfitting/Overtraining Problem. Neural Processing Letters 6, 43–50 (1997). https://doi.org/10.1023/A:1009610808553
Issue Date:
DOI: https://doi.org/10.1023/A:1009610808553