Accelerating the convergence of the back-propagation method

Vogl, T. P.; Mangis, J. K.; Rigler, A. K.; Zink, W. T.; Alkon, D. L.

doi:10.1007/BF00332914

Accelerating the convergence of the back-propagation method

Published: 01 September 1988

Volume 59, pages 257–263, (1988)
Cite this article

Biological Cybernetics Aims and scope Submit manuscript

T. P. Vogl¹,
J. K. Mangis¹,
A. K. Rigler²,
W. T. Zink¹ &
…
D. L. Alkon³

1658 Accesses
842 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

The utility of the back-propagation method in establishing suitable weights in a distributed adaptive network has been demonstrated repeatedly. Unfortunately, in many applications, the number of iterations required before convergence can be large. Modifications to the back-propagation algorithm described by Rumelhart et al. (1986) can greatly accelerate convergence. The modifications consist of three changes:1) instead of updating the network weights after each pattern is presented to the network, the network is updated only after the entire repertoire of patterns to be learned has been presented to the network, at which time the algebraic sums of all the weight changes are applied:2) instead of keeping η, the “learning rate” (i.e., the multiplier on the step size) constant, it is varied dynamically so that the algorithm utilizes a near-optimum η, as determined by the local optimization topography; and3) the momentum factor α is set to zero when, as signified by a failure of a step to reduce the total error, the information inherent in prior steps is more likely to be misleading than beneficial. Only after the network takes a useful step, i.e., one that reduces the total error, does α again assume a non-zero value. Considering the selection of weights in neural nets as a problem in classical nonlinear optimization theory, the rationale for algorithms seeking only those weights that produce the globally minimum error is reviewed and rejected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akaike H (1959) On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method. Ann Inst Statist Math 11:1–17
Article Google Scholar
Alkon DL (1983) Learning in a marine snail. Scientific American (July 1984), pp 70–84
Alkon DL (1984) Calcium-mediated reduction in ionic currents: a biophysical memory trace. Science 226:1037–1945
Article CAS PubMed Google Scholar
Alkon DL (1985) Conditioning — induced changes ofHermissenda channels: relevance to mammalian brain function. In: Weinberger NM, McGaugh JL, Lynch G (eds) Memory systems of the brain. The Guiford Press, New York
Google Scholar
Levy AV, Gomez S (1985) The tunneling method applied to global optimization. In: Boggs PT, Byrd RH, Schnabel RB (eds) Numerical optimization 1984. SIAM, Philadelphia, pp 213–244
Google Scholar
Luenberger DG (1984) Linear and nonlinear programming, 2nd ed. Addison-Wesley, Reading, Mass
Google Scholar
Lundy M, Mees A (1986) Convergence of an annealing algorithm. Math Prog 34:111–124
Article Google Scholar
Muneer T (1988) Comparison of optimization methods for nonlinear least squares minimization. Int J Math Educ Sci Tech 19:192–197
Google Scholar
Pardalos PM, Rosen JB (1986) Methods for global concave minimization: a bibliographic survey. SIAM Rev 28:367–379
Article Google Scholar
Parker DB (1987) Optimal algorithms for adaptive networks: second order back propagation, second order direct propagation, and second order Hebbian learning. In: Caudill M, Butler C (eds) Proceedings of the 1st International Conference on Neural Networks, San Diego, Calif., June 1987. IEEE Cat. #87TH0191-7, pp II-593-II-600
Pegis RJ, Grey DS, Vogl TP, Rigler AK (1966) The generalized orthonormal optimization program and its applications. In: Lavi A, Vogl TP (eds) Recent advances in optimization techniques, Wiley, New York, pp 47–60
Google Scholar
Pineda FJ (1987) Generalization of back propagation to recurrent and higher order neural networks. Proceedings of the IEEE Conference on Neural Information Processing Systems, Denver, Colorado 1987: (to be published)
Rinnooy-Kan AHG, Timmer GT (1985) A stochastic approach to global optimization. In: Boggs PT, Byrd RH, Schnabel RB (eds) Numerical optimization 1984. STAM, Philadelphia, pp 245–262
Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representation by error propagation. In: Rumelhart DE, McClelland JL and the PDP Research Group (eds) Parallel distributed processing, vol 1, chap 8. MIT Press, Cambridge, Mass
Google Scholar
Walster GW, Hansen ER, Sengupta S (1985) Test results for a global optimization algorithm. In: Boggs PT, Byrd RH, Schnabel RB (eds) Numerical optimization 1984. SIAM, Philadelphia, pp 272–287
Google Scholar
Watson LT (1986) Numerical linear algebra aspects of globally convergent homotopy methods. SIAM Rev 28:529–545
Article Google Scholar
Whitson GM (1988) An introduction the the parallel distributed processing model of cognition and some examples of how it is changing the teaching of artificial intelligence. In: Dreshem HL (ed) Proceedings of the 19th Annual Technical Symposium on Comp Sci Education. ACM, New York, pp 59–62
Google Scholar
Whitson GM, Kulkarni A (1988) A testbed for sensory PDP models. Proceedings of the 16th Annual Comp Sci Conf. ACM, New York, pp 467–468

Download references

Author information

Authors and Affiliations

Environmental Research Institute of Michigan, 1501 Wilson Boulevard, 22209, Arlington, VA, USA
T. P. Vogl, J. K. Mangis & W. T. Zink
Computer Science Department, University of Missouri, 65401, Rolla, MO, USA
A. K. Rigler
Neural Systems Section, National Institute of Neurological and Communicative Disorders and Stroke, NIH, 9000 Rockville Pike, 20892, Bethesda, MD, USA
D. L. Alkon

Authors

T. P. Vogl
View author publications
You can also search for this author in PubMed Google Scholar
J. K. Mangis
View author publications
You can also search for this author in PubMed Google Scholar
A. K. Rigler
View author publications
You can also search for this author in PubMed Google Scholar
W. T. Zink
View author publications
You can also search for this author in PubMed Google Scholar
D. L. Alkon
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vogl, T.P., Mangis, J.K., Rigler, A.K. et al. Accelerating the convergence of the back-propagation method. Biol. Cybern. 59, 257–263 (1988). https://doi.org/10.1007/BF00332914

Download citation

Received: 22 October 1987
Published: 01 September 1988
Issue Date: September 1988
DOI: https://doi.org/10.1007/BF00332914

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating the convergence of the back-propagation method

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Siamese Neural Networks: An Overview

Autoencoders and their applications in machine learning: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating the convergence of the back-propagation method

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

Siamese Neural Networks: An Overview

Autoencoders and their applications in machine learning: a survey

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation