Keeping Neural Networks Simple

Hinton, Geoffrey E.; van Camp, Drew

doi:10.1007/978-1-4471-2063-6_2

Geoffrey E. Hinton² &
Drew van Camp²

Included in the following conference series:

International Conference on Artificial Neural Networks

65 Accesses
4 Citations

Abstract

Supervised neural networks generalize well if there is much less information in the weights than there is in the output vectors of the training cases. So during learning, it is important to keep the weights simple by penalizing the amount of information they contain. The amount of information in a weight can be controlled by adding Gaussian noise and the noise level can be adapted during learning to optimize the trade-off between the expected squared error and the information in the weights. We describe a method of computing the derivatives of the expected squared error and of the amount of information in the noisy weights in a network that contains a layer of non-linear hidden units. Provided the output units are linear, the exact derivatives can be computed efficiently without time-consuming Monte Carlo simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hinton, G. E. (1987) Learning translation invariant recognition in a massively parallel network. In Goos, G. and Hartmanis, J., editors, PARLE: Parallel Architectures and Languages Europe, pages 1–13, Lecture Notes in Computer Science, Springer-Verlag, Berlin.
Google Scholar
Lang, K., Waibel, A. and Hinton, G. E. (1990) A Time-Delay Neural Network Architecture for Isolated Word Recognition. Neural Networks, 3, 23–43.
Article Google Scholar
Le Cun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W. and Jackel, L. D. (1989) Back-Propagation Applied to Handwritten Zipcode Recognition. Neural Computation, 1, 541–551.
Article Google Scholar
Mackay, D. J. C. (1992) A practical Bayesian framework for backpropagation networks. Neural Computation, 4, 448–472.
Article Google Scholar
Neal, R. M. (1993) Bayesian learning via stochastic dynamics. In Giles, C. L., Hanson, S. J. and Cowan, J. D. (Eds), Advances in Neural Information Processing Systems 5, Morgan Kaufmann, San Mateo CA.
Google Scholar
Nowlan. S. J. and Hinton, G. E. (1992) Simplifying neural networks by soft weight sharing. Neural Computation, 4, 173–193.
Article Google Scholar
Rissanen, J. (1986) Stochastic Complexity and Modeling. Annals of Statistics, 14, 1080–1100.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, M5S 1A4, Canada
Geoffrey E. Hinton & Drew van Camp

Authors

Geoffrey E. Hinton
View author publications
You can also search for this author in PubMed Google Scholar
Drew van Camp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dutch Foundation for Neural Networks, University of Nijmegen, Geert Grooteplein 21, 6525 EZ, Nijmegen, The Netherlands
Stan Gielen & Bert Kappen &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hinton, G.E., van Camp, D. (1993). Keeping Neural Networks Simple. In: Gielen, S., Kappen, B. (eds) ICANN ’93. ICANN 1993. Springer, London. https://doi.org/10.1007/978-1-4471-2063-6_2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2063-6_2
Published: 10 April 2012
Publisher Name: Springer, London
Print ISBN: 978-3-540-19839-0
Online ISBN: 978-1-4471-2063-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics