Reexamining the principle of mean-variance preservation for neural network initialization

Open Access

Reexamining the principle of mean-variance preservation for neural network initialization

Kyle Luther and H. Sebastian Seung

Phys. Rev. Research 2, 033135 – Published 24 July 2020

Abstract

Before backpropagation training, it is common to randomly initialize a neural network so that mean and variance of activity are uniform across neurons. Classically these statistics were defined over an ensemble of random networks. Alternatively, they can be defined over a random sample of inputs to the network. We show analytically and numerically that these two formulations of the principle of mean-variance preservation are very different in deep networks using rectification nonlinearity (ReLU). We numerically investigate training speed after data-dependent initialization of networks to preserve sample mean and variance.

Received 30 January 2020
Accepted 23 June 2020

DOI:https://doi.org/10.1103/PhysRevResearch.2.033135

Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.

Published by the American Physical Society

Physics Subject Headings (PhySH)

Learning

Artificial neural networks

Machine learning

Interdisciplinary PhysicsStatistical Physics & ThermodynamicsNetworks

Authors & Affiliations

Kyle Luther^1,2,* and H. Sebastian Seung ^2,3

¹Department of Physics, Princeton University, Princeton, New Jersey 08544, USA
²Neuroscience Institute, Princeton University, Princeton, New Jersey 08544, USA
³Department of Computer Science, Princeton University, Princeton, New Jersey 08544, USA

^*kluther@princeton.edu

Article Text

Click to Expand

Supplemental Material

Click to Expand

References

Click to Expand

Issue

Vol. 2, Iss. 3 — July - September 2020

Subject Areas

Reuse & Permissions

Author publication services for translation and copyediting assistance advertisement

Physical Review Research