Phase transitions in the generalization behaviour of multilayer neural networks

Published under licence by IOP Publishing Ltd
, , Citation B Schottky 1995 J. Phys. A: Math. Gen. 28 4515 DOI 10.1088/0305-4470/28/16/010

0305-4470/28/16/4515

Abstract

We study the generalization ability of multilayer neural networks with tree architecture for the case of random training sets. The first layer consists of K spherical perceptrons with binary output. A Boolean function B computes the final output from the K values produced by the first layer. We first calculate the learning behaviour of Gibbs learning in the case of learnable rules, where teacher and student have the same architecture and the Boolean function B is permutation symmetric with respect to the hidden units. In the asymptotic case of high loading alpha to infinity (as usual alpha is the loading parameter) we find that the generalization error vanishes in the same way for all B; the reason being is that there are two effects cancelling each other. In the opposite limit for small alpha we find qualitatively different behaviour, i.e. some networks undergo a phase transition. We show how these differences in the behaviour depend on certain characteristics of the Boolean function B. We then study the Bayes algorithm, which generalizes according to the majority decision of a certain ensemble of machines, and find that the parity function plays a special role: if the teacher is a parity-machine there always exists a single student parity-machine that generalizes as well as the Bayes algorithm.

Export citation and abstract BibTeX RIS

10.1088/0305-4470/28/16/010