Slow stochastic learning with global inhibition: a biological solution to the binary perceptron problem

doi:10.1016/j.neucom.2004.01.062

Neurocomputing

Volumes 58–60, June 2004, Pages 321-326

https://doi.org/10.1016/j.neucom.2004.01.062 Get rights and content

Abstract

Networks of neurons connected by plastic all-or-none synapses tend to quickly forget previously acquired information when new patterns are learned. This problem could be solved for random uncorrelated patterns by randomly selecting a small fraction of synapses to be modified upon each stimulus presentation (slow stochastic learning). Here we show that more complex, but still linearly separable patterns, can be learned by networks with binary excitatory synapses in a finite number of presentations provided that: (1) there is non-vanishing global inhibition, (2) the binary synapses are changed with small enough probability (slow learning) only when the output neuron does not give the desired response (as in the classical perceptron rule) and (3) the neuronal threshold separating the total synaptic inputs corresponding to different classes is small enough.

Introduction

The strength of biological synapses can only vary within a limited range, and there is accumulating evidence that some synapses can only preserve a restricted number of states (some seem to have only two [4]). These constraints have dramatic effects on networks performing as classifiers or as an associative memory. Networks of neurons connected by bounded synapses which cannot be changed by an arbitrarily small amount share the palimpsest property (see e.g. [2]): new patterns overwrite the oldest ones, and only a limited number of patterns can be remembered. The more synapses changed on each stimulus presentation, the faster is forgetting. Moreover, learning to separate two classes of patterns with discrete synaptic weights is a combinatorially hard problem (the ‘binary perceptron problem’, see [1]). Fast forgetting can be avoided by changing only a small fraction of synapses, chosen randomly at each presentation. Stochastic selection permits the classification and memorization of an extensive number of random patterns, even if the number of synaptic states is reduced to two [2]. However, additional mechanisms must be introduced to store more realistic patterns with correlated components. The solution we study here is based on the perceptron learning rule: the synapses are changed with some probability only when the response of the post-synaptic cell is not the desired one. This ‘stop-learning’ property might be the expression of some regulatory synaptic mechanisms or the effect of a reward signal. Together with global inhibition, a small synaptic transition probability and a small neuronal threshold are sufficient to learn and memorize any linearly separable set of patterns.

Section snippets

The model

Neuron model: We consider a single postsynaptic neuron which receives excitatory inputs from N presynaptic neurons, and an inhibitory input which is proportional to the total activity of the N excitatory neurons. The postsynaptic neuron is either active or inactive, depending on whether the total postsynaptic current h is above or below a threshold θ₀. The total current is calculated by the weighted sum of the synaptic inputs ξ_j, $h=1/N ∑_{j=1}^{N} (J_{j} −g_{I})ξ_{j}$ , where ξ_j can take on any value from (and

Results

Given any two sets C^± of linearly separable patterns, a neuron endowed with global inhibition and the stochastic learning rule described above will always learn to correctly classify the patterns in a finite number of presentations. The tighter the separation between the two classes C^±, the smaller the neuronal threshold θ₀, the learning margin δ₀, and the learning rate q must be (for simplicity we assume q₊=q₋=q). More precisely, we assume that there is a separation vector S of length ||S||=N

Conclusions

We have shown that stochastic learning allows a perceptron with binary excitatory weights to converge in a finite number of updates for any separable set of patterns, provided that there is some global inhibition, a small neuronal threshold, and slow learning. These ingredients rescue binary synapses from fast forgetting due to saturation of the potentiation probabilities. They also allow to store as many patterns as in a network with analogue unbounded synapses (proportional to N^α, with α from

Acknowledgements

This work was supported by the EU Grant IST-2001-38099 ALAVLSI and the SNF Grant 3152-065234.01. We thank J. Brader for useful remarks.

References (4)

M.R. Garey et al.
Computers and Intractability: A Guide to the Theory of NP-Completeness
(1999)
S. Fusi
Hebbian spike-driven synaptic plasticity for learning patterns of mean firing rates
Biol. Cybern.
(2002)

There are more references available in the full text version of this article.

Cited by (5)

Generalization of finite size Boolean perceptrons with genetic algorithms
2008, Neurocomputing
Citation Excerpt :
Using statistical mechanics techniques, Gardner and Derrida [7] introduced the study of Boolean perceptrons with discrete weights without correlation among input patterns and the corresponding outputs, the so-called random map. For finite size networks learning in binary perceptron has been investigated with different techniques such as the slow stochastic process devised by Senn et al. [19,20] and clipping of continuous-weight perceptrons [5,13,14,17]. Random input–output associations algorithm were investigated by Baldassi et al. [2], who showed that on-line supervised algorithms provides fast learning of random input–output associations, up to close to the theoretical capacity [12].
We present an investigation of the generalization ability of finite size perceptrons with binary couplings. The results for the expected generalization error provide a guide for practical applications by establishing limits for the learning capacity of finite systems. The method applied to find solutions was the genetic algorithm, which showed to be efficient, even for values of $α$ larger then the Gardner–Derrida storage capacity $α_{GD} = 1.245$ , for which the number of solutions is largely reduced. We show that the generalization error of finite size networks for $α$ up to $α_{GD}$ coincides with the value calculated through the statistical mechanical analysis in the thermodynamic limit.
Are binary synapses superior to graded weight representations in stochastic attractor networks?
2009, Cognitive Neurodynamics
Learning only when necessary: Better memories of correlated patterns in networks with bounded synapses
2005, Neural Computation
Multiple views of the response of an ensemble of spectro-temporal features support concurrent classification of utterance, prosody, sex and speaker identity
2005, Network: Computation in Neural Systems
Convergence of stochastic learning in perceptrons with binary synapses
2005, Physical Review E - Statistical, Nonlinear, and Soft Matter Physics

View full text