Emergent Dynamical Properties of the BCM Learning Rule

The Bienenstock–Cooper–Munro (BCM) learning rule provides a simple setup for synaptic modification that combines a Hebbian product rule with a homeostatic mechanism that keeps the weights bounded. The homeostatic part of the learning rule depends on the time average of the post-synaptic activity and provides a sliding threshold that distinguishes between increasing or decreasing weights. There are, thus, two essential time scales in the BCM rule: a homeostatic time scale, and a synaptic modification time scale. When the dynamics of the stimulus is rapid enough, it is possible to reduce the BCM rule to a simple averaged set of differential equations. In previous analyses of this model, the time scale of the sliding threshold is usually faster than that of the synaptic modification. In this paper, we study the dynamical properties of these averaged equations when the homeostatic time scale is close to the synaptic modification time scale. We show that instabilities arise leading to oscillations and in some cases chaos and other complex dynamics. We consider three cases: one neuron with two weights and two stimuli, one neuron with two weights and three stimuli, and finally a weakly interacting network of neurons.


Introduction
For several decades now, the topic of synaptic plasticity has remained relevant. A pioneering theory on this topic is the Hebbian theory of synaptic modification [1,2], in which Donald Hebb proposed that when neuron A repeatedly participates in firing neuron B, the strength of the action of A onto B increases. This implies that changes in synaptic strengths in a neural network is a function of the pre-and post-synaptic neural activities. A few decades later, Nass and Cooper [3] developed a Hebbian synaptic modification theory for the synapses of the visual cortex, which was later extended to a threshold dependent setup by Cooper et al. [4]. In this setup, the sign of a weight modification is based on whether the post-synaptic response is below or above a static threshold. A response above the threshold is meant to strengthen the active synapse, and a response below the threshold should lead to a weakening of the active synapse.
One of the widely used models of synaptic plasticity is the Bienenstock-Cooper-Munro (BCM) learning rule with which Bienenstock et al. [5]-by incorporating a dynamic threshold that is a function of the average post-synaptic activity over time-captured the development of stimulus selectivity in the primary visual cortex of higher vertebrates. In corroborating the BCM theory, it has been shown that a BCM network develops orientation selectivity and ocular dominance in natural scene environments [6,7]. Although the BCM rule was developed to model selectivity of visual cortical neurons, it has been successfully applied to other types of neurons. For instance, it has been used to explain experience-dependent plasticity in the mature somatosensory cortex [8]. Furthermore the BCM rule has been reformulated and adapted to suit various interaction environments of neural networks, including laterally interacting neurons [9,10] and stimuli generalizing neurons [11]. The BCM rule has also been in the center of the discussion as regards the relationship between rate-based plasticity and spike-time dependent plasticity (STDP); it has been shown that the applicability of the BCM formulation is not limited to rate-based neurons but under certain conditions extends to STDP-based neurons [12][13][14].
Based on the BCM learning rule, a few data mining applications of neuronal selectivity have emerged. It has been shown that a BCM neural network can perform projection pursuit [7,15,16], i.e. it can find projections in which a data set departs from statistical normality. This is an important finding that highlights the feature detecting property of a BCM neural model. As a result, the BCM neural network has been successfully applied to some specific pattern recognition tasks. For example Bachman et al. [17] incorporated the BCM learning rule in their algorithm for classifying radar data. Intrator et al. developed an algorithm for recognizing 3D objects from 2D view by combining existing statistical feature extraction models with the BCM model [18,19]. There has been a preliminary simulation on how the BCM learning rule has the potential to identify alpha numeric letters [20].
Mathematically speaking, the BCM learning rule is a system of differential equations involving the synaptic weights, the stimulus coming into the neuron, the activity response of the neuron to the stimulus, and the threshold for the activity. Unlike its predecessors, which use static thresholds to modulate neuronal activity, the BCM learning rule allows the threshold to be dynamic. This dynamic threshold provides stability to the learning rule, and from a biological perspective, provides homeostasis to the system. Treating the BCM learning rule as a dynamical system, this paper explores the stability properties and shows that the dynamic nature of the threshold guarantees stability only in a certain regime of homeostatic time scale. This paper also explores the stability properties as a function of the relationship between homeostasis time scale and the weight time scale. Indeed, there is no biological reason why the homeostatic time scale should be dramatically shorter than the synaptic modification time scale [21], so in this paper, we relax those restrictions. In Sect. 3, we illustrate a stochastic simulation in the simplest case of a single neuron with two weights and two different competing stimuli. We derive the averaged mean field equations and show that there are changes in the stability as the homeostatic time constant changes. In Sect. 4, we continue the study of a single neuron, but now assume that there are more inputs than weights. Here, we find rich dynamics including multiple period-doubling cascades and chaotic dynamics. Finally, in Sect. 5, we study small linearly coupled networks and prove stability results while uncovering more rich dynamics.

Methods
The underlying BCM theory expresses the changes in synaptic weights as a product of the input stimulus pattern vector, x, and a function, φ. Here, φ is a nonlinear function of the post-synaptic neuronal activity, v, and a dynamic threshold, θ , of the activity (see Fig. 1A).
If at any time, the neuron receives a stimulus x from a stimulus set, say {x (1) , x (2) , . . . , x (n) }, the weight vectors, w, evolve according to the BCM rule as θ is sometimes referred to as the "sliding threshold" because, as can be seen from Eq. (1), it changes with time, and this change depends on the output v, the sum of the weighted input to the neuron, v = w · x. φ has the following property: for low values of the post-synaptic activity (v < θ ), φ is negative; for v > θ, φ is positive. In the results presented by Bienenstock et al. [5], is a running temporal average of v and the learning rule is stable for p > 1. Later formulations of the learning rule (for instance by [7]) have shown that a spatial average can be used in lieu of a temporal average, and that with p = 2, E[v p ] is an excellent approximation of E p [v]. We can also replace the moving temporal average of v with first order low-pass filter. Thus a differential form of the learning rule is where τ w and τ θ are time-scale factors, which in simulated environments, can be used to adjust how fast the system is changing with respect to time. We point out that this When τ θ /τ w = 0.25, response converges to a steady state and neuron selects stimulus x (1) . (Here, the stimuli are x (1) = (cos α, sin α) and x (2) = (sin α, cos α) with α = 0.3926, the stimuli switch randomly at a rate 5, and τ w = 25.) (C) When τ θ /τ w = 1.7, responses oscillate but the neuron still selects stimulus x (1) . (D) When τ θ /τ w = 2.5, neuron is no longer selective is the version of the model that is found in Dayan and Abbott [22]. We point out that the vector input, x is changing rapidly compared to θ and w, so that Eq. (2) is actually a stochastic equation. The stimuli, x are generally taken from a finite set of patterns, x (k) and are randomly selected and presented to the model.

Results I: One Neuron, Two Weights, Two Stimuli
For a single linear neuron that receives a stimulus pattern x = (x 1 , . . . , x n ) with synaptic weights w = (w 1 , . . . , w n ), the neuronal response is v = w · x. The results we present in this section are specific to when n = 2 and when there are two pat-terns. In this case, the neuronal response is v = w 1 x 1 + w 2 x 2 . In the next section, we explore a more general setting.

Stochastic Experiment
A good starting point in studying the dynamical properties of the BCM neuron is to explore the steady states of v for different time-scale factors of θ . This is equivalent to varying the ratio τ θ /τ w in Eq. (2). We start with a BCM neuron that receives a stimulus input x stochastically from a set {x (1) , x (2) } with equal probabilities, that is, We create a simple hybrid stochastic system where the value of x switches between the pair {x (1) , x (2) } at a rate λ as a two state Markov process. At steady state, the neuron is said to be selective if it yields a high response to one stimulus and a low (≈ 0) response to the other.
Figures 1B-D plot the neuronal response v as a function of time. In each case, the initial conditions of w 1 , w 2 and θ lie in the interval (0, 0.3). The stimuli are x (1) = (cos α, sin α) and x (2) (1) is the response of the neuron to the stimulus x (1) and v 2 = w · x (2) is the response of the neuron to the stimulus x (2) . In each simulation, the presentation of stimulus is a Markov process with rate λ = 5 presentations per second. When τ θ /τ w = 0.25, Fig. 1B shows a stable selective steady state of the neuron. At this state, v 1 ≈ 2 while v 2 ≈ 0, implying that the neuron selects x (1) . This scenario is equivalent to one of the selective steady states demonstrated by Bienenstock et al. [5].
When the threshold, θ changes slower than the weights, w, the dynamics of the BCM neuron take on a different kind of behavior. In Fig. 1C, τ θ /τ w = 1.7. As can be seen, there is a difference between this figure and Fig. 1B. Here, the steady state of the system loses stability and a noisy oscillation appears to emerge. The neuron is still selective since there is a large enough empty intersection between these ranges of oscillation.
Setting the time-scale factor of θ to be a little more than twice that of w reveals a different kind of oscillation from the one seen in Fig. 1C. In Fig. 1D where τ θ /τ w = 2.5, the oscillation has very sharp maxima and flat minima and can be described as an alternating combination of spikes and rest states. As can be seen, the neuron is not selective.

Mean Field Model
The dynamics of the BCM neuron (Eq. (2)) is stochastic in nature, since at each time step, the neuron randomly receives one out of a set of stimuli. One way to gain more insight into the nature of these dynamics is to study a mean field deterministic approximation of the learning rule. If the rate of change of the stimuli is rapid compared to that of the weights and threshold, then we can average over the fast time scale to get a mean field or averaged model and then study this through the usual methods of dynamical systems. Consider a BCM neuron that receives a stimulus input x, stochastically from the set {x (1) = (x 11 , x 12 ), x (2) = (x 21 , x 22 )} such that P r[x(t) = x (1) ] = ρ and P r[x(t) = x (2) ] = 1 − ρ. A mean field equation for the synaptic weights iṡ Now let the responses to the two stimuli be v 1 = w · x (1) and v 2 = w · x (2) . With this, changes in the responses can be written aṡ So a mean field equation in terms of the responses is This equation is our starting point for the analysis of the effects of changing the timescale factor of θ , τ θ . Thus all that matters with regard to the time scales is the ratio, τ = τ θ /τ w . We note that we could also write down the averaged equations in terms of the weights, but the form of the equations is much more cumbersome. We now look for equilibria and the stability of these fixed points. We note that if the two stimuli are not collinear and ρ ∈ (0, 1), thenv 1,2 = 0 if and only if v j (v j − θ) = 0. Using the fact that at equilibrium, θ = ρv 1 2 which gives the fixed points The fixed points ( 1 ρ , 0, 1 ρ ) and (0, 1 1−ρ , 1 1−ρ ) are stable (as we will see) for small enough τ and selective, while (0, 0, 0) and (1, 1, 1) are neither stable nor selective. Bienenstock et al. [5] discussed the stability of these fixed points as they pertain to the original formulation. Castellani et al. [9] and Intrator and Cooper [7] gave a similar treatment to the objective formulation. In Sect. 3.4, it will be shown that the stability of ( 1 ρ , 0, 1 ρ ) and (0, 1 1−ρ , 1 1−ρ ) depends on the angle between the stimuli, the amplitude of the stimuli, ρ, and the ratio of τ θ to τ w .

Oscillatory Properties: Simulations
As seen in the preceding section, the fixed points to the mean field BCM equation are invariant (with regards to stimuli and synaptic weights) and depend only on the probabilities with which the stimuli are presented. The stability of the selective fixed points, however, depends on the time-scale parameters, the angular relationship between the stimuli, and the amplitudes of the stimuli. To get a preliminary understanding of this property of the system, consider the following simulations of Eq. (4); each with different stimulus set characteristics. We remark that because Eq. (4) depends only on  (1), sin(1)), τ θ /τ w = 1.5; (D) ρ = 0.5, x (2) = 1.5(cos(1), sin(1)), τ θ /τ w = 0.8 the inner product of stimuli, equal rotation of both has no effect on the equations. What matters is the magnitude, angle between them, and frequency.
Simulation A: orthogonal, equal magnitudes, equal probabilities Let ρ = 0.5, x (1) = (1, 0), x (2) = (0, 1). In this case, the two stimuli have equal magnitudes, are perpendicular to each other, and are presented with equal probabilities. Figure 2(A) shows the evolution of v 1 and v 2 in the last 100 time-steps of a 400 time step simulation. The dashed line shows the unstable non-zero equilibrium point. For τ ≡ τ θ /τ w = 1.1, there is a stable limit cycle oscillation of v 1 . Since the stimuli are orthogonal, v 2 (t) = 0 is an invariant set.
Simulation B: non-orthogonal, equal magnitudes, equal probabilities Let ρ = 0.5, (1), sin(1)), τ = 1.5. In this case, the two stimuli have equal magnitudes, are not perpendicular to each and are presented with equal probabilities. Figure 2(B) shows an oscillation, but now v 2 oscillates as well since the stimuli are not orthogonal.
These four examples demonstrate that there are oscillations of various shapes and frequencies that arise pretty generically no matter what the specifics of the mean field model are; they can occur in symmetric cases (e.g. simulation A) or with more general parameters as in B-D. We also note that to get oscillatory behavior in the BCM rule, we do not even need τ θ > τ w as seen in example D. We will see shortly that the oscillations arise from a Hopf bifurcation as the parameter, τ increases beyond a critical value. To find this value, we perform a stability analysis of the equilibria for Eq. (4).

Stability Analysis
We begin with a very general stability theorem that will allow us to compute stability for an arbitrary pair of vectors and arbitrary probabilities of presentation. Looking at Eq. (4), we see that by rescaling time, we can assume that x (1) · x (1) = 1 without loss of generality. To simplify the calculations, we (2) , and c = ρ/(1 − ρ). Note that a > b 2 by the Schwartz inequality and that c ∈ (0, ∞) with c = 1 being the case of equal probability.
For completeness, we first dispatch with the two non-selective equilibria, (1, 1, 1) and (0, 0, 0). At (1, 1, 1), it is easy to see that the characteristic polynomial has a constant coefficient that is Linearization about (0, 0, 0) yields a matrix that has double zero eigenvalue and a negative eigenvalue, −1/τ . Since the only linear term in Eq. (4) is −θ/τ , the center manifold is parameterized by (v 1 , v 2 ) and first terms in a center manifold calculation for θ are θ = ρv 2 1 This term only contributes cubic terms to the v 1 , v 2 right-hand sides so that to quadratic order: Hence, We claim that there exists a solution to this equation of the form, v 2 = Kv 1 for a constant K > 0. Plugging in this assumption we see that K satisfies If b ≥ 0, then clearly v 1 (t) goes away from the origin, which implies that (0, 0, 0) is unstable. If b < 0, the singularity occurs when K 2 = −cb/a and the root to H (K) = 1/K is less than −cb/a. This yields and, again, using the fact that b 2 < a, we see that v 1 leaves the origin. Thus, we have proven that (0, 0, 0) is unstable. We now have to look at the stability of the selective equilibria: 2 , since the latter has different stability properties if the parameter a > 1. The Jacobian matrix for the right-hand sides of Eq. (4) around z 1 is From this we get the characteristic polynomial: Equilibria are stable if these three coefficients are positive and from the Routh-Hurwitz criterion, A 11 A 12 − A 10 := R 1 > 0. We note that A 10 > 0 since c > 0 (unless ρ = 0) and a > b 2 . This means that no branches of fixed points can bifurcate from the equilibrium point; that is there are no zero eigenvalues. For τ small R 1 ∼ (1 + ac)/τ 2 > 0 and the other coefficients are positive, so the rest state is asymptotically stable. A Hopf bifurcation will occur if R 1 = 0 and A 10 > 0 and A 12 > 0. Setting R 1 = 0 yields the quadratic equation: In the "standard" case (e.g. as in Fig. 2B), we have a = c = 1 and Q 1 A similar calculation can be done for the fixed point z 2 . In this case, the coefficients of the characteristic polynomial are As with the equilibrium z 1 , there can be no zero eigenvalue and A 20 is positive except at the extreme cases where c = 0 or a = b 2 . The Routh-Hurwitz quantity, We note that when a = c = 1, we recover Eq. (8). For τ sufficiently small, z 2 is asymptotically stable. We now use Eqs. (7) and (9) to explore the stability of the two solutions as a function of τ . We have already eliminated the possibility of losing stability through a zero eigenvalue since both A 10 , A 20 are positive. Thus, the only way to lose stability is through a Hopf bifurcation which occurs when either of Q 1,2 R (τ ) vanishes. We can use the quadratic formula to solve for τ for each of these two cases, but one has to be careful since the coefficient of τ 2 vanishes when c = a or c = 1/a. Figure 3 shows stability curves as different parameters vary. In panel A, we use the standard setup ( Fig. 2B) where ρ = 0.5, the stimuli are unit vectors ((1, 0) and (cos α, sin α)), and α denotes the angle between the vectors. The curve is explicitly obtained from Eq. (8), with b = cos α. For any τ above τ c , either of the two selective equilibria is unstable. In Fig. 3B, we show the dependence of τ c on ρ, the frequency of a given stimulus. All values of τ c are greater than or equal to 1, so that in order to get instability the time-scale factor, τ θ , of homeostasis must be more than or equal to that of the weights, τ w . In panel C, we show the dependence on the amplitude, A, where x (2) = A(cos α, sin α). This figure shows two curves: the red curve give τ c for the equilibrium, (v 1 , v 2 , θ) = (2, 0, 2) while the black curve is for (v 1 , v 2 , θ) = (0, 2, 2). The latter equilibrium can lose stability at arbitrarily low values of τ if the amplitude is large enough. Indeed, τ c ∼ 1/A 2 as A → ∞.
We summarize the results in this section with the following theorem.
Theorem 3.1 Assume that the two stimuli are not collinear and that ρ ∈ (0, 1). Then there are exactly four equilibria to Eq. (4): • z 2 is linearly asymptotically stable if and only if • If a = 1 (that is, the stimuli have equal amplitude), then z 1,2 are linearly asymptotically stable if and only if

Bifurcation Analysis
The previous section shows that as the ratio τ increases, the two selective equilibria lose stability via a Hopf bifurcation. We now use numerical methods to study the behavior as τ increases. As the stability theorem shows, if the amplitude of the two stimuli are the same, then the stability is exactly the same for both, no matter what the other parameters. We will fix ρ = 0.5, and x (1) (1), sin(1)) and let τ vary. In Fig. 4A, we show the case A = 1 so that both stimuli have the same magnitudes. As τ increases, each of the selective equilibria loses stability at the same value of τ , here given by τ c = 1/(1 − cos(1) 2 ) = 1.412 (cf. Eq. (8)). At this point a stable limit cycle bifurcates and exists up until τ ≡ τ H C ≈ 3.2 where the orbit appears to be homoclinic to the nonlinear saddle at the origin. (Note that near the homoclinic, there are some numerical issues with the stability; we believe that the branch is stable all the way up to the homoclinic.) We remark that the dynamics for τ slightly larger than τ H C is difficult to analyze; while the origin is unstable, it has stable directions and it appears that all initial data eventually converge to it. For τ large enough, we have found that solutions blow up in finite time.
If the amplitude of x (2) is different from that of x (1) , then the theorem shows that the two selective equilibria have different stability properties. Figure 4C shows the bifurcation diagram for A = 1.5. When we follow the stability of z 1 = (2, 0, 2) (shown as the lower curve labeled 1), there is a Hopf bifurcation at τ ≈ 1.52 and a stable branch of periodic orbits bifurcates from it that persists up until τ ≈ 1.94 where it bends around (LP), becomes unstable, and terminates on the symmetric unstable equilibrium, (v 1 , v 2 , θ) = (1, 1, 1) at a Hopf bifurcation (τ ≈ 0.79) for this equilibrium, labeled Hs. Figure 4D shows the small amplitude periodic orbit at τ = 1.7 projected in the (v 1 , v 2 ) plane where it is centered around (v 1 , v 2 ) = (2, 0). The up-per curve in panel C (labeled 2) shows the stability of z 2 = (0, 2, 2) as τ varies. Here, there is a Hopf bifurcation at τ ≈ 0, 5 and a stable branch of periodic orbits bifurcates from the equilibrium. The branch terminates at a homoclinic orbit at τ ≈ 1.35. Figure 4D shows an orbit for τ = 0.7 that surrounds (v 1 , v 2 ) = (2, 0).
In sum, in this section we have analyzed a very simple BCM model where there are two stimuli, two weights, and one neuron. We have shown that if the time-scale factor (τ θ ) of the homeostatic threshold, θ is too slow relative to the time-scale factor of the weights, then, the selective equilibria lose stability via a Hopf bifurcation and limit cycles emerge. For very large ratios, τ = τ θ /τ w , solutions become unbounded and intermediate values of τ , the origin becomes an attractor even though it is unstable. In the next section, we consider the case when there are more stimuli than there are weights and, in the subsequent section, we consider small coupled networks.

Results II: One Neuron, n Weights, m Stimuli
We next consider the general scenario where a single neuron receives an ndimensional input selected from m different possibilities with probability p k , k = 1, . . . , m. We will label the stimuli x kj with j running from 1, . . . , n, and k as above. The weights are w 1 , . . . , w n and the response of a neuron to stimulus k is If the weights and the threshold change slowly compared to the change in the stimulus presentation, then the differential equations for the BCM rule can be averaged over the inputs: where v k is given in Eq. (10). We note that, classically, what is of interest is not the evolution of the weights, but rather the evolution of the responses. Using Eq. (10), we see that where x k is the vector whose entries are (x k1 , . . . , x kn ). It is very clear that using this formulation, the equations are very simple. Let X denote the matrix whose entries are x kj ; it is an m × n matrix. If n = m, then X is square, and if it is invertible, then the two formulations with respect to the weights and the responses are equivalent. That is, v(t) = X w(t). However, if n = m, then there will be some degeneracy with respect to the two formulations. Most typically, the dimension of the stimulus space will be larger than the dimension of the weight space (m > n) and in this case there will be degeneracy with respect to the responses. As should be clear from the two formulations, the equations are much simpler in the response space, so that this is the preferred set of ODEs and thus there will be redundancy in the equations. That is, there will be m − n linearly independent vectors, q i such that q T i X = 0. This implies that here will be m − n constants of motion in the response space: Thus, in the case when m > n, we still need only study the n + 1-dimensional dynamical system consisting of n choices of the v k along with the m − n linear constraints (12).

Example: n = 2, m = 3
As an example of the kinds of dynamics that is possible, we will consider m = 3 and n = 2 where the three stimuli are (1, 0), (cos α, sin α), and (cos β, sin β) and these are distributed with equal probability. In this case, the equations for v k , θ are with c ll = 1, c lk = c kl , c 12 = cos α, c 13 = cos β, and c 23 = cos(α − β). Since there are two weights and three stimuli, we can reduce the dimension by 1 with the constraint: where e 1 = cos α cos(α − β), e 2 = − sin α sin β and e 3 = sin(α) 2 . As long as one of these is non-zero (which will happen if the vectors are not all collinear), we can solve for one of the v k and reduce the dimension by 1.
In the example that we analyze here, we fix α = 0.92 and β = 2.5 and eliminate v 3 . This leaves two parameters, τ ≡ τ θ /τ w and C, the constant of integration. Equilibria are independent of τ but the existence of limit cycles and other complex dynamics obviously depends on τ . Figure 5 shows the dynamics as C is varied for different values of the ratio τ .  Thin black lines are unstable equilibria, red are stable equilibria, green and blue circles are stable and unstable limit cycles. PD is for period-doubling bifurcation; CH for chaos, HOM for homoclinic. Arrows in C correspond to chaotic behavior shown in Fig. 6 all τ > 1.293, the upper branch has two Hopf bifurcations (labeled a, b) so that we can expect the possibility of periodic behavior. The curve of the Hopf bifurcations is more complicated for the isola. We first note that the upper part of the isola always has one real positive eigenvalue, so that it is unstable for all τ . The lower part of the isola has a negative real eigenvalue and its stability depends on τ . Returning to the Hopf bifurcations on the isola of equilibria (shown in red in panel A), we see that there can be 1, 2 or 3 Hopf bifurcations as C changes. We label these c, d, e. Since there are generally two Hopf bifurcations on the main branch of equilibria, there can be up to five Hopf bifurcations for a given value of τ as C increases. We start with τ = 1.6 (panel B). For this value of τ , we see it is below the minimum for which there are Hopf bifurcations on the isola, so all the bifurcations appear on the main branch. Both bifurcations are supercritical and lead to small amplitude stable oscillations that grow in amplitude. The branches of periodic orbits arising from the two Hopf bifurcations are joined and thus represent a single continuous branch. However, the branch starting at a loses stability via a period-doubling bifurcation (PD in panel B) at C ≈ 0.177. There does not appear to be any chaotic behavior that we have been able to find. For τ = 1.8, shown in panel C, we see that the branch of periodic orbits that bifurcated from the main branch (at points a, b), has split into two separate branches that terminate on Hopf bifurcations of the upper branch of the isola (points c, d). The left branch that joins a and c also undergoes a period-doubling bifurcation (PD) and for a limited range of C, there appears to be chaos in the dynamics; specifically around C = 0.18. Two arrows delimit the range of parameters that are shown in Fig. 6. For τ = 2.3, 2.54, there are 5 Hopf bifurations and as with τ = 1.8 the periodic orbits arising from point a join with those on point c and those arising from b join with the branch arising from d. The branch of stable periodic orbits arising from the Hopf bifurcation at e is lost at a homoclinic labeled Hom in panel E. There is a small regime of chaotic behavior for τ = 2.3 shown in panel D, but we find no chaos when τ = 2.54, For larger values of τ , there are three Hopf bifurcations (a, b, d). The bifurcations c,e merge and disappear so that all the equilibria on the isola are unstable. The branch of periodic orbits arising from d, becomes disconnected from the branch arising from b while the branch of orbits arising ftom b joins the branch arising from a. Other than the unique stable equilibrium when C is large or small, there is only a principal branch of stable periodic orbits between the Hopf bifurcations a and b. There are other complex structures, but none of them are stable. Figure 6 shows some probable chaos for τ = 1.8 and C ∈ [0, 0.25]. Panel A shows a trajectory projected in the v 1 − θ plane for C = 0.18. Panel B shows the evolution of the attracting dynamics as C varies. We take a Poincaré section at v 2 = 2 and plot the successive values of θ after removing transients and letting C vary between 0 and 0.25. As C increases, there is a periodic orbit that undergoes multiple period-doubling bifurcations before becoming chaotic. There are several regions showing period three orbits (C ≈ 0.1, C ≈ 0.175, C ≈ 0.21) as well as many regions with complex behavior. The chaos and periodic dynamics terminates near C = 0.237, which is the value of C at which the lower stable branch of equilibria in the isola begins. Chaos and similar complex dynamics occurs for other values of τ .
In this section, we have shown that the degeneracy that occurs when there are more stimulus patterns than weights can be resolved by finding some simple constants of motion. The resulting reduced system will always be three-dimensional. In the simplest case of three patterns and two weights, we have found rich dynamics when τ = τ θ /τ w is larger than 1.

Results III: Small Coupled Network
To implement a network of coupled BCM neurons receiving stimulus patterns from a common set, it is important to incorporate a mechanism for competitive selectivity within the network. A mechanism of this sort, found in visual processes [23] (and also in tactile [24], auditory [25], and olfactory processing [26]) is called lateral inhibition, during which an excited neuron reduces the activity of its neighbors by disabling the spreading of action potentials to neighboring neurons in the lateral direction. This creates a contrast in stimulation that allows increased sensory perception.
Consider a network of N mutually inhibiting neurons. At any time, be the net activities of the neurons. Let {s i } N i=1 be the partial activities induced by a stimulus x i.e. s i = w i · x where w i is the synaptic weight vector for neuron i. At any given time, the activity, v i of the linear neuron i (where i ∈ {1, 2, . . . , N}) follows the differential equation: where I i is the external input to the neuron [27]. Since neuron i is inhibited by its neighbors where γ is the inhibition parameter, measuring the amount of inhibition that i gets. Therefore At a steady state, this equation becomes Thus, the overall activity of the network can be expressed as Then we can write Eq. (15) as .
Linearizing around the steady state solution of Eq. (14), we obtain the Jacobian Notice that (1, 1, . . . , 1) T is an eigenvector of M with corresponding eigenvalue −(1 + Nγ ). This eigenvalue is negative when γ > −1/N . Also notice that M can be written as  (1) then Thus u is an eigenvector of M corresponding to the eigenvalue γ − 1. This eigenvalue is negative when 0 < γ < 1. Thus whenever G is invertible, the system is also stable. Now consider two neurons a and b who mutually inhibit each other and, at any instant, receive the same stimulus pattern x, with synaptic weight vectors w a (for neuron a) and w b (for neuron b). Let their responses to x be s a and s b , and their net responses (after accounting for inhibition) be v a and v b . Finally, let the dynamic threshold to v a and v b be θ a and θ b , respectively. The BCM learning rule of these two neurons is given by where s a = w a · x and s b = w b · x and thus
The rate of change of v a is given bẏ A similar expression exists for v b . Assume that x is from the set {x (1) = (x 11 , x 12 ), Then in terms of the responses, the mean field version of the BCM rule for two mutually inhibiting neurons a and b is derived as follows: Observing that each of ρ, x (1) , and x (2) is non-zero, and setting the right-hand side of Eq. (22) to zero yields in these fixed points correspond to the symmetric variants of the last equilibrium, for example swapping the (1, 1, 1) and the ( 1 ρ , 0, 1 ρ ) or swapping the latter triplet for (0, 1 1−ρ , 1 1−ρ ). Castellani et al. [9] and Intrator and Cooper [7] give a detailed analysis on the stability of most of these fixed points in the limit of τ θ → 0. They showed that (0, 0, 0, 0, 0, 0) and (1, 1, 1, 1, 1, 1) are unstable and the fully selective fixed points are stable. This leaves the fixed points of the form (1, 1, 1, 1 ρ , 0, 1 ρ ). We address these below for our particular choice of stimuli.

Stability of the Selective Equilibria
We now consider the stability of these equilibria in the simplified case of the previous paragraph (ρ = 0.5, x (1) , x (2) are unit vectors with x (1) · x (2) = β = cos(α) where α is the angle between them). We exploit the symmetry of the resulting equations to factor the characteristic polynomial into the product of two cubic polynomials and then use the Routh-Hurwitz criteria. We have made use of the symbolic capabilities of Maple. Again, let β = cos(α), g = 1/(1 − γ 2 ) and h = γ /(1 − γ 2 ). The linearization of the symmetric equilibrium (v a1 , v a2 , θ a , v b1 , v b2 , θ b ) = (2, 0, 2, 2, 0, 2) is This matrix is clearly block symmetric with 3 × 3 blocks G, H . The the stability is thus found by studying M 1 = G + H and M 2 = G − H . Let a 1 = g − h and a 2 = g + h. Then the blocks have the form The characteristic polynomial of M j is Clearly the λ 2 coefficient and the constant coefficient are positive. The λ-coefficient could become negative if τ > 2/(a j (1 − β 2 )) ≡ τ s j 1 . The Routh-Hurwitz criterion also requires This quantity becomes negative for τ > 1/(a j (1 − β 2 )) ≡ τ s jH . Clearly τ s jH < τ s j 1 , so as τ increases there will be a Hopf bifurcation. Since 0 < a 1 < a 2 , we see that the symmetric equilibrium will be stable if and only if where we have used the definitions of g, h, β. The critical value of τ is a linear function of γ and this critical value can be arbitrarily small as γ → 1. We also remark that the critical instability is due to a 2 , which is associated with G − H . Thus, we expect a symmetry-breaking Hopf bifurcation to out-of-phase oscillations. We will numerically confirm this result in the subsequent numerical analysis. We can do a similar calculation for the antisymmetric equilibrium. Here, we just state the final result; the approach and calculations are similar. The characteristic polynomial factors. Each factor has the form The constant coefficient and the quadratic coefficient are always positive. The linear coefficient is positive as long as The additional Routh-Hurwitz condition is positive if and only if which is clearly less than τ 2± . Applying the definitions of g, h, β, yields Clearly τ A − < τ a + , so that the antisymmetric solution is stable as long as τ < τ a − ≡ τ a H . We summarize the stability results in the following theorem.  The symmetric equilibrium, (v a1 , v a2 , θ a , v b1 , v b2 , θ b ) = (2, 0, 2, 2, 0, 2) is linearly asymptotically stable if and only if Furthermore the unstable direction is antisymmentric.
We remark that, for acute angles where cos(α) > 0, the symmetric equilibrium loses stability at lower values of τ than does the antisymmetric equilibrium and for obtuse angles (cos α < 0) it is vice versa.
Partially selective equilibria. Using the same notation as for the selective equilibria, we consider the partially selective fixed points for ρ = 1/2 (e.g. (1, 1, 1, 2, 0, 2) etc.). An elementary evaluation of the constant coefficient of the characteristic polynomial yields a value: which is clearly negative. Since the product of the eigenvalues is a 0 this implies that the eigenvalues have mixed signs and these equilibria are saddle points.

Numerical Results
In this section, we study the numerical behavior of Eq. (17) for ρ = 0.5, x 1,2 unit vectors with angle α = 0.7709 as τ and γ vary. We will generally set γ = 0.25. The choice for α is somewhat arbitrary but was found to yield rich dynamics. We first study the behavior of the symmetric equilibrium (v a1 , v a2 , θ a , v b1 , v b2 , θ b ) = (2, 0, 2, 2, 0, 2) as τ increases. In Fig. 7A, we set γ = 0.2. For τ small enough, the selective symmetric equilibrium is stable, with increasing τ loses stability and we have a Hopf bifurcation (HB). A branch of periodic solutions emerges where v * 1 > v * 2 for each neuron, * = {a, b}. At a critical value of τ there is a branch point (or pitchfork) bifurcation (BP) where this selective periodic solution intersects a non-selective periodic solution. The selective periodic solutions have either v * 1 > v * 2 (1 > 2, top branch) or (2 > 1, lower branch). The non-selective branch (with a # on it) loses stability at a torus bifurcation (TR). Beyond the torus, there are, at first, stable non-selective quasi-periodic solutions, and then some possible chaos. We will look at chaotic solutions when we describe the antisymmetric behavior. Figures 7B1, 2 show the V 's for the selective and non-selective stable oscillations at values of τ denoted by the , and the (τ = 1.55, τ = 2.29, respectively). In Fig. 7C, we set γ = 0.4 and see a behavior similar to panel A, but the selective branches lose stability at a torus bifurcation at values of τ less than the branch point and this gives rise to attracting quasi-periodic selective behavior, and then, for τ a bit larger, selective chaos. For all γ , when τ is larger than about 3, the solutions produce a "spike" and then return to 0. We know that the origin is unstable, but there are some stable directions and all solutions appear 1 to approach this stable direction when τ is large enough. There are seven regions in the diagram. In R1, the equilibrium is stable; this region is delineated by the Hopf bifurcation (HB) line that we computed in Theorem 5.1, τ S H B = (1 − γ )/(1 − cos 2 α). Region R2 corresponds to a non-symmetric periodic orbit such as shown in Fig. 7B1. If γ is small (roughly, less than 0.26), then, as τ increases, there is a reverse pitchfork bifurcation (BP) to a non-selective periodic orbit (R3) such as shown in Fig. 7B2. This orbit loses stability at the non-selective torus bifurcation (NSTR) as we enter R4. In R4, there is quasi-periodic and chaotic behavior, but the dynamics lies on the four-dimensional space V a1 = V a2 , V b1 = V b2 . Passing from R2 to R5 also appears to lead to quasi-periodic and chaotic behaviors. Region R6, delineated below by the curve of limit points (LP) above, by an apparent homoclinic orbit (HC) consists of large amplitude stable periodic orbits where V a1 = V a2 and V b1 = V b2 . This branch of solutions (seen in the one-parameter diagram, Fig. 7A at the top right) does not connect to the other branches until γ is close to zero (not shown). As τ increases, the period of these orbits appears to go to infinity and they spend more and more time near the origin. We find that in R7, the origin is  Fig. 7 that divides (τ, γ ) into different regions. Labels as in Fig. 7, with NSTR corresponding to a non-selective torus bifurcation and STR, the selective one. Small letters, a, b, c, d, e correspond to the letters in Fig. 7. See text for details a global attractor, even though it is unstable. Figures 9A,B show simulations when τ is in R6 (panel A) and in R7 (panel B). Initial conditions were chosen with no special symmetry. In Fig. 9A, we see a brief transient, followed by a gap and then, eventually long period activity. In Fig. 9B, we only show the first five time units, but after t = 20,000, we still saw no return to activity.
We next turn to the behavior of the antisymmetric equilibrium, (v a1 , v a2 , θ a , v b1 , v b2 , θ b ) = (2, 0, 2, 0, 2, 2) as τ increases. Figure 10A shows the fate of this branch of solutions as τ increases for γ = 0.25 and α = 0.7709. As described in Theorem 5.1, there is a Hopf bifurcation at τ = (1 − γ cos α)/(1 − cos 2 α) and this gives rise to a branch of periodic orbits (labeled i). Figure 10B(i) shows a time series of the v a1,b2,a2,b1 which maintains the symmetry of v a1 = v b1 and v a2 = v b1 . At a critical value of τ there is symmetry-breaking bifurcation and a new branch of solutions emerges where all the v's are different. This is shown in Fig. 10B(ii). Further increases in τ lead to a pair of period-doubling bifurcations, PD1, PD2. The branch emerging from PD1 leads to a stable periodic branch, an example of which is in Fig. 10B(iii). The second branch, PD2, leads to an unstable branch of solutions and re-stabilizes the branch labeled ii. This branch and the branch labeled iii then lose stability at torus bifurcations, labeled TR1 and TR2, respectively. Below, we will explore what happens after the bifurcation at TR2. Once τ gets large enough, the dynamics appears to become symmetric with v a1 = v a2 and v b1 = v b2 where it is as in Fig. 9. Figure 11 shows the behavior of the antisymmetric branch as τ and γ change. For a fixed value of γ , as τ increases, the selective state (R 1 ) loses stability at the The branch of periodic orbits such as seen in Fig. 10B(ii) is found in R 2 and loses stability at a pitchfork bifurcation (BP). The HB and BP curves appear sequentially for all γ < 1 in contrast to Fig. 8. In the region R 3 , solutions have lost the symmetry of R 2 and resemble the solutions shown in Fig. 10B(ii). Further increases in τ lead to a periodic doubling and solutions in the region R 4 look like Fig. 10B(iii). Region R 4 is bounded by PD1 and the torus bifurcation TR2 for γ < 0.375. For γ > 0.375, instead of a torus bifurcation, there is a period-doubling cascade to chaos (not shown). Beyond the torus bifurcation, there seems to be quasi-periodic motion that persists until PD2 where the branch labeled ii stabilizes again to form a new region R 3 . This branch loses stability at a torus bifurcation TR1. For τ beyond TR1, there seems to be chaos, quasi-periodic behavior, and complex periodic orbits. Eventually, for τ large enough, the dynamics of Fig. 9 is all that remains.

Discussion
We have explored the BCM rule as a dynamical system. Although the literature does not suggest a homeostatic time-scale range that ensures stability of a biological system, we have shown that the selective fixed points of the BCM rule are generally stable when the homeostatic time scale is faster than synaptic modification time scale, and that some complex dynamics emerges as the homeostatic time scale varies. The nature of this complex dynamics also depends on the angular and amplitudinal relationships between stimuli in the stimulus set. In our analysis, the neuron is presented with stimuli that switch rapidly, so it was possible to reduce the learning rule to a simple averaged set of differential equations. We studied the dynamics and bifurcation structures of these averaged equations when the homeostatic time scale is close to the synaptic modification time scale, and found that instabilities arise, leading to oscillations and in some cases chaos and other complex dynamics. Similar results would hold if the quadratic term v 2 in the second line of Eq. (2) were replaced with v p , p > 2, since the original formulation by Bienenstock et al. [5] suggests that the fixed point structures are preserved for any positive value of p. Since the onset of the bifurcations (such as the Hopf bifurcation) depends mainly on the symmetry of these fixed points, we expect that the main results will be the same and only the particular values of parameters would change. While this paper has focused on how small changes in the time scale of a homeostatic threshold can lead to complex dynamics, there are many other kinds of homeostases [28] which present many time scales and similar opportunities for analysis.
The model neuron we used in this paper has been assumed to have linear response properties, which may be seen as oversimplified, and hence a potential problem in translating our conclusions to actual biological systems. It is well known that plasticity goes beyond synapses, and it is sometimes even a neuron-wide phenomenon [29], and that there is no unique route to regulating the sliding threshold of the BCM rule [10,30]. Thus in addition to synaptic activities, intrinsic neuronal properties may also play a role in the evolution of responses and linearity may not be able to capture this scenario. The introduction of a nonlinear transfer function to the BCM learning rule has been addressed by Intrator and Cooper [7]. In their formulation, the learning rule is derived as a gradient descent rule on an objective function that is cubic in the nonlinear response. Our decision to use linear units is motivated by the accessibility to formal analysis. Biologically, linearity can be justified if we assume that the underlying biochemical mechanisms are governed by membrane voltage rather than firing rate; see, for example Clopath and Gerstner [31].
The theoretical contributions of this paper are based on an analysis that we did using a mean field model of the BCM learning rule. Similar mean field models have been made, but in terms of synaptic weights; see, for example Yger and Gilson [32]. With this approach to the mean field, it is difficult to arrive at the fact that the fixed points-not their stability-of the learning rule depend only on the probabilities with which each stimulus is presented. In this paper, we have given a derivation of the mean field model of the BCM learning rule as a rate of change of the activity response v, with time. The derived model considers the amplitudes of the stimuli presented, the pairwise angular relationships between the stimuli, and the probabilities with which the stimuli are presented. The appeal of this derivation is that it easily highlights the fact that the fixed points depend on these probabilities. Additionally, the derivation is important because the dynamics of the BCM learning rule is driven by the activity response (not the synaptic weights), and many analyses in the literature rely on this fact; see, for example Castellani et al. [9]. Our analyses considered three cases: one neuron with two weights and two stimuli, one neuron with two weights and three stimuli, and lastly a weakly interacting small network of neurons.
In exploring the dynamics of a single neuron, we used Fig. 3 to show the dependence of critical value τ c of τ = τ θ /τ w -which leads to a Hopf bifurcation-on the angle, α between the stimuli, the amplitudinal factor, A between the stimuli, and the probability distribution, ρ of the stimuli. The role of τ as a bifurcation parameter has been seen in several recent works including Zenke et al. [33], Toyoizumi et al. [34], and Yger and Gilson [32]. A possible future work, which is beyond the scope of this paper, is to investigate the dependence of the selectivity of the neuron on τ . For a single neuron presented with a set of stimuli S, Bienenstock et al. [5] defined the selectivity of the neuron as a function of the area under the tuning curve of the neuronal responses to S. This definition, however, assumes that the learning rule converges to a stable steady state. To analyze the selectivity of a neuron as τ varies, one would need a measure of selectivity that addresses an oscillatory steady state. Thus, it might be more appropriate to talk about relative selectivity (RS) in this case. If the neuron receives stimulus inputs from the set S = {x (1) = (x 11 , x 12 ), x (2) = (x 21 , x 22 )} with synaptic weights w = (w 1 , w 2 ), then at any point in time, v 1 (t) = w 1 (t)x 11 + w 2 (t)x 12 and v 2 (t) = w 1 (t)x 21 + w 2 (t)x 22 . If for given τ , we let t o be the point in time at which the dynamics of the neuron achieves a stable steady state or an oscillatory steady state, and d(τ ) = min t≥t o |v 1 (t) − v 2 (t)|, then we can define RS as follows: .
Note that 0 ≤ RS ≤ 1, since it is defined as a fraction of the maximum selectivity. For this reason it tends to have the same maximum value and shape for all values of α ∈ (−π/4, π/4). Preliminary analysis of this formulation allows us to conjecture that RS stays pretty much at its maximum for τ ∈ (0, τ c ) decays to 0 as τ increases on (τ c , ∞).
In our analysis of a small network (see Sect. 5) we have made the simplifying assumption that the lateral inhibitory weight is constant in time. The incorporation of an inhibitory plasticity rule (as in Moldakarimov et al. [35]) would necessitate a third time-scale parameter, and possibly a fourth if the inhibitory rule were to include a dynamic modification threshold. This is beyond the scope of the paper and reserved for future work. Another related possible future direction is to perform an analysis of a large network of BCM neurons, by observing what happens to the network dynamics at different time-scale parametric regimes. A good starting point is to explore the dynamics for a fully connected network with equal inhibition, that is, each neuron is coupled with every other neuron in the network and inhibits each of them equally. The next step would be to let the amount of inhibition vary according to how far away the inhibiting neuron is. It may also be interesting to examine how the architecture of the network is affected. We know, for instance, that spike-time dependent plasticity (STDP) has the ability to yield a feedforward network out of a fully connected network. The analysis that Kozloski and Cecchi [36] used to demonstrate this finding centers around the synaptic weights. Thus it will be useful to pay closer attention to the synaptic weights in future work. Moreover, the oscillatory and chaotic properties we observed in the small coupled network will also be observed had our mean field been derived in terms of the weight and the analyses been done with the synaptic weights.
The debate about synaptic homeostatic time scales in neurobiology remains vibrant. A review of the literature seems to reveal a varied, and somewhat paradoxical set of findings among experimentalists and theoreticians. While homeostasis of synapses found in experiments is slow [12,37], homeostasis of synapses in most theoretical models needs to be rapid and sometimes even instantaneous to achieve stability [33,38,39]. There are, however, ongoing efforts to shed more lights on the debate. It has been suggested that both fast and slow homeostatic mechanisms exist. Zenke and Gerstner [39] suggest that learning and memory use an interplay of both forms of homeostasis; while fast homeostatic control mechanisms help maintain the stability of synaptic plasticity, slower ones are important for fine-tuning neural circuits. In addition to the present work contributing to the debate by demonstrating the relevance of fast homeostasis to synaptic stability, it also furthers the discussion as regards the link between STDP and the BCM rule: Zenke et al. [33] found that homeostasis needs to have a faster rate of change for spike-timing dependent plasticity to achieve stability. Furthermore it is well known that, under certain conditions, the BCM learning rule follows directly from STDP [13,14].