Co-existence of synaptic plasticity and metastable dynamics in a spiking model of cortical circuits

Evidence for metastable dynamics and its role in brain function is emerging at a fast pace and is changing our understanding of neural coding by putting an emphasis on hidden states of transient activity. Clustered networks of spiking neurons have enhanced synaptic connections among groups of neurons forming structures called cell assemblies; such networks are capable of producing metastable dynamics that is in agreement with many experimental results. However, it is unclear how a clustered network structure producing metastable dynamics may emerge from a fully local plasticity rule, i.e., a plasticity rule where each synapse has only access to the activity of the neurons it connects (as opposed to the activity of other neurons or other synapses). Here, we propose a local plasticity rule producing ongoing metastable dynamics in a deterministic, recurrent network of spiking neurons. The metastable dynamics co-exists with ongoing plasticity and is the consequence of a self-tuning mechanism that keeps the synaptic weights close to the instability line where memories are spontaneously reactivated. In turn, the synaptic structure is stable to ongoing dynamics and random perturbations, yet it remains sufficiently plastic to remap sensory representations to encode new sets of stimuli. Both the plasticity rule and the metastable dynamics scale well with network size, with synaptic stability increasing with the number of neurons. Overall, our results show that it is possible to generate metastable dynamics over meaningful hidden states using a simple but biologically plausible plasticity rule which co-exists with ongoing neural dynamics.


Introduction
Cortical circuits express ongoing neural dynamics that is often found to be metastable.By this we mean that ensembles of neurons keep approximately constant firing rates for transient periods of time, until a subset of the neurons switch to different firing rates.A stable vector of firing rates across simultaneously recorded neurons may be thought of as a 'hidden state', of which spike trains emitted by the neurons are noisy observations.In the cortex of rodents and non-human primates, metastable states last between a few hundred of ms to a few seconds and their identity and dynamics have been linked to sensory processes, attention, expectation, navigation, decision making and behavioral accuracy (see (La Camera et al., 2019;Brinkman et al., 2022) for reviews).One way to model this type of metastable activity is to organize a neural network in clusters, or cell assemblies (Litwin-Kumar and Doiron, 2012;Mazzucato et al., 2015).The neurons in each clusters are connected by synaptic weights whose average value is larger than the average weight between neurons of different clusters.A key question is then to understand how such structure subserving metastable dynamics can emerge from, for example, experience-dependent plasticity.This problem is very similar to the problem of the stable formation of Hebb assemblies leading to persistent activity reflecting the memory of learned stimuli (Del Giudice et al., 2003;Amit and Mongillo, 2003;Litwin-Kumar and Doiron, 2014;Zenke et al., 2015).In both cases, one desires to obtain stable clusters of strongly connected neurons; the difference is that here we are interested in neural activity that, as the consequence of learning, is metastable.
The problem of the formation of stable cell assemblies has been of interest for a long time (see (Dayan and Abbott, 2005;Gerstner et al., 2014) for textbook reviews).The most recent efforts in this direction have included the combination of spike-timing-dependent plasticity (STDP) (Markram et al., 1997;Bi and Poo, 1998) and a number of homeostatic mechanisms (Watt and Desai, 2010;Turrigiano, 2017) to keep the neural activity bounded during learning.However, while most efforts so far have focused on the formation of stable neural clusters with the purpose of representing retrievable memories or the development of receptive fields, here we focus on metastable dynamics.In other words, instead of focusing on stable neural dynamics following the presentation and removal of a stimulus, the aim of this study is to obtain neural dynamics that continuously switches among a set of hidden states, which have been stored in the network structure by training.
In pursuing this effort, we require the synaptic plasticity rule to be biologically plausible and leading to the formation of neural clusters that are stable for random perturbations.Moreover, we require that the neural clusters generate metastable dynamics that coexists with ongoing plasticity (i.e., in the absence of external stimuli).We further aim to obtain a model where new information can be accommodated, so that training with a new set of stimuli will lead to cluster rearrangement producing metastable dynamics among the new states.The requirement of biological plausibility mostly means that the plasticity rule must be local and must depend only on presynaptic and postsynaptic activity in a way that is accessible to the synapse -in particular, it must depend only on presynaptic spikes and postsynaptic variables related e.g. to membrane voltage or calcium transients (Fusi et al., 2000;Shouval et al., 2002;Clopath et al., 2010;Graupner and Brunel, 2012).We will not consider directly the role of calcium in our model.Rather, our learning rule belongs to the class of voltage-based STDP models (Fusi et al., 2000;Clopath et al., 2010).We will also allow neurons to respond to multiple stimuli, as observed in experiment (Miyashita and Chang, 1988;Miyashita, 1988;Curti et al., 2004;Jezzini et al., 2013;Fusi et al., 2016).
Here, we present a plasticity rule that fulfills all of the above requirements in a network of exponential integrate-and-fire neurons (Fourcaud-Trocmé et al., 2003).The plasticity rule is simple and depends only on presynaptic spikes arrival times, postsynaptic voltage, and postsynaptic spikes.Long-term potentiation (LTP) and long-term depression (LTD) are obtained by comparing a voltage-sensitive internal variable to an adapting threshold, reminiscent of the BCM rule (Bienenstock et al., 1982).The threshold adapts in a way to produce transient LTP among co-active neurons, followed by LTD after prolonged activation.This leads to the stable formation of neural clusters while promoting a dynamics that is automatically metastable for a large range of network sizes.Moreover, additional homeostatic mechanisms, such as inhibitory plasticity or synaptic scaling (Tetzlaff et al., 2011;Zenke and Gerstner, 2017), are not required.Our plasticity rule provides a possible explanation of the emergence of metastable dynamics observed in many brain areas (Brinkman et al., 2022) during both ongoing and evoked neural activity.

The synaptic plasticity rule
We endowed a recurrent network of excitatory and inhibitory exponential integrate-and-fire (EIF) neurons (see Methods) with the following plasticity rule for synapses connecting excitatory neurons.Given a synapse with efficacy w ij from excitatory neuron j to excitatory neuron i, a change in synaptic efficacy was triggered by the arrival of a presynaptic spike, while its polarity and strength depended on the activity of the postsynaptic neuron: (1) is the presynaptic spike train and δ(t) is the Dirac delta function.
[x] + .= max(x, 0) is the rectified linear function and ṽi is a local postsynaptic variable which dictates the polarity of plasticity: the synapse undergoes LTP when ṽi > θ LTP , and it undergoes LTD when ṽi < θ LTD , with θ LTD ≤ θ LTP .The variable ṽi could represent a calcium variable or the running average of the membrane potential.In our case, ṽi was the low-pass filter of ∆ T e , where V is the membrane potential of the EIF neuron (see Methods), and therefore it was driven most substantially during the emission of a spike.The strength of LTP was further modulated by an attenuation factor e −βw 2 ij that constrains the ability of strong synapses to grow excessively.
As similar learning rules of this kind (Dayan and Abbott, 2005), this rule is unstable if θ LTP and θ LTD are constant thresholds and the attenuation factor is missing.In such a case, stimulating a subset of neurons would result in higher firing rates and larger ṽi , which in turn would lead to higher firing rates, and so on.One way to prevent this problem is to use activity-dependent thresholds as in the BCM rule (Bienenstock et al., 1982).This idea requires the LTP and LTD thresholds to be a non-linear function of the postsynaptic activity ṽ, with faster dynamics than the dynamics of the synaptic weights.While the BCM rule uses a supralinear function of ṽ, we chose a hyperbolic tangent function of both ṽ and postsynaptic spiking activity, si : (2) In this equation, si is the low-pass filtered postsynaptic spike train, i ), θ a is a constant in units of θ i , g is a gain factor and γ is a constant.For simplicity, we used the same threshold θ i (t) for both θ LTP and θ LTD , but in principle they could be different.
The hyperbolic tangent in Eq. 2 automatically adjusts the dynamics of the threshold θ i (t) for different postsynaptic activities.The motivation behind this specific model is that it can lead naturally to switching dynamics of neural clusters.This can be understood from the dynamics of Eq. 2, which adapts to the size of its argument ∆ ≡ ṽi + γs i (Fig. 1a; see Methods for details).For small arguments, the dynamics is fastest and θ i closely follows ṽi , resulting in no average change in the synaptic weights (Fig. 1b).This occurs when the postsynaptic membrane potential is characterized by subthreshold fluctuations.When the postsynaptic neuron fires a spike, its membrane potential rises significantly and θ i is attracted to a new value with slower dynamics.As a result, θ i will temporarily lag behind ṽi + γs i , producing a short temporal window for LTP, followed by a longer window for LTD (Fig. 1c).An occasional spike during ongoing activity will not produce a meaningful synaptic change, however, repeated activation of the same postsynaptic neuron will produce a longer window for LTP (Fig. 1d, blue shaded area).This will occur when the postsynaptic neuron engages in recurrent excitation, and will promote the formation of clusters of co-active neurons.Prolonged activation of the same cluster, however, will turn LTP into LTD.This is due to the term γs i , so that the threshold θ i will finally approach a value θ i ≈ ṽi + γs i > ṽi , causing LTD (Fig. 1d, red shaded area).In summary, θ i dynamics can help to form neural clusters via transient LTP, while impeding prolonged activation of the same clusters, naturally producing a switching dynamics among co-active groups of neurons.
We note that although it might seem superfluous to add the term γs i to the equation for θ i , it is in fact required to ensure LTD when the neuron is constantly active (i.e.ṽi ≫ 0).This can be either achieved by adding the offset term γs i to ṽi , or by using a supralinear function of ṽi , such as ṽp i with p > 1, as in the BCM rule.However, the additive solution ṽi + γs i has the advantage of preventing synaptic change when the postsynaptic neuron is inactive, as explained above (see also the next section).We will further comment on the comparison with the classical BCM rule in the Discussion.∆ ≡ ṽi + γsi (see Eq. 2).T half is defined as the time it takes for θ(0) = 0 to reach half of ∆ (in units of τ θ ), and for ∆ ≳ 3 it increases linearly with ∆ (see Methods).(b) The learning rule ignores the subthreshold fluctuations of the membrane potential when the neuron does not fire spikes.(c) After the postsynaptic neuron fires a single spike, the synapse undergoes a short window for LTP (blue shaded area) followed by a longer window for LTD (red shaded area).(d) Repeated activation of the postsynaptic neuron will cause a transition from LTP to LTD.

Formation of stable clusters with metastable dynamics
were presented to the network in random order, each targeting a fixed but randomly chosen subpopulation of excitatory neurons, which we call a cluster.Although each cluster was associated to a separate stimulus, each cluster contained f = 10% of randomly selected neurons, so that the same neuron could respond to more than one stimulus as typically observed in experiments (Miyashita and Chang, 1988;Miyashita, 1988;Curti et al., 2004;Jezzini et al., 2013).Thus, the mean number of neurons in each cluster was 80, but each neuron had a probability of about 0.26 to respond to at least two stimuli, and clusters had a probability 0.61 of sharing neurons with other clusters (see Methods for details).
Stimulus presentations occurred every 2000 ms and each lasted 500 ms (however, using random interstimulus intervals did not alter the results).This training procedure lasted for 10 minutes and on average each sensory stimulus was presented 30 times.
As expected, after training the network exhibited metastable ongoing dynamics which was still present after 2 and 4 hours (Fig. 2a).In this figure, spikes from neurons in the same cluster have the same color, and neurons belonging to multiple clusters are duplicated.The black curves superimposed to each cluster's spikes measure the 'overlap' of the whole network's activity with the stimulus associated to that cluster (see Methods).For a given stimulus, the overlap varies between zero and one and approaches 1 when all active excitatory neurons of the network belong to the associated cluster, while the remaining neurons have negligible firing rate.The raster plots in Fig. 2 show that the neural activity after training switches between networks states characterized by specific cluster activations.These states can therefore be interpreted as memories of the stimuli, with the ongoing dynamics being akin to a random walk among these memory states.The distribution of state durations was approximately exponential with mean around 250 − 300 ms (Fig. 2c), reminiscent of the discrete Markov processes with fast state transitions found to describe ensembles of cortical spike trains (Seidemann et al., 1996;Jones et al., 2007;Mazzucato et al., 2015;Maboudi et al., 2018;Benozzo et al., 2021;Lang et al., 2023).The metastable dynamics observed after training is the consequence of potentiated synapses inside clusters and the emergence of block structure in the synaptic matrix (Fig. 2b), a structure known to potentially produce metastable dynamics in spiking networks (Litwin-Kumar and Doiron, 2012;Deco and Hugues, 2012;Mazzucato et al., 2015).In our case, metastable dynamics is co-existent with synaptic plasticity, but it remains present also if synaptic plasticity is turned off after training (not shown).
We investigate the role of the attenuation factor e −βw 2 ij in Eq. 1 and the spiking term γs i in Eq.To quantify the amount of learning, we measured the average synaptic weight among synapses connecting neurons sharing at least one sensory stimulus (dubbed 'w 1 '), and the average weight among neurons that did not share any sensory stimuli ('w 0 '): where C ∈ {0, 1}, N C is the number of synapses of type C, and S C is the set of ij indices of synapses of type C. Training significantly increased w 1 compared to w 0 , thus mapping successfully the association between sensory inputs and the corresponding neural clusters activated by the stimuli.After training, the excitatory synapses were continuously modified by the plasticity rule during ongoing metastable activity, leading to the decrease of w 1 and increase of w 0 (Fig. 2d).These synaptic changes stabilize after 4 hours; after this time, plasticity is stable and coexists with metastable dynamics unfolding as a random walk among states associated with the training stimuli.

Learning stability vs. network size
Here we show that spontaneous memory decay (Fig. 2d) slows down as a function of network size.We show this by estimating the rate of change for w 0 and w 1 as a function of N , the number of neurons in the network.Scaling up the network size can be done in several ways (Renart et al., 2007;Doiron and Litwin-Kumar, 2014); we used two different scaling procedures, one in which Q ∝ N and one in which In the first scenario, we scale up the number of clusters Q proportionally to N E = 0.8N while keeping constant the mean cluster size N E /Q = 80 (this corresponds to a coding level f = 1/Q ∝ 1/N , where f is the probability that a neuron is targeted by a stimulus; see Methods).As shown in Fig. 3, synaptic decay is slower in larger networks (Fig. 3a) while ongoing metastable dynamics co-exists with synaptic plasticity (Fig. 3b).In particular, nearly constant values of the mean synaptic weights are observed in networks of 20, 000 neurons (rightmost panel of Fig. 3a).

Analytical arguments imply a synaptic decay rate
where ∆w C is the average change in strength over the interval ∆t for synapses of type C (defined analogously to Eq. 3; see Methods).To confirm this prediction, we plotted N ∆w 0 (Fig. 3c) and N ∆w 1 (Fig. 3d) as a function of time and for increasing network size (from N = 5, 000 to N = 20, 000).The plots show that, after entering the ongoing metastable regime, both |N ∆w 0 | and |N ∆w 1 | increase linearly with time as predicted by Eq. 4.Moreover, the slopes of N ∆w 0 and N ∆w 1 approach a constant value that is independent of network size when N is large enough (Fig. 3c-d, right panels).The non-linear transients visible in the left panels of Fig. 3c-d are due to either a small N or not having yet reached the stable regime of synaptic decay (which takes about 4h in small networks, see Fig. 2d).For large N and 3h post training, synaptic decay is stable and N ∆w C is linear as predicted (Fig. 3c-d These results confirm that the synaptic decay rate ∝ N −1 , implying slower memory decay and more stable synaptic efficacies in larger networks, despite ongoing plasticity.Moreover, the dynamics produced by the networks of Fig. 3 shows near-exponential distributions of the state durations (Fig. S3), with mean durations approaching stability 4 hours post-training and for N > 15, 000.
The slowdown of synaptic decay rate with N was found also in an alternative scaling scenario in which In this case we have a smaller number of clusters (∝ √ N ), but each cluster grows in size with N .We trained several networks under this scaling and found results analogous to those of Fig. 3: training is successful, the synapses become more stable in larger networks, and mestastable dynamics is reliable across network sizes (see Fig S4).Compared with the scaling of Fig. 3, however, both w 1 and the excitatory firing rates decreased with network size in a manner that suggests a synaptic decay rate between 1/N and 1/ √ N (Fig. S4c-d).

Robustness to perturbations and remapping
In the previous section we have shown that the synaptic weights are stable against metastable spontaneous activity.We show next that this is also true in the presence of random stimuli.To show this, we probed the network with external sensory inputs with similar physical features (i.e.magnitude and coding level) as the training stimuli.The occurrence schedule of these external stimuli was modeled as a Poisson process with a mean inter-event interval of 10 s.We considered three different scenarios.In the first scenario, the external stimuli were randomly sampled and used only once (i.e., the subsets of target neurons were uniformly and independently sampled at each occurrence; Fig. 4a).This scenario mimics a purely noisy environment where there is no temporal correlation in the sensory inputs.In the second scenario, only half the sensory inputs were random inputs, while the other half were sampled from a finite set of 10 sensory stimuli targeting always the same neurons (Fig. 4b).This scenario mimics a combination of meaningful stimuli occurring amid some random sensory background.Finally, in the third scenario the external stimuli were all sampled from a pre-defined set of 10 sensory stimuli, mimicking a situation where the network is being retrained with new stimuli (Fig. 4c).In all cases, the network kept adjusting its synaptic weights according to our plasticity rule.
In the presence of random stimulation, the network maintained a significant memory of the learned stimuli for several hours, similar to what we found in the absence of sensory stimulation (Fig. 4a, top and middle panels).In the third scenario, the network quickly learned the new sensory inputs (Fig. 4cbottom) while gradually forgetting the previously learned ones (Fig. 4c-middle).Finally, in the 'mixed' stimulation scenario, the network could still learn the new sensory stimuli but at a much slower rate (Fig. 4b-top, compare dashed lines in Figures 4b-top and 4c-top).A raster plot of the neural activity shows the metastable activation of clusters of neurons representing stimuli in both the first and second set (Fig. 4b, middle and bottom panels, respectively), showing that the network could accommodate the learning of new stimuli (Fig. 4b, bottom) while maintaining a trace of the previously learned ones.Note how, in between stimulus presentations, the network dynamics was metastable in all cases.
In conclusion, our plasticity rule can successfully structure a network of spiking neurons into an extensive number of clusters (cell assemblies), in a manner that spontaneously generates and maintains metastable ongoing activity.Both metastable dynamics and the learned structure are stable to random stimulus perturbations, but also flexible enough to be reshaped by new stimuli.

Model features and main results
Metastable neural dynamics, defined as the repeated occupancy of a set of discrete neuronal states occurring at seemingly random transition times (Brinkman et al., 2022), has recently come to the fore as a potential mechanism mediating sensory coding, expectation, attention, navigation, decision making and behavioral accuracy (see (La Camera et al., 2019;Brinkman et al., 2022) for reviews).At the same time, model variations over a basic network of spiking neurons with clustered architecture (Miller and Katz, 2010;Deco and Hugues, 2012;Litwin-Kumar and Doiron, 2012;Mazzucato et al., 2015) have accounted for a wealth of data concerning this metastable dynamics (Mazzucato et al., 2015;Mazzucato et al., 2016;Mazzucato et al., 2019;Lang et al., 2023).One is then led to the following question: how can a cortical circuit be shaped by internal dynamics and externally-driven events so as to converge to the clustered architecture producing metastable dynamics?
The model of synaptic plasticity introduced in this work answers this question by offering a simple yet biologically plausible mechanism capable of cluster formation and metastable dynamics.The plasticity rule builds clusters of neurons with strengthened synaptic connections; after training, the neural activity switches among a number of 'states' that can be interpreted as neural representations of the stimuli used for training.Notably, the metastable dynamics generated by learning co-exists, after training, with ongoing synaptic plasticity.
The plasticity rule presented here has many desirable features.One is biological plausibility, in at least two ways: i) the plasticity rule is local, i.e., it depends only on quantities that are available to the synapse, and ii) it works in networks of spiking neurons (rather than artificial neural networks or firing rate models).Another desirable feature of our rule is that it leads to metastable dynamics with similar properties as those observed in experimental data (Abeles et al., 1995;Jones et al., 2007;Ponce-Alvarez et al., 2012;Mazzucato et al., 2015;Maboudi et al., 2018;Benozzo et al., 2021;Recanatesi et al., 2022;Lang et al., 2023), including an exponential distribution of cluster activation durations with means of few hundreds of ms (Mazzucato et al., 2015).Our model also has a minimal number of mechanisms and parameters within a class of similar models that might produce metastable dynamics -for example, a model wherein the presynaptic spike train is replaced by a low-pass filter of it; or a model that imposes an upper bound to the synaptic weights.Those mechanisms are not needed in our plasticity rule.In addition to an adaptive threshold, our model requires only two mechanisms: an attenuation term for LTP (the e −βw 2 ij term in Eq. 1) and a spiking term γs i in the equation for the adaptive threshold (Eq.2).These mechanisms require the introduction of parameters β and γ, which have a clear biological correlate.β controls the amount of LTP.Mechanisms reducing LTP but not LTD have been described in vitro by (Delgado et al., 2010), who showed how an increase in total excitatory and inhibitory activity (for example due to cluster activation) can rapidly reduce the amplitude of LTP but not LTD.This mechanism could be mediated through changes in calcium dynamics in dendritic spines and has been proposed as an example of 'rapid compensatory process' by (Zenke et al., 2017).The parameter γ controls the spiking contribution to the threshold's dynamics.This term could emerge from cellular mechanisms similar to those responsible for afterhyperpolarization currents (Sah, 1996;Andrade et al., 2012), which are often used in models of firing rate adaptation (Wang, 1998;Ermentrout, 1998;La Camera et al., 2004).We found no need to invoke additional compensatory mechanisms such as inhibitory plasticity or synaptic normalization often used to keep firing rates and synaptic strengths in check during training or during ongoing activity (Litwin-Kumar and Doiron, 2014; Zenke and Gerstner, 2017).
Another notable feature of our model is its behavior under network scaling.Cortical circuits are variable in size but they typically comprise large numbers of neurons (Shepherd and Grillner, 2010).It is therefore important to test whether models of plasticity maintain their properties as networks are scaled up in size.Scaling up a neural circuit is a necessary operation is several neuroscience theories (e.g., the theory of balanced networks (van Vreeswijk and Sompolinsky, 1998;Renart et al., 2007)) and it can be done in several ways.We have chosen here 2 ways, one that is most intuitive, with order N clusters but fixed cluster size, and one that is most commonly used, wherein both the size and the number of clusters grows as √ N (see (Doiron and Litwin-Kumar, 2014) for a discussion of the different scaling regimes).In both cases, we have found that training leads to the formation of stable clusters generating metastable dynamics, at least for networks up to 20, 000 neurons (larger networks becomes computationally expensive to train and simulate).Importantly, both the properties of the dynamics (Fig. S3) and the learned synaptic weights (Fig. 3 and Fig. S4) tend to stabilize as the number of neurons in the circuit grows.In the first scaling regime, we have found that the synaptic decay decreases very fast with N , i.e., ∝ 1/N .Finally, our model also potentially solves a conceptual problem arising in deterministic network models that try to combine attractor networks with balanced networks (Renart et al., 2007;Doiron and Litwin-Kumar, 2014).These models produce metastable dynamics due to finite size effects, so that in very large networks stable (rather than metastable) cluster activations would prevail (Doiron and Litwin-Kumar, 2014;Huang and Doiron, 2017;La Camera et al., 2019;Berlemont and Mongillo, 2022).It is not clear how to obtain metastable dynamics in these models in the limit of very large network size.In the presence of our plasticity mechanism, however, persistent activity can only be transient, because a long-lasting cluster activation always induces LTD, which will eventually inactivate the cluster.This mechanism does not depend on network size, as shown in Fig. 3, and ensures the stable coexistence of plasticity and metastable dynamics for arbitrary network size.

Comparison with previous work
Modeling the self-organization of neural circuits via synaptic plasticity has long been an important topic of research (Dayan and Abbott, 2005;Gerstner et al., 2014) and has recently been studied in a number of works bearing various similarities to our work.However, the first successful simulations of spiking networks dynamics with ongoing synaptic plasticity have appeared only about 20 years ago (Del Giudice and Mattia, 2001;Del Giudice et al., 2003;Amit and Mongillo, 2003).Previous models closest to ours are models of voltage-based STDP rules wherein an internal variable (often interpreted as postsynaptic depolarization) is compared to different thresholds for induction of LTP and LTD (Fusi et al., 2000;Clopath et al., 2010;Litwin-Kumar and Doiron, 2014;Zenke et al., 2015).Part of the motivation for some of these models was to reproduce experimental results on both STDP (Markram et al., 1997;Zhang et al., 1998) and the dependence of synaptic plasticity on postsynaptic depolarization (Artola et al., 1990;Ngezahayo et al., 2000;Sjöström et al., 2001).When applied to self-organization of cortical circuits, most previous work in this context has focused on the formation of clusters (often called cell assemblies) for the formation of stable neural activity representing memories.The goal of these works was therefore to obtain stable attractor dynamics after learning.In contrast, the goal of our work was to obtain a stable synaptic matrix that co-exists with a rich, ongoing metastable dynamics.
A notable exception among previous efforts is the model by (Litwin-Kumar and Doiron, 2014).In their model, the authors combined the voltage-based STDP rule of (Clopath et al., 2010) with inhibitory plasticity (Vogels et al., 2011) and synaptic renormalization to obtain metastable dynamics in a network of adaptive EIF neurons.Inhibitory plasticity was used to maintain a target firing rate in the excitatory neurons and prevent the formation of winner-take-all clusters during training.This problem arises due to the fact that, during training, different stimuli target different numbers of neurons creating inhomogeneities in clusters size (Doiron and Litwin-Kumar, 2014) (in our case, the initial inhomogeneity was 10%).We found that in our model, neither inhibitory plasticity nor synaptic normalization was required, presumably thanks to the presence of a dynamic threshold which keeps the synaptic weights within boundaries as in the original BCM rule (Bienenstock et al., 1982).In our model, the dynamic threshold also produces LTD after prolonged cluster activation.This, together with LTP attenuation (the e −βw 2 ij term in Eq. 1; see Fig. S1) helps to prevent the formation of large clusters during training, while also keeping the firing rates within an acceptable range.This confers advantages as it frees up inhibitory plasticity (and possibly other network mechanisms) to accomplish other tasks.
More recently, (Wang and Aljadeff, 2022) have proposed a plasticity rule that produces switching dynamics that resembles ours.In their model, rather than comparing a postsynaptic variable to a threshold for LTP/LTD, a Hebbian term is compared to each threshold.The Hebbian term is the lowpass filter of the product of the low-pass filters of the pre-and post-synaptic spike trains, and was interpreted as a calcium signal responsible for synaptic plasticity (Inglebert et al., 2020).The main goal of (Wang and Aljadeff, 2022) was to present a learning rule resulting in stable and unimodal weight distributions after learning.The authors show that their learning rule has intrinsic homeostatic properties without having to impose an additional homeostatic mechanism on timescales which are much shorter than observed experimentally (Zenke et al., 2017).Perhaps due to these homeostatic properties, they reported metastable dynamics with 2 or 3 stimuli after training.However, as the aim of their work was not to investigate the potential for metastable dynamics, training was only accomplished with few stimuli, and no scaling of the network size was attempted.
We have shown that our plasticity rule allows to learn new stimuli.A similar result was obtained with the plasticity rule of (Litwin-Kumar and Doiron, 2014), which however requires inhibitory plasticity and synaptic renormalization.More recently, (Manz and Memmesheimer, 2022) have proposed a pure STDP plasticity rule that, similarly to our rule, does not require inhibitory plasticity or synaptic renormalization.However, (Manz and Memmesheimer, 2022) focus on the generation of stable clusters producing persistent activity rather than metastable dynamics.Their persistent activity has the signature of fast Poisson noise due to the stochastic nature of their model neurons, but lacks the slow fluctuations characteristic of ongoing metastable dynamics (Litwin-Kumar and Doiron, 2012).We also note that our results on remapping of sensory stimuli is different from the reorganization of clusters described in (Manz and Memmesheimer, 2022), which has been related to representational drift (Ziv et al., 2013;DeNardo et al., 2019;Rule et al., 2019;Masset et al., 2022).In our model, new clusters form due to stimulation by a new set of stimuli rather than drifting of the representations due to the noisy dynamics of plasticity as it occurs in (Manz and Memmesheimer, 2022).
Our model is reminiscent of the BCM rule (Bienenstock et al., 1982), and in fact it could be understood as a BCM rule for spiking neurons in which the non-linearity of the threshold equation is given by a hyperbolic tangent.More common implementations of BCM utilize a power function (often a quadratic function; see e.g.(Intrator and Cooper, 1992;Dayan and Abbott, 2005)).Another difference with our model is the presence, in the latter, of the additional term γs i in the threshold dynamics, Eq. 2. In our plasticity rule, this term is required to enforce LTD when the neuron is constantly active (i.e.ṽi ≫ 0).One is naturally led to consider whether or not this term could be removed if we replace the hyperbolic tangent with a power of ṽi , e.g., where a is a constant and p > 1.To best match our model, p should be small, e.g.1.1-1.2.We found that this learning rule does indeed work; however, its performance during remapping is somewhat inferior (not shown).Another problem of this learning rule is that it is very sensitive to the parameter values: for example, while p = 1.1, a = 1.5 gives good results, synaptic weights decay fast when p = 1.2, a = 1.4.This problem is somewhat ameliorated when using a squashing function of −θ i + aṽ p i , for example, an hyperbolic function again: In conclusion, although alternative versions of the adaptive threshold may work in principle, we found that the threshold dynamics, Eq. 2, is both more robust and better performing.

Limitations and possible extensions
Our model has a minimal number of ingredients and did not use widespread properties of cortical neurons, such as external noise, firing rate adaptation or short-term plasticity.One reason is model economy, i.e., the desire to use only minimal, essential ingredients.Another reason is that external noise, firing rate adaptation and short-term plasticity all facilitate metastable dynamics (Moreno-Bote et al., 2007;Giugliano et al., 2008;Deco and Hugues, 2012;Cao et al., 2016;Jercog et al., 2017;Setareh et al., 2017;Ballintyn et al., 2019), leading to a possible confound on the role of long-term modifications on generating and maintaining ongoing metastable dynamics.
Although synaptic plasticity depends on pre-and postsynaptic calcium (Sjöström and Nelson, 2002;Shouval et al., 2002;Graupner and Brunel, 2012), we do not consider directly the role of calcium in our model.Rather, our learning rule belongs to the class of voltage-based STDP models (Clopath et al., 2010).These models aim to capture the dependence of LTP and LTD on the membrane potential (Ngezahayo et al., 2000;Sjöström et al., 2001;Wang and Maffei, 2014) while producing STDP curves as an emergent phenomenon.Although voltage-based STDP can capture in vitro results where the membrane potential at the electrode is well defined, in a real neuron with an extended geometry the membrane potential is not generally uniform along the dendritic tree.At the time of a postsynaptic spike, one might assume that a back-propagating action potential imposes a uniform voltage on the dendritic tree, at least in proximal dendrites (Stuart and Sakmann, 1994;Nuriya et al., 2006), although faithful propagation depends on many factors including the order of the somatic action potential within a train and the location of the spines (Spruston et al., 1995;Waters et al., 2005).At the moment of a presynaptic spike, however, the voltage at different locations will be different and probably reflect random subthreshold fluctuations.The plasticity rule in such a case would be at the whim of random fluctuations in membrane potential, unrelated to the timing correlation between pre and postsynaptic spikes.We argue that this does not pose a problem in our model because when our postsynaptic variable ṽi follows random subthreshold fluctuations, our plasticity rule produces no net average synaptic change (Fig. 1b).On the other hand, when the postsynaptic neuron emits a spike, ṽi follows closely the transient exponential increase in membrane potential (Fig. 1c), modeling the more uniform effect of the backpropagating action potential along the dendrites.It is even tempting to speculate that, during a prolonged period of active firing, the amplifying buildup of γs i could reflect the facilitation of dendritic calcium spikes by trains of backpropagating action potentials (Larkum et al., 1999).
Our model did not include inhibitory plasticity (Maffei, 2011;Vogels et al., 2013).This was because adding inhibitory plasticity to our plasticity rule did not qualitatively change our results (not shown), and therefore was deemed unnecessary.A benefit of including inhibitory plasticity in our model could be to autonomously set the level of inhibitory activity best suited for metastable dynamics (Zenke et al., 2015).This useful function of inhibitory plasticity contrasts with its homeostatic role in keeping the postsynaptic firing rates close to a desired target value (Vogels et al., 2011;Litwin-Kumar and Doiron, 2014), a role that in our model is fulfilled by the adaptive threshold for LTP and LTD.

Conclusion
We have proposed a biologically plausible model of synaptic plasticity that produces stable cell assemblies in a deterministic spiking network of excitatory and inhibitory neurons.The ensuing clustered architecture of the network produces ongoing metastable dynamics that co-exists with synaptic plasticity.The metastable dynamics supports seemingly random switching among hidden states representing the stimuli used for training, and has several characteristic traits of the metastable dynamics observed in brain regions of rodents and primates.Our model could also provide an explanation for how a clustered network of spiking neurons can generate metastable dynamics for large network sizes in the absence of external noise.

Methods
Spiking network model.Here we define the 'basic' network, from which all other networks were obtained by scaling up the number of neurons N and the number of stimuli Q (see 'Network scaling' below).The basic network comprised N E = 800 excitatory and N I = 200 inhibitory exponential integrateand-fire (EIF) neurons.The ratio of excitatory to inhibitory neurons was N E /N I = 4, in agreement with previous studies and experimental observations (Braitenberg and Schüz, 1991;Amit and Brunel, 1997;Shepherd and Grillner, 2010).Before training, neurons were randomly connected with fixed probability (0.2 among excitatory neurons and 0.5 in all the other cases) and constant synaptic efficacies (w EE = 0.005, w EI = −0.34,w IE = 0.54 and w II = −0.46mV).Only connected excitatory neurons underwent plasticity.
We modeled the impact of Q = 10 random stimuli as follows.Each neuron had a probability ('coding level') f = 0.1 to be targeted by a stimulus, meaning that the neuron would receive external input during the presentation of that stimulus (we call such neurons 'responsive').This resulted in a typical number of f N E ± f (1 − f )N E = 80±8.4neurons targeted by each stimulus and a mean fraction (1−f ) Q ≈ 0.35 of non-responsive neurons in the basic network (in keeping with similar numbers found in the experimental literature, see e.g.(Perin et al., 2011)).We call a 'cluster' a subpopulation of neurons targeted by a given stimulus during training.There were a mean fraction 1 − (1 − f ) Q ≈ 0.65 of neurons in (overlapping) clusters.A randomly picked neuron had a probability 26 to respond to at least two stimuli, which results in about P ov = 1 − (1 − f ) Q−1 ≈ 0.61 probability for a responsive neuron to respond to multiple stimuli (and therefore belong to multiple clusters).Parameter values are summarized in Table 1.
Network scaling.For the analysis of Fig. 3, N , N E and Q were increased with N E = 0.8N , Q = N/100 and coding level f = 1/Q, resulting in a constant mean cluster size of N Q = f N E = 80 neurons in all cases.For the analysis of Fig. S4, we used f = 1/Q with Q = N/10 stimuli, resulting in N Q = 0.8 √ 10N neurons in each cluster (compatible with QN Q = 0.8N = N E excitatory neurons in total).The factors in the second scaling were chosen so as to agree with the 'basic' network for N = 1, 000 Single neuron dynamics.The membrane potential of the i-th EIF neuron followed the dynamical equation where V L = 0 mV is the 'leak' potential (practically equal to the resting potential in this model), ∆ T = 1 mV is a 'sharpness' parameter related to spike width and upstroke velocity, and V T = 20 mV is a reference potential somewhat related to the spike threshold (it would be the spike threshold in the limit ∆ T → 0).A spike was said to be emitted when V i ≥ V peak = 25 mV, after which the membrane potential was reset to V r = 0 mV.The membrane time constant τ m was 15 ms for excitatory neurons and 10 ms for inhibitory neurons.The term h st i represents a constant stimulus input current which was present only during stimulus presentation.h ext i represents a constant external current, presumably coming from more distant regions or other brain areas.The term h syn iα (t), where α ∈ {E, I}, is the recurrent synaptic input coming from excitatory and inhibitory neurons of the network, respectively.The synaptic input to neuron i obeyed the equation where t (k) j denotes the arriving time of the k-th spike of neuron j, w ij is the synapse weight from neuron j to neuron i, and τ syn,α the synaptic time constant of the presynaptic inputs, equal to τ syn,E = 3 ms for excitatory inputs and τ syn,I = 2 ms for inhibitory inputs (see Table 1).
Plasticity rule and training protocol.In our model, only the synapses between excitatory neurons were plastic and they were subject to the following plasticity rule: where s j (t) = k δ(t − t (k) j ) is the presynaptic spike train and [x] + := max(x, 0) is the rectified linear function.The dynamical threshold θ i (t) evolved according to Eq. 2 of the main text: where the notation x indicates a variable obeying the equation τ x ẋ = −x + x(t), i.e., x is a low-pass filter of the variable x(t).Specifically, ṽi is the exponential voltage term ∆ T e V i −V T ∆ T low-pass filtered with time constant τ v = 50 ms, while si is the postsynaptic spike train low-pass filtered with time constant τ s = 1 s.τ θ , θ a and γ were constant (see Table 1).
In the basic network, initial training was performed by randomly presenting one stimulus out of Q every 2000 ms for a duration of 10 minutes (presentation at random times did not affect the results).Each stimulus lasted 500 ms and and was presented an average of 30 times during the training period.
The training time was linearly scaled in larger networks (i.e., 20 minutes for a network with 2Q clusters, and so on), resulting in the same mean number of presentations per stimulus in all networks.Perturbing stimuli after training (Fig. 4), whether reoccurring or not, were presented for 200 ms at random times characterized by an exponential inter-event distribution with an average of 10 s.
Quantification of neural and synaptic activity.A cluster is defined as a subpopulation of excitatory neurons targeted by a given stimulus during training.An active cluster (see below) is interpreted as an internal representation of the corresponding stimulus, and therefore a memory of that stimulus in the absence of external stimulation (active clusters could also represent decision variables and other abstract variables, see e.g.(La Camera et al., 2019)).Cluster activity was measured by the normalized firing rate of its neurons, specifically, for the q-th cluster, where ν i (t) is the firing rate of neuron i at time t and η q,i = 1 if neuron i is a target of stimulus q, otherwise it is zero.We call m q the 'overlap' between the network activity and stimulus q.By definition, 0 ≤ m q ≤ 1: m q = 1 implies that only neurons responding to stimulus q have non-zero firing rates; m q = 0 means that the neurons targeted by stimulus q are all silent.A memory for stimulus q was said to be active if m q > 0.5.Since q m q ≈ 1, this guarantees that at any given time there is at most one active memory state ( q m q is not exactly 1 because neurons can be targeted by multiple stimuli).
To quantify the amount of learning, we defined the average synaptic weight among synapses connecting neurons sharing at least one sensory stimulus (dubbed 'w 1 '), and the average weight among neurons that did not share any sensory stimuli ('w 0 ') according to Eq. 3, where C ∈ {0, 1}, N C is the number of synapses of type C, and S C is the set of ij indices of synapses of type C. The mean synaptic change for synapses of type C, ∆w C , was defined analogously, i.e., by replacing w ij with ∆w ij in Eq. 9: Scaling laws for synaptic decay rate.Here we show that, in the limit of large N , the synaptic decay rate is proportional to excitatory firing rates and the coding level f .Under the scaling used in Fig. 3, i.e., Q ∝ N E ∝ N and f ∼ 1/Q as N → ∞, this leads to a synaptic decay rate proportional to 1/N .
According to the plasticity rule, Eq. 7, the rate of change in synaptic weight w ij is proportional to the product of a post-synaptic function of membrane potential, ϕ i , and the presynaptic spike train s j (t) = k δ(t − t (k) j ).The total synaptic weight change for synapses of a given class C (see Eq. 9) during an interval ∆t, is therefore where i, j in the right hand side are such that i, j ∈ C. The postsynaptic term is the sum of two contributions, one coming from active neurons ϕ , where active neurons are those firing in an active cluster.Of these two contributions, only ϕ (a) i has non-zero mean during learning, because occasional pre-and post-synaptic spikes in inactive neurons produce no average synaptic change, as shown in Fig. 1c.
We further assume that we can separate the pre-and postsynaptic terms in the integral (on account that the latter term is much slower than the former), obtaining (12) Assuming that only a finite number of clusters is active at any given time, the mean number of active postsynaptic neurons connected by a synapse of type C is proportional to f N E ∝ f N (the mean number of neurons in each cluster), giving Assuming Poisson spike trains of average rate ν C , the presynaptic term scales as where p C is the probability that a synapse is of type C. Putting all together, we get From this equation and Eq. 10 with N C ≈ ⟨N C ⟩ = p C N 2 E ∼ p C N 2 , the rate of change of w C in an interval ∆t is then Removing gamma

Figure 1 :
Figure 1: Illustration of the plasticity rule.(a) Time scale of τ θ θ = tanh(−θ + ∆) as a function of ∆, where 2. Without attenuation, training is successful, however clusters are unstable and disappear within one hour after training (Fig S1).Without the term γs i in Eq. 2, training produces uneven clusters that are unable to sustain cluster activation within one hour after training (Fig S2).

Table 1 :
Model parameters.See the text for details.