A Self-Organizing Short-Term Dynamical Memory Network

Working memory requires information about external stimuli to be represented in the brain even after those stimuli go away. This information is encoded in the activities of neurons, and neural activities change over timescales of tens of milliseconds. Information in working memory, however, is retained for tens of seconds, suggesting the question of how time-varying neural activities maintain stable representations. Prior work shows that, if the neural dynamics are in the ‘null space’ of the representation - so that changes to neural activity do not affect the downstream read-out of stimulus information - then information can be retained for periods much longer than the time-scale of individual-neuronal activities. The prior work, however, requires precisely constructed synaptic connectivity matrices, without explaining how this would arise in a biological neural network. To identify mechanisms through which biological networks can self-organize to learn memory function, we derived biologically plausible synaptic plasticity rules that dynamically modify the connectivity matrix to enable information storing. Networks implementing this plasticity rule can successfully learn to form memory representations even if only 10% of the synapses are plastic, they are robust to synaptic noise, and they can represent information about multiple stimuli.


INTRODUCTION
Working memory is a key cognitive function, and it relies on us retaining representations of external stimuli even after they go away. Stimulus-specific elevated firing rates have been observed in the prefrontal cortex during the delay period of working memory tasks, and are the main neural correlates of working memory (Funahashi et al., 1993;Fuster & Alexander, 1971). Perturbations to the delay period neural activities cause changes in the animal's subsequent report of the remembered stimulus representation (Li et al., 2016;Wimmer et al., 2014). These elevated delay-period firing rates are not static but have timevarying dynamics with activities changing over timescales of tens of milliseconds (Brody et al., 2003a;Romo et al., 1999), yet information can be retained for tens of seconds (Fig. 1A). This suggests the question of how time-varying neural activities keep representing the same information.
Prior work from Druckmann & Chklovskii (2012) shows that, if the neural dynamics are in the "null space" of the representation -so that changes to neural activity do not affect the downstream read-out of stimulus information -then information can be retained for periods much longer than the time-scale of individual neuronal activities (called the FEVER model; Fig. 1B) . That model has a severe fine-tuning problem, discussed below. We identified a synaptic plasticity mechanism that overcomes this fine-tuning problem, enabling neural networks to learn to form stable representations.
While the dynamics of neurons in the FEVER model match that which is observed in the monkey prefrontal cortex during a working memory task (Murray et al., 2017), the model itself requires that the network connectivity matrix have one or more eigenvalues very near to unity. According to the Gershgorin Circle Theorem, this will almost surely not happen in randomly-connected networks: fine-tuned connectivity is needed (Zylberberg & Strowbridge, 2017). Druckmann & Chklovskii (2012) suggest a mechanism of Hebbian learning by which this connectivity can be learned. That mechanism requires the read-out weights to form a 'tight frame', which will not necessarily be true in biological circuits. Thus, the prior work leaves it unknown how synaptic plasticity can form and/or maintain functional working memory networks. Here, we identify biologically plausible synaptic plasticity rules that can solve this fine-tuning problem without making strong assumptions like 'tight frame' representations. Our plasticity rules dynamically re-tune the connectivity matrix to enable persistent representations of stimulus information. We discuss possible biological mechanisms for networks to implement our plasticity rules. We specifically address parametric working memory, where the requirement is to remember continuous values describing several different variables or dimensions such as spatial locations or sound frequencies.
For example, in the oculomotor delayed response task, subjects must remember the location of a target during a delay period (Funahashi et al., 1993). We perform experiments to demonstrate that networks using these plasticity rules are able to store information about multiple stimuli, work even if only a fraction of the synapses are tuned, and are robust to synaptic noise. We also show that these networks improve over time with multiple presented stimuli, and that the learning rules work within densely or sparsely connected networks. When presented with an external stimulus, s, neural activity patterns initially encode an internal representation of that stimulus,ŝ(t=0). These neural activity patterns change over timescales of tens of milliseconds, and yet somehow the same information is stored for up to tens of seconds. (B) While the firing rates, r i (t), change over time, information about stimulus,ŝ(t), can be remain unchanged as long as the projection of the firing rates onto the "read-out" vector, d, remains constant (Druckmann & Chklovskii, 2012).

THE RATE-BASED NETWORK MODEL
We use a rate-based network like that of the FEVER model (Druckmann & Chklovskii, 2012) but with positive rectifying activation functions (ReLU -rectified linear units) (Carandini, 2004). We use standard linear dynamics in the network model: where a i (t) is the internal state (membrane potential) of the i th neuron at time t, r j (t) is the output (firing rate) of the j th neuron, τ the time constant, and L ij represents the strength of synapse from neuron j to neuron i. Firing rates, r j (t), are given by a positive rectifying function of the internal states: r j (t) = [a j (t)] + . After an external stimulus, s, is presented to the network, the network's representation of the stimulus,ŝ(t), is obtained from a weighted combination of firing rates: where d i is the weight of the contribution of neuron i to the stimulus (the "read-out weight"). The initial stimulus value,ŝ(t = 0), is a weighted combination of a random initialization of the firing rates r(t = 0). We first consider a single scalar stimulus value, and study the encoding of multiple stimuli in Sec. 3.5.

PLASTICITY RULES
To organize the network, we update the synaptic weights, L ij , so as to minimize changes in the stimulus representation. To do this, we differentiated Equation 2 with respect to time, to calculate dŝ dt . We used gradient descent with respect to L ij on loss function ( dŝ dt ) 2 to calculate the update rule: where η is the learning rate of the network and dri dai (t) is the slope of the (ReLU) activation function which is one for all positive values of a i , and 0 otherwise. In section 3.6, we discuss potential biological mechanisms that allow a network to implement these plasticity rules. We chose the elements of d to be positive based on candidate sources of the global error signal discussed in section 3.6. The source code and scripts for reproducing our experiments are available at https://github.com/cfederer/SOMN. To evaluate the performance of our working memory networks, we asked how well the networks could store information about a scalar stimulus value. We quantified the remembered value relative to initial,

A B C
s(t=0) , over 3 seconds, where 1.0 indicates perfect memory, and values above or below indicate failure to remember the initial value. We evaluate over 3 seconds because this is the duration of heightened activity observed during working memory tasks (Funahashi et al., 1993). The networks were all-to-all connected, autapses excluded (so diagonal elements of L ij are all zero) and contain 100 neurons. Partial connectivity is discussed in Sec. 3.3. We evaluated how well random networks without plastic synapses store information, by initializing each network with random neural activities, a(t = 0), random readout weights, d, and random connection weight matrices, L, and simulating the dynamical evolution of the representation. We then compared these "constant random synapse" networks to ones with identical random initial conditions, but in which our plasticity rule (Eq. 3) dynamically updated the synapses (Figs. 2B, C). Finally, we compared both of these randomly-initialized networks to ones that had the fine-tuned connectivity specified by the FEVER model.
The stimulus value in the randomly-initialized plastic network initially decreased slightly, but the plasticity rules quickly reorganized the connectivity, and the representation remained constant after the first ∼ 50 ms. The synapses change slightly in the initial steps and then stop changing once the network has learned (Fig. A1). Initially, in the plastic random synapse model, the firing rates are highly dynamic, while the stimulus value remains mostly constant ( Fig. 2A). In the networks with fixed random synaptic weights, the representation and firing rates quickly decay to 0 (Figs. 2A,B). Thus, our plasticity rule enables initially random networks to quickly become effective working memory systems.
To ensure that the success of our plasticity rule at forming an effective memory network was not limited to a fortuitous random initialization, we quantified the fraction of stimulus retained over 10 different networks, each with a different initial connectivity matrix, read-out vector, and initial activity vector a(t = 0). In the FEVER networks, stimulus retention is perfect across all networks. The models with plastic random synapses perform almost as well as the FEVER models, but require some time to self-organize before the representations remain constant (Figs. 2B, C). In the random constant synapse networks, information is quickly lost (Fig. 2B).

NOISY SYNAPSES
Working memory must be robust to noise and imprecise components (Brody et al. (2003b)). We added Gaussian noise with mean 0 and standard deviation .00001 to demonstrate that FEVER networks with initially tuned constant synapses forget the stimulus representation, whereas networks with randomly initialized plastic synapses are robust to added noise ( Fig. 3A). To determine how robust these plastic random networks are to noise, we simulated networks with various levels of noise added to the synaptic updates. We added Gaussian noise with mean 0 and standard deviation α times the update to the synapse, ∆L ij , where α was varied from 0 to 1. (Fig. 3C shows a histogram for 1000 randomly sampled synaptic updates.) We quantified the average remembered value relative to initial for networks with various noise levels and found that the noise did not have a noticeable effect on the network performance for α ≤ 1, whereas α = 10 is exceeds the synaptic noise that networks can handle and still store information (Fig.  3B). The remembered value relative to initial is almost equivalent for all values of α ≤ 1, with minor differences due to random initial conditions. This should not be surprising considering that multiplying error signals by random synaptic weights does not hinder learning, so long as the network is still being pushed down the loss gradient (Lillicrap et al., 2016). Figure 3: Network robustness to noise. (A) The average remembered value relative to initial retained for 10 random initializations for a FEVER network (black) and plastic random network (blue) with Gaussian noise mean 0 and standard deviation .00001. (B) The average remembered value relative to initial for 10 random initializations of networks with differing levels of synaptic update noise (α ≤ 1). Shaded areas in A and B represent ± standard error of the mean over the 10 different random initializations.

A B C
(C) A histogram of 1000 randomly sampled synaptic weight updates during first step of training for one initalization of a plastic random network presented on a log scale.

PARTIAL TUNING
While synapses are plastic, it is not known if all synapses change. To determine how well the network performs if only some synapses are updated, we simulated networks in which different fractions of the synapses were updated using Eq. 3: the other synapses were held constant. We quantified the remembered value relative to initial for these networks (Figs. 4A, B). Even with just 10% of the synapses being tuned, the networks learn to store information about the stimulus. In hindsight, this makes sense. To A B Figure 4: Network robustness to partial plasticity. (A) The average remembered value relative to initial for 10 random initializations. Different lines are for different fractions of plastic synapses. (B) A zoomed in look at the average remembered value relative to initial for 10% − 90% plasticity. Shaded areas in A and B represent ± standard error of the mean over the 10 different random initializations.
store n stimulus values, n constraints must be satisfied by the connectivity matrix: it must have n eigenvalues near 1 (we chose n = 1 for Figs. 4A, B). Because the connectivity matrix has many more than n elements, many configurations can satisfy the constraint, so it is possible to satisfy the constraint without updating every synapse.

A B
Figure 5: Network robustness to partial connectivity. (A) The average remembered value relative to initial for 10 random initializations. Different lines are for different fractions of connectivition probabilities.
(B) A zoomed in look at the average remembered value relative to initial for 10% − 90% connection probability. Shaded areas in A and B represent ± standard error of the mean over the 10 different random initializations.
In the previous results, the connectivity was all-to-all (100%). Real neural circuits are not 100% connected. In visual cortex, for example, connection probabilities range from 50% to 80% for adjacent neurons (Hellwig, 2000). To ask if our synaptic update rule could self-organize partially connected networks, we simulated networks with different connection probabilities and with synapses updated using Eq. 3. We quantified the average remembered value relative to initial for 10 random initializations. Performance declines somewhat as connectivity decreases, but even networks with 10% connection probabilities can learn to store stimulus information (Figs. 5A, B). Experiments show that working memory performance declines with age, which correlates with a reduction in number of synapses (Peters et al., 2008).

PRE-TRAINING THE NETWORK
In the previous examples, each network is initialized with random connection weights. In reality, the working memory network will be continuously learning and will not start over with random connection weights when each new stimulus is presented. Consequently, we speculated that, once the network had learned to store one stimulus, it should be able to remember subsequently presented stimuli, even with minimal re-training. Relatedly, experimental work shows that performance in working memory tasks in children and young adults can be increased not only for trained tasks but for new tasks not part of the training: this coincides with strengthening of connectivity in the prefrontal cortex (Constantinidis & Klingberg, 2016).
To determine if our synaptic update rule enables the network to store new stimuli without further training, we first trained the networks (Eq. 3) to remember 1, 5 or 10 individual stimuli, one at a time: each new stimulus corresponded to another random initialization of the firing rates r(t = 0). We quantified the networks' abilities to represent these training stimuli, and found that the networks performed better on each subsequent stimulus: training improved performance (Fig. 6A). Next, we asked if after training on 1, 5, or 10 prior stimuli, the network could store information about a new stimulus without any more synaptic updates. We found that a network was able to store information about a new stimulus after being trained on at least 1 previous stimulus and did better after training on 5 or 10 stimuli (Fig. 6B). Once the connectivity weight matrix (L) has obtained one or more eigenvalues near unity, it is able to stably store novel stimuli without additional training. Figure 6: Training improves performance. (A) The average remembered value relative to initial over 10 random initializations for a plastic random synapse network that has seen 0 previous stimuli, 1 previous stimulus, 5 previous stimuli or 10 previous stimuli. (B) The average remembered value relative to initial over 10 random initializations for a constant synapse network that has previously been trained on 0, 1, 5, or 10 previous stimuli, but with no training during the simulation period shown. Shaded areas in A and B represent ± standard error of the mean over the 10 different random initializations.

REMEMBERING MULTIPLE STIMULI
The previous section shows how networks can self-organize to store information about one stimulus value, but working memory capacity in adult humans is typically 3-5 items (Cowan, 2010). To incorporate this working memory capacity into our models, we adapted the representation such that there were multiple read-out vectors, one for each stimulus value. We then derived plasticity rules via gradient descent on the squared and summed time derivatives of these representations: the loss was k dŝ k dt 2 . This led to the plasticity rule where n is the number of stimuli to be remembered. We chose n = 4 for our experiments, and we quantified how well the networks remember these stimuli (Stims 1-4 in Fig. 7). In our experiments, each neuron in the network contributed to the representation of every stimulus value: in vivo, most neurons are sensitive to multiple aspects of stimuli (Miller & Fusi, 2013). This is not a requirement: the models successfully represent multiple stimuli even when subsets of the neurons participate in each representation.

POTENTIAL BIOLOGICAL MECHANISMS
Synaptic updates are thought to rely on synaptically local information, like the activities of the pre-and post-synaptic neurons. Our plasticity rules (Eqs. 3,4) involve this information, in addition to "global" error value(s) { dŝ k di dt }. Thus, we obtained three-factor rules: synaptic changes depend on the pre-and postsynaptic neurons' activities, and a global error signal (Lillicrap et al., 2016). There are two facets of the networks presented above that are unclear in their biological interpretations: the source (and precision) of the global error signal, and the symmetry of the feedback signals that convey the error information to the synapses. We discuss these issues in more detail below, and show that they are not firm requirements: our networks can learn to form memory representations even with more realistic asymmetric feedback, and with very coarse (even binary) error signals. Consequently, we demonstrate that synaptic plasticity mechanisms can successfully form self-organizing memory networks with minimal constraints.

SOURCES OF THE GLOBAL ERROR SIGNAL
Below, we propose two sources for the global error signal(s), { dŝ k dt }: feedback to the neurons' apical dendrites (Fig. 8A), and neuromodulatory chemicals that modulate the plasticity (Fig. 8B). These are not mutually exclusive, and in either case, continuous training is not required for functioning working memory (Fig. 6): the feedback signals need not be constantly available. In both implementations, a downstream "read-out" system estimatesŝ, and sends that information (or information about its time derivative) back to the memory network. This means that, at first glance, the read-out weights d i must be accessed in two separate places: at the synapse, L ij , to calculate the update, and at the read-out layer to calculate the remembered stimulus value (Eq. 2). There are no known mechanisms that would allow this same d i value to be accessed at distantly-located regions of the brain (Lillicrap et al., 2016), posing a challenge to the biological plausibility of our networks. To address this issue, and show that known feedback and neuromodulatory mechanisms could implement our memory circuits, we later consider random (asymmetric) feedback weights, where the read-out weight used to estimateŝ, differs from the one used to update the synaptic weights (Fig. 8C). We also consider that error signals, either via segregated dendrites or neuromodulators, would likely be on a slower time scale than activity in the network. We show that the networks can self-organize with delayed plasticity (Fig. A2). Networks with longer delays between plasticity updates require some pre-training before effectively storing information (Fig. A2).
Calculating the Global Error Signal Locally Using Segregated Dendrites: The global error signal could be calculated locally by each neuron, by exploiting the fact that, in pyramidal cells, feedback arrives at the apical dendrites, and modulates synaptic plasticity at the basal dendrites (where information comes in from other cells in the memory network) (Lillicrap et al., 2016;Guerguiev et al., 2017) (Fig. 8A). Here, a readout layer provides feedback to the apical dendrites that specifies the represented stimulus valueŝ(t): the weight of the feedback synapse to neuron i from read-out neuron k is d i k , and so the apical dendrite receives a signal k d i kŝ k (t). The apical dendrites track the changes in these feedback signals, sending that information ( k d i k dŝ k /dt) to basal dendrites via the soma. Correlating those signals with the preand post-synaptic activity at each of the synapses on the basal dendrites, the synaptic updates specified by Eq. 4 are obtained. Thus, the neurons locally compute the synaptic updates.
Signalling the Global Error Signal with Neuromodulators: Alternatively, the global error signal(s) could be communicated throughout the network by neuromodulators, like dopamine, acetylcholine, serotonin, or norepinephrine. These have all been shown to be important in synaptic plasticity in the prefrontal cortex and in working memory (Meunier et al., 2017). This is reward learning, with the reward values coming from the neuromodulator concentrations. Experimental work shows that synapses have activitydependent "eligibility traces" that are converted into changes in synaptic strength by reward-linked neuromodulators (He et al., 2015). In this scenario the concentration of different modulators tracks the error signals, dŝ k dt , and the densities of the receptors to the modulators at each synapse are d i k . Thus, at each synapse, the modulators bring information k d i k dŝ k /dt that, when correlated with the pre-and post-synaptic activities, yields the updates from Eq. 4 (Fig. 8B).

A B
Read-out neuron Neuromodulator Cartoon depicting error signals being communicated by neuromodulators. Neurons in the modulatory system calculate the remembered stimulus values via Eq. 5, where q i is the weight of the synapse from cell i in the memory circuit to the read-out neuron. That cell releases an amount of neuromodulator that tracks the changes in the represented stimulus value. The modulatory chemical affects synaptic plasticity by an amount that depends on the receptor density, d i , at the synapses. Both implementations work with asymmetric feedback: the weights to the read-out neuron from cell i (q i ) will not necessarily match the weights, d i , with which cell i receives the feedback signals. (C) The average remembered value relative to initial over 10 random initializations of a network with plastic random synapses with random feedback and readout weights (q = d) (blue), plastic random synapses with random feedback and readout weights and binary error signals ( dŝ k dt ∈ {±1}) (green), and constant synapses (red). The shaded areas represent ± standard error of the mean over the 10 different random initializations.

LEARNING WITH ASYMMETRIC RANDOM FEEDBACK WEIGHTS AND BINARY ERROR SIGNALS
The discussion above shows that it is critical to implement our memory networks with asymmetric feedback weights. To do this, we let the top-down feedback impinge on the neurons at synapses with weights d i , and the read-out layer calculate the remembered stimulus as: where d = q (Lillicrap et al., 2016). Here, the values of q i are randomly drawn, independently from d i , and so we refer to this as asymmetric random feedback.
To test whether networks with this asymmetric feedback could learn to store stimulus information, we simulated such networks and quantified the average remembered value relative to initial. The results (Fig. 8C, upper curve) show that even with asymmetric feedback, networks can learn to store stimulus information. This makes sense because both q and d contain only positive elements, so the feedback update signal to each synapse has the same sign as the update calculated from gradient descent. Thus, the synaptic updates with asymmetric feedback are generally in the same direction as those obtained from gradient descent (i.e., the angle between the update vectors, and those from true gradient descent, is less than 90 o ), which suffices for learning (Lillicrap et al., 2016).
Next, we wondered whether our networks require precise error signals dŝ k dt to learn to form memory representations, or whether coarser feedback would suffice: if coarser signals suffice, this removes any fine-tuning requirement "hidden" in the precision of the feedback. To answer this question, we repeated the simulations with our asymmetric random feedback networks, but binarized the error signals used in the synaptic plasticity, via the sign function: sign( dŝ k dt ) ∈ {±1}. The updates are now reduced to either Hebbian or anti-Hebbian learning rules, depending on the sign of the error signal. The results (Fig.  8C, middle curve) show that with asymmetric and binary feedback, the networks can still learn to form memory representations, albeit not quite as well as in the case of highly precise feedback signals (Fig.  8C, upper curve). We found that networks with asymmetric and binary feedback were also robust to partial plasticity and partial connectivity (Fig. A3).

DISCUSSION
We derived biologically plausible synaptic plasticity rules through which networks self-organize to store information in working memory. Networks implementing these plasticity rules are robust to synaptic noise, to having only some of the synapses updated, and to partial connectivity. These networks can store multiple stimuli and have increased performance after previous training. We suggest two candidate sources for the global error signal necessary for the plasticity rule, and demonstrate that our networks can learn to store stimulus information while satisfying the added requirements imposed by these biological mechanisms. This flexibility suggests that other types of synaptic plasticity updates may also be able to organize working memory circuits.
The results presented here were obtained for networks of 100 neurons -as opposed to larger networks -to speed up the simulations. Tests on networks with 10,000 neurons show that the update rule works in larger networks. The optimal learning rate, η, decreases as the network size increases. Aside from network size, a potential caveat in using a rate-based network model is losing information about spiketiming dependency. We tested our plasticity rules in leaky integrate and fire models and found that spiking networks implementing these update rules learn to store information (Fig. A4).
Along with understanding how information is stored in working memory, this work may have implications in training recurrent neural networks (RNNs). Machine learning algorithms are generally unrealistic from a biological perspective: most rely on non-local synaptic updates or symmetric synapses. We show that recurrent networks can learn to store information using biologically plausible synaptic plasticity rules which require local information plus a global error signal (or signals), that can be calculated on the apical dendrite or via neuromodulators. This same setup could be utilized in RNNs to make them more biologically realistic. This would let us better understand how the brain learns, and could lead to novel biomimetic technologies: prior work on biologically realistic machine learning algorithms has led to hardware devices that use on-chip learning (Knag et al., 2015;Zylberberg et al., 2011). Synaptically local updates do not have to be coordinated over all parts of the chip, enabling simpler and more efficient hardware implementations. A histogram of 1000 randomly sampled synaptic weights from the 10000 in the network before training (t=0, red) and after training (t=3000, blue).

A.2. DELAYED FEEDBACK DURING PLASTICITY
Based on the potential sources for the global error signal(s) (Figs. 8A, B), there would likely be a delay period before the feedback signal would reach the working memory network. To test whether our network could still learn to store information even if the error signals were delayed, we simulated networks with initially anti-Hebbian learning rules (error signal is -1) until some delay period, (10-50ms), and then the network receives the error signal from time t minus the delay period. Because this is a difficult task, we lowered the learning rate, η, and allowed the network to train on 5 individual previous stimuli, with the delayed plasticity, and then evaluated its ability to store information (while still training) on a novel stimulus. We found that networks with delays in feedback could still store information (Fig. A1). Even with delays of 40-50ms the networks could still retain some information (Fig. A1). Figure A2: Delayed plasticity. The average remembered value relative to initial over 10 random initializations of a plastic random network with delayed plasticity that been trained on 5 previous stimuli. The network initially has an anti-Hebbian learning rule, where error signal is -1 until feedback arrives after the delay period (10, 20, 40 or 50ms). The network then receives the error signal from time t minus the delay period. The shaded areas represent ± standard error of the mean over the 10 different random initializations.

A.3. ROBUSTNESS TESTS IN NETWORKS WITH RANDOM FEEDBACK AND BINARY ERROR SIGNALS.
A B Figure A3: Partial plasticity and connectivity in networks with random feedback and binary error signals.
(A) The average remembered value relative to initial for 10 random initializations of plastic random synapse networks with random feedback and readout weights and binary error signals. Different lines are for different percentages of plasticity. (B) The average remembered value relative to initial for 10 random initializations of plastic random synapse networks with random feedback and readout weights and binary error signals. Different lines are for different connection probabilities. Shaded areas in A and B represent ± standard error of the mean over the 10 different random initializations.
We demonstrated that with asymmetric and binary feedback, the networks can still learn to form memory representations (Fig. 8C, upper curve). We then tested where networks with random and imprecise feedback are still robust to partial plasticity and partial connectivity. We followed the same protocol as 3.2 and 3.3 but in networks with asymmetric feedback and found that networks were still able to store information even with just 10% plasticity or 10% connectivity, albeit not as well as in the case of highly precise feedback signals (Figs. A2A, A2B). Figure A4: Stimulus retention in leaky integrate and fire (LIF) spiking self-organizing memory networks which enforce Dale's Law (neurons are either excitatory or inhibitory). (A) The average remembered value relative to the initial stimulus value over 10 random initializations for the plastic random synapse model (blue) and the constant random synapse model (red) where 1.0 indiciates perfect memory. Shaded areas represent ± standard error of the mean over the 10 different random initializations. (B) Raster plot of spiking activity of the excitatory neurons which encode for the remembered stimulus value over 50ms of one random initialization.

A B
In order to test if our update rule can organize a spiking network, we implemented a leaky integrate-andfire (LIF) model (Ledoux & Brunel, 2011). We enforced Dale's law with separate E and I populations composed of N E = 100 and N I = 20 LIF neurons. Each excitatory (E) neuron i was described by its internal activity (membrane potential) a i (t) which obeys: where τ E is the membrane time constant of E neurons (20ms), a rest = -70mV is the resting membrane potential, L the synaptic strength, and m ij (t) are individual synaptic currents modeled with an additional synaptic variable x ij : where τ is the time constant, the sum over k represents a sum over all spikes of pre-synaptic neuron j, that occurs at time t k j . Similarly, each inhibitory (I) neuron i was described by its internal activity (membrane potential) a i (t) which obeys: where τ I is the membrane time constant of I neurons (10ms), and the rest of the variables are defined as above for Eq. 4. In this model, action potentials occured when voltage crosses V thr = −50mV and at the time of the action potential, the voltage was reset at V reset = −60mV , with no refractory period for simplicity (Ledoux & Brunel, 2011). In order to calculate the updates to the synapse, we took a smoothed rolling estimate of the firing rates r E,I (t) which evolved via: where δ i (t) is a binary variable for if neuron i spikes or not. In order to organize the network to store information, we updated the synapses of the excitatory network, L EE with the same derived update as Eq. 3 except the loss was calculated as: whereŝ(t) decays exponentially to 0 without spikes δ i (t). To evaluate the performance of our spiking working memory networks, we asked how well the networks could store information about a scalar stimulus value, as in 3.1. The networks were all-to-all connected, autapses excluded (so diagonal elements of L ij are all zero) and contain 100 excitatory neurons and 20 inhibitory neurons. We evaluated how well random networks without plastic synapses store information, by initializing each network with random neural activities, a(t = 0), random read-out weights, d, and random connection weight matrices, L, and simulating the dynamical evolution of the representation. The initial stimulus value to be remembered is a weighted combination of the spikes from the excitatory population,ŝ(t = 0) = i δ i (t = 0)d i . We found that LIF networks with plastic synapses updated via Eq. 3 are able to store information about stimulus (Fig. A4.A). The network learns to maintain information with periodic bursts of spikes ( Fig.  A4.B).