Spiking neural networks compensate for weight drift in organic neuromorphic device networks

Organic neuromorphic devices can accelerate neural networks and integrate with biological systems. Devices based on the biocompatible and conductive polymer PEDOT:PSS are fast, require low amounts of energy and perform well in crossbar simulations. However, parasitic electrochemical reactions lead to self-discharge and the fading of the learned conductance states over time. This limits a neural network’s operating time and requires complex compensation mechanisms. Spiking neural networks (SNNs) take inspiration from biology to implement local and always-on learning. We show that these SNNs can function on organic neuromorphic hardware and compensate for self-discharge by continuously relearning and reinforcing forgotten states. In this work, we use a high-resolution charge transport model to describe the behavior of organic neuromorphic devices and create a computationally efficient surrogate model. By integrating the surrogate model into a Brian 2 simulation, we can describe the behavior of SNNs on organic neuromorphic hardware. A biologically plausible two-layer network for recognizing 28×28 pixel MNIST images is trained and observed during self-discharge. The network achieves, for its size, competitive recognition results of up to 82.5%. Building a network with forgetful devices yields superior accuracy during training with 84.5% compared to ideal devices. However, trained networks without active spike-timing-dependent plasticity quickly lose their predictive performance. We show that online learning can keep the performance at a steady level close to the initial accuracy, even for idle rates of up to 90%. This performance is maintained when the output neuron’s labels are not revalidated for up to 24 h. These findings reconfirm the potential of organic neuromorphic devices for brain-inspired computing. Their biocompatibility and the demonstrated adaptability to SNNs open the path towards close integration with multi-electrode arrays, drug-delivery devices, and other bio-interfacing systems as either fully organic or hybrid organic-inorganic systems.


Introduction
Organic neuromorphic devices are promising candidates to implement low-powered accelerators for AI-at-the-edge applications [1,2]. Devices with mixed ionic electronic conductance allow close integration with electrolyte-based material systems, such as cell culture, microfluidics or the human body. Nevertheless, organic neuromorphic devices have also been used to facilitate learning and sensorimotor integration in robots [3].
These advances put organic neuromorphic device networks for sensing and information processing within reach. However, self-discharge in organic artificial synapses prevents them from maintaining stable conductance states over long periods. This effect can be partially mitigated by careful device design [16] but gets worse for small, scaled-down devices [7] that are required for many applications. Therefore, the weights of such an organic neural network degrade and information and trained behavior are lost over time in as little as a few minutes [17]. This effect is hard to eliminate and requires careful algorithm-hardware co-design to keep the neural network operational over longer time periods. In a previous effort [17], a compensation mechanism was introduced to allow rate-based artificial neural networks (ANNs) to operate on forgetful organic neuromorphic devices for extended periods of time. However, an ideal system would reduce additional circuitry and complexity as much as possible, i.e. not require active compensation for device non-idealities and rely on biosimilar mechanisms as much as possible.
This work shows that spiking neural networks (SNNs) can compensate for moderate device weight drift and profit from running on forgetful neuromorphic hardware. SNNs implement local learning through spike-timing-dependent plasticity (STDP) inspired by biological neurons. Connections between neurons that fire together strengthen over time, while uncorrelated neurons are disconnected. These learning rules can always be active and blur the line between training and inference. In spiking neuromorphic systems with organic artificial synapses, the learning rule can be directly implemented in hardware. In this way, we aim to implement automatic and constant reinforcement of forgotten weights.

Methods
Below, we apply a proven SNN architecture [18][19][20] shown in figure 1(A) to the task of MNIST image recognition and integrate simulated organic neuromorphic devices as plastic synaptic weights to understand the behavior of SNNs on forgetful hardware. The SNN is simulated with the Brian 2 simulator [21], while the self-discharge in the non-ideal plastic synapses is described by a surrogate model derived from detailed charge transport simulations.

Organic device model
Organic neuromorphic devices consist of a PEDOT:PSS channel, an electrolyte and a PEDOT:PSS gate electrode. To write conductance states, a potential between −0.5 V and 0.5 V is applied between the gate and drain electrodes. The resulting cation extraction or injection in the channel leads to additional doping or dedoping of PEDOT. At the same time, a redox shuttle between the channel and gate layers leads to a leakage current. When the gate is disconnected, i.e. the device is in open-circuit mode, a slow but steady self-discharge is observed. Figure 1(B) visualizes the device and the charge transport processes.
To describe the full device behavior, the device model implements modified Nernst-Planck-Poisson equations [22] to describe solute transport in the liquid phase, a drift-diffusion model in the electron conductive PEDOT phase, and the capacitive coupling of both phases as proposed by Tybrandt et al [23]. Electrochemical self-discharge is described through a Butler-Volmer-type equation. The fully coupled system is solved with a custom simulation framework CP-E n PE n [24,25] based on PETSc [26] and is described in detail in [24]. The model is validated against experimental data that include cyclic voltammetry, device charging and self-discharge [24].
To guarantee a stable baseline, the two devices are combined into a differential synapse [17,27] (figure 1(C)) with the weight w defined as, Here, G ± is the device conductance, and G 0,ref is a reference conductance that scales the weight to ±1. Only the weight range from 0 to 1 is used for SNNs. Simulations with many concurrently simulated synapses require a simpler model representation to achieve good simulation performance. Therefore, to be able to simulate larger networks, we built a surrogate model and consequently validated it against the 'full model' . Because the system was observed to be reactive and not diffusion limited [24], an equation with the form of a Butler-Volmer kinetic equation [28] is fitted to the device model.
where a = 8.3 and b = 2 × 10 −7 are determined by fitting a simulated artificial synapse's behavior to the numerical solution of the differential equation. Figure 2(A) compares the surrogate model to the predictions of the full model for the first 20 min of self-discharge, and figure 2(B) shows the behavior for a time span of 10 h. Weights close to 1 have a very high rate of change, while for weights below 0.5, the electrochemical driving force and the weight drift are greatly reduced. This coincides with experimental observation [16,24].
To avoid costly, clock-driven differential equations in the Brian 2 implementation, this behavior is transformed into a function for the drift within 0.5 s depending on the current weight value. There is a duration of 0.5 s for each image to be shown to the network (0.35 s) as well as a relaxation time of 0.15 s between image presentations that allows the network to return to a steady state. Weight drift can then be applied in total through interpolation after each image is shown instead of being continuously updated during the image presentation through partial differential integration integration. Given the small changes in weight value over 0.5 s, this simplification does not affect the simulation results. The resulting function is drawn in figure 2(C).

Network architecture
The network implements a two-layer structure with an input and a processing layer, and closely follows the designs proposed by Diehl et al [18] and Querlioz et al [19]. Image intensity values are encoded as 784 Poisson generators with spike rates between 0-63.75 Hz in the input layer. The Poisson generators are connected with plastic connections one-to-all to 100 excitatory neurons in the second layer. The excitatory neurons are connected one-to-one with 100 inhibitory neurons by a fixed, strong synaptic weight so that the excitatory neuron's firing always triggers the inhibitory neuron. The inhibitory neurons then connect back to all other excitatory neurons (one-to-all-except-one) to suppress their firing going forward. This structure is drawn in figure 1(A).
Neurons are described with a leaky integrate-and-fire model [18] with the membrane potential V m .
where E L,i is the inhibitory synapses' reversal potential, g e and g i the excitatory and inhibitory synapses' conductances, V rest the resting potential and τ the neuron's time constant. The excitatory synapses' reversal potential is 0 mV and, therefore, omitted. Synapses increase their conductance g instantaneously with the weight w when a presynaptic spike arrives at the synapse. Their conductance decays exponentially.
When the neuron's membrane potential crosses its threshold potential V thresh + θ, the neuron fires and the membrane potential is reset to V reset . While V thresh is a constant, θ adapts the membrane threshold to obtain more equal firing rates across the different excitatory neurons. Its value increases every time the neuron fires and decays exponentially.

Learning
While the weights between excitatory and inhibitory (1:1) and inhibitory and excitatory (1:all-except-one) neurons are fixed, the weights between input and excitatory neurons are learned through STDP. At each synapse, traces of the pre-synaptic activity x pre ∈ [0, 1] are recorded, and the weight w ∈ [0, 1] is updated by a local learning rule on each post-synaptic spike. Updating weights only on post-synaptic spikes enhances computational efficiency while still leading to competitive classification accuracy [18]. Here, a power-law-based learning rule is used: where x tar = 0.25 is the target value of the trace at the moment of a post-synaptic spike, w max = 1 is the maximal weight value, η = 0.001 the learning rate and µ = 0.2 an exponent that controls the dependence of the update on the previous weight. x pre is set to 1 on every pre-synaptic spike and decays exponentially with a time constant of 20 ms. During each update, w is clipped to [0, 1]. This leads to a fixed penalty of x tar if the post-synaptic neuron's firing is not immediately caused by pre-synaptic spikes and, therefore, a biomimetic disconnection of irrelevant synapses. A plot of the STDP behavior of the learning rule can be found in detail in the supporting information and schematically in figure 1(D). The SNN model assumes an ideal hardware implementation of the learning rule. The hardware can be represented by a capacitor for the presynaptic trace that discharges through a resistor. The target potential is then subtracted from the capacitor's potential, and the result is amplified and added to the organic synapse's open circuit potential to obtain the programming potential. To protect the synapses and constrain the conductance range, the programming potential is clipped to ±0.5 V before it is applied to the synapse gate for a fixed time.

Training
The networks are trained on 40 × 10 3 images from the MNIST data set (Keras [29] MNIST training set). Images are presented for 0.35 s. If fewer than seven spikes are produced during this time, the maximum input intensity is increased in steps of 63.75 Hz until seven spikes have been produced in total. After each image presentation, the network is rested for 0.15 s. During ideal training, devices with no self-discharge are considered, and no further steps are taken. During forgetful training, accumulated weight drift is applied to the STDP synapses after each image to account for self-discharge in non-ideal devices.

Inference and accuracy determination
Predictions are obtained from the network by counting spikes in the output neurons. After training is completed, labels (0-9) are assigned to each excitatory neuron by evaluating 1000 images from the MNIST training set [30], counting the spikes and assigning the number with the highest overall activity.
During the evaluation, spikes at all excitatory neurons are counted and aggregated by their label. The label with the highest sum of spikes normalized to the number of neurons assigned to that label is taken as the prediction of the network.
To determine a network's accuracy at a single point in time, weights and adaptive thresholds are frozen for the entire label assignment and evaluation steps.

Training on forgetful devices
Two networks have been trained to assess the effect of weight drift during SNN training. One with ideal weights, the other with full weight drift. Figure 3(A) compares their receptive fields for the first 20 × 10 3 training examples. In both cases, numbers emerge clearly from the initial static (yellow) at a similar speed, clearly showing the handwritten number pattern. In the ideal case, weights either gravitate to 0 (white) or 1 (black). In the forgetful case, the large weights are more differentiated and in the range between 0.7-1, rendering the number pattern dark red.
Both networks achieve high accuracy for their size comparable with the literature values [18] as drawn in figure 3(B). The initial accuracy of 32% is not unexpected, as even with fully random weights, the network behaves similarly to a reservoir with a trained output layer. Interestingly, in the final state, the network's accuracy with drifting weights even surpasses the ideal network's by 2% with 84.5% versus 82.5% for the ideal network. The higher mobility of each weight could cause this by preventing their values from getting stuck at 1. Figures 3(C) and (D) compare weight traces for a random sample of 256 synapses from ideal and forgetful networks. They confirm that weights in the ideal network quickly converge to either 0 or 1, while the forgetful network's weights fluctuate in value and cross between the high and low states much more frequently.
Thus, training on forgetful devices is not only possible but could potentially be advantageous.

Long-term stability
After forgetful training, the long-term stability of the resulting networks is evaluated. As a baseline, a network with drifting weights is evaluated periodically, over 1 × 10 5 s without STDP. Figure 4(A) shows that the network's receptive fields fade slowly down to a maximum weight value of 0.15 at 1 × 10 5 s. At the same time, the accuracy of the baseline network drops from 84.5% to 0%, much like a biological neuronal network loses its function when it receives no input for long time spans. A network that sees one example every second, i.e. that is idle 50% of the time, with active STDP, sees only a small change in the average weight value from 0.153 to 0.166. Here, the idle rate is given as the image presentation frequency divided by the presentation frequency during training. At the same time, the accuracy fluctuates close to the initial value (84.5%) between 76.5% and 86%. Figure 4(B) shows that these values barely change even without label reassignment. The network remains stable for more than a day without signs of degradation.
Figures 4(C) and (D) compare the development of 256 random weights in the networks over time. For the baseline case with disabled STDP, a quick decline of weights over the entire range towards zero is observed. For active STDP, the weights fluctuate around the high state (0.6-0.95) and the low state (0-0.2). While a small number of weight values crosses between the two states, they seem to belong to output neurons with a low impact on accuracy.   Figure 5 finally shows the effect of different idle rates on network accuracy. The range observed for 50% remains almost unchanged up to idle rates of 90%. Only for higher idle rates are significant drops in accuracy observed, and even at 99%, the network's accuracy rises back to 61% after temporarily dropping to 41%.

Discussion
Neural networks on organic neuromorphic hardware face significant challenges from self-discharge and the resulting weight drift. This work shows that the dynamics of SNNs and drifting organic devices complement and compensate for each other, resulting in stable networks with improved training results. The final training accuracy of the studied network increased from 82.5% to 84.5% when the synapse model was switched from ideal devices to forgetful hardware. This behavior can be attributed to increased weight mobility, preventing the weights from getting stuck at 1, and has similarly been observed for phase-change material devices [31] with a similar SNN architecture.
More importantly, long-term stability simulations over 24 h show that SNNs stabilize on drifting neuromorphic hardware. While fluctuations in the prediction accuracy are observed, their magnitude is relatively minor and in a similar range as during continued training on both ideal and forgetful hardware. Increases by 1.5% are observed as well as temporary drops by 8%. The studied network with 100 excitatory neurons was chosen as the minimal viable example for simulation performance reasons. Networks with up to 6400 excitatory neurons have been shown to reach much higher training accuracy of up to 95% [18]. Therefore, based on the results presented, we are confident that competitive SNNs on forgetful neuromorphic hardware are possible.
The networks are stable for significant idle rates up to 90%. Additional compensation mechanisms should be considered if higher idle rates are expected for long periods. One such mechanism [17] is based on reminder pulses that are computed from each device's current state and the time of self-discharge based on a device model. They were shown to restore a rate-based ANNs state and accuracy. These pulses can also stabilize an SNN's weight values at the cost of additional write time, added complexity, and disabled STDP. Since STDP is sufficient to stabilize SNNs up to 90% idle rate without additional cost, pure STDP should be chosen below that level, while reminder pulses are advantageous for higher idle rates.
Besides self-discharge, device mismatch introduced during fabrication and permanent or impermanent changes in device behavior introduced during operation can reduce an organic neural network's performance. This can include changes in temperature or device deformation in the case of wearable and stretchable arrays. The presented model can be easily adapted to include these disturbances and support the implementation of future wearable devices.
We conclude that SNNs perform favorably on forgetful organic neuromorphic hardware even with high idle rates. This opens the path towards building organic neuromorphic device networks that integrate with organic, event-driven sensor platforms and even actuators for ultra-low-powered bio-interfacing devices. In particular, biomonitoring applications that harness the synergies between forgetful devices and spiking networks should be the focus of future research.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://git. rwth-aachen.de/avt.cvt/public/organic-snn. Data will be available from 6 September 2023.