Brain-inspired computing with resistive switching memory (RRAM): Devices, synapses and neural networks

The human brain can perform advanced computing tasks, such as learning, recognition, and cognition, with extremely low power consumption and low frequency of neuronal spiking. This is attributed to the highly-parallel and the event-driven scheme of computation, where energy is used only when and where it is needed for processing the information. To mimic the human brain, the fundamental challenges are the replication of the time-dependent plasticity of synapses and the achievement of the high connectivity in biological neuron networks, where the ratio between synapses and neurons is around 10 4 . This combination of high computing capability and density scalability can be obtained with the nanodevice technology, notably by resistive-switching memory(RRAM)devices.Inthiswork,therecentadvancesinRRAMdevicetechnologyformemoryandsynaptic applications are reviewed. First, RRAM devices with improved window and reliability thanks to SiO x dielectric layer arediscussed.Then,theapplication ofRRAM inneuromorphic computing areaddressed, presenting hybrid synapses capable of spike-timing dependent plasticity (STDP). Brain-inspired hardware featuring learning and recognition of input patterns are ﬁ nally presented. © 2018 The Author. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Moore's law of transistor scaling is approaching its ultimate limit mainly due to the excessive power consumption caused by both static and dynamic leakage processes [1]. The power consumption issue may in principle be attenuated by novel devices with lower voltage supply and steep subthreshold slope, including finFET, trigate FET [2], tunnel FET [3], negative capacitance FET [4], and alternative non-charge based switch concepts [5]. On the other hand, novel computing architectures are proposed to solve the von Neumann bottleneck, where the physical separation between the data processing and the memory units in conventional computers pose increasing limitations of latency and power consumption, especially for data centric computation [6]. The von Neumann bottleneck can be solved either by creating a 3D structure for co-integration of computing and memory elements [7], or by introducing whole new architecture concepts of in-memory computing, such as in-memory logic [8][9][10][11][12][13][14][15][16][17][18][19], and neuromorphic computing [20][21][22][23][24][25][26][27][28][29]. These research efforts generally require novel switches, such as resistive switching memory (RRAM) or phase change memory (PCM), which can serve as memory and computing element at the same time. For instance, in-memory computing circuits take advantage of the PCM ability to add multiple applied pulses in its crystalline fraction, thus serving as a logic gate [9,14] or an algebraic counter [30,31].
Similarly, RRAM logic gates rely on the conditional switching in one or more output RRAMs, depending on the applied voltage amplitudes and the states of input RRAM devices [8,12,15,16]. On the other hand, RRAM and PCM can be used as synaptic elements in neural networks, for their ability to tune the device resistance [32][33][34][35][36][37][38][39]. Most recently, neuron circuits based on RRAM [40] and PCM devices [41] have also been reported to allow for the area downscaling of the integrate-andfire circuit. The use of RRAM devices as synaptic or neuron elements, however, requires optimization of certain properties, such as multilevel operation, high on/off ratio, linear change of the resistance upon set and reset, and good reliability. In addition, brain-inspired circuits require synaptic plasticity according to spike timing, which also poses critical challenges at the device and circuit viewpoints.
This work reviews the recent advances in the development of braininspired neuromorphic computing circuits based on RRAM. First, the materials-engineered RRAM to address emerging needs in neuromorphic circuits are discussed. Then, the development of synaptic circuits capable of spike-timing dependent plasticity (STDP), similar to the human brain, will be reviewed. Finally, spiking neural networks with plastic synapses capable of evidencing learning, recognition, signal reconstruction, and error correction, will be reviewed and discussed. and scalable size of a single memory device within a crossbar array. At the same time, there are significant performance and reliability requirements that should be satisfied for RRAM implementation in neuromorphic circuits. One of the most studied application of RRAM is the synaptic element in a deep learning network, consisting of a feedforward multilayer perceptron (MLP) network where the learning process takes place via backpropagation [42]. Fig. 1 shows an example of MLP with 4 neuron layers, including one input, one output, and 2 hidden layers, where the synaptic connections are made of RRAM devices. In an MLP, learning relies on the supervised backpropagation algorithm [23,26,27,42], where the error signal ε, namely the difference between the teacher signal and the real output, linearly controls the update of each synaptic weight in the network.

Linearity and symmetry of weight update
Fast and efficient learning requires that each synapse is potentiated or depressed by blind pulses, independent of the initial resistive state. This requires (i) a high linearity of the RRAM characteristics, where a fixed applied pulse results in a known potentiation/depression by an additive or multiplicative term, and (ii) a high symmetry of update, where similar update characteristics are obtained under positive and negative bias, for potentiation and depression, respectively. Symmetric characteristics are generally difficult to achieve with conventional metal-oxide RRAM devices, due to the different set and reset transitions. Fig. 2a shows a schematic illustration of a RRAM device with TiN/HfO 2 / TiN stack, where an oxygen-deficient layer was formed at the bottom electrode to enable injection and migration of defects (oxygen vacancies and metallic impurities) during forming and set transition [43]. Fig. 2b shows the corresponding current-voltage (I\ \V) characteristics of the device, indicating an abrupt set transition and a gradual reset transition. While the reset transition can in principle be used to carefully tune the synaptic weight [44], the abrupt set transition does not allow for a voltage-controlled tuning of the resistance. Note that this is similar to the case of PCM synapses, where the set (crystallizing) transition is sufficiently gradual thanks to nucleation and growth mechanisms, whereas the reset (amorphizing) transition is abrupt. The strongly asymmetric potentiation/depression is generally solved by adopting a 2-resistor (2R) synapse, where the 2 devices play the role of excitatory and inhibitory synapse to control the weight update in both directions [23,24].
While synaptic applications require linear weight update, real RRAM devices generally show a non-linear characteristic, as shown in Fig. 3 [45], comparing three RRAM devices based on (a) TaO x /TiO 2 bilayer stack [46], (b) polycrystalline Pr 1-x Ca x MnO 3 (PCMO) [47,48], and (c) amorphous Si with an Ag top electrode [32]. The figure shows the measured conductance of the RRAM device as a function of the number of potentiating/depressing pulses of equal amplitude. All RRAM technologies shown in the figure display large non-linearity in both potentiation and depression characteristics, with non-linearity coefficients ranging between 1 and 6, thus potentially resulting in inaccuracies of pattern recognition after learning for image and speech applications [23]. It has been shown that the adoption of voltage pulses with linearly increasing amplitude allows for an improvement of linearity of weight update, however at the expense of additional circuit complexity and a corresponding loss of speed [48]. Alternative one-transistor/2-resistor (1T2R) structures have been proposed to improve linearity, although resulting in a slightly larger area occupation in the synaptic array [49].

Stability of conductance state
RRAM synapses must also display ideal characteristics from the viewpoint of reliability, including retention at high temperature, good endurance, and immunity to noise. The latter has been carefully studied in RRAM devices for memory application, especially for the high resistance state (HRS) where the random network of conductive channels is more impacted by defect -related fluctuations [50]. The study of HRS indicates a complex noise structure which includes both random telegraph noise (RTN) and random walk (RW), the latter consisting of abrupt changes of resistance with purely random distributions of amplitude and time t RW . This is shown in Fig. 4a, which compares the time evolution of the measured resistance for 3 individual cells in a memory array [50]. The figure shows that the resistance can randomly change from time to time, resulting in an overall increase, decrease, or stability of the cell resistance. A careful study of RW fluctuations indicates that RW has a time-dependent activity, where the probability g(t) of RW fluctuating within a given time step Δt decreases with time according to g(t)~t −1 [50]. The time-dependent fluctuation of RW was attributed to the distributed energy barrier for structural relaxation of newly generated defects in the dielectric layer. On the other hand, RTN is generally attributed to individual unstable defects, affecting the conductance of the localized path in the memory device [51,52]. The conductive path might consist of the conductive filament in the low resistance state (LRS), or the percolation path of Poole-Frenkel hopping in the HRS. The localized nature of RTN is evidenced in Fig. 4b, showing the resistance amplitude ΔR of RTN divided by the average resistance R, as a function of R [52]. Data are collected from several RRAM materials including HfO x [52], NiO x [51], Cu-based RRAM [53] and Cu nanometallic bridges [54]. Despite the wide range of materials considered, a general trend can be seen in Fig. 4b, where ΔR/R increases linearly with the resistance, which can be attributed to a defect-induced depletion within the conductive path [51]. In the HRS, the path size becomes comparable to the defect itself, thus ΔR/R saturates at a value of the order of 1, which is also evidenced by the large RTN fluctuations in Fig. 4a.
Data in Fig. 4a shows that, once a RRAM device is programmed in the HRS, its resistance could significantly fluctuate, leading to a distribution broadening within the memory array. This is a clear problem for synaptic applications, since the synaptic weight should remain constant with time, e.g., to enable classification of patterns in the neural network of Fig. 1. To stabilize the synaptic state with time, the RRAM materials should be engineered to minimize fluctuations, or to increase the on/ Simulation results from an analytical model for RRAM are also shown. Reprinted with permission from [43]. Copyright (2014) IEEE.  [46] (a), PCMO [47,48] (b) and amorphous Si with Ag electrode [32] (c). The characteristics show the conductance measured after application of a fixed-voltage pulse during either potentiation (weight increase, increasing pulse number) or depression (weight decrease, decreasing pulse number). Reprinted with permission from [45]. Copyright (2015) IEEE.

RRAM devices with improved on-off ratio
Large on/off ratio of memory states is generally shown by conductive bridge memory (CBRAM) [55][56][57][58][59], namely a particular type of RRAM where the resistance change is due to the migration of cations from one or both the electrodes in the memory stack, rather than the ionized oxygen or their respective vacancy as usually considered for metal oxide RRAM [60][61][62]. To enhance the switching speed at low voltage, metals with relatively high mobility, such as Ag or Cu, are used as electrode materials in CBRAM, resulting in very low operation voltages (few hundreds mV) and currents, even lower than 1 nA [57]. However, the retention time in Ag-and Cu-based RRAMs is generally degraded due to mobility-induced spontaneous disconnection of the conductive filament [63][64][65][66][67].
A better tradeoff between switching speed and volatile behavior can be obtained by engineering the metal in the active electrode, i.e., the one that is serving as reservoir for cation migration in the switching process. This was recently shown for the Ti/SiO x /C RRAM stack shown in Fig. 5a [64]. Fig. 5b shows the measured I\ \V curves after forming for the Ti/ SiO x RRAM, indicating set and reset transitions with a resistance window between HRS and LRS of 4 orders of magnitudes. The switching process is attributed to Ti cation migration, as in CBRAM devices, although the lower mobility of Ti requires voltages larger than 1 V to induce set and reset transitions. The migration of relatively low mobility metals, such as Ti, Ta, and Hf, was postulated to contribute to the switching process in metal oxide RRAM [61] and also experimentally evidenced at the nanoscale [68]. Thanks to the relatively low mobility, Ti-based conductive filament is much more stable at both room and elevated temperatures, resulting in data retention of 1 h at 260°C [69]. The extremely large band gap and good insulating properties of SiO x allow to reach high resistance in the HRS, resulting in a large on-off ratio. The latter is also improved by the adoption of a highly inert C bottom electrode, which prevents dielectric breakdown even at the relatively large negative voltage in the range of − 5 V, which is needed to reach a deep HRS with high resistance. The stable bottom electrode material is also beneficial for cycling endurance, which is around 10 8 for the Ti/SiO x RRAM [64]. It should be noted that the large on-off ratio, combined with the good stability of HRS and LRS, allows for larger read margins to counteract the defect-related fluctuations and noise in Fig. 4, thus contributing to stable resistive states in memory and synaptic applications.
The high resistance window in the Ti/SiO x device also enables multilevel operation, which is required for high performance synaptic elements. Fig. 6a shows the measured I\ \V curves for Ti/SiO x RRAM devices at variable V stop , namely the maximum negative voltage along the reset sweep. As | V stop | increases, a larger resistance level of the HRS is obtained, as a result of the voltage-controlled gradual change of resistance in the reset process [43,63,70,71]. Note that the voltage V set increases with | V stop |, which can be attributed to the increasing gap length along the disconnected filament in the HRS [62].  |V stop |, evidencing the good control of R in the HRS at increasing |V stop |. LRS is instead controlled by the compliance current I C during the set transition, which was constant (I C = 50 μA) in the figure. Multilevel operation can also be achieved by varying I C , which results in LRS levels with various resistances [69,72]. These results support the multilevel capability in Ti/SiO x RRAM devices, which can be potentially exploited for synaptic application in neural networks for pattern learning and recognition functions.

Synapse and network circuits for brain-inspired computing
Developing brain-inspired computing systems does not only rely on the device engineering, but necessarily requires a detailed design of circuit blocks displaying certain aspects of neuromorphic functions, such as plasticity and learning. The individual synapses, for instance, must serve as electrical connections between 2 neurons, as well as changing its weight according to specific brain-inspired learning rules. The latter might be significantly different from the backpropagation or, more generally, gradient descent algorithms, which provide the computational basis for deep learning [42]. On the other hand, synaptic plasticity in the brain has been shown to strongly depend on time. For instance, the spike-timing dependent plasticity (STDP) is a weight update mechanism observed in the human brain, where the time delay between presynaptic and post-synaptic spikes dictates the magnitude and sign of the weight change, i.e., potentiation for the pre-synaptic spike preceding the post-synaptic spike, and depression for the post-synaptic spike preceding the pre-synaptic spike [73]. Other more complicated synaptic weight update rules also rely on time computation between a pair or a triplet of spikes [74][75][76]. These plasticity rules generally require the implementation of complex circuits involving CMOS devices [77][78][79] or nanoscale devices, e.g., PCM [34,38,80] or RRAM [32,33,35,37,39]. Although examples of time-sensitive synapses have been reported [65,81], plastic synapses based on PCM and RRAM generally include one or more transistors in a hybrid configuration, to allow for a controllable computation of time within the circuit block. Fig. 7a shows an example of hybrid RRAM-CMOS synapse, featuring 2 transistors and one RRAM element in a 2T1R configuration [37]. The pre-synaptic neuron drives the gate of one transistor, called the communication transistor, and the top electrode of the RRAM device. Fig. 7b shows the voltage V CG applied to the gate of the communication gate and V TE to the top electrode, which are both applied by the pre-synaptic neuron in the spike event. The applied voltage spikes in the figure induce a spiking synaptic current, which is proportional to the conductance of the RRAM, thus serving as a storage element of the synaptic weight. The synaptic current flows through the synaptic circuit and is fed into the input terminal of the post-synaptic neuron, where integration and fire take place as shown in the schematic circuit of Fig. 7c. Note that the input node of the post-synaptic neuron is a virtual ground, thus ensuring a zero potential at the bottom electrode of the 2T1R synapse, and serving as a summing input of a virtually-unlimited number of synaptic channels.

Hybrid STDP synapses
As the integrated current exceeds a certain threshold, the post-synaptic neuron fires, sending a spike to the following neurons in the network, as well as applying a feedback spike to the fire gate. This is shown in Fig. 8, for the 2 cases of long-term potentiation (LTP) with spike delay Δt N 0 (a) and long-term depression (LTD) with spike delay Δt b 0 (b). When the pre-synaptic spike V TE precedes the postsynaptic spike V FG (Fig. 8a), the overlap between the positive pulse in V TE and V FG causes a set transition, hence potentiation, with an amount which is controlled by V FG [37]. Due to the shape of V FG , the compliance current I C at the overlap point decreases with Δt, which results in a decreasing LTP, in agreement with the biological STDP characteristics [73]. This is confirmed in Fig. 8c, showing that the measured change of conductance R 0 /R, where R 0 and R are the RRAM resistance values before Fig. 7. Hybrid RRAM-CMOS synapse with 2T1R configuration (a), voltage waveforms for V CG and V TE applied by the pre-synaptic neuron in the spike event (b), and overall circuit sketch including the synapse and the pre-and post-synaptic neurons [37]. The overlap between V CG and V TE pulses causes a negative current proportional to the synaptic weight, which is integrated by the post-synaptic neuron and eventually contributes to fire. and after the application of pre-and post-synaptic spikes, decreases with Δt. The initial state was HRS, i.e., minimum conductance, for the purpose of characterizing the potentiation characteristics. On the other hand, when the pre-synaptic spike V TE follows the post-synaptic spike V FG (Fig. 8b), the overlap between V TE and V FG causes a reset transition, hence depression, with an amount which is controlled by V TE , due to the voltage controlled reset. As V TE decreases with time, LTD decreases for increasing negative Δt, which is confirmed by the conductance change of Fig. 8d, measured with respect to an initial LRS. Fig. 9a shows the conductance change for positive and negative Δt, for various initial LRS, which were programmed at variable I C to modulate R 0 . Data indicate time-dependent LTP and LTD for Δt N0 and Δt b 0, respectively. Simulation results by an analytical model of RRAM [43] account for the experimental data, thus supporting the solid understanding of the STDP operation in the 2T1R synapse. Also, note that depression can take place at relatively large Δt N 0, as a result of the negative V TE overcoming the effects of set transition by the positive V TE in Fig. 8a. A similar LTD for positive delay was also observed in some experiments on biological samples [74], which supports the bio-realistic behavior of the synapse. Fig. 9b shows the probability distribution in a color map for a conductance change R 0 /R in correspondence of a delay Δt, for random initial states of the synapses. The shape of the STDP characteristics evidences maximum probability for LTP at Δt N 0 and LTD at Δt b 0, although the exact amount of conductance change depends on the delay Δt and initial resistance R 0 .
The 2T1R structure of the synapse allows for a detailed control of both potentiation, via current-controlled set process, and depression, via voltage-controlled reset process, thus enabling analog STDP with a relatively simple structure of the synapse. Also, the 2 transistors allow to discriminate between the 2 functions of the synapse, namely the Fig. 8. Voltage waveforms for V TE (pre-synaptic spike) and V FG (post-synaptic spike) for Δt N 0 (a) and Δt b 0 (b), and conductance change R 0 /R measured after application of spike pairs as a function of Δt, starting from the HRS (c) and the LRS (d) [37]. The STDP characteristics of R 0 /R for LTP at Δt N 0 and LTD at Δt b 0 can be seen in (c) and (d), respectively. Fig. 9. Conductance change R 0 /R as a function of Δt for initial LRS obtained at variable I C from 25 μA to 170 μA (a) and color map of the probability distribution for a conductance change R 0 /R at a given delay Δt, for random initial state (b) [37].
transmission of spikes, during normal information processing in the neural network, and the synaptic plasticity, during the learning process. Similar 2T1R synapses were proposed for PCM devices with analog STDP characteristics [38]. To simplify the synaptic layout and reduce its circuit area, 1T1R synapses with RRAM devices were also proposed, although at the expense of a digitalized STDP characteristics, featuring only full potentiation or full depression [25,39].

Learning with RRAM STDP synapses
Demonstrating STDP in individual synapses cannot conclusively provide a conceptual proof of learning, which requires instead experiments and simulations at the higher level of synaptic/neural networks. Fig. 10 shows a simple example of a feedforward neural network, called perceptron [25,82,83]. The network consists of 2 layers, namely a presynaptic layer where the neural spikes are submitted to the synaptic channels, and a second layer with just a single post-synaptic neuron to integrate the current spikes and fire. The network is fully connected, namely, each pre-synaptic neuron has a connection to the post-synaptic neuron, with the connection being a hybrid CMOS-RRAM synapse discussed in Sec. 3a. At each fire event, the post-synaptic neuron sends a feedback spike to each synapse to enable LTP/LTD, depending on the delay Δt between pre-and post-synaptic spikes. As a result, submitted patterns tend to be learnt by the network, in that the synapses corresponding to the pattern channels are potentiated, whereas all other synapses, also referred to as the background synapses, tend to get depressed, thus enabling on-line learning of submitted patterns, e.g., images, sounds, or speech [37,39,80]. Fig. 11 shows an experimental demonstration of pattern learning in a hardware neural network, using hybrid 1T1R synapses [25,39]. The network consists of a circuit board hosting a 4 × 4 synaptic array connected to a microcontroller, to handle the spike submission by the pre-synaptic neurons and the integrate/fire operation by the post-synaptic neuron. The operation of the hardware network was in real time, i.e., spikes went through the synapses, gave rise to fire, eventually causing LTP/LTD of the synapses during the same experiment, without any interruption for interaction with a computer, or any other supervisor machine. After initializing the synaptic weights in LRS, pattern #1 shown in Fig. 11a was submitted for 300 epochs, followed by pattern #2 (Fig. 11b) presented for 300 epochs and pattern #3 (Fig. 11c) for 400 epochs. Every pattern was submitted several times for the duration of a spike (1 epoch), randomly alternated with noise images such as the one shown in Fig. 11d. Each epoch lasted 10 ms in the experiment. Each noise submission was fully random, with equal probabilities for the appearance of either noise or pattern.
After each submission, the synaptic weights were measured, thus allowing to monitor the evolution of the synapses in real time. Fig. 11e shows the initial synaptic weights in a color map, and the final states after submitting pattern #1 (f), pattern #2 (g) and pattern #3 (h). Note that, in all cases, the final synaptic weights closely match the submitted pattern, demonstrating highly-accurate and fast pattern learning. Also, the network is capable of updating the states of all synapses, either potentiating the pattern synapses, or depressing the background synapses whenever needed. Potentiation is due to the submitted pattern inducing fire in the post-synaptic neuron, resulting in the coexistence of pre-and post-synaptic spikes with Δt N 0 in pattern synapses, thus causing LTP. Fire is then most likely followed by the presentation of noise, resulting in the coexistence of pre-and post-synaptic spikes with Δt b 0 in background synapses, thus causing LTD. To partially inhibit depression of pattern synapses, a refractory time of 1 epoch was introduced in all neuron channels, i.e., a pre-synaptic neuron cannot fire in a time step and also in the following one. Note that the depression of pattern synapses is still possible, e.g., when noise induces fire, followed by the submission of the pattern. To avoid massive instability of learning due to pattern depression, the noise density was limited to the activity of only few channels, e.g., only 2 spikes over 16 channels in Fig. 11d. Fig. 11i shows the submitted pattern in the three phases of learning, while Fig. 11j shows the measured synaptic conductance as a function of time, showing the convergence of all pattern weights to LRS, and the convergence of all background weights to HRS. In general, the fastest learning process is potentiation, as pattern spikes are all presented simultaneously by the PRE layer. Also, note that the synaptic array relies on a binary coding of information, i.e., each synapse can be either LRS or HRS, which makes the neural network sufficiently robust against variability of LRS and HRS resistances. It was also shown that it is possible to increase the number of levels for gray-scale pattern learning, by adopting multilevel operation of the synapses with current-controlled potentiation in the 1T1R synapse [25]. Similar STDP-based perceptron networks have been presented [84][85][86][87][88], although only at the level of software, or mixed software/hardware approaches. The concept of feedforward network for pattern learning was also extended to multiple patterns, and dynamic (instead of static) patterns, where the ability to track the moving object by online learning was supported by hardware data [25]. More recently, the STDP concept was extended to spike-rate dependent plasticity (SRDP) [89] and to Hopfield-type recurrent networks [90], which have efficient capability of associative memory, thus serving as tools for signal restoration and error correction.

Conclusions
This work reviews the recent progress in the development of braininspired hardware with RRAM devices. First, optimization of RRAM devices for neural networks is reviewed, covering the new materials and device stacks, and their respective performance for improved stability and accuracy of conductance tuning. Then, hybrid synapses combining CMOS and RRAM technologies are shown to display STDP, which is a fundamental algorithm for unsupervised learning. Finally, learning at the level of neural network is shown with reference to full-hardware demonstrations of spiking networks with RRAM synapses. The main challenges for further development of CMOS/RRAM networks in the near future is the co-integration of these technologies and a better understanding of biological neural networks to boost the range of applications in the area of neuromorphic engineering with nanoelectronic devices.

Acknowledgments
The author would like to thank V. Milo and G. Pedretti for critical reading of the manuscript. This article has received funding from the European Research Council (ERC) under the European Union's Horizon