HfO2-based resistive switching memory devices for neuromorphic computing

HfO2-based resistive switching memory (RRAM) combines several outstanding properties, such as high scalability, fast switching speed, low power, compatibility with complementary metal-oxide-semiconductor technology, with possible high-density or three-dimensional integration. Therefore, today, HfO2 RRAMs have attracted a strong interest for applications in neuromorphic engineering, in particular for the development of artificial synapses in neural networks. This review provides an overview of the structure, the properties and the applications of HfO2-based RRAM in neuromorphic computing. Both widely investigated applications of nonvolatile devices and pioneering works about volatile devices are reviewed. The RRAM device is first introduced, describing the switching mechanisms associated to filamentary path of HfO2 defects such as oxygen vacancies. The RRAM programming algorithms are described for high-precision multilevel operation, analog weight update in synaptic applications and for exploiting the resistance dynamics of volatile devices. Finally, the neuromorphic applications are presented, illustrating both artificial neural networks with supervised training and with multilevel, binary or stochastic weights. Spiking neural networks are then presented for applications ranging from unsupervised training to spatio-temporal recognition. From this overview, HfO2-based RRAM appears as a mature technology for a broad range of neuromorphic computing systems.


Introduction
A major challenge in neuromorphic engineering is the design and development of novel devices which mimic the behavior of biological elements of a neural network, such as spiking neurons and learning synapses [1][2][3]. In this regard, the class of resistive (or memristive) devices, such as the resistive switching random access memory (RRAM) has attracted a good deal of interest for the simple structure, the low-power operation and the easy integration with the complementary metal-oxide semiconductor (CMOS) process flow [4][5][6][7]. The ability of controlling the device conductance by electrical stimuli, similar to the neuronal spikes causing potentiation and depression of a biological synapse, has spurred the development of artificial synapses based on RRAM devices [8][9][10].
The neuromorphic research has focused on two main directions, namely (i) the development of artificial neural networks (ANNs) aiming at high-accurate recognition of image, video and audio data [11], and (ii) the engineering of spiking neural networks (SNNs) to closely mimic the adaptability and high-energy efficiency of the human brain [12].
The good scaling behavior of RRAM devices in terms of both device area [13] and 3D integration [14] enables the implementation of high density of synapses needed in both deep learning architectures and braininspired circuits with high connectivity between neurons and synapses. As biological synapses weight the communication among neurons, in the same way, resistance states of nonvolatile RRAMs can modulate the connection among artificial neurons. Furthermore, RRAM devices can play the role of both memory and  [25]. (c) © 2021 IEEE. Reprinted, with permission, from [26].
in an increase of the energy consumption per programming operation. Moreover, the use of the 1T1R configuration (especially using an integrated transistor during device fabrication) provides a better control of the current compliance during forming and set operations, due to the limitation of parasitic capacitance effects [32], and thus allowing low power operation with respect to the use of 1R configuration only. In addition, the 1T1R configuration is beneficial for large array integration density since the transistor provide a selector limiting sneak path problems. Therefore the use of 1T1R configuration for HfO 2 RRAM has lead to the best device performance. Recently, nonvolatile HfO 2 RRAM have shown large endurance [33], retention up to 10 years [34,35], multilevel operation [26,36], excellent scalability down to 10 nm or less [13] demonstrated at single device level, integration in 1T1R large arrays and in combination with scaled CMOS technology nodes [26,35,37,38], and possibility of integration in 3D arrays [7,39].
HfO 2 RRAM devices have also been optimized through material engineering of the stack with the aim of improving electrical performances and, more recently, with the aim of achieving analog switching. Regarding the used material stack for HfO 2 based RRAM devices, the basic structure usually consists of a HfO 2 layer as switching medium (with thickness in the 5-30 nm range), and bottom and top electrodes formed by metals or nitrides, such as Pt, Au, W, Ni, TiN, TaN. One of the most used optimized stack includes the us of a Ti (or Hf) scavenging layer, between the HfO 2 layer grown by atomic layer deposition (ALD) and the top TiN electrode [13,19,33,40,41]. As shown in figure 2 in the case of Ti, the metal layer is easily oxidized by oxygen exchange between Ti and HfO 2 , possibly also promoted by an additional post-deposition annealing. This phenomenon leads to the formation of a TiO x layer which serves as oxygen exchange layer during the following switching operation, as well as to the formation of a sub-stochiometric HfO x layer close to the top electrode and an asymmetrical oxygen vacancy profile in the oxide layer. Various scavenging metal layers have been studied in combination with HfO 2 grown by ALD. Ti or Hf scavenging layers and the created asymmetric oxygen vacancy profile in the oxide layer are beneficial in terms of reduction of the forming voltage [13,42,43], as shown in figure 2(c). Independently from the RRAM cell size (1 μm or 40 nm) the increase of Hf cap thickness leads to a decrease of the forming voltage.
Moreover, the insertion of Ti and Hf interlayers is also beneficial for the subsequent switching properties. In particular, the use of TiN/Ti/HfO 2 /TiN or TiN/Hf/HfO 2 /TiN (top electrode/scavenging metal/oxide/bottom electrode) stacks leads to the best results in terms of device endurance and retention. For instance, Chen et al [33] reported 10 10 pulse endurance in TiN/10 nm Hf/5 nm HfO 2 /TiN 1T1R RRAM devices. The energetic of oxygen vacancy creation and migration can be modified by HfO 2 doping with a trivalent metal [45,46]. For instance, Al doping has been reported to influence uniformity, retention and resistance fluctuations [34,47,48]. All these effects may help implementing high precision multilevel storage useful for some kind of neuromorphic computing schemes as we will discuss in sections 3 and 4 [49][50][51].
Overall, the attempted materials science solution are many but a clear framework to optimize RRAM device toward a specific application has not been devised yet. We can notice that, recently, the use of multilayers  has empirically demonstrated successful for the improvement of the analog modulation of the conductance in several works [11,[52][53][54]. In particular, some examples of successful realization of multilevel of analog RRAMs make use of multilayers stacks like Pt/HfO x /TiO x /HfO x /TiO x /TiN [52], Al/AlO x /HfO 2 /Ti/TiN [53], TiN/HfAlO x /TaO x /TiN [11] or TiN/TaO x /HfO x /TiN [54].

Volatile devices
Another class of RRAM devices, named also volatile memristors or diffusive memristors, is the one for which the retention of the LRS can span various time scales from ultra short time (ns) up to tens of ms or seconds [18,55]. These devices are based on cation migration [28] and the device stacks are usually based on one active electrode (Ag or Cu), a solid electrolyte as switching medium (HfO 2 , TiO 2 , TaO x , SiO x ) and an inert electrode (Pt, Au, W, TiN, Pd, carbon) [18,[56][57][58][59]. The use of symmetrical stacks like Ag/electrolyte/Ag [60,61] or Pt/Ag-doped material/Pt [62] have been also proposed. Even if HfO 2 is not the only materials of choice for volatile devices, HfO 2 -based volatile RRAM devices have been recently demonstrated by various groups and proposed for neuromorphic and computing applications [18,55,57,61,63].
Regarding the device operation, initially the cell is in the pristine HRS and the application of a voltage to the active electrode leads to the formation of a Ag (or Cu) conductive filament connecting the two electrode (LRS) thanks to the injection of cation from the active electrode material into the solid electrolyte. The filament self-dissolve once the applied voltage falls below a hold value [64,65]. After the initial formation of a filament, which may be associate to a forming process, the volatile switching is achieved by applying a voltage over a threshold value (usually lower than forming voltage) for filament formation. Figure 3(a) shows a transmission electron microscopy section image of Pt/Ag nanodots (top electrode)/HfO 2 /Pt (bottom electrode) stack, while figure 3(b) reports the corresponding measured currentvs-voltage (IV) curves. Another example of volatile switching is reported in figure 3(c) for the Ag (top electrode)/HfO 2 /Pt (bottom electrode) device. The IV curves show an abrupt and volatile switching, and the final LRS value can be controlled by the imposed current compliance. Usually symmetric stack structures (Ag/switching medium/Ag) show bi-directional volatile switching, while non-symmetric stack structure (Ag/switching medium/Pt) shows uni-directional volatile switching. On the other hand, bi-directional switching has been observed also for non-symmetric structures [55,56,66] and ascribed to residual Ag filament close to the inert electrode after forming operation. An example for the latter case is indeed reported in figure 3(b). Finally, it is possible to observe both volatile and nonvolatile switching in Ag-or Cu-based RRAMs especially if the device is operated at large compliance currents leading to the formation of a large and stable filament [57].

Programming schemes for nonvolatile and volatile RRAMs
The main usage of nonvolatile RRAMs in neuromorphic hardware is as synaptic weight. Weights can be stored in multiple conductance levels, described by non-overlapping distribution of values. In this case, we speak about multilevel operation. We refer to analog devices, instead, in case their conductance can be modulated through a continuum of values without identifying levels and corresponding distinct distributions. The programming methodology of nonvolatile synaptic weights is different in case the training is performed either on-line or off-line. Indeed, in case the training is performed off-line, weights have to be uploaded to the synaptic array as conductance values with high accuracy. In particular, for off-line training, it is possible to take advantage of program verify schemes, especially for 1T1R structures. On the contrary, in case of on-line learning, program verify schemes are hardly implemented. Furthermore, most of the training protocols and learning rules do not indicate absolute weight values (i.e. conductance values), but, in turn, prescribe weight changes relatively to the current value. Therefore, for on-line training, programming schemes devised to apply relative conductance changes instead of absolute conductance values are needed and they will be referred to as weight or conductance update, in the following. Both multilevel and analog programming can be implemented as weight updates. Despite multilevel or analog conductance modulations are generally considered as the best options for highly accurate ANN or SNN implementations, neural networks with binary weight demonstrate great computing potential and ease of implementation as discussed in details in the next section. For on-line training, binary weight can be also programmed in a stochastic manner.
For what concerns volatile devices, the research about their implementation in neuromorphic systems is still at its infancy. In the following, some possible uses are described which take advantage of short term internal or resistance evolution of the devices to emulate dynamical features of neural network or dynamical elements present in human brain.

Multilevel programming
For ANN with off-line training, accurate program-verify techniques are needed to store high-precision weights for running the network. In fact, the application of a single programming pulse typically results in a relatively large variation of conductance [9,68,69]. In addition, the conductance can also change after the programming pulse due to random telegraph noise (RTN) [34,47,70,71], 1/f noise [72] and random walk effects [73], all contributing to the broadening of the conductance distribution. Finally, the weights stored in the array might be affected by a device-to-device variation, due to the physical differences in the device structures and geometries [74]. As a result, single-pulse programming operations are not suitable for synaptic weight storage in RRAM arrays. In general, 1R devices can be programmed in a multilevel fashion, as well, for instance by modulating the voltage applied either during the RESET or during SET operation in case of devices that do not require current limitation [10,75,76]. However, such programming method results in a higher variability than the programming through the current compliance provided by an integrated transistor. Therefore the use if 1T1R configuration is usually employed for an efficient multilevel programming as discussed in the following. Figure 4 shows two program-verify techniques for RRAM devices with 1T1R structures [67]. In the incremental step program-verify algorithm (ISPVA, figure 4(a)), voltage pulses with incremental amplitude are applied at the top electrode terminal of the 1T1R structure, while the gate voltage is maintained fixed to control the compliance current [77]. As a result, the device can be gradually set to the desired level, the latter being selected via the gate voltage. On the other hand, the top electrode voltage is kept to a constant value in the incremental gate-voltage verify algorithm (IGVVA), while the gate voltage, hence the compliance current, is increased until the desired conductance level is reached [26]. Figure 4(c) shows the measured standard deviation σ G of conductance as a function of the average conductance, G . The programming variation σ G is significantly decreased by the IGVVA technique with relatively small incremental voltage ΔV G = 10 mV, namely the IGVVA10 technique, which results in a σ G between 2 μS and 5 μS for LRS. Further mitigation of the cycle-to-cycle and device-to-device variations of conductance can be achieved by redundancy and bit slicing  techniques in multiple arrays, with an overhead in terms of memory area and associated power consumption and latency [78].
The previous programming algorithms refer to set operation, which is conventionally controlled through a current compliance by a transistor. Recently, the use of the compliance current has been investigated also during the reset operation [79]. Usually the reset operation is performed by applying a transistor gate voltage that is larger than those applied during the set, so that the transistor do not act as a current limiter and the voltage drop across the transistor is minimized. On the contrary, if a gate voltage smaller than that used for set is applied for reset, the voltage divider allows programming intermediate states, even though with limited reliability. Dalgaty et al [79] report this programming strategy for TiN/Ti/HfO 2 /TiN 1T1R devices as shown in figure 5(a) and apply it to Bayesian neural network as described in the next section.
The use of the transistor current limitation allows setting with some precision a absolute resistance value. To turn such programming scheme into a weight update, i.e. applying relative resistance changes, or steps, dedicated protocols must be elaborated. Payvand et al [80] propose a programming algorithm and circuits to exploit current compliance control to set HfO 2 -1T1R devices in several resistance state ( figure 5(b)). The working principle consists in a continuous reset of the device to the highest resistance state and a following set into the desired intermediate state. Furthermore, before applying the reset and set operation, the resistance of the device is read and the new current compliance value and transistor gate voltage are evaluated on the basis of the desired resistance change.

Analog weight update
The programming scheme described in this subsection is a rather unconventional programming investigated in the last years and it consists in the stimulation of the devices through train of identical weak pulses to get an analog weight update, sometimes called gradual programming. Figures 6(a), (b) and (d) report the analog weight update in the TiN/Ti/HfO 2 TiN device stimulated by trains of negative (set) and positive (reset) pulses, respectively. In the figures 6(a) and (b), we can observe that very short pulse widths at equal voltage produce no conductance change. On the contrary, few long pulses are sufficient to drive a high conductance change.
In general, conductance change of a RRAM device over time is fast or slow on the base of the strength of the programming conditions. Strong (weak) programming conditions are achieved by a combination of relatively high (low) voltages and/or long (short) pulses. A train of weak pulses, each of which induces slow switching change, is able to produce a gradual conductance transition useful for analog weight update operations. Despite a clear understanding of which are the factors allowing the implementation of an analog weight update is still lacking, several works evidenced that interlayers at the metal/oxide interfaces and interface switching enable such analog weight update [81,82]. In other works, the doping of HfO 2 has been engineered to obtain analog conductance updates [50]. Materials engineering has been oriented on two aspects of the conductance update. First of all, a linear evolution of conductance as a function of the number of pulses is highly desirable to have a proper implementation of the training algorithms based on the back-propagation of the error. For instance Woo et al [53] used a AlO x /HfO 2 bilayer to demonstrate a linear conductance dynamics. Obviously, the number of states that can be programmed is of primary importance for neuromorphic computing. However, it is not straightforward to define a value for the number of levels or states especially when the conductance evolution as a function of the number of pulses is not linear (see figures 6(a) and (b)), i.e. in case possible resistance state are not evenly spaced. For this reason, a precise comparison among literature results is not straightforward. However, best literature results may corresponds to few hundreds of effective resistance states, most of them reported for HfO 2 -based devices [54,83]. A mathematical definition of the effective number of states given a certain conductance dynamics has been proposed in [84], which can be useful for a quantitative assessment and device engineering.
Another aspect that requires a further device improvement for analog conductance update regards the memory window, which is usually quite limited for HfO 2 [28,81]. As a matter of fact, resistance windows of an order of magnitude have been only reported for either set [82] or reset operation [85], while the reverse operation occurs always abruptly. Another evidenced issue of analog weight update is what has been named switching noise [86] or stimulated telegraph noise [87]. Such noise is ubiquitous in all filamentary devices and it was studied with reference to HfO 2 -based devices [84]. Such noise is particularly relevant for reset and at high resistance values and results from a dynamical equilibrium between the processes of drift and diffusion of oxygen vacancies [84]. Anyway, despite such analog weight update is unconventional with respect to the standard memory programming, interestingly, it has already been demonstrated in TiN/Ti/HfO 2 /TiN devices wire-connected to 350 nm CMOS technology node neurons [69] and in TiN/TaO x /HfAl y O x /TiN 1T1R 1 kbit array [11]. An example of repeated analog set and reset updates for the latter devices is reported in figure 6(c). As a matter of fact, the use of weak programming pulse is expected to have a beneficial effect on device endurance. As an example figure 6(d) reports repeated analog set/reset update cycles up to some tens of thousand pulses in TiN/Ti/HfO 2 /TiN devices.

Stochastic programming
In some devices or programming conditions, the switching events are so fast that weak programming conditions do not result in any gradual resistance modulation. Conversely, the switching can be considered stochastic in the sense that only two distinct levels are obtained and weak programming pulses produce the switching from one to the other and vice versa with some probability [28]. Stochastic programming has been employed with success in neural networks as an alternative to multilevel or analog weight storage [9,85,89]. As stated above, stochastic programming needs the use of weak programming pulses as in the case of gradual programming. Yu et al [85] report a stochastic set operation for devices composed of a HfO 2 /TiO 2 multilayer. Figure 7(a) reports various set experiments (different colors) in which the switching occurs after different number of weak programming pulses. Figure 7(b) reports an endurance experiment using weak programming conditions for the set operation. Garbin et al [9] use the stochastic switching to obtain a gradual resistance change of the parallel of many HfO 2 -based 1T1R devices, as shown in the circuital scheme of figure 7(c). The resulting set and reset dynamics as a function of the number of pulses is reported in figure 7(d).

Programming and uses of volatile RRAMs
Volatile RRAM devices have been mainly explored as selectors for memory crossbars [60,61], as well as for hardware security [90]. Such applications will not be dealt with in the present review which will only concentrate the use of volatile devices enabling actual neuromorphic functionalities. In particular, the filament self-dissolution after a programming event from HRS to LRS can extend from μs to seconds. The relaxation is easily tracked by measuring the resistance of the devices by low applied voltages and opens the possibility to emulate short term dynamical elements in neuromorphic chips. The longest reported relaxation times in literature for Ag or Cu-based volatile RRAM are in the range of tens of ms up to seconds [18,55,56].  can be controlled by pulse voltage amplitude (figure 8(b)) and reading voltage (figure 8(c)). In figure 8(a), programming pulses with different amplitudes, V p , are applied to a Pt/HfO 2 /Ag device. The device switches to a LRS whose resistance is related to V p . In general the final current value is related to the programming voltage amplitude (figure 8(a)) or time [55,56]. After the pulse end, the low current states of the device is read at 0.1 V to monitor the self-relaxation process. The device current is progressively restored to the initial value during a relaxation time which is longer for higher initial current values (larger applied V p ) (figure 8(b)). Conversely, figure 8(c) shows that the relaxation time, t R , is also controlled by the reading voltage [56]. The latter results are achieved in the Ag/SiO x /C RRAM stack. As a matter of fact, Chekol et al [91] and Covi et al [56] showed that the relaxation time can be controlled to some extent by changing the programming conditions, like pulse width/amplitude and current compliance in 1T1R structures, respectively.
The existence of a relaxation dynamics enables the possibility of reaching a cumulative or integrative effect to repeated pulses. In particular, in case pulses are repeated with a period shorter than the typical relaxation time, the effect of each pulse is summed up to that of the previous pulses, thus leading to a LRS value lower that what achieved by single pulse [57,62,92], possibly leading to an extension of the relaxation time with continuous pulse stimulation. This effect is exploited to emulate the so called pulse paired facilitation observed in biological synapses [57,62,92] and the synaptic metaplasticity [93].

Computing schemes
Thanks to in situ data processing where data movement is virtually suppressed, IMC can accelerate a broad range of computing processes in both the digital and analog domains. These may include ANNs for deep learning [94] and SNNs which aim at mimicking the human brain particularly regarding its ability of learning and adaptation [12]. Both fully-connected networks [11,77,95] and convolutional neural networks (CNNs) [9,54,96] have been implemented with HfO 2 -based RRAMs. Synaptic weights are generally quantized to a certain number of levels [26,77] including the extreme case of binary weights (e.g. LRS and HRS) in binarized [97,98] and ternary neural networks (TNNs) [99]. In addition to ANNs, other types of networks have been considered, e.g., decision trees and random forests implemented in ternary content addressable memory (TCAM) arrays. [100] Various type of SNNs have been implemented with HfO 2 RRAM devices, with the aim of supporting spike-based learning [101] and spatio-temporal recognition [63].
The following is a summary of the main demonstrations of IMC primitives with HfO 2 RRAM devices for machine learning and SNNs.

Acceleration of machine learning algorithms
A strong benefit of IMC derives from the one-step, parallel matrix vector multiplication (MVM) operation which provides the backbone of the fully-connected ANN in figure 9(a) [77]. Here, each neuron input contains the summation of the input signals multiplied by a synaptic weight W ij , which is readily expressed by the MVM operation. Figure 9(b) shows the crosspoint array which can execute the computation of the MVM in one step: the application of a voltage V i at the ith row of the array results in a current I j = ΣW ij · V i , which is equivalent to the MVM operation. In particular, the synaptic weight W ij is implemented as the difference between two conductance values, namely where the minus sign can be obtained by subtraction of the currents in the two adjacent columns in figure 9(b) [102]. Figure 9(c) shows the distributions of individual device currents measured at V read = 0.5 V in a 4 kb array of HfO 2 -based RRAM [77]. The conductance G ij of each device in the array was obtained by the ISPVA program-verify technique, as described in the previous section [77]. A total number of 5 levels were programmed in the array, including the HRS (L1) and four LRS levels (L2 to L5) with increasing conductance. These quantized conductance levels were used in equation (1) to describe the synaptic weights calculated from the back-propagation algorithm, which is a typical supervised off-line training technique [94]. Figure 9(d) shows the confusion matrix of the implemented hardware two-layer fully-connected ANN, namely the probability of a certain output response by the network as a function of the class of the input pattern. While on average the network gives a correct response, the accuracy is only around 83% compared to a software-level accuracy of 92%. This is due to two main limitations of the conductance matrix, namely (i) quantization of the weights with on only 5 levels, and (ii) stochastic variations of conductance with significant spread around the ideal value in figure 9(c).
To improve the accuracy of the ANN, more advanced program-verify techniques can be adopted, such as the IGVVA with small incremental gate voltage, e.g., 10 mV [26]. The improved programming precision allows to reduce the standard deviation of conductance, thus increasing the number of levels and reducing the quantization error. The improved precision of conductance, combined with quantization-aware algorithm for off-line training, allows for a substantial increase of recognition accuracy approaching the equivalent software performance [67]. Alternative error correction codes have been developed that take advantage of encoding/decoding strategies with the support of a hardware encoder device matrix, which impacts the area and energy requirements [103].
In addition to inference accelerators, in situ training was demonstrated in IMC hardware with 1T1R arrays of RRAM devices with Ta/HfO 2 /Pt stack [95]. The on-line training was achieved by the stochastic gradient descent algorithm, where the synaptic update was executed directly on the device by adjusting the gate voltage, similar to the IGVVA approach [95]. Similar results were obtained for a TiN/TaO x /HfAl y O z /TiN stack by applying a train of equal pulses with constant gate and top electrode voltage [11]. With a similar RRAM stack, a fully-memristive hardware implementation of CNN with 32 levels of conductance was demonstrated in [54]. Other CNN implementations with HfO 2 -based RRAM were reported in [9,96].
To reduce the complexity of precise RRAM programming for multiple-level operation, binarized neural networks (BNNs) [104] and TNNs [99] were developed with HfO 2 -based RRAM. In a BNN, both neuron states and synaptic weights have binary values, such as +1 and −1, which strongly simplifies the computation and hardware implementation. In fact, in contrast to analog MVM implementations, all products and summation are carried out in the digital domain, with binary product being implemented by a XNOR operation, while summation is replaced by the POPCOUNT operation which counts all output signals equal to one. A BNN was implemented in hardware with HfO 2 -based RRAM arranged in the 2T2R structure shown in figure 10(a). Here, the two RRAM devices are programmed in a complementary state (HRS/LRS or LRS/HRS) and the synaptic weight is represented by the difference between the two RRAM currents, which is sensed by a precharge sense amplifier (PCSA) [104]. The 2T2R structure allows for a better immunity to errors resulting from tails of the distributions of the LRS and HRS conductance. As shown in figure 10(b), these errors can be minimized by increasing the compliance current, which reduces the LRS tails, and increasing the reset voltage and pulse width, which reduces the HRS tails. Figure 10(c) shows that write errors in the 2T2R structure increases quadratically with the single-bit error, as a result of the LRS and HRS occurring independently in the memory array [104].
The BNN concept was further demonstrated for learning, by taking advantage of the gradual potentiation and depression of RRAM devices, although with relaxed requirements about the symmetry and linearity of weight update with respect to on-line training with back-propagation algorithm [97]. The main advantages of the BNN are the resilience to conductance variation and the fully-digital approach within the computing hardware, where analog-digital converters are no more needed. However, the full parallelism of the analog domain MVM cannot be simply achieved within hardware BNN. Extension to TNN was also reported by using the same 2T2R synaptic architecture, which allows for a slight increase in recognition accuracy for the same network size [99]. Binary RRAMs were also adopted for TCAMs [105], which find extensive applications in storing synaptic tags for spike routing in multi-core SNNs [106] and decision trees for machine learning [107]. TCAMs are usually implemented by static random-access memories (SRAMs), however require relatively large silicon area due to the six-transistor structure of SRAMs. TCAMs with nonvolatile RRAMs can be reduced to a smaller 2T2R structure where the three states can be obtained by the configurations LRS/HRS (state 1), HRS/LRS (state 0) and HRS/HRS (state X, or 'do not care') [105].
A key limitation of RRAM devices for hardware ANN implementation is the limited precision due to the program/read variations [68,70]. Such variations can be turned into a precious feature in stochastic computing circuits, e.g., true random number generators [109,110], Bayesian neural networks [108] and Monte Carlo Markov chains [111]. In a Bayesian neural network ( figure 11(a)), synaptic parameters usually consist of random variables, which well match the random nature of RRAM conductance obtained without programverify algorithms [108]. Figure 11(a) shows the methodology for describing a given probability distribution of weights with stochastic RRAMs: the distribution can be approximated by a combination of a number of Gaussian distributions, each obtained by programming RRAM devices with a fixed pulsed amplitude and time, without verify. To represent a single probability distribution, a relatively large number of devices (e.g., 1024) would be required, as opposed to the single RRAM device (or two RRAM devices in the case of a differential synapse) required for describing a fixed, non-probabilistic weight in an ANN. The output neuron distribution can be obtained in figure 11(c) by sampling multiple output results from predefined sub-sets of the RRAM synapses [108]. Similarly, probabilistic Monte Carlo Markov chains were demonstrated by harnessing the stochastic distributions of programmed RRAM conductance, thus taking advantage of the cycle-to-cycle and device-to-device variations of RRAM [111].
The reported examples show that, depending on the RRAM multilevel precision, various computing schemes can be implemented which target different applications. The requirement on RRAM precision and  consequent computation accuracy can be relaxed in favor of the reduction of system complexity and cost or in favor of ultimate low power operation in case the computing system is used, for instance, as a tool for the pre-processing or filtering of sensor data in the so called intelligent edge computing concept [112].

Spiking computing schemes
While ANNs show excellent performance in terms of image, speech and object classification, they also have several key weaknesses such as the catastrophic forgetting and the limited learning capability. To overcome these limits, SNNs directly mimic the information processing in the brain to gain a better performance in terms of learning, adaptation, real-time interaction with the environment and energy efficiency [12]. Both nonvolatile and volatile HfO 2 -based RRAMs have been explored to study and demonstrate SNN concepts.
Nonvolatile devices are mostly used, as in the case of ANNs, to store static synaptic weights as conductance values. In addition, SNN training is generally driven by local learning protocols rather then the minimization of global error functions like in the case of ANNs. This fact renders the implementation of online training protocols easier for SNNs than for ANNs. Indeed, online learning is often investigated in SNNs. A large amount of publications have been dealing with the implementation of the so-called spike-timing dependent plasticity (STDP) learning protocol in a synaptic device that mediates the communication between a pre-synaptic and a post-synaptic neuron. In biological STDP, the delay time, Δt, between the pre-synaptic and post-synaptic spikes dictates the synaptic weight update [114]. In particular, long-term potentiation takes place for presynaptic spike preceding the post-synaptic spike, while long-term depression takes place for the opposite spike sequence.
Analog STDP weight modulation qualitatively similar to the biological one has been reproduced by stimulating 1R devices at their two terminals by properly shaped overlapping pulses as shown in figure 12(a). Given the triangular shaped pre-spikes, the resulting voltage drop on the device depends on the relative timing of preand post-spikes and the obtained conductance change results in the typical asymmetric STDP shape reported in figure 12(b). The reported results refer to TiN/HfO 2 /Ti/TiN devices properly optimized to give analog operation in response to train of pulses with increasing amplitude, as well as, in response to train of identical pulses [10]. Alternative shapes or even binary versions of STDP curves can be obtained by designing the shape of the programming pulses driving HfO 2 -based devices, as attested by several publications [52,85,88,115].
A temporal overlap scheme has been also proposed for 1T1R synapse structure [113,116]. Figure 12(c) illustrates the concept of 1T1R synapse for STDP [113]. The pre-synaptic spike is applied to the gate of the select transistor, while the post-synaptic spike, also called feedback spike, is applied to the top electrode of the RRAM device. Under pre-synaptic stimulation, the synaptic current, which is proportional to the RRAM conductance, is injected from the transistor source to the post-synaptic neuron. Under post-synaptic stimulation, a set transition takes place across the RRAM device in case of a small positive delay Δt, where the pre-synaptic spike overlaps with the positive pulse of the post-synaptic spike ( figure 12(d)). On the other hand, a reset transition takes place in the case of negative delay, where the pre-synaptic spike overlaps with the negative side of the post-synaptic spike. Figures 12(e) and (f) show the measured and calculated STDP characteristics, respectively, supporting the effects of potentiation and depression for positive and negative delay, respectively [113]. Improved STDP characteristics with exponentially decaying weight update as a function of increasing positive/negative delay can be obtained by properly shaping the post/pre-synaptic spikes and introducing a second select transistor in the two-transistor/one-resistor (2T1R) structure [117].
In all these works, the STDP protocol is implemented by including the temporal information in long lasting pulses, which could complicate the management of a high number of devices in an array. VLSI neuromorphic chips are able to implement various STDP variants by encoding the temporal information in the discharge of capacitors in synapse and neuron CMOS circuits [118,119]. On one side, this solution avoids the use of long overlapping pulses [119], on the other side, the feasible time constants are limited by the physical dimension of the capacitors [120]. The programming of RRAMs according to a generalized and biologically plausible version of STDP, akin to the Bienenstok-Cooper-Munro theory [121], was demonstrated by connecting a TiN/Ti/HfO 2 /TiN device in between two CMOS neurons realized in 350 nm technology node. Further, the proposal of a six-transistor/one-resistor (6T1R) HfO 2 -based synapse in connection with the same CMOS neuron circuits was validated through system level simulations against the hand-written classification task [83]. Supervised training protocols can also be implemented in SNNs. For instance, the so called delta rule which is a spike-based version of the gradient descent has been validated by simulation based on experimental data obtained from Pt/HfO 2 /TiO x /Ti RRAM devices [122].
The investigation of memristive SNNs has gradually moved from system-level simulations to mixed hardware/software experiments, to fully hardware realizations. Fully-memristive neural networks implementing STDP synapses were implemented for the demonstration of unsupervised learning [101]. Figure 13(a) shows a sketch of the one-layer neural network that was used for the unsupervised learning of a 4 × 4 image pattern. The 1T1R synapses were stimulated by spiking signals, either containing the image pattern ( figure 13(b)) or noisy, sparse images. The synapses were initialized with a random configuration, resulting in a stochastic spiking of the single output neuron, with a higher probability of fire under presentation of the input image pattern. As a result, the presentation of the image pattern preferentially causes the pre-post sequence, hence potentiation of the stimulated synapses, whereas random noise stochastically causes a post-pre sequence leading to depression. The stochastic STDP dynamics thus allows for sequential image learning, where each submitted image is learnt by the synaptic array, then replaced by the newly arrived pattern, as shown in figures 13(c) and (d) [101]. A full-hardware SNN with integrate-and-fire neurons and 1T1R RRAM-based synapses was integrated in the 130 nm CMOS node, demonstrating inference accuracy of 84% with binarized weights for a simplified MNIST dataset [123]. Similar SNNs were developed by adopting other types of RRAM materials, such as SiO x [124].
A different usage of nonvolatile devices is the one proposed by Dalgaty et al [79]. They used 1T1R HfO 2based RRAMs as programmable resistors in RC circuits for the tuning and diversification of synaptic and neuronal time constants. As a matter of fact, in fully CMOS analog spiking chips, neurons and synaptic dynamics are implemented with the charge and discharge of capacitors in order to match the signal timescales. Capacitors occupy large silicon footprint and are, therefore, shared among all neurons and synapses of a chip. Therefore, the solution proposed by Dalgaty et al [79] expands the tunability and diversification possibilities of purely CMOS neuromorphic chips through nanoscaled nonvolatile programmable RRAM resistors.
On the other side, time constants and synaptic/neuronal dynamics can in principle implemented by taking advantage of device physics [1,125]. From this standpoint, volatile RRAMs-based on HfO 2 can mimic short term memory in the human brain, thus supporting various cognitive processes such as the spatiotemporal sequence recognition. Although nonvolatile RRAM based on HfO 2 were also shown to learn and recognize spatio-temporal patterns [126], volatile RRAMs can directly mimic transient effects, such as the excitatory post-synaptic current (EPSC), thus serving as an ideal hardware parallel for short-term memory effects. Figure 14(a) illustrates the concept for in-memory sensing and processing capable of direction sensitivity similar to the human retina [63]. The direction sensitive (DS) ganglion cell in figure 14(a) serves in this role by collecting the EPSCs from excitatory and inhibitory synapses stimulated by the photoreceptors in the retina. Due to their space configuration, excitatory and inhibitory synapses are stimulated at different times by a moving light image. For instance, an image moving from left to right in figure 14(b) stimulates excitatory synapse (A) followed by inhibitory synapses (B). This can be replicated in hardware by the neural network in figure 14(c), where excitatory and inhibitory synapses are mimicked by volatile RRAM devices, each contributing a transient current for a limited retention time t ret . The difference between excitatory and inhibitory currents yields EPSC, which consists of a positive current peak for the preferred sequence A-B (image moving from left to right, figure 14(d) or negative peak for the non-preferred sequence B-A (image moving from right to left, figure 14(e). Figure 14

Challenges, solutions and perspectives
As for most emerging memory technologies, the development of HfO 2 -based RRAM is still facing several challenges, mostly originating from the non-ideal reliability and performance of the device. The most relevant limitations of HfO 2 -based RRAM, which are shared with all other filamentary RRAMs, are the rather high variability, the random fluctuation of resistance, the relatively narrow resistance window and the limited endurance. These issues affect all IMC applications, particularly those where the device is supposed to operate in a multilevel mode to maximize the density and performance. Two types of variability effects are present, namely device-to-device and cycle-to-cycle variation of the programmed states [77]. Stochastic variations are inherently arising from the filamentary nature of the conduction path in the RRAM device, where the variation of defect number and of the filament shape can result in a relatively large change of resistance. Both variation phenomena can be addressed by accurate program/verify techniques to finely tune the resistance to the desired multilevel state [26]. However, post-programming rediffusion of defects and RTN can affect the resistance even after the program/verify operation [73]. These fluctuation phenomena result in the broadening of the distribution, thus affecting the bit precision of RRAM conductance levels.
In addition to variations and fluctuations, HfO x RRAM shows an intrinsic limit of the resistance window, particularly for the HRS which generally shows a finite, non-zero value of resistance. The resulting leakage current can affect the IMC accuracy if not properly compensated. In general, a differential synapse, including two RRAM devices with opposite currents to represent the positive and negative components of the synaptic weight, is recommended to achieve a highly precise zero weight. This is shown in figure 15(a), illustrating a differential synapse where the two opposite currents are obtained by biasing the two RRAM devices with opposite voltage, so that the overall weight is given by W = G + − G − , where G + and G − are the conductance values of the two devices in the differential pair [127]. Given the distribution of programmed conductance in figure 15(b), obtained by the IGVVA10 algorithm, one can properly combine the LRS levels L 1 -L 9 to realize the differential weight distribution in figure 15(c). Although the HRS distribution in figure 15(b) displays a non-negligible conductance of about 8 μS, the weight W 0 in figure 15(c) displays an average zero value thanks to the subtraction of two L 9 levels in the differential pairs.
Note that a significant drawback of the differential approach in figure 15 is the presence of relatively large currents to achieve relatively low weight values, e.g., two conductances of about 225 μS are needed to achieve a zero-valued weight with relatively low standard deviation. More generally, HfO 2 -based RRAMs display rather large conductance, in the range of several tens of μS, as shown in figure 15(b). While relatively high conductance values are beneficial thanks to a relatively small variation and a higher robustness against high-T annealing [77], large currents can cause significant voltage drop across the bitlines (usually referred to as IR drop), due to the summation of synaptic currents in the MVM operation and to the parasitic wire resistance of the metallic interconnections. The IR drop can be mitigated by several techniques including both hardware training techniques [128] and replication circuits for compensation [129]. From this standpoint, the adoption of low-current emerging memory technologies, such as electro-chemical random access memories (ECRAMs) [130,131], is the most efficient solution. Neuromorphic computing may require extensive programming of devices, e.g., for continuous learning and adaptation of synaptic weights or integration-and-fire neuron applications. From this point of view, a potential concern is the limited endurance upon repeated set/reset cycles [132]. The endurance of HfO 2 has been shown to be typically in the range of one million cycles, although the maximum number of cycles exponentially drops for increasing reset voltage [133]. For incremental set/reset operations, which might be needed for neuromorphic plasticity and integration, a larger endurance might be expected, although a comprehensive study of endurance for neuromorphic applications is not available yet.
Regarding the programming operation, another key concern is the forming operation, which is needed to initialize the device from the initial, pristine state of high resistance. Forming generally requires a relatively high voltage, which is a burden from the circuit point of view as it may requires charge pumps or high-reliability select transistor that can sustain the applied voltage. To minimize this burden, the HfO 2 layer is engineered to minimize the forming voltage by introducing a suitable concentration of defects [42]. This can be achieved by an oxygen exchange layer generated by redox exchange at the interface between the switching HfO 2 layer and a scavenging layer of moderately reactive metal, such as Ti, Hf or Ta [19], as also discussed in section 2.1.
Hardware accelerators of network training require in-memory execution of the outer product, namely an element-wise vector-vector product for updating the weight matrix in the RRAM array [134]. For this type of application, the device should display a high linearity of conductance vs number of pulses at fixed voltage, to enable potentiation or depression which is proportional to the pulse width at a given amplitude [102]. From this standpoint, the HfO 2 -based RRAM is not optimized to achieve high linearity of conductance update, due to saturation effects (figure 5). RRAM with bilayer oxide stacks, such as HfO 2 /TaO 2 , have been recently reported to improve the linearity of conductance update [135]. Alternative memory technologies have shown a better linearity, usually combined with a lower operating current [136,137]. Table 1 shows a summary of the properties of RRAM compared to other nonvolatile memory technologies [138], including commercial flash [139] and emerging memories such as phase change memory (PCM) [140], spin-transfer torque magnetic random access memory (STT-MRAM) [141], spin-orbit torque magnetic random access memory (SOT-MRAM) [142], ferroelectric random access memory (FeRAM) [143], ferroelectric field-effect transistor (FeFET) [144] and Li-ion based ECRAM [136]. In general, RRAM displays good compatibility for integration in CMOS circuits, simple fabrication process in the back end and small cell size. However, challenges still exist in terms of current operation, programming speed and reliability, including variability, endurance and fluctuations. Further progress is possible by a suitable combination of material engineering, programming/read/training algorithm and circuit/architecture design.
For what concerns the employment of volatile device for the implementation of dynamical components of neuromorphic networks, the investigation is still at the beginning and problems and challenges have not been identified precisely yet. However, one can consider that the use of the relaxation dynamics of volatile devices in real system is beneficial only in case the decay times are longer compared to the time constants that can be alternatively achieved with reasonably small capacitors and charging/discharging currents. In CMOS neuromorphic chips time constants of hundreds of milliseconds have been reported thanks to the use of very low subthreshold transistor currents and relatively large capacitors used to emulate the dynamics of a row of many synaptic devices exploiting the superposition principle [118]. Relaxation times of seconds are closed to the maximum values reported for HfO 2 -based devices. Therefore, it is not clear whether the replacement of capacitors with volatile device is a real advancement. In turn, in case of the emulation of the dynamics of individual network units, like single neuron or synapse with its own dynamics, the superimposition principle does not hold any more and use of volatile RRAMs for each network units provides a scaling advantage compared to the use of capacitors. In particular, volatile devices can be used to implement biological properties of individual synapses, like paired pulse facilitation and depression [57,62,145], metaplasticity [146] or short-to-long term memory [145] transition in case the same RRAM device shows both volatile to nonvolatile retention [92,93,147]. These functions, which cannot be efficiently implemented in conventional CMOS technology, have not been demonstrated neither with large statistics nor at array level, yet. In general, however, a criticality that can already be identified for volatile device is their variability whose impact may depend a lot on their specific use in a neuromorphic system. In some case the parallel of many volatile device instead of only one is used in order to mitigate the effect of variability [63].

Conclusions
This review article presents the status of HfO 2 -based RRAM devices for neuromorphic computing. The key device properties are illustrated for RRAMs devices, highlighting the role of the top electrode material in controlling the forming voltage and the volatile/nonvolatile memory behavior. The programming algorithms for nonvolatile RRAM are described, covering both high-precision multilevel cell programming for off-line training and analog weight update for on-line training. An overview on computing schemes is provided, covering ANNs with binary, multilevel and stochastic weights, as well as SNNs for unsupervised learning and spatiotemporal recognition. Finally, we discuss about open challenges, solutions and perspective of HfO 2 -based RRAMs for neuromorphic applications in comparison with other existing and emerging technologies. From this report, HfO 2 appears as one of the most important RRAM material for demonstrating and prototyping neuromorphic circuits.