Two- and three-terminal HfO2-based multilevel resistive memories for neuromorphic analog synaptic elements

Synaptic elements based on memory devices play an important role in boosting neuromorphic system performance. Here, we show two types of fab-friendly HfO2 material-based resistive memories categorized by configuration and an operating principle for a suitable analog synaptic device aimed at inference and training of neural networks. Since the inference task is mainly related to the number of states from a recognition accuracy perspective, we first demonstrate multilevel cell (MLC) properties of compact two-terminal resistive random-access memory (RRAM). The resistance state can be finely subdivided into an MLC by precisely controlling the evolution of conductive filament constructed by the local movement of oxygen vacancies. Specifically, we investigate how the thickness of the HfO2-switching layer is related to an MLC, which is understood by performing physics-based modeling in MATLAB from a microscopic view. Meanwhile, synaptic devices driven by an interfacial switching mechanism instead of local filamentary dynamics are preferred for training accelerated neuromorphic systems, where the analogous transition of each state ensures high accuracy. Thus, we introduce three-terminal electrochemical random-access memory that facilitates mobile ions across the entire HfO2 switching area uniformly, resulting in highly controllable and gradually tuned current proportional to the amount of migrated ions.


Introduction
As data in various formats are rapidly generated worldwide, modern computing systems based on von Neumann architecture become power-hungry to handle explosive data frequently. Neuromorphic architecture inspired by the biological brain structure has been introduced to enhance computing performance [1]. Artificial neural networks emulating massively cross-linked neurons by numerous synapses enable data execution in a fully parallel fashion, significantly improving latency and power efficiency. To implement this architecture on semiconductor-based hardware chips, synapses and neurons are fundamental building blocks for storing and computing data, which must be demonstrated with electronic devices [2,3]. The role of the neurons is to collect the data transferred from connected synapses with respect to signals. As a result, silicon transistor-based peripheral circuitries have been built [4]. For the electronic synapse, static random-access memory (SRAM) has been used, where a unit transistor that can be aggressively scaled down to 2 nm [5] is cross coupled [6].
The power consumption of the SRAM-implemented neuromorphic chips was still considerable at the expense of the mature silicon device technology because additional power needed to be continuously supplied to the synaptic elements due to the inherently volatile nature of the SRAM. Besides, the small memory capacity (i.e., tens of MB) of SRAM is challenging, as higher synaptic density is required to run more complex algorithms for extended applications or general purposes beyond handwritten digits. These drawbacks from power and density perspectives highlighted the importance of exploring and developing alternative synaptic components using resistive random-access memory (RRAM) devices, typically demonstrated in simple twoterminal structures. Thus, nonvolatile RRAM implemented in a highly scaled region of approximately 10 nm allows dense synaptic arrays employing crossbar architecture [7], reducing the occupied area. It was found that a computing execution metric in the form of tera operations per second per watt used to describe performancepower efficiency could be improved dramatically based on recent studies examining prototype RRAM chips [8,9] and systematic end-to-end simulation analysis [10]. In addition to the structural advantages of the compact RRAM cell, a multilevel cell (MLC) characteristic paves the path for versatile classifications instead of digital binary storage of '0' and '1'. In the inference phase during recognition, detected patterns decoded as input signals were fed into the word lines of the synaptic RRAM array in parallel. Simple multiplication of the input signal and stored data resulted in the output in the form of a current. The unknown patterns can, thus, be inferred by comparing the computed output currents from each bit line. When multiple states of the synaptic devices are assigned to the array, the combination of the output currents increases exponentially, resulting in highly accurate recognition accuracy.
Notably, the neuromorphic systems not only perform robust classification iteratively based on the predefined dataset, but also enable learning. The reversible shift between acquired states plays a more crucial role in this training phase than the number of states achieved [11]. That is, when an incorrect output occurred, the system started to adjust the state up or down to calculate the actual expected output based on back-propagation learning algorithms. For this update process, an identical pulse train technique, in which pulses of a fixed width and amplitude are repeatedly addressed, is preferred to alleviate the design complexity and burden from a driving circuit standpoint. Thus far, most RRAM devices have shown MLC, but nonuniformly and abruptly updated behavior is challenging, although various engineering methodologies have been attempted [12,13]. It seems to have been overcome by employing electrochemical random-access memory (ECRAM) solely dedicated to the analog synaptic device at the expense of footprint loss [14]. The three-terminal ECRAM structure resembles a conventional silicon transistor, but the movement of mobile ions in the solid electrolyte driven by gate voltage into the channel region makes the device nonvolatile. Simultaneously, the ions progressively moved back and forth into the channel layer depending on gate voltage polarity, resulting in a wide range of channel currents. Therefore, we demonstrated two-and three-terminal resistive memory devices for synaptic elements suitable for neuromorphic computing tasks. Fab-friendly HfO 2 material was served as the switching layer for both devices, considering complementary metal-oxide semiconductor (CMOS) compatibility. In the first part of the discussion, we addressed the MLC characteristics of the two-terminal HfO 2 -based RRAM devices enabled by locally formed conductive filament for inference-accelerated systems, as shown in figure 1. Next, we demonstrated three-terminal HfO 2 -based ECRAM devices actuated by Cu ion migration through the entire area, favorable for achieving analogously tuned multiple states aimed at training-related systems.

Methods
Both RRAM and ECRAM were fabricated with fully CMOS compatible materials and processes. For the RRAM, a 6 nm-thick HfO 2 layer was deposited by atomic layer deposition on the TiN bottom electrode. A 15 nm-thick Ti metal electrode was then used for scavenging to generate oxygen vacancies by absorbing the oxygen ions from the HfO 2 due to the lower Gibbs free energy, resulting in a Ti/HfO 2 /TiN RRAM stack (from top to bottom) [12]. In this simple and compact two-terminal structure, the top (or bottom) electrode was connected to the voltage source (or ground) for measurement. For the three-terminal ECRAM, the ion migration was induced by introducing additional gate. A WO x channel with a width and length of about μm each was first formed on the SiO 2 substrate. Two W metal pads for source and drain were deposited at the edge of the WO x channel to read the channel current. Then, HfO 2 and Cu were subsequently deposited by sputtering for electrolyte and mobile ion-supplied gate electrode, respectively. Unlike the two-terminal structure, the voltage application methods for programming and read operations were different, indicating that the current paths were decoupled. By applying a voltage to the gate and grounding the source, mobile ions were migrated vertically during programming. The changed state due to the ion migration can be identified by applying the voltage to the drain instead of the gate. Figure 2(a) displays current-voltage (I-V) characteristics of 50 cycles for the HfO 2 RRAM. Bipolar resistive switching behavior was visible after a forming process that led to the creation and clustering of oxygen vacancies at approximately 5 V. The flow of electrons through the formed oxygen vacancy-based conductive filament resulted in a low resistance state (LRS), corresponding to a set operation. The sudden increase in the current was limited by a compliance current (I cc ) to prevent permanent breakdown. When a negative voltage was applied to the RRAM, the LRS was converted to a high resistance state (HRS) equal to the reset operation. Consequently, the oxygen vacancies in the filament were dissolved, disconnecting the path between two electrodes. The memory window between LRS and HRS was greater than two orders of magnitude at a read voltage (V read ) of 0.1 V during the repeated cycle, as shown in figure 2

HfO 2 RRAM-based inference-accelerated systems
Considering the filamentary dynamics of the RRAM, the manner to achieve MLC is categorized as either adjusting the I cc [15] or reset voltage (V reset ) [16], as shown in figure 3. The higher I cc induced more oxygen vacancies toward the filament, thickening its size. The LRS was lowered by increasing the I cc value setting from 200 to 750 μA at a given V reset of −1.2 V (figure 3(a)). However, the multiple states demonstrated in the LRS regime caused the huge output current resulting from summed total activated LRSs, worsening power consumption. On the other hand, the same amount of states was obtained by modulating V reset from −0.9 to −1.15 V with a finely divided V reset step of −0.05 V at the given I cc of 1 mA, as shown in figure 3(b). The I−V characteristics assigned to each condition were measured repeatedly 10 times and then V reset was gradually increased in the same device. The negative V reset caused the vacancies to disperse from the filament and the connected filament started to be dissolved at about −0.5 V, increasing the resistance of the RRAM. As larger V reset amplitude was applied, a switching gap between the electrode and ruptured filament was widened. The multilevel states were thus positioned in the range of μA, which was approximately 100 times lower than the LRS. The state uniformity was vulnerable to cycling at the expense of realizing an MLC at low current levels [17] because the ruptured filament with the switching gap tended to be easily disturbed by residual vacancies, inducing state fluctuation.
The switching variability of the RRAM can be improved by constricting the switching place [18], which was realized by geometrical scaling of the switching layer thickness [19]. In general, the chemical reaction occurs at the interface by absorbing the oxygen ions from the oxide, creating additional mixed oxide layer. This means that the oxygen ions should be first migrated from the HfO 2 toward the reactive Ti electrode, and then the reaction is performed due to the contact of oxygen ions with Ti atoms. That is, how much the reactant corresponding to the Ti atom is exposed to the oxygen ions becomes important [18]. Therefore, the amount of oxygen vacancies in the HfO 2 created by the chemical reaction is expected to be the same when the Ti scavenging layer of the same thickness is applied. The impact of the vacancies on the evolution of the filament was  thus pronounced in the thinner HfO 2 layer. Since the preexisting vacancies were relatively sufficient to establish the filament, the probability of generating additional vacancies was reduced, thereby suppressing randomly formed paths. As shown in figure 4(a), the distribution of each current level read at 0.1 V extracted from the I-V traces of 50 cycles for the RRAMs operated by the same voltage conditions was tighter in the switching layer thinned to 4 nm. While the uniformity was improved, the memory window was narrowed, which must be significant to achieve an MLC, resulting in a trade-off relationship. The V reset was varied for all RRAM devices to identify the root cause of the small memory window. As presented in figure 4(b), average read currents of 10 cycles in the 5 and 6 nm-thick HfO 2 RRAMs were well-controlled by V reset . Conversely, the current became flat without a noticeable change in the entire V reset range in the 4 nm-thick HfO 2 RRAM, making it difficult to distinguish the states. In addition, as the switching layer decreased in thickness, the set voltage (V set ) seemed to be independent of V reset , as shown in figure 4(c). In general, the V set followed a proportional relationship with V reset because the vacancies needed to travel according to the extent of filament rupture by V reset , so the required V set , which is the driving force to lead the vacancies, was also more significant. As expected, the V set and V reset closely followed the linear relationship for the 6 nm HfO 2 RRAM exhibited. For the 5 nm-thick HfO 2 film, the linear line shifted toward the low V set and eventually exhibited a nearly constant V set with respect to the V reset for the 4 nm-thick HfO 2 layer.
We performed physics-based modeling in MATLAB to understand these observations at a microscopic view. The switching gap representing g was designed to be varied over time based on the field-induced ion migration through generation and recombination of the vacancies [20,21]. The numerical equation to describe the process is as follow: dg/dt = −v 0 · exp(E a /kT) · sinh(γa 0 qV/LkT) where ν 0 is the velocity containing attempt-to-escape frequency, E a is the activation energy for vacancy migration, γ is the field enhancement factor, a 0 is the hopping site distance, and L is the switching layer's thickness. γ can be extracted with the following relation:  where all the parameters are fitting parameters. Then, the current directly related to the extent of the growth of the filament is described as where I 0 is the constant, and g 0 and V 0 are the gap and voltage coefficient, respectively. The parameters used in the simulation are listed in table 1. The specific gap and range for each HfO 2 thickness were empirically used. The experimental traces (symbols) were fitted with the simulated MLC I-V curves (lines) achieved by expansion and contraction of the gap, as shown in figure 5. More importantly, we identified that lowering E a from 1.46 and 1.17 to 1.07 eV at the same given parameters primarily led the V set to be invariant, accompanying a narrowed interval of multiple HRSs as a function of HfO 2 thickness. As discussed above, a relatively larger number of vacancies actively participated in the filament evolution in the thinner HfO 2 , where the vacancy migration was easier due to lowered E a [22]. Although the filament was disconnected at approximately −0.5 V during the reset operation for the 4 nm HfO 2 RRAM, numerous vacancies remained around the filament, making it difficult to lower the HRS level. The larger V reset helped reduce the HRS by pushing the vacancies further away from the filament. However, the reset breakdown phenomenon indicating an irreversible set at negative voltage instead of lowering the HRS was observed (not shown here). It limited the maximum allowable V reset to −1.2 V for the 4 nm-thick HfO 2 compared with the 6 nm-thick HfO 2 , where the tolerant V reset extent was −1.9 V. Therefore, our findings revealed that the optimal switching layer thickness considering the MLC and reliability such as uniformity and failure needed to be selected.
In this regard, the 5 nm RRAM was optimal in this work. Based on the achieved MLC ( figure 6(a)), the feasibility of inference tasks on the Modified National Institute of Standards and Technology (MNIST) dataset was evaluated. 10 output neuron layers were connected with 784 input neurons layers through two hidden layers with 250 and 125 neurons to determine handwritten digits from 0 to 9, as shown in figure 6(b). The dataset was pretrained with 10 bits, and the synaptic weight was then quantized [23]. Since a relatively simple MNIST dataset was utilized to classify, higher inference accuracy of approximately 94% was ideally displayed even though the synaptic device exhibited binary states, as shown in figure 6(c). As the bit of the RRAM was increased, an accuracy greater than 95% was ensured. A significant deviation-dependent drop in the accuracy occurred for the 1 bit per RRAM, assuming that the instability of the state was applied for the real case. The MLC of the RRAM noticeably mitigated the accuracy degradation.
From a pratical point of view, investigation on reliability of the multiple states is required for hardware implementation of RRAM based neural networks [24][25][26]. By introducing a verification algorithm, equally spaced multilevels can be demonstrated at the chip-level [27]. State stability is more important when considering inference engines, where the states are rarely programmed. Specifically, the state instability can be accelerated due to the elevated temperature that occurs while the chip is running. In this case, a refresh algorithm can be a solution to mitigate the resistance drift toward a specific direction or random motion [28].
Finally, permanent failure analysis should be carefully managed. As we discussed above, the unexpected reset breakdown of the device, which was one of the failure modes, induced stuck-at-LRS [29]. Since the sum of the total resistances of the RRAMs located in the selected column is interested in the neuromorphic systems, the high current of the faulty single device makes it difficult to classify the output values during the inference. Therefore, a systematic assessment of robust multilevel RRAM should be studied through an understanding of the root causes of possible failure scenarios in the various operating environments.

HfO 2 ECRAM-based training-accelerated systems
In the aforementioned HfO 2 RRAM, jumps between each state were solely available when either I cc or V reset was constantly changed, heading upwards or downwards, respectively. The use of continued nonidentical conditions every cycle becomes a massive burden in the driving or peripheral circuitries, eventually degrading system power and latency. Hence, we studied the three-terminal ECRAM device designed to respond to identical pulses for the training-accelerated systems. For the desired analog synaptic operation, mobile ions in the electrolyte actuated by an additional gate terminal reached the channel, usually consisting of an oxide semiconductor [14,30]. Then, the oxide semiconductor channel became highly conductive due to the injected ions serving as dopants. It resulted in a controllable channel current as a function of the gate input. In the early stage, Li ions and their corresponding ion conductor, such as lithium phosphorus oxynitride, were used for mobile ions and solid electrolytes, respectively, inspired by rechargeable secondary ion batteries [14,[31][32][33][34]. Various mobile ions such as Na [35] and H [36] have been explored recently based on their excellent synaptic behavior. Here, by taking into account the CMOS compatibility, we configured the ECRAM device with the Cu/HfO 2 /WO x stack. To provide Cu mobile ions, a Cu electrode was used, which can be directly integrated into the back-end-of-line interconnect process. An HfO 2 electrolyte was utilized to allow the gate-controlled Cu ion motion. Since the W atom has multiple valence states in the WO x channel, it can induce a wide range of conductivity changes. Figure 7(a) presents a transmission electron microscopic (TEM) image of the multilayered ECRAM's gate stack, and each layer was clearly classified by energy-dispersive x-ray (EDX) analysis, as shown in figure 7(b). In addition to the HfO 2 layer, we investigated various electrolyte materials, such as SiO 2 [36] and MoO 3 [37], specifically selected from previously reported literature. Figure 8 demonstrates that the normalized read current due to migrated ions controlled by a gate pulse of 8 V with a pulse width of 1 s was a strong function of the electrolytes used. No change in the MoO 3 /WO x stack's read current was shown regardless of the gate pulse successively applied. Alternatively, the field-driven Cu ions started to be involved in the channel area in both the SiO 2 and HfO 2 electrolytes. Notably, the channel current only responded when hydrogen was doped in the SiO 2 based on the preliminarily performed study. Therefore, it can be inferred that the observed change in the channel current was not due to interface defects, but was mainly derived by additionally incorporated mobile ions in the electrolyte. Since a hydrogen-free undoped SiO 2 electrolyte was utilized, we believe that the Cu ion  migration is essential in adjusting the channel current. Compared with the SiO 2 electrolyte, the HfO 2 electrolyte enabled more sensitive increment under the continuously addressed gate pulse, indicating progressive ion movement in the HfO 2 /WO x stack.
The synaptic characteristics can be achieved through material optimization of the HfO 2 /WO x structure, discussed in detail in reference [38], as shown in figure 9(a). A lateral channel current at 0.5 V with respect to the identical gate voltage of ±3 V with a pulse width of 1 s was analogously tuned. The higher gate voltage amplitude enlarged the dynamic range of the current change. Furthermore, the gate-controlled channel property was electrically verified using the examining area scaling analysis, as shown in figure 9(b). As the channel area was reduced, the read current was proportionally lowered, meaning that the entire channel area participated in the resistive switching. We measured the gate current by applying a voltage to the gate and grounding to the source to clarify the dominant switching location. The current was a level below the μA range by sweeping the voltage to 3 V without an abrupt current increase, as shown in figure 9(c). Instead, under the repeated sweeping voltage, the current was smoothly increased. It indicated that no leakage paths due to either oxide breakdown or clustering of the Cu ions were formed locally, and the impact of the gate side on the channel current was, thus, negligible.
The obtained synaptic characteristics were expressed in terms of a linearity factor, α, which is the performance index of the synaptic device [39], where the current increase or decrease was 2.42 or 0.21, respectively, as shown in figure 10. The impact of the linearity on the recognition accuracy was then evaluated using the constructed neural network with back-propagation algorithms in MATLAB. As the error arose in the output neurons during propagation, the states obtained from the ECRAM synaptic devices were updated following the states versus pulse number curve ( figure 10(a)). The filamentary RRAM exhibited the nonlinear and asymmetric synaptic response, resulting in low accuracy of approximately 10%. On the other hand, the training accuracy  (90.8%) was dramatically improved by gradually adjusting the update operation of the ECRAM, as shown in figure 10(b).

Conclusion
We addressed CMOS-compatible HfO 2 -based RRAM and ECRAM devices for inference and training accelerated neuromorphic systems, respectively. We examined how the HfO 2 switching layer thickness was involved in achieving an MLC employing electrical characterization and MATLAB-based modeling. Our measurement results revealed that a trade-off relationship between an MLC and switching uniformity depending on the HfO 2 thickness. The appropriate thickness of the main switching layer should be considered when the V reset modulation technique was used for the MLC in the filamentary switching. Instead of a locally formed filament, whole area switching was preferred for the training-enhanced analog synaptic behavior. The Cu ion-actuated ECRAM with the HfO 2 /WO x stack has, thus, been exploited, and the gradual synaptic response driven by the number of identical gate voltages was demonstrated. We hope that our findings can provide insight into designing the device structure aimed at a specific target application.