Classification tasks using input driven nonlinear magnetization dynamics in spin Hall oscillator

The inherent nonlinear magnetization dynamics in spintronic devices make them suitable candidates for neuromorphic hardware. Among spintronic devices, spin torque oscillators such as spin transfer torque oscillators and spin Hall oscillators have shown the capability to perform recognition tasks. In this paper, with the help of micromagnetic simulations, we model and demonstrate that the magnetization dynamics of a single spin Hall oscillator can be nonlinearly transformed by harnessing input pulse streams and can be utilized for classification tasks. The spin Hall oscillator utilizes the microwave spectral characteristics of its magnetization dynamics for processing a binary data input. The spectral change due to the nonlinear magnetization dynamics assists in real-time feature extraction and classification of 4-binary digit input patterns. The performance was tested for the classification of the standard MNIST handwritten digit data set and achieved an accuracy of 83.1% in a simple linear regression model. Our results suggest that modulating time-driven input data can generate diverse magnetization dynamics in the spin Hall oscillator that can be suitable for temporal or sequential information processing.


Results and discussion
SHO device model. A conceptual schematization of the SHO-based feature filter is shown in Fig. 1a. It consists of a pulse input, a SHO, a frequency domain output and a filter. The modelled SHO is composed of platinum/permalloy bilayer (Pt/NiFe) with a lateral size of 100 × 100 nm 2 . Each layer has a thickness of 5 nm. The magnetization is aligned at an angle φ = 90° (+ Y), by applying an in-plane magnetic field, µ 0 H ext = 100 mT . The SHE in Pt causes spin-dependent electron scattering to the top and bottom surfaces of Pt upon the flow of current I c in the X direction. This leads to spin accumulation at the interface of Pt/NiFe and a subsequent transfer of spin angular momentum to NiFe results in a transverse flow (Z) of spin current ( I s ). The ratio between charge current density ( j c ) and spin current density ( j s ) is characterized by the spin Hall conversion efficiency, θ SH (spin Hall angle) 25,26 . The j s results in two SOTs known as field-like torque (FLT) and damping-like torque (DLT). FLT is the result of an interfacial effect due to the strength of spin accumulation at the interface, whereas DLT is the result of bulk phenomena of the SHE 41,42 . Due to the minimal impact of FLT for the 5 nm thick Pt Layer, we have only taken into account the role of DLT on the dynamics of magnetization in the simulations. The natural NiFe damping can be controlled by DLT and fully compensated to produce auto-oscillations in the gigahertz frequency range by increasing j c 29,33,34 . The "Methods" section contains information on the simulation technique and the material parameters. The input data are in the form of current pulses. The binary digits "1" and "0" are encoded as two distinct current values, I 1 and I 0 , respectively. The output channel is assumed to be of anisotropic magnetoresistance (AMR) based electrical read-out which depends on the magnetization component Mx 24,43 . To represent the collective behavior of the SHO for the given input signal, the fast Fourier transform (FFT) transforms the simulated time-dependent Mx into frequency spectra. The main benefit of frequency domain analysis is the reduced volume of output data for additional computations. The following section will discuss how a filtering mechanism can be used to perform feature extraction and a straightforward linear classification of input data. Inherent magnetization dynamics excited by SOT. We begin by examining the magnetization dynamics of the modeled SHO device as a function of I 1 for a single current pulse with a pulse width ( τ ) of 3 ns and pulse rise and fall times of 1 ns. In Fig. 1b, the magnetization component Mx's temporal evolution is depicted for I 1 = 5.0, 5.5, 6.0, and 6.5 mA with I 0 = 0 mA. SOT excitation of a ferromagnetic resonance mode causes magnetization oscillations at I 1 = 5.0 and 5.5 mA. They exhibit small angle precession, in which the cone angle increases with an increase of I 1 [44][45][46] . As shown by the colored circles in Fig. 1b, the initiation and relaxation times vary depending on the strength of I 1 . This indicates the manipulation of effective damping by the SOT. These excitations can be converted into self-sustaining auto-oscillations by gradually increasing the precession amplitude with τ until it saturates at the limit cycle of auto-oscillation 47 . For I 1 = 6.0 and 6.5 mA, the oscillations correspond to the in-plane and out-of-plane auto-oscillation modes, respectively. The equilibrium energy, which regulates the auto-oscillation's limit cycle, is influenced by the device geometry, mode of excitation, and www.nature.com/scientificreports/ direction of the effective field. The auto-oscillation orbit for the single magnetic domain model is circular in the out-of-plane direction and shaped like a clamshell in the in-plane direction 45 . The single cycle trajectories for each oscillation mode are depicted in Fig. 1c. In Fig. 1d, the FFT amplitude spectrum for the I 1 range (5 mA to 6.5 mA) is displayed. For I 1 < 5.5 mA, the FFT amplitude increases linearly with increasing I 1 , but the oscillation frequency, which can be seen from the peak position of the FFT amplitude in the spectra, is constant at 9.0 GHz. This constant frequency corresponds to the ferromagnetic resonance frequency (Supporting Fig. S1). For 5.5 mA < I 1 < 6.0 mA, the frequency undergoes a red shift due to the large angle motion of magnetization components, as can be seen in Fig. 1c ( I 1 = 6.0 mA), which reduces the effective demagnetization field in the NiFe layer. The magnetization component transverse to H ext undergoes oscillations at twice the oscillation frequency in order to maintain a constant magnitude and thus reduces the frequency. As I 1 is increased further, the frequency decreases, reaching 7.1 GHz at I 1 = 6.0 mA. This frequency shift is attributed to the complex coupling of oscillatory amplitude and phase as predicted by the nonlinear auto-oscillator theory for STOs 48 . For I 1 = 6.5 mA, the frequency increases to 9.1 GHz and the FFT amplitude reduces due to the out-of-plane oscillation, as can be seen in Fig. 1c ( I 1 = 6.5 mA). As a result, the magnetization dynamics in the SHO can be divided into three regimes, as shown in Fig. 1e: the linear excitation regime for I 1 < 5.5 mA, where the frequency is constant with increasing precession amplitudes as a function of I 1 , the nonlinear excitation regime for 5.5 mA < I 1 < 6.0 mA, where the FFT amplitude is saturated and the www.nature.com/scientificreports/ frequency decreases drastically with increasing I 1 and the out-of-plane oscillation regime for I 1 > 6.5 mA where the frequency increases and the FFT amplitude decreases. These nonlinear frequency amplitude relationships can be used to classify the inputs.

Binary inputs classification. Regular pulse scheme.
After investigating magnetization dynamics, we look into the SHO's ability to classify n-binary input data. The pulse stream is represented by the encoded input signal n -bi(t), which has current values I 0 and I 1 for "1" and "0", respectively. The pulse period ( t ) and width ( τ ), respectively, 4 ns and 3 ns. The pulse width τ , includes a rise time of 1 ns and a fall time of 1 ns. Figure 2 represents 4 -bi(t) input pulse patterns, Mx responses, and FFT amplitude spectra (frequency) with input current values of I 1 = 3.5 mA and I 0 = 0 mA, which lies in the linear excitation regime. For the input pattern 1111 in Fig. 2a, the Mx response in Fig. 2b shows the magnetization oscillations with varying amplitude for each "1" input, and the corresponding FFT spectrum is shown in Fig. 2c. For the 1001 pattern in Fig. 2d, the magnetization relaxes to its initial state in the time between the two "1" inputs, as shown in Fig. 2e. The corresponding amplitudes in the FFT spectrum (Fig. 2c,f) for the input patterns of 1111 and 1001 allow one to clearly see the difference in magnetization dynamics. In order to categorize the patterns of the input data and extract features from the outputs, we employ the classic signal processing tool of filtering. We apply the filter neuron concept, which is inspired by recurrent neural networks' use of it for feature extraction and handling time-varying outputs, to filter a specific feature in the output data 49,50 . The linear excitation regime frequency, 9.0 GHz, is fixed as the filter and the FFT amplitude value at this frequency is set as the filter output characteristic feature in order to separate the input patterns in the output spectrum. The filtered amplitude values for the input patterns 1111 and 1001 at 9.0 GHz are 0.0042 and 0.00002, respectively, as shown in Fig. 2c,f. Note that this filtering strategy differs from the standard bit slicing techniques used in the computing paradigm. The bit slicing method maps n input elements to n output values and then performs additional computations for weight optimization 9 . But in this case, n input elements are mapped to a single output value using the quantization technique. This method is well suited for the reduction of output data and does not call for weight optimization for the classification task of input patterns, which can reduce the computation costs 51,52 .
The relaxation of magnetization precession during the interval between two consecutive pulses poses a challenge to the classification task. For the 0101 input pattern in Fig. 2g, the Mx response is displayed in Fig. 2h. Since the magnetization has relaxed to its initial state prior to the second "1" pulse arrival, the individual "1" pulses exhibit the same oscillating amplitude. As can be seen in Fig. 2f,i, the resulting FFT spectra have the same amplitude as the 1001 pattern. In this case, it is not possible to distinguish between the SHO's output and any of the possible 4 -bi(t) input patterns. Input pulse parameters I 0 , I 1 and τ can be varied to affect the dynamics of the magnetization, but in the linear excitation regime, patterns like 1000 and 0001 still produce the same FFT spectra. This is caused by the same magnetization dynamics, but in a different time frame, for each "1" pulse. We refer to this type of input pulse stream as the regular pulse scheme. Supporting Figures S2 and S3 provide an analysis of the magnetization dynamics for various I 1 values, as well as the resulting FFT spectra and the inability to separate 4 -bi(t) input patterns.
Modified pulse scheme. To tackle the challenge faced in the regular pulse scheme, we resort to modifying the input driven magnetization dynamics rather than the internal structure of the device 53 . Figure 3a   www.nature.com/scientificreports/ reaches an amplitude of 0.18 and relaxes to the ground state within 2 ns when I e = 3.5 mA and I 0 = 0 mA. This can be clearly seen from the upper envelope plot of Mx shown after the pulse is off at 8 ns in Fig. 3c. However, by introducing an offset value for I 0 , the relaxation time can be extended. Figure 3d shows an excitatory pulse with an offset value for I 0 ( I e = 3.5 mA, I 0 = 1.2 mA). The oscillation amplitude increases to 0.38 due to the increasing precessional amplitude with the offset current, as shown in Fig. 3e, and the relaxation period is extended by 8 ns, as shown in Fig. 3f. This allows us to modify the magnetization dynamics in the SHO during the inputs for different 4 -bi(t) patterns, as will be discussed below.
In order to extend the dynamics of magnetization relaxation for the duration of the input pulse pattern, a modified pulse scheme ( I mod ) that includes the excitatory pulse ( I e ) and a pulse gap ( δ ) prior to the introduction of the 4 -bi(t) is used. Hence, the SHO responds to a combination of two input signals, I e and bi(t) given by, A modified pulse scheme with I e = 3.0 mA, t 1 = 7 ns, δ = 5 ns, I 0 = 1.2 mA, I 1 = 2.4 mA, t = 4 ns, τ = 3 ns is used for illustration. For simplicity, we denote these input patterns with the above mentioned parameters as IP 1 . Figure 4a shows the input pattern of 1010, the Mx response in Fig. 4b, and the FFT spectrum in Fig. 4c.
Similarly, Fig. 4d shows the input pattern for 0101, the Mx response in Fig. 4e, and the FFT spectrum in Fig. 4f. As can be seen from the Mx responses in Fig. 4b,e, each "1" pulse exhibits a different oscillating amplitude as the magnetization dynamics is influenced by both the prior I e and the corresponding pulses. Since I e 's influence gradually diminishes over time, each output has a unique dynamic, and the degree of influence of previous inputs varies as a function of time. As anticipated, the variation in relaxation dynamics has a significant impact on oscillation amplitude. For the I mod scheme, the FFTs were obtained from the Mx in the range of input pulse patterns n -bi(t) (Supporting Fig. S4). Figure 4c,f show the variation in the FFT amplitude and frequency as well www.nature.com/scientificreports/ as the filtered (9.0 GHz) amplitude values 0.037 and 0.017 for the patterns 1010 and 0101, respectively. Because of the input pattern-specific nonlinear magnetization dynamics, I mod scheme can easily classify these two input patterns from the filtered amplitudes, whereas the previous regular pulse scheme could not. Similarly, the 1000 and 0001 patterns can also be distinguished using the filtered amplitude due to the variations in the relaxation rates. Supporting Figure S5 shows the 16 different n -bi(t) input pattern combinations and their corresponding magnetization dynamics. Figure 5a displays the filtered FFT amplitude for the 16, 4 -bi(t) input patterns using IP 1 . Due to the distinction in the filtered amplitude values, a 4 -bi(t) pattern can be quantized and represented as a 1-dimensional analog output.
Seperability index. We further examined how different input I e and δ values in IP 1 affect the separation of 4-bi(t) digit patterns. The results are compared using a commonly used metric in pattern recognition problems, called separability index, (SI) which is a measure of the average difference between the outputs, for different classes of input parameters 54,55 . SI for IP 1 with I e = 1.4-6.0 mA and δ = 5 ns is shown in Fig. 5b. For I e = 1.4 and 2.0 mA, the magnetization precession excited by the first "1" pulse relaxes before the arrival of the next pulse. This results in the FFT amplitudes being very close to each other, leading to a low SI. On the other hand, for I e = 4.0, 5.0, and 6.0 mA, the magnetization precession excited by the first "1" pulse shows a large amplitude of auto-oscillation, resulting in the consecutive bi(t) pulses oscillating at the same amplitude. As a result, input patterns like 0111, 1011, 1101, 1110, and 1111 produce FFT amplitudes that are very close to one another. The close range of SI values for I e = 4.0, 5.0, and 6.0 mA indicates that the excitatory pulse's ability to distinguish between the various input patterns has been lost. We observed a maximum SI in the case of I e = 3.0 mA, which indicates that a balance between the I e pulse and bi(t) pulses is needed. Figure 5c shows SI for IP 1 with a range of pulse gaps, δ = 1-25 ns and I e = 3.0 mA. The effect of I e on bi(t) pulses decreases as δ increases. Therefore, for input patterns like 0010 and 0001, the FFT amplitudes become similar to one another. I e has a noticeable impact on bi(t) for up to 20 ns before it is lost completely at δ = 25 ns. For δ > 25 ns, the magnetization dynamics are identical to the regular pulse scheme. Therefore, we can conclude that the separation of 4-bi(t) inputs can be optimized to obtain the highest separation at the filtered FFT amplitude with proper I e and δ . A comprehensive analysis of magnetization dynamics for various δ and I e are summarized in the Supporting Figs. S6 and S7.
The MNIST handwritten digit classification. Finally, the SHO device with the modified pulse scheme is evaluated for the classification of handwritten digits of the Modified National Institute of Standards and Technology (MNIST) handwritten database 56 . The database has 60,000 training images and 10,000 test images for the digits 0 to 9. Each image is 28 × 28 pixels in size and is displayed in grayscale, with pixel intensities ranging from 0 to 255. Figure 6a depicts the hypothetical model network, which contains an input layer, the SHO layer, and a classifier layer. The workflow of the network is assumed to have the following layer operations: the images are preprocessed in the input layer so that the original grayscale format is binarized with a threshold (pixel intensity > 125 = 1), where 1 stands for a white pixel and 0 for a black pixel. Following this, the images are divided into 4 pixel segments that move along rows and are then converted into input current pulses, creating a total of 196, 4 − bi(t) input patterns. Each of the 4 − bi(t) inputs are encoded as a pulse stream and fed to the SHO layer using the modified pulse scheme with IP 1 parameters. As was discussed in the previous section, the filtered FFT amplitude is collected as the output of the SHO layer for 4 − bi(t) pattern at 9.0 GHz. For the results shown here, custom functions were created in Matlab and Python programming languages to replace 4 pixel values www.nature.com/scientificreports/ with corresponding FFT amplitudes. The classifier layer has 10 nodes that are all fully connected and are used to categorize the 10 handwritten digits from the maximum entry. The MNIST handwritten digit classification accuracy is evaluated as the ratio of the total number of correctly classified digits to the total number of digits. First, using simple linear regression the weight matrix was calculated using the training data and tested on the test set of 10,000 images. The accuracy obtained by this one-step calculation is 83.1%. The predicted vs. true digit confusion matrix, shown in Fig. 6b as a color map, displays the classification success for the IP 1 modified pulse scheme. The digit 5 is the least successful at being classified, which lowers the success rate as a whole. In addition, the SHO network model was evaluated in a supervised learning process carried out in Python using the Tensorflow machine learning module 57 . We fixed the softmax activation function for the classifier layer and categorical cross entropy as loss function. The classification accuracy achieved with supervised learning with 20 epochs and 32 batch sizes was 86.6%. This indicates the overall www.nature.com/scientificreports/ classification accuracy can be improved with classifier layer training. However, our aim is to reduce the previous layer computations for feature extraction.
To illustrate the reduced computations, we performed network training comparing with a software FNN. The software-FNN model consisted of 784 input neurons for each input pixel, a fully connected middle layer of 196 neurons for feature extraction, and ten neurons (the classifier layer) for classification. We used the rectified linear activation function (ReLU) for the feature extraction layers and the softmax activation function for the classifier layer. It is clear that the software-FNN requires 196 neurons and associated computations to generate 196 features for the classifier layer, similar to the SHO layer outputs. The confusion matrix of the software-FNN, shown in Fig. 6c, achieved an accuracy of 93.1%. When compared to the SHO network, the software-FNN improves accuracy by 7.5%. However, the computational cost of calculating the weight matrix is high. Figure 6d shows the accuracy and weight parameters required for the classification for the linear SHO (single linear weight computation), SHO-FNN (supervised learning), and software-FNN. It can be seen that with the use of the SHO layer, the weight computations required for a software network can be significantly reduced from 155,820 weight parameters to 1960 parameters. The ability of the SHO layer to directly infer distinct outputs for 4 − bi(t) patterns reduces the computation required. Although there is a trade-off between recognition accuracy and inference computations for the MNIST handwritten digit classification due to various factors like binarization of images, choice of classifier layer training methods, and loss functions, the SHO network is still favorable since it can achieve 83.1% with linear regression. Moreover, a sequence of 4 binary digits can be easily classified without any requirement of weight storage or computations, which can be beneficial for applications that need faster inference with reduced computations. Nevertheless, there is a huge opportunity to engage in the co-design of SHO as a dedicated feature mapping layer alongside already existing hardware and algorithms.

Conclusion
In this study, a single spin Hall oscillator capable of real-time classification of sixteen different binary 4 − bi(t) patterns (0000 to 1111) was demonstrated. Control and tuning of the intrinsic magnetization oscillations of the spin Hall oscillator were performed by modifying an input digit pulse pattern. The performance of the model was tested with the standard MNIST handwritten digit data set classification and achieved an accuracy of 83.1% with a linear training network and reduced output layer computations. Hence, these results are expected to provide an insight into the controlling and tuning of spintronic oscillators for real time and on-device neuromorphic computations.

Methods
Micromagnetic simulations. The micromagnetic simulations were carried out using the LLG micromagnetic simulator in the framework of a single domain model [58][59][60] . The temporal dynamics of the ferromagnetic layer was solved by the Landau-Lifshitz-Gilbert (LLG) equation with the spin transfer torque term, where m = M M s is the normalized magnetization vector, γ is the gyromagnetic ratio, α is the Gilbert damping parameter, µ 0 is the vacuum permeability, M s is the saturation magnetization, ℏ is the reduced Planck constant, e is the electron charge, and t FM is the thickness of the magnetic layer. The effective field H eff includes the external magnetic field, the magneto-crystalline anisotropy field, and the demagnetization field. θ SH is the spin Hall angle, which characterizes the conversion efficiency of charge current density j c to spin current density j s in the heavy metal layer. σ = −sgnθ SH ( z × j c ) is the orientation of spin injected into the ferromagnet, where z and j c are the unit vectors in the direction of surface normal and the electrical current, respectively. The −sgn factor changes with the position of ferromagnet, i.e., if the ferromagnet is atop the HM, z would face into the HM, or if the ferromagnet is below the HM, z would face into the ferromagnet but it's sign would be opposite to the prior case. In accordance with experiments, the material parameters used in the simulations 38,39 , are: µ 0 M s = 1.0T , α = 0.02 , θ SH = 0.07 , resistivities ρ NiFe = 4.5 × 10 −7 m and ρ Pt = 2.0 × 10 −7 m . The magnetic anisotropy is ignored, and an in-plane external magnetic field of strength µ 0 H ext = 100mT is applied to saturate the magnetization along the + Y direction. Seperability index. The separability index (SI) were calculated between the N different outputs ( a i , .., a n ) of the spin Hall oscillator (filtered FFT amplitudes), corresponding to each of the N different input patterns via, where i and j are the indices of summation and N = 16 (16 patterns). The SI value of a particular input current scheme indicates how well the different input patterns can be distinguished from the closeness of the output values. Thus, we can infer that the input current scheme having the highest SI value is the most suited for performing classification and recognition tasks.
Network training and testing. Linear regression. For the SHO based network training, all of the images in the MNIST dataset were binarized first, and with custom functions another dataset replacing the 4 × 1 or 2 × 2 pixels with FFT amplitudes was created. For the direct software based networks implementation, we used the inbuilt Matlab/Python functions for converting black and white images within the respective classification pro- www.nature.com/scientificreports/ grams. For the linear regression readout layer, we used linear regression function executed in Matlab software. During the learning process, for a particular image, the output amplitudes were mapped into a one-dimensional feature vector of 196 elements. This process was repeated for all 60,000 training images, creating a 60,000 × 196 output matrix O. Then, a 196 × 10 weight matrix W (where each column corresponds to each digit from 0 to 9) was calculated using the output matrix O and a label matrix L (60,000 × 10) containing the true labels for each training image. In each row of the label matrix, the ( l + 1)th column had a value of 1 and the remaining columns had a value of 0. l is the true digit ( l = 0, 1..., 9). In linear regression, assuming a linear relationship between the output matrix O and the label matrix L, the weight matrix W was calculated using pseudo inverse W, Nonlinear activation functions. The SHO and software-FNN network models were evaluated in a supervised learning process performed in Python using the Tensorflow. All the networks were trained for 20 epochs for the weight optimization. The categorical cross-entropy error is reduced using the ADAM optimizer with default learning rate and a batch size of 32. The activation functions ReLU and softmax for the input (z) are given by, The softmax z i are the results of previous layer and K is the number of outputs, which corresponds to the number of classes to be distinguished ( K = 10 for MNIST handwritten digits 0-9).

Data availability
The datasets generated and analysed for this study is available from Kyushu Institute of Technology repository at http:// hdl. handle. net/ 10228/ 00008 940. The MNIST dataset is available at http:// yann. lecun. com/ exdb/ mnist.