An artificial neural network as a predictor of electrical characteristics of nanoelectronic device channel based on a low-dimensional heterostructure

In this paper a computational algorithm for calculating current density of low-dimensional semiconductor heterostructures based on an artificial neural network is proposed. The neural network training is performed using the quantum-mechanical model of Green’s Functions.


Introduction
The rapid development of modern electronics has allowed the use of low-dimensional semiconductor heterostructures to increase the operating frequency range of devices called "ultrafast" [1,2]. However, such a small channel size of nanoelectronic devices leads to difficulties in ensuring their reliability due to the critical influence of degradation processes not only during their operation, but even during their manufacture. Modern technology computer-aided design (TCAD) systems in most cases are based on the use of semi-classical current transfer models [3], which makes them impossible to implement in the latest structures with quantum restriction and does not allow degradation in such structures at the design stage. In practice, developers are forced to use "blind" selection methods or intuitive methods "from production experience" to select the design parameters of ultrafast devices that provide the required performance characteristics.For the above reasons, an actual problem is the development of a TCAD software package for devices based on low-dimensional semiconductor structures with a quantum constraint, which allows evaluating the operating properties and reliability parameters of a nanoelectronic device at the design stage [4,5].

Current density prediction model
When developing a TCAD system for designing modern ultra-fast devices on small structures, the crucial importance is the effective use of computing resources in the calculation of the current transport model and its multiple repetition to evaluate the kinetics of current-voltage characteristic (CVC) and to predict reliability parameters. For example, nowadays commonly used stochastic models based on the Monte Carlo method for current transportsimulation of devices based on low-dimensional semiconductor heterostructures, despite several advantages, require significant computational resources [6]. Semiclassical models are more efficient in terms of using computer resources. However, due to various approximations of the carrier distribution functions on which they are based, semiclassical models simulate electrical parameters of nanoscale devices with less accuracy [7]. Previously, the authors proposed a combined quantum-mechanical model for predicting electrical parameters, based on Non-Equilibrium Green's Function (NEGF) formalism [8]. The proposed model allows to predict electrical parameters of low-dimensional structures with high accuracyWith the aim of increasing the computational efficiency of the electrical characteristics prediction module, the authors in this article consider the possibility of artificial neutron network, which is trained using a combined quantummechanical model, as a predictor.

Artificial neural networks -basic concepts and parameters
Nowadays artificial neural networks are widely used in various fields of technology for approximation of complex functional associations [9,10]. It is well-known that any continuous function of many variables can be approximated by a neural network with a given accuracy. [11]  In general, an artificial neural network is a system which consists of connected simple computational elements, each of them being called a neuron. The input of the neuron gets the data that can either be received from the outside or the signals at the output of other neurons. Then a weighted sum of all incoming signals is determined.Next, non-linear transformations are performed with the received data, for which special functions, so-called activation functions, are usedafter that, a certain signal is received at the neuron's output.In the network structure, there is an input layer that gets all features of the described model, an output layer that displays the model results, and hidden layers between them. The number of neurons in the input layer is determined by the number of independent variables of the simulated function, the number of neurons in the output layer is equal to the number of output variables. The number of hidden layers and the number of neurons in them is specified when developing the structure of a neural network according to the problem to be solved.
The neural network should be trained to perform the required calculations correctly.During training it is necessary to solve the optimization problem of finding the weights of synaptic neuron connection, at which the learning error will be minimal. The main training method of neural networks for regression is a supervisor learning. In this approach, which based on a final set of precedents, weights are corrected to minimize the error between the neural network output results and the known results a priori. It should be noted that considering the specificities of algorithms and models used in neural networks, the initial training set should be normalized. This procedure is performed to minimize the influence of dimensions of different physical values included in the training sample. The most important property of the ANN is a generalization capability, the property of the model to reflect the input data in the desired results for the whole set of input data, not only in the training examples.
4. Development of an artificial neural network structure for the current density prediction As part of the reliability prediction model of devices based on low-dimensioned structures, an important task is to predict electrical parameters and to simulate their gerontological changes over time. It is necessary to estimate the current density for the preset parameters of the structure taking into account the technological errors that occur during production, and also to evaluate the kinetics of the operating point due to the degradation processes that occur over time. In this paper, the authors have considered a stationary problem, but the proposed solution is also scalable for non-stationary problems.
When designing a neural network, the authors used the structure of a direct distribution network (multilayer perseptron) to predict current density. Design and technological parameters of the structure (thickness and chemical composition of heterostructure considering technological errors that occur at the production stage) and the value of applied voltage were used as input data for simulation. Accordingly, the number of neurons on the input layer is determined by the number of parameters used. On the output layer one neuron outputs the predicted value of the electrical characteristic of the simulated structure. The network with one hidden layer is used in this work. The number of neurons in the hidden layer is determined empirically as a result of comparing the learning efficiency of the developed networks on test samples. Configuration of a neural network has a significant impact on its generalization capacity. Using an overly complex model will lead to the problem of "retraining". The algorithm of Bayesian regularization is used to solve this problem. To reduce computational costs, the authors have decided to use a modified "ReLU" as an activation function. For the solution "dying ReLU problem", the ReLU's modification "Softplus" [12] was used. The mean square norm is used to estimate the error function.
where J -Jacoby's matrix, P -additional regularization parameter, H error vector, I -identity matrix.

Forming a training sample
For training the developed neural network it is necessary to form the training sample representing a set of design and technological parameters and electric characteristics of structures. The ideal option is to use a 3 set of experimental data for these purposes. However, for supervisor training the prediction model based on a neural network with nonlinear activation functions, it is necessary to conduct a significant number of experiments. Manufacturing the required number of low-dimensional semiconductor structures will require significant financial expenses, which is not possible for the tasks of designing nanoelectronic devices. According to the approach proposed by the authors, the main part of the training set is data obtained through numerical modeling, experimental data are used to refine and supplement the set, and to verify the model. The model used to generate the training sample is based on the idea of using the Green function, which describes the response of the quantum system to external disturbance, to calculate the current density in low-dimensional semiconductor heterostructures. In general, the current density can be determined as follows: where ( ) D E -supply function,  Figure 1 shows a three-dimensional figure, obtained as a result of using a model based on the ANN, describing the dependence of the I-V characteristic and the peak current value on the widths of the barriers. Figure 1 Three-dimensional figure describing the dependence of the I-V curve and peak current value on the barrierswidths As part of the verification, the results obtained by the developed model were compared with those obtained by using the Landauer-Buttiker formalism model ( Fig. 2 (a)). The final error of current density prediction when using the artificial neural network model does not exceed 3% (Fig. 2 (b)). Comparative analysis of time complexity of computational algorithms based on ANN and models based on Landauer-Buttiker's and NEGF formalisms allowed to conclude that the model based on trained neural network is more effective in multiiteration calculations. A graphical representation of the dependence of computational time for the test model structure on the discrete grid step of the coordinate axis is presented in Figure 2 (c).  Figure 2(a, b, c). (a) Three-dimensional graph of current density as a function of the applied voltage and width of the quantum well; (b) Cross section of the three-dimensional graph of current density with the plane appropriate to the width of the quantum well=8 monolayers of GaAs; (c) Time dependence of current density calculation for one value of the applied voltage on the value of the step of discrete coordinate axis.

Conclusion
The authors developed an effective computational algorithm for modeling the current-voltage characteristic of ultrafast devices based on low-dimensional semiconductor heterostructures. An artificial neural network trained based on a quantum-mechanical model, based on the formalism of Green's functions, was used as a predictor. According to the benchmark tests of algorithms for predicting the current-voltage characteristics, an algorithm based on a trained artificial neural network demonstrated an increase in computational efficiency by more than two orders of magnitude in comparison with other quantum-mechanical models or models based on conditional analytical methods.As part of the validation of the developed computational algorithm, the difference between the model and the experimental beginning section of the CVC of resonant tunnel diode was estimated and obtained error does not exceed 5%.The results of verification and validation of the developed computational algorithm for predicting the electrical characteristics of the nanoelectronic device channel on a low-dimensional heterostructure allow us to conclude about the expediency of its integration into commercial applied TCAD systems.