Simulating lattice thermal conductivity in semiconducting materials using high-dimensional neural network potential

We demonstrate that a high-dimensional neural network potential (HDNNP) can predict the lattice thermal conductivity of semiconducting materials with an accuracy comparable to that of density functional theory (DFT) calculation. After a training procedure based on the force, the root mean square error between the forces predicted by the HDNNP and DFT is less than 40 meV/{\AA}. As typical examples, we present the results for Si and GaN bulk crystals. The deviation from the thermal conductivity calculated using DFT is within 1% at 200 to 500 K for Si and within 5.4% at 200 to 1000 K for GaN.

Heat generation in semiconducting materials has become a critical problem in modern nanoscale electronics. As the electric device size decreases, the power density and device temperature increase. This becomes one of the major factors contributing to degradation of the device performance and reliability 1 . To design semiconductor materials with better thermal manageability, efficient methods for theoretical simulation of the thermal conductivity are highly demanded.
The main carrier of heat in semiconductors is the phonon, which is the quantum of lattice vibration. Current methods of simulating the lattice thermal conductivity can be classified into three categories 2,3 : 1) anharmonic lattice dynamics (ALD) in combination with phonon transport calculation using the Boltzmann transport equation (BTE) and Fourier's law 4-8 , 2) equilibrium molecular dynamics (EMD) using the Green-Kubo formula [9][10][11] , and 3) direct evaluation of the heat flux by nonequilibrium molecular dynamics (NEMD) 12,13 . In these theoretical frameworks, accurate prediction of the interatomic force in the solid is essential.
Density functional theory (DFT) calculation is one of the most well-established techniques for accurate force prediction, including the effect of the electronic state change with atomic displacement. However, the high computational cost limits the application of DFT calculation in thermal conductivity simulations. Although a combination of DFT calculations with the ALD 7 or EMD 14 approach has been successful in accurate prediction of the lattice thermal conductivities of semiconductor crystals, the application of this technique to systems with more complex structures such as defective or disordered ones is not realistic because of the rapid increase in computational cost with increasing system size. Regarding NEMD, direct combination with DFT calculation is almost impossible, because the system size used in NEMD simulations must be much larger to reproduce a reasonable temperature gradient 13 .
The computational time can be reduced by using the empirical potential, but the accuracy is insufficient compared to that of DFT-based calculation 8 . An alternative simulation technique that can resolve the trade-off between the accuracy of the force prediction and the computational cost is urgently needed.
Machine learning techniques are a promising approach 15,16 , and applications to solid-state physics have been rapidly developed in recent years. As shown in early works by Behler et al. 17,18 and subsequent studies [19][20][21][22][23][24][25] , the high-dimensional neural network potential (HDNNP) can describe the relationship between the total energy of a system and its atomic arrangement. It is naturally expected that the force acting on atoms can also be described by the HDNNP, because the derivative of the total energy with respect to the atomic displacement gives the force. Force prediction in semiconducting materials using an HDNNP or other machine learning techniques has already been reported in several studies 26,27 .
However, the accuracy of the prediction is limited to on the order of 100 meV/Å, which is much larger than the DFT accuracy (a few tens of meV/Å) required to simulate the thermal conductivity. Here, we show that much higher accuracy can be obtained by training HDNNP parameters with a focus on force fitting. We chose crystalline Si and GaN as representative semiconducting materials with one and two atom types, respectively. In both systems, we obtained an HDNNP that can predict the force with the accuracy of DFT. The obtained phonon frequency and thermal conductivity are within 5.4% of those calculated by DFT.
In this study, we adopted the HDNNP model developed by Behler et al 16,17 . In this model, the total energy of the system ( tot ) is expressed as the sum of the energy contributions from each atom ( ), i.e., tot = ∑ i . Here we neglect the effect of the long-range electrostatic potential, and is determined by the local atomic environment, which is described by the cutoff and symmetry functions. The cutoff function determines the sphere of the local environment, and the symmetry functions represent the radial and angular distributions of neighboring atoms. Here, we use the following cutoff function: where is the cutoff distance, and is the interatomic distance. For the symmetry functions, we use the following three types of functional forms: (4) Here, atom represents the number of atoms inside the cutoff sphere. The hyperparameters c , s , , , and , which should be set before the HDNNP is trained, must be tuned for better prediction performance. The values of the hyperparameters used in this work are summarized in the supplementary materials.
As a simple example, let us consider a neural network (NN) consisting of a single hidden layer with n nodes and s input nodes associated with the symmetry functions. The atomic energy and the force with respect to the atomic displacement along the Cartesian coordinate i ν ( = , , ) output by the NN ( and , respectively) are given by the expressions where 0 , ≠0 , and are the bias, weight, and activation functions in the k-th layer, respectively. represents the total number of atoms in the system. These formulas for the atomic energy and force can be extended to the deeper NN and subnet structure in the HDNNP.
The above HDNNP model was implemented in our homemade code, which is publicly accessible via a Github repository 28 . Training of the weights and biases in the NN by backpropagation and the differentiation required for the force calculation were implemented using the Chainer 29,30 library. The loss function for the training procedure is defined as where RMSE({}) is the sum of the root mean square error between the HDNNP prediction and DFT training data. Note that the units of the RMSE of the energy and force are eV/atom and meV/Å, respectively. Here, the loss function L is evaluated using the unitless RMSE values.
The training data sets for Si and GaN were generated from a combination of a classical molecular dynamics (MD) simulation using LAMMPS 31,32 and DFT calculation using the Vienna Ab initio Simulation Package (VASP) [33][34][35][36] . A crucial point for good training of an HDNNP is the randomness of the atomic configurations in the training data, which requires a long MD simulation and sparse sampling of the MD trajectories. Because of this sparseness, it is more efficient to generate structures using classical MD with a well-established interatomic potential and then evaluate the energies and forces of structures sampled from the MD trajectories by DFT calculation than to perform the entire procedure using DFT. We first conducted a classical MD simulation using the Stillinger-Weber potential 37 40 , we also included the data obtained using slightly different lattice constants. The total numbers of the atomic configurations in the DFT data sets were 2800 and 3000 for Si and GaN, respectively.
In the training procedure using the above data set, we used an NN topology with two hidden layers. The number of nodes in each layer was set to 500, and the hyperbolic tangent was adopted as the activation function. Adam (Adaptive Moment Estimation) was chosen as the optimization algorithm 41 .
In Fig. 1, we compare the forces in the Si and GaN systems predicted by DFT and the HDNNP.
During training, 90% of the DFT data sets were used as training data, and the remaining 10% were used as validation data. We set the parameter = 0.99 in the loss function in Eq. (7) so that the RMSE of the force is dominant in the training procedure. The final RMSE of the force prediction in the validation data is 25.5 meV/Å for Si and 37.8 meV/Å for GaN. Even at the very low weight of the loss function, the final RMSE of the total energy prediction for the validation data reaches 32.7 meV/atom for Si and 66.5 meV/atom for GaN (see the supplementary materials). We found that the lower value of increases the RMSE of the force prediction, which indicates that focusing on the force itself during training is important to obtain more accurate force prediction.
Next, to check that the force accuracy is sufficient for simulation of the phonon-related thermal properties, we evaluated the phonon dispersions in Si and GaN crystals by a combination of the HDNNP and phonopy 42 . We used HDNNP to predict the forces for the irreducible displacement patterns given by phonopy. For comparison, we also performed a phonon dispersion calculation using a combination of VASP and phonopy.
The phonon dispersion curves obtained using HDNNP are in good agreement with the DFT calculation results and previous reports 43, 44 for both Si and GaN, as shown in Fig. 2. Note that we focus on the comparison between the HDNNP and DFT results, and we did not include the non-analytic correction for LO-TO splitting 45 . Thus, the frequencies of the optical modes in GaN differ from the experimental values.
Then, we simulated the lattice thermal conductivity based on ALD by combining the HDNNP and phono3py package 4 , using a procedure similar to that for the phonon dispersion calculation. The irreducible displacements for evaluating the third-order potential for three phonon processes were obtained using phono3py, and the forces acting on atoms in each displacement pattern were predicted using the HDNNP. The phonon-phonon interaction strength and the corresponding lifetime were extracted from these force data; then, the lattice thermal conductivity ( ) was calculated using the single-mode relaxation time approximation of the linearized phonon BTE. Figure 3 compares the temperature dependence of the thermal conductivity obtained from the force predictions of the HDNNP and VASP calculations. We note that the thermal conductivity in GaN is underestimated compared to a previous report 46 .

Values of hyperparameters used in this work
The hyperparameters of radial and angular symmetry functions used in Si and GaN simulations are listed in Table S1 and Table S2, respectively.