Artificial neural network-based path integral simulations of hydrogen isotope diffusion in palladium

The contribution of nuclear quantum effects (NQEs) to the kinetics and dynamics of interstitial H isotopes in face-centered cubic Pd was intensively investigated using several path-integral techniques, along with a newly developed machine-learning interatomic potential based on artificial neural networks for Pd–H alloys. The diffusion coefficients (D) of protium, deuterium, and tritium in Pd were predicted over a wide temperature range (50–1500 K) based on quantum transition-state theory (QTST) combined with path-integral molecular-dynamics simulations. The importance of NQEs even at high temperatures was illustrated in terms of the characteristic temperature dependence of the activation free energies for H-isotope migration in Pd. This illuminates the overall picture of anomalous D crossovers among the three H isotopes in Pd. In addition, the D of protium in Pd was directly computed using two approximate quantum-dynamics methods based on Feynman’s path-integral theory, i.e. centroid molecular dynamics (CMD) and ring-polymer molecular dynamics (RPMD), in the temperature range 370–1500 K. The D values obtained from the CMD and RPMD simulations were very similar and agreed better with the reported experimental values than the QTST results in this temperature range. Our machine learning-based path-integral calculations elucidate the underlying quantum nature of the ‘reversed S’-type nonlinear behavior of D for the three H isotopes in Pd on the Arrhenius plots.


Introduction
Pd is a unique material with a strong affinity to H owing to both its catalytic and H-absorbing properties [1]. This is characterized by the fact that Pd can absorb large amounts of H to form a hydride (PdH x where x is the atomic ratio of H to Pd) under ambient conditions [2,3]. This property enables various important applications, such as H storage for clean portable energy, catalytic converters, and radioactive H-isotope adsorption (i.e. tritium recovery) [1,4,5]. Pd-based membranes are expected to be used in H separation and purification processes because of their relatively high H permeability and the ease of dissociation, adsorption, recombination, and desorption of H molecules on the surface. In general, alloying of Pd is necessary in these membranes to reduce the amount of expensive Pd used while maintaining or even improving the membrane properties [6][7][8]. From the viewpoint of alloy design, it is essential to evaluate quantitatively and in detail the rate-limiting step of the H-permeation process in Pd alloys, and to characterize and control the net H permeation depending on the amount of alloying elements. For this purpose, it is indispensable to have a theoretical approach that can efficiently evaluate and explore the dominant kinetic processes of H in various Pd alloys.
The behavior of lightweight particles such as H-isotope nuclei in materials is significantly influenced and even dominated by nuclear quantum effects (NQEs) [9]. In this context, an atomistic approach based on Feynman's path-integral theory [10] can be quite effective in the precise description of material-H systems

Construction of an ANN-based MLP
We used the Atomic Energy Network (aenet) package [26,39] for the construction and application of atomic-interaction potentials based on ANNs of the Behler-Parrinello type [40]. The Chebyshev [39] and Behler-Parrinello [40] descriptors for atomic environments are implemented in aenet. Here, we adopted the Chebyshev descriptor because of its fewer parameters, ease of setup, and faster evaluation time. It also has advantages when extending to multi-element systems to account for the effects of alloying elements. In the Chebyshev descriptor (basis set), the radial distribution function (RDF) and angular distribution function (ADF) are expanded in a basis of Chebyshev polynomials. The expansion orders (i.e. the number of polynomials minus one) for the RDF and ADF were 16 and 4, respectively, for all the atomic elements. The cutoff radius both for the RDF and ADF was chosen to be 5.5 Å. The ANN-based MLPs were trained using the bounded limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS-B) algorithm [41] for 30 000 iterations. Note that ANNs were fitted only to the DFT-calculated energies because aenet has not yet implemented a force fitting function apart from that with respect to numerical forces as energy deviation in random directions [42].
We determined the ANN architecture from training and test datasets in an ad-hoc way without using a validation dataset; we fixed the number of hidden layers to two, varied the number of nodes to 15, 18, and 20, set a tanh-type or twisted-tanh-type activation function, and picked those with good mean absolute errors (MAEs) and root-mean-squared errors (RMSEs) of energies from ten training cases each. As a result, the chosen MLP that exhibited the best MAE and RMSE values used two hidden layers with 15 nodes each and a twisted-tanh-type activation function. From each of these base structures, we generated deformed structures by changing the cell vectors of the system and displaced structures either by displacing atoms randomly or from snapshots of molecular dynamics (MD) calculations at finite temperatures. Such deformed and/or displaced structures are necessary when training the ANNs; otherwise, the ANNs may provide unphysically lower energies for deformed/displaced structures relative to the true low-energy structures. The training process was repeated several times in some cases, adding the snapshots of MD calculations using the tentative MLP to the training dataset. In total, we generated 27 097 structures and divided them randomly into a set used for training and a test set used for monitoring the convergence.

Generation of training data
To evaluate the total energies for the structures listed above, DFT calculations were carried out using the Vienna Ab initio Simulation Package [45]. The projector-augmented wave method [46,47] was used to describe the interaction between ions and electrons. The valence electron configurations considered in this study included 4d 9 5s 1 for Pd and 1s 1 for H. We used the generalized gradient approximation in the scheme of Perdew-Burke-Ernzerhof [48] to describe the exchange-correlation functional. Calculations were performed with a plane-wave energy cutoff of 300 eV using a Monkhorst-Pack k-point mesh [49] for integration over the Brillouin zone. In the previous DFT study for the Pd-H system [22], a convergence check of the cutoff energy showed that 300 eV was sufficient to evaluate the total energies. The MEPs and saddle-point (S) sites for H migration between the different interstitial sites were obtained using the nudged-elastic-band method [50] with a constant cell shape. The vibration analysis confirmed that the obtained image near the barrier top, which corresponds to the saddle point, has only one imaginary mode. Spin polarization (SP) was not considered in the calculations since the DFT results with SP were not very different than those without in the case of H migration in Pd. For example, the migration barrier of H for the Pd 32 H 1 system (i.e. 2a × 2a × 2a supercell, where a is the lattice constant of the fcc unit cell) with and without SP was 0.155 eV and 0.162 eV, respectively, and the difference between the energies of the H atom at the T and O sites with and without SP was 0.044 and 0.053 eV, respectively.

PIMD-based free-energy profile for H-isotope diffusion
The PIMD approach [32] provides an effective and versatile route to computing the static properties of quantum systems at finite temperatures, in which the partition function of a quantum particle is identical to that of a classical, closed ring polymer composed of P replicas of the original particle (or beads) connected with harmonic springs. The spring constant of the ring polymers is m I P/(βℏ) 2 , where P represents the number of beads in each ring polymer; m I , the mass of the Ith quantum particle in the N-particle system; ℏ, the Planck constant; β, the inverse temperature (k B T) −1 ; k B , the Boltzmann constant; and T, the absolute temperature. All equilibrium distances of the springs are equal to zero. The isomorphism becomes exact in the limit P → ∞. To evaluate the path-integral-based reaction free energy [11], we calculated the reversible work of moving the centroids of the ring polymers along the reaction coordinate [12,13,[51][52][53]. Each of the images obtained along the MEP was assumed to correspond to a centroid configuration r cent = {r cent I } (1 ⩽ I ⩽ N) of the ring polymers in the N-particle system. The MLP-based PIMD calculations were carried out for each individual centroid configuration for 5 × 10 5 steps with an integration time step of 0.1 fs, using 16 and 32 beads for temperatures above and below 150 K, respectively. The path-integral average of the centroid force f cent was calculated at 14 temperatures from 50 to 1500 K in a statistical ensemble with fixed centroid positions. The PIMD-based free-energy profiles were obtained by numerically integrating the centroid mean force along the H-isotope migration pathway q connecting states X and Y [22,23], i.e.
where q X and q Y represent the centroid configurations in which the H atom is located either at an O, T, or S site (X = O or T; Y = S), and δ is Dirac's delta function. The bracket denotes the ensemble average of the quantity it encloses. The differences between the free energies of the system F Y−X for H at different interstitial sites were calculated as F S−O ≡ ∆F(q S , q O ) and F S−T ≡ ∆F(q S , q T ).

CMD and RPMD
CMD and RPMD are approximate quantum-dynamics techniques in which the Kubo-transformed correlation function is derived approximately from the dynamical trajectories on the effective potential in the path integral. Although the CMD and RPMD methods are conceptually distinct, they differ from each other and from the PIMD method in how the fictitious mass and the thermostats for temperature control are set. In the CMD and RPMD approaches, dynamical rearrangements of Pd atoms during the process of H diffusion are automatically taken into account. Thus, the results obtained include the thermal lattice vibration (phonon) effects on H diffusion in the Pd lattice. For the CMD and RPMD calculations in this study, all atoms were treated quantum-mechanically by discretizing the path into P = 16 imaginary time slices.
In the CMD calculations, the mean square displacements (MSDs) of the centroids of H atoms in Pd were determined at 450, 600, 750, 900, 1050, and 1200 K. To carry out the CMD calculations, massive Nosé-Hoover chains [54] were attached to normal-mode coordinates [33][34][35], while a single Nosé-Hoover thermostat [55,56] was attached to the centroid coordinates. We considered a 3a × 3a × 3a Pd 108 H supercell under periodic boundary conditions, in which one H atom was initially placed at one of the O sites in the fcc structure with a = 3.942 Å. The integration time step was ∆τ = 0.1 fs. In the CMD calculations, the multiple-time-step method [57] was employed for time integration; the normal-mode and centroid coordinates were integrated at intervals of 1 20 ∆τ and ∆τ , respectively. To decouple the fictitious motion of the noncentroid modes from the centroid's physical motion, the masses of the noncentroid degrees of freedom were artificially reduced by a factor of 10. The centroids evolved with time according to an instantaneous force; the evolution was based on an on-the-fly scheme [33][34][35]. The diffusion coefficients at each temperature were evaluated using Einstein's relation from the MSDs within the canonical ensemble for a 1.2 ns run.
In the RPMD calculations, the MSDs of H atoms in Pd were determined at 370, 450, 600, 750, 900, 1050, 1200, 1350, and 1500 K. Since the RPMD method does not use thermostats, one can employ a larger time step for RPMD calculations than for CMD calculations. For each temperature, we obtained ten independent RPMD trajectories using the Pd 108 H supercell with different initial positions and velocities of atoms (beads) with a time increment of ∆τ = 0.25 fs within the microcanonical ensemble; each trajectory was 250 ps long and generated after a unique 1 ps PIMD run within the canonical ensemble at each target temperature. The diffusion coefficients and their numerical errors were estimated in a statistical framework proposed by Bullerjahn et al [58].
In this study, the path-integral simulations were carried out using the Parallel Molecular Dynamics Stencil [59] and PIMD [60,61], which both support the interface to aenet [26,39].

Potential-energy prediction through machine learning
The MLP for Pd-H alloys was created using 27 097 structures generated by DFT calculations for the Pd-H system; 90% of these structures were used for the training dataset of the MLP, and the remaining 10% were used for the test. The ANNs of the MLP were trained for 30 000 epochs using the L-BFGS-B method [41]. The scatter plots of the potential energies obtained from the DFT and MLP calculations are shown in figure 1(a). A good linear relation was observed between the MLP and DFT results for the energies. The resulting MAEs and RMSEs of energies for the training dataset (the test dataset) were 0.5 meV/atom (0.6 meV/atom) and 1.0 meV/atom (1.2 meV/atom), respectively. Note that these errors monotonically decreased during the iteration process, and furthermore, the errors for the test dataset were kept sufficiently small compared to those for the training dataset. This indicates that the ANNs do not overfit to the training data. The scatter plots of the atomic forces obtained from the DFT and MLP calculations are shown in figure 1(b). Although inferior to the results for the energies, a decent linear relationship between the DFTand MLP-calculated forces was observed. Table 1 lists the formation energies of the various Pd-H configurations obtained from the MLP, along with the DFT-calculated counterparts. We found that the results from the MLP agreed reasonably well with those from the DFT calculations. The ANN-based MLP had the ability to describe the Pd-H system not only at low H concentrations (i.e. H solid solution in Pd) but also at high H concentrations (i.e. Pd hydrides). Interestingly, the SAVs in Pd hydrides [43,44], which are vacancies of Pd atoms formed in Pd hydrides at concentrations as large as 25 at.%, were also described using the MLP in an energetically consistent manner. Fundamental properties related to H dissolution, H diffusion, H trapping at a vacancy, and hydride formation in Pd were evaluated using DFT and the MLP (table 2). For comparison, the results obtained using an embedded-atom-method (EAM) potential developed by Zhou et al [62] for Pd-H alloys are also included in table 2. Although the binding energies of H to a vacancy tend to be overestimated when the number of trapped H atoms is small, the MLP shows good transferability and can reproduce the energetics of the Pd-H systems. We note that, unlike the present MLP, the EAM potential by Zhou et al does not adequately describe SAV formation in Pd hydrides.  It should be also noted that the MLP underestimates the dissolution energy of the H atom to Pd (at the O site) by approximately 0.05 eV because of the significant underestimation of the bond energy of the H 2 molecule, as in table 1. In addition, the average binding energies for H atoms at a vacancy tended to be underestimated (by approximately 0.04 eV/H atom, as in table 2) as the number of H atoms increases. The reason why the formation energies of Pd 107 Vac 1 H 5 and Pd 107 Vac 1 H 6 were positive, whereas DFT indicated they are energetically favorable, is that the above errors were multiplied by the number of H atoms when calculating the formation energy of the H-vacancy complex. Therefore, problems may arise when describing phenomena involving the dissociation process of H 2 molecules using the present MLP. The description of H atoms 'dissolved' in Pd does, however, remain accurate. Figure 2 shows a comparison between the MEPs for H migration in Pd obtained using the DFT and MLP calculations within the classical regime (i.e. without NQEs). For comparison, the result obtained using the EAM potential by Zhou et al [62] is also included in figure 2. Note that the MLP described the MEP for H migration with accuracy similar to that of DFT. On the contrary, while the overall shape of the EAM-calculated MEP was consistent with those obtained from the DFT and MLP calculations, the EAM potential did not adequately describe the details of the potential-energy profile. It should be noted that the MLP-calculated MEP for Pd 108 H was closer to the DFT-calculated one for Pd 32 H than that for Pd 108 H. This may be attributed to the fact that many of the DFT data used for training of the MLP were derived from the base 2a × 2a × 2a Pd 32 supercell (see appendix). Table 2. Calculated values of the bond length of a H2 molecule (dH 2 ), the lattice constant of the face-centered cubic (fcc) unit cell of pure Pd (a0), the formation energy of a vacancy (E f Vac ) in fcc Pd, change in volume for the formation of a vacancy (∆VVac) in a 3a × 3a × 3a fcc Pd supercell (i.e. at 0.93 at.    decreases in the low temperature range (<150 K). It is noted that the F S−O values were increased for all H isotopes, owing to the NQEs in the temperature range 50-1200 K, so that the NQEs impede the diffusion of H isotopes. This is consistent with the ab initio PIMD results reported for H in fcc metals, depending on the preferential H location at the interstitial sites [21][22][23]. Further, the F T−O value increased with decreasing temperature. The migration barriers (F S−O ) of H isotopes in Pd are plotted as functions of temperature from 50 to 1500 K in figure 3(d). The F S−O values of the three H isotopes increase, reach their peaks, and then decrease as the temperature decreases, keeping values greater than the classical limit of 0.16 eV. Moreover, F S−O increases overall as the mass of H isotopes decreases in the temperature range above 75 K. The characteristic decrease in the migration barrier at low temperatures can be ascribed to the onset of quantum tunneling, which is observed at temperatures around 150 K, 100 K, and 75 K for protium, deuterium, and tritium, respectively, as shown in figure 3(d). It should be noted that the MLP-based PIMD results show quite good agreement with the DFT-based PIMD results reported for protium and tritium [22] over this wide temperature range, but not at 60 K.

H-isotope diffusivities from quantum transition-state theory
The isotropic diffusivity for a network of interstitial O and T sites in the fcc lattice is defined as [22,23,63]: where Γ and ρ denote the jump rate and equilibrium site probability, respectively. Note that the correlation effects in atomic diffusion are not taken into account in this model. The jump rate at which an H-isotope  [66], and a line terminated by inverted triangles (T) [67]. Note that the experimental data obtained for H are essentially those for protium.
atom initially located at the O site (or T site) escapes to the neighboring T site (or O site) is described by transition state theory (TST) as where dq represents a line element along the pathway, i.e. dq = |dq|. F O (q) and F T (q) denote the one-dimensional free-energy profiles as functions of q, where the zero energy is assumed to be at the bottom of the well corresponding to the O site (q O ) and T site (q T ), respectively. The integral in equation (3) runs between the tops of the two barriers (i.e. neighboring S sites) across the well; v * is the forward flux at the top of the barrier, i.e. (2πβm) − 1 2 , with m being the mass of the H-isotope atom. The equilibrium site probabilities are expressed using the following rates: such that the normalization ρ O + 2ρ T = 1 holds.
The D values of H isotopes in fcc Pd were systematically calculated based on quantum TST (i.e. using the PIMD values of Γ O→T and Γ T→O ). Figure 4 shows the Arrhenius plots of D for protium, deuterium, and tritium in the temperature range 50-1500 K. For comparison, the experimental D values [64][65][66][67] are plotted in figure 4. In the high-temperature region, the D values for H isotopes approached the classical TST results (the almost straight lines in figure 4) obtained from equation (3) when excluding NQEs. It should be noted that the quantum-TST-based D curves of all H isotopes exhibited reversed-S shapes on Arrhenius plots; similar trends have been reported for H-isotope diffusion in fcc metals [22,23]. Moreover, the quantum TST results were highly consistent with the experimental observation that the D values of protium, deuterium, and tritium are in reverse order of their masses in the intermediate-temperature region (130-470 K) [64,65,67]. On the other hand, in the temperature region 200-1200 K, the D values obtained from TST tended to be overestimated compared to their experimental counterparts. This may be due to the omission of recrossing in the TST model.
In the experiment reported by Higelin et al [66], D became larger at temperatures below 75 K when the mass of H isotopes decreased. On the contrary, the quantum-TST-based D values, which were overall in good agreement with experimental values [64][65][66][67], did not clearly follow this trend. It is noted that those obtained from the DFT-based PIMD calculations in the previous study [22] agreed better with experimental values than the present MLP-based results at low temperatures (below 75 K). This implies that the discrepancy between MLP-based and DFT-based PIMD calculations tends to increase at lower temperatures. Such a discrepancy at temperatures below 75 K suggests that the accuracy of the present MLP may be insufficient when applied to low-temperature regions, where the delocalization of H-isotope atoms in the lattice is pronounced and the anharmonicity of the potential-energy surface for H isotopes becomes significant. In particular, the error in energy of the MLP with respect to DFT is about 1 meV/atom (equivalent to 10 K); problems may arise when the magnitude of the error is of the same order of magnitude as the thermal energy k B T. Although this is only a problem at low temperatures, this suggests that more descriptors are needed to more accurately match the energies. Further, a higher level of machine learning (see, for example, [68]) may be required for the potential-energy surface for H migration when targeting low-temperature regions where deep tunneling of H-isotope atoms may occur. The descriptors we employed cover only short-range interactions. Since atomic charges are small in the metal-hydrogen system, this is considered a reasonable approximation. Nevertheless, the accuracy might be improved if long-range electrostatic interactions were included (as in [68]). This will be the subject of future work.

H diffusivities based on CMD and RPMD
To examine the effects of thermal lattice vibrations on the H diffusivity at finite temperatures, we calculated the D of protium in Pd by the CMD and RPMD methods at various temperatures between 370 K and 1500 K, a range in which sufficient statistical sampling for the jump processes of H atoms could be obtained. We also calculated D from classical (P = 1) MD simulations; the corresponding results are plotted in figure 4(b). It was found that both CMD and RPMD results were consistent with each other in this temperature range. Note that the classical MD results showed much larger D values than the CMD and RPMD results; this was expected, because of the lack of NQEs. The quantum CMD/RPMD results approached the classical MD results as temperature was increased.
The Arrhenius plot of protium diffusivity, which exhibits a concave-down curve in the quantum TST results in this temperature range, is almost a straight line with respect to the inverse temperature in the CMD and RPMD results, as in figure 4. It is noted that the apparent activation barrier for H diffusion derived from the CMD/RPMD calculations (0.23 eV) is in quite good agreement with that from the experiments (0.23-0.24 eV) [64,65,69], while the corresponding value from the classical MD calculations is 0.17 eV. Furthermore, the D values obtained from the CMD and RPMD calculations are closer to the experimental values than those obtained from quantum TST. This is because the TST model assumes that there is no recrossing of the dividing surface. The effects of thermal lattice vibrations in CMD and RPMD calculations appropriately and reasonably reduce the transmission coefficient of the migration barrier compared with that in the TST model. Consequently, the CMD/RPMD approach provides a good description of H diffusion behavior in Pd at temperatures above 370 K. The results suggest that the coupling of quantum and thermal fluctuations plays an important role in the H migration process in Pd over a wide temperature range.

Conclusions
Using path-integral simulations along with an ANN-based MLP, we elucidated a comprehensive picture of the temperature-dependent diffusivities of H isotopes in fcc Pd, which had been only partially understood before. We successfully evaluated the activation free energies for H-isotope migration and the diffusion coefficients of interstitial H isotopes in Pd. In this paper, the D values for the three H isotopes in Pd are presented in detail over a wide temperature range for the first time. This was made possible by developing an MLP for Pd-H alloys with near-DFT accuracy, which allowed systematic PIMD calculations to be performed at a realistic computational cost. Furthermore, with the help of the MLP, the dynamics of protium in Pd were quantitatively analyzed by the CMD/RPMD calculations at an ab initio DFT level, while considering both quantum and thermal fluctuations. The results obtained reaffirm the importance of NQEs of H isotopes in fcc metals, not only at low temperatures but also at high temperatures.
Interpolation of atomic potential-energy surfaces by machine learning, which allows nearly automatic construction of highly accurate potentials, is an important technique for making full use of advanced molecular simulations that require many copies of the system (replicas) and/or long sampling times. As demonstrated in this study, path-integral approaches combined with MLPs are essential for the study of H diffusion in materials, allowing the exploration of previously unstudied interactions between materials and H in complex systems.

Data availability statement
The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.