Elemental vacancy diffusion database from high-throughput first-principles calculations for fcc and hcp structures

This work demonstrates how databases of diffusion-related properties can be developed from high-throughput ab initio calculations. The formation and migration energies for vacancies of all adequately stable pure elements in both the face-centered cubic (fcc) and hexagonal close packing (hcp) crystal structures were determined using ab initio calculations. For hcp migration, both the basal plane and z-direction nearest-neighbor vacancy hops were considered. Energy barriers were successfully calculated for 49 elements in the fcc structure and 44 elements in the hcp structure. These data were plotted against various elemental properties in order to discover significant correlations. The calculated data show smooth and continuous trends when plotted against Mendeleev numbers. The vacancy formation energies were plotted against cohesive energies to produce linear trends with regressed slopes of 0.317 and 0.323 for the fcc and hcp structures respectively. This result shows the expected increase in vacancy formation energy with stronger bonding. The slope of approximately 0.3, being well below that predicted by a simple fixed bond strength model, is consistent with a reduction in the vacancy formation energy due to many-body effects and relaxation. Vacancy migration barriers are found to increase nearly linearly with increasing stiffness, consistent with the local expansion required to migrate an atom. A simple semi-empirical expression is created to predict the vacancy migration energy from the lattice constant and bulk modulus for fcc systems, yielding estimates with errors of approximately 30%.


Introduction
Many materials phenomena are affected by atomic diffusion properties, and knowledge of these properties is of great technological importance in applications ranging from structural materials to electrochemical devices. Extensive experimental efforts have been exerted over many decades in order to determine the activation energies for vacancy diffusion, Q, in pure elements and alloys. The vacancy diffusion mechanism is the dominant mechanism of selfdiffusion in monoatomic close-packed crystals [1]. As attempt frequencies are relatively similar, within a simple Arrhenius model, the quantity Q largely determines self-diffusion properties for elements in close-packed structures. The activation enthalpy for vacancy diffusion can be found by summing together the enthalpy of vacancy formation, H vf , and the vacancy migration enthalpy, H vm . Accurate experimental determination of these two quantities can be quite challenging. The large range of vacancy formation enthalpies reported from the different experimental methods (e.g. quenching, positron annihilation, etc) indicates that H vf determination is non-trivial. One complication that is commonly encountered in experimentally determining H vf is the isolation of monovacancy contributions from those of other defects, such as divacancies [2]. H vm can be directly measured in quenching experiments. However, this often leads to less accurate results than simply subtracting the vacancy formation enthalpy from Q found in diffusion experiments [1,3]. Thus, the accuracy of H vm is dependent on methods used to find H vf , making both quantities difficult to determine consistently and accurately. These difficulties partially provide the incentive to use theoretical alternatives to calculate the diffusion properties of materials.
One theoretical approach that has gained popularity in recent decades is the use of ab initio density functional theory (DFT) based quantum codes in order to determine materials properties. DFT is an efficient technique used to solve the fundamental quantum-mechanical equations for complex systems of atoms. DFT has had good success in predicting accurate diffusion properties [4][5][6]. The evolution of fast and robust DFT codes has made it possible to automate computational techniques in which large amounts of data relevant to a specific materials property can be obtained [7]. High-throughput ab initio calculations are being used to study an increasing range of problems, including surface energy determination [8], crystal structure prediction [9], band structure calculation [10], catalyst design [11] and Li-ion battery cathode design [12]. A number of tools to aid in high-throughput computation are also becoming available, including AFLOW [10], the Materials Project 2 and the Atomic Simulation Environment [13]. However, little work has been done to develop databases of basic elemental or alloy diffusion data using high-throughput ab initio methods.
There are a few key advantages to having a large database of diffusion properties formulated from automated ab initio calculations. The consistency of the methods used to obtain the results makes for more reliable relative values. Furthermore, using ab initio techniques allows one to study systems that are theoretical or do not exist in a stable natural form.
Predictions for metastable systems can be useful if these systems are ever synthesized. These predictions are also often necessary when fitting the dependence of diffusion energetics on alloy composition. Finally, having a consistent and consolidated database allows one to establish correlations between diffusion and system properties. These correlations could be used as a materials screening tool that would help facilitate the choice of materials for technological applications. This work will demonstrate the utility of high-throughput calculations for establishing an elemental diffusion database, and this database will be used to explore trends in vacancy formation and migration energetics. While this work is focused on simple pure elements as an initial demonstration, it suggests a path to a more comprehensive effort currently underway that is focused on treating diffusion in alloys with the same high-throughput approach.

Total energy calculations
All calculations were performed using the Vienna Ab Initio Simulation Package (VASP) version 5.2.2, a quantum mechanical code based on plane-wave DFT [14][15][16][17]. The workflows of the calculations were automated using the MAterials Simulation Toolkit (MAST), a code package developed at the University of Wisconsin-Madison and built on the pymatgen toolset [18]. For all of the studied materials, the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional with the generalized gradient approximation (GGA) for the exchange-correlation energy was used, and the projector augmented wave (PAW) method was implemented to model the electron-ion interactions within each of these systems [19][20][21][22]. All ionic relaxations were performed using a conjugate gradient algorithm. For these, a first-order Methfessel-Paxton smearing method [23] was used with a smearing width of 0.2 eV. For all ground state energy calculations with fixed ions, a tetrahedron method with Blöchl corrections was used. Spinpolarization was neglected for all systems with the exception of Ni, Co and Mn in both facecentered cubic (fcc) and hexagonal close packing (hcp). Because Fe is non-magnetic in the fcc structure [24], only hcp Fe was attempted with spin-polarization (although no moment was found). For this subgroup of element-structure combinations, non-spin-polarized and spinpolarized results are both presented. Note that, in the case of Mg in fcc and Au in hcp, these are the vacancy formation energies in a metastable host structure.

Calculation parameters
The size of the supercell and the density of points in the k-point mesh were held constant for all elements in a given crystal structure. For the fcc systems, a Monkhorst-Pack [25] 4 × 4 × 4 k-point mesh and a 3 × 3 × 3 supercell of the conventional cell consisting of 108 lattice sites were used. For the hcp systems, a Gamma-centered 9 × 9 × 9 k-point mesh and a 3 × 3 × 2 supercell of the conventional cell consisting of 36 lattice sites were used. Table 1 summarizes the results of convergence testing for k-point mesh performed on aluminum in fcc and magnesium in hcp. Convergence with cell size was tested by using a finite-size scaling approach. Following previous literature we plotted the values of interest (H vf and H vm ) as a function of N −1 atoms and looked for trends to enable extrapolation to infinite cell size [26][27][28][29][30]. In general we found very robust fits for a simple linear dependence versus N −1 atoms , consistent with the defect interactions being based on strain. Figure 1 shows the effect of cell size on H vf for Au and Mg in both the fcc and hcp structures. Based on figure 1, the estimated size-effect error for H vf is 20-30 meV for both the fcc and hcp systems. Almost no change in the hcp vacancy formation energy is observed with respect to system size, suggesting that the 3 × 3 × 2 supercell is sufficient for hcp calculations. Figure 2 shows the effect of cell size on H vm for Au and Mg in both the fcc and hcp structures. From figure 2, the estimated size-effect error for H vm is again between 20 and 30 meV for both the fcc and hcp systems. While migration energies were not calculated for the 6 × 6 × 4 hcp supercell due to computational limitations, both basal and c-axis migration energies show little dependence with system size and are sufficiently converged. Plane wave cutoff energies were assigned for each element at 1.5 times the maximum energy cutoff in the PAW potential file (denoted ENMAX) for that element. This energy cutoff gives errors in the reference energy of about 5 meV atom −1 or less. In both the fcc and hcp structures, the same pseudopotential was used for each element. Based on linear extrapolation to 1/N = 0 for these test cases, the size-effect error of the migration energy for both the fcc and hcp systems treated in this paper is between 20 and 30 meV. Note that the migration energies for the 6 × 6 × 4 hcp supercell were not calculated.

Parameters used in saddle point calculations
The Climbing Image Nudged Elastic Band (CI-NEB) [31] method was used to obtain the vacancy migration enthalpy, H vm , of each system. This procedure involves finding the saddle point structure and the associated minimum energy pathway using a chain of images representing intermediate structures that are linearly interpolated from two endpoint structures. Both endpoint structures consisted of cells containing a monovacancy at one site (107 on-lattice atoms +1 vacancy or 35 on-lattice atoms +1 vacancy). In all cases, the vacancies were positioned at nearest-neighbor sites with respect to the migrating atom. For all CI-NEB -based calculations, a 5.0 eV Å −2 spring constant was used to maintain a constant distance between the images. A quasi-Newton algorithm with a force scaling constant of 0.5 was utilized in the relaxation of ions into their respective ground states. Further detail regarding the calculation of migration energies is given in section 2.5.

Calculation of the vacancy formation enthalpy
The vacancy formation enthalpy, H vf , was determined by comparing the total energy of the cell containing the monovacancy with that of the undefected cell. The undefected cell energy is scaled by a fraction in order to conserve the appropriate number of atoms. This calculation is represented in (1): In the above equation, N is the number of lattice sites, H N −1 is the enthalpy of the defected cell containing a single vacancy and H N is the enthalpy of the undefected cell. The calculation of H vf using VASP involved a series of relaxations of both an undefected and a defected cell.
For each element, calculations began with the geometric optimization of the undefected cell, allowing the cell volume and ionic positions to relax to their minimum energy states. Multiple optimizations were performed until the total energy of the undefected cell was converged to within 1 meV cell −1 . Once this was achieved, the electronic degrees of freedom of the system were re-optimized in a fixed-ion calculation of the total energy to give H N in (1). To get H N −1 in (1), a new cell was then formed from the relaxed undefected cell by adding a single vacancy of arbitrary placement. A similar series of calculations were performed using the same energy threshold at a fixed volume equal to that of the relaxed undefected cell. From these resulting energies, H vf was calculated. Note that all calculations are performed at approximately zero pressure so the enthalpy and energy (H vf and E vf ) are taken to be equal in these calculations. Therefore, the energy is calculated and set equal to the enthalpy.

Calculation of the vacancy migration enthalpy
Transition state theory [32] was applied to find H vm for each system. H vm was obtained by comparing the total enthalpy of an endpoint containing a monovacancy with the total enthalpy of the system containing an atom at the transition-state position of a migration event. This is quantified in (2): In the above equation, TS signifies the transition state and IS signifies the initial state of the system. As with formation energies, the vacancy migration enthalpy and energy (H vm and E vm ) are taken as equal, as the calculations are performed at approximately zero pressure. For computational efficiency, an exact construction of the second endpoint was made from the first using symmetry in both the fcc and hcp systems. This method involved translating each atom of the first endpoint by a vector equivalent to a nearest-neighbor hop associated with each crystal system. The first and second endpoint structures were taken and used to linearly interpolate a single transition state image used in the CI-NEB calculation. H N −1,TS in (2) is given by the energy of this intermediate image, while the initial-state enthalpy, H N −1,IS , can be obtained from the total energy of either equivalent endpoints. To ensure that the single-image CI-NEB calculations discovered an energy maximum, the migrating atom of the linearly interpolated middle image was displaced by a fraction of an angstrom in a symmetry-breaking direction. This displacement guaranteed that the climbing image would not be stuck in a local minimum due to the existence of a high-symmetry point. Since all of the atom hops studied in this work are symmetric, this approach will find the maximum of the potential energy surface assuming the migration path has one or two maxima. If the energy surface has three or more maxima then they could be missed by our approach, but no examples of such complex energy surfaces were found in the subset of four element-structure systems checked with additional images.

Calculation of the cohesive energy and bulk modulus
As the formation and migration of vacancies can both be considered bond-breaking processes, it is of interest to see how the energy barriers correlate with the cohesive energy, E C , of each element. Cohesive energies were calculated as the difference in energy per atom between the crystal and the single atom system. The latter reference energy was calculated with a Gammacentered 1 × 1 × 1 k-point mesh in a 12 × 13 × 14 Å cell at an energy cutoff of 600 eV. In the case of Ni, the asymmetric cell used for the atom calculations could not be converged despite extensive efforts, and a cell with 10 × 10 × 10 Å dimensions was used. The cohesive energy of Ni therefore runs a risk of errors associated with inadequate symmetry breaking in the ground state of its atom reference state. All isolated atom calculations were performed with spin-polarization.
The bulk modulus B 0 and its pressure derivative B 0 were obtained by fitting energy-volume data of the undefected cell to the third-order Birch-Murnaghan equation of state (EOS), shown in (3) [33]. The fit was performed using seven energy-volume pairs, in addition to the undefected cell volume V 0 and energy E 0 , with volumes ranging from 80 to 112% of V 0 : Figure 3 shows all of the element-structure combinations for which calculation of diffusion properties was attempted. In the fcc systems, elements with atomic numbers 1-83 as well as 89-91 were attempted. In the hcp systems, elements with atomic numbers 1-83 with the exclusion of 59-71 were attempted. For both structures, exclusions were made on the basis of likelihood of success and estimated computing time. If systems in either structure were unable to relax after the addition of a vacancy, calculations for that system were discontinued.  [2,[35][36][37][38] for both the fcc and hcp systems. Only in-plane hcp migration data are included in this plot. All data presented in both of these plots are for elements in their stable, naturally occurring crystal structures.

Omission of phonon calculations
The attempt frequency of a migrating atom can be determined from ab initio phonon calculations [5]. The calculation of phonons was omitted from this work as they require large amounts of computing resources, but these calculations could easily be implemented in future work. With knowledge of the attempt frequencies and the migration energetics in the elemental systems, the vacancy self-diffusion coefficient, D, can be calculated.

Results and discussion
Tables A.1 and A.2 found in the appendix list all of the data calculated in this work for elements in the fcc and hcp structures respectively. Comparisons to selected experimental data are given in figures 4(a) and (b). In these figures, as throughout figures in the paper, lines are superimposed as a guide to help the reader see trends in the data. For the fcc systems, calculated vacancy formation enthalpies are generally lower than the corresponding experimental data. It has previously been proposed that, due to an intrinsic error in treating surfaces, GGA DFT can underestimate H vf for certain systems [34]. The intrinsic surface error involved can be removed by adding a correction value that is a function of electron density [34], although this correction is not included in this work. Based on the limited data in figure 4(a), this same error is not seen with the hcp systems, as the calculated hcp vacancy formation enthalpies generally overestimate the experimental results. Further work is necessary to understand why the typical errors between fcc and hcp, which are both quite similar close-packed structures, should be so different.
The H vm values calculated in this work generally agree well with the experimental data.
It should be noted that previous ab initio work has identified an anomalous monovacancy migration pathway for Group IV hcp metals with bcc-hcp phase transitions at higher temperatures. For migration in the basal plane, the migrating atom finds a bcc-like atomic environment along the pathway which is an energy local minimum that exists between the two saddle points [39]. It is very possible that the naturally bcc elements treated in this work follow this tendency when placed in the hcp structure. Because only a single image is used in the CI-NEB calculations, it is possible that the single image falls into this local minimum, thereby underestimating the migration barrier.
In figures 5 and 6, the H vf and H vm values respectively are plotted against the elements' Mendeleev numbers. The Mendeleev number is a phenomenological coordinate that captures size, electronegativity, valence and bond orbitals in a single chemical scale in which each element is assigned one unique number [40]. It is important to note in these figures that the legend corresponds to the natural crystal structure of each element at standard temperature and pressure (STP), not to the structure in which the elements were relaxed. These plots show that the vacancy formation and migration energies are generally smooth and continuous when plotted as a function of chemical species. Notable in both figures is the peak formed by the middle transition metals (Re, Os, Tc, Ru, Mn and Fe). Also notable are the troughs formed by the weakly interacting elements with essentially zero defect energetics (He, Ar, Kr, etc).
The results in figures 5 and 6 suggest at least some correlation between H vf and H vm and the bonding strength of the systems. To see these trends more clearly, in figures 7 and 8, the calculated H vf and H vm values are plotted against the cohesive energy of each element. Figures 7(a) and (b) show an overall trend of linearly increasing vacancy formation energy with increasing cohesive energy for both structures. A linear regression was performed on both sets of data. For the fcc data, a fitted slope of 0.317 was found with R 2 = 0.655 (rms error = 0.477 eV). For the hcp data, a slope of 0.323 was found with R 2 = 0.683 (rms error = 0.501 eV). These results are consistent with the fact that a simple fixed bond-strength model is not sufficient in describing the energetics of vacancy formation. Because the remaining bonds nearby are strengthened when the vacancy is formed, and because the lattice can lower its energy by relaxing into the void space formed by the missing atom, it is common for H vf to fall somewhere between 0.2E C and 0.8E C [37]. Figures 8(a) and (b) show a continuous trend of H vm with E C . A fit was performed using data points with cohesive energies less than 6 eV. In this regime, the data appears linear with a slope of 0.191 (rms error = 0.243 eV) for the fcc data and 0.159 (rms error = 0.209 eV) for the hcp data. The effects of the cohesive energy are especially strong when E C > 6 eV. In this range, there appears to be a nonlinear increase of the migration enthalpy with increasing cohesive energy. These trends are consistent with the intuition that the migrating atom's mobility is strongly hindered by an increasingly cohesive lattice. However, the trend of H vm with E C does not appear to be simply linear, and a change in the migration physics may occur for some of the most stable elements. These simple correlations could provide quick screening for H vf and H vm as the cohesive energy is much faster to calculate than either H vf or H vm .
As atomic migration can be considered a bond-breaking phenomenon, it is intuitively reasonable that H vm might scale with cohesive energy. However, the migrating atom must also push its way through a constrained space, and therefore H vm might additionally be expected to scale with elastic properties. This is explored further in figures 9(a) and (b), in which the calculated H vm values are plotted against the calculated bulk moduli of the corresponding elements in the fcc or hcp structures. The plots for both structures show a linearly increasing The legend corresponds to the natural crystal structure of each element at STP, not the structure in which the elements were relaxed. In the case of elements that are gas-phase at room temperature, the legend corresponds to high pressure or low temperature-induced solid phase structures [41]. Gray lines are superimposed as a visual guide for seeing trends. Gaps in the guiding line correspond to portions of the plot that are difficult to interpret due to insufficient or scattered data.   A fit was performed using data points with cohesive energies less than 6 eV. In this regime, the data appear linear with a slope of 0.191 (rms error = 0.243 eV) for the fcc data and 0.159 (rms error = 0.209 eV) for the hcp data. After this linear regime, H vm increases sharply with cohesive energy, showing that a change in the migration physics may occur for some of the most stable elements. A gray dashed line is added as a visual guide to aid in seeing the increase. Using the bulk modulus as the elastic constant and performing a linear fit (R 2 = 0.84, rms error = 0.240 eV) on the H vm data versus a 3 B 0 yields the expression given in (4): This equation applies only to fcc systems due to the relatively more complex symmetries in the hcp structure. Attempts were made to find a similar expression for the H vm of hcp systems by substituting the volume of four hcp atoms for a 3 in (4) and performing a linear fit. However, the resulting equation provided no better representation of H vm than the regression against the bulk modulus itself shown in figure 9(b). The validity of (4) is depicted in figure 10 in which H vm values given by (4) are plotted against the ab initio H vm values obtained in this work.

Conclusion
Using a high-throughput approach, a large elemental diffusion energetics database was created that contains data for both metastable and naturally occurring systems. Having such a database is advantageous in that the methods used to calculate the results are rigorously consistent, accurate and able to provide data for theoretical systems on which it is potentially impossible to perform experiments. The inclusion of data for metastable systems is useful if these systems are ever synthesized or if metastable predictions are required as end-member data inputs for trend fitting software, such as the thermodynamic simulations performed in CALPHAD. By plotting the data against various chemical, physical and mechanical properties, several important controlling factors for diffusion in the elemental systems have been discovered.
These correlations are fundamental to understanding diffusion in various systems. The vacancy formation enthalpy was shown to depend linearly on the cohesive energy, and the vacancy migration enthalpy plotted against cohesive energy showed an initial linear region followed by a rapidly increasing region for the most stable systems. As vacancy formation and migration can be interpreted as bond-breaking phenomena, cohesive energy as a controlling factor makes intuitive sense. The relationship shown by equation (4) demonstrates that there is also a strong correlation between the bulk modulus and the migration enthalpy. As the migrating atom must push through a constrained space, incompressibility as a controlling factor also makes intuitive sense. It is interesting to observe that, in general, the metastable data fit into the same trends as those formed by data from the naturally occurring systems.
The above correlations could all be used as screening tools for more complicated systems, such as alloys, although the validity of the correlations must be demonstrated for alloy systems. Properties such as the cohesive energy and the bulk modulus are less computationally intensive to compute, and therefore knowledge of these properties as controlling factors for migration could be useful in the screening process for specific migration energetics. This work has by no means explored all of the potential ways that the database can be used to discover trends of diffusion energetics with other properties of solids. Additional highthroughput diffusion work will increasingly generate more data and allow new correlations to be discovered.    The instability of these elements in the hcp structure precluded the gathering of certain data. b This slightly negative value signifies a lack of stability for this element-structure combination.