Quick-start guide for first-principles modelling of point defects in crystalline materials

Point defects in crystalline materials often participate in electronic (carrier generation and recombination) and optical (absorption and emission) processes. First-principles calculations of defects based on density functional theory have been widely used to complement and even validate experimental observations. In this 'quick-start guide', we discuss the best practice in how to calculate the formation energy of point defects in crystalline materials and analysis techniques appropriate to probe changes in structure and properties.

'The perfect crystal is one of the idealizations commonly found in theoretical physics and science fiction'. 1 Defects are present in all crystals -they can be detrimental to device performance or can be beneficial for use in a wide variety of processes, including: • optical, e.g. the colour centre V Br in CsBr; • chemical, e.g. the catalytic centre Li Mg in MgO; • mechanical, e.g. hardening of Fe using C; • electrical, e.g doping by V O in ZnO.
Computational techniques allow us to investigate properties of point defects at a level of detail that is often difficult to access via experiments. It is possible to isolate the behaviour of particular defects and predict their spectral signatures and physical effects. One challenge is that much of the infrastructure for materials modelling is built on translational symmetry (e.g. Bloch wave functions). Defects break the periodicity of a crystal, and their accurate description is a continuing endeavour for materials modelling. Following our quick-start guide on interfaces, 2 this is a primer for researchers starting to work on first-principles defect simulations.

DEFECT NOTATION
A defect is often referred to according to its spectroscopic signature. An anion vacancy with a trapped electron in an ionic crystal may absorb light in the visible range, making the transparent host material colourful; an F center (Farbe means colour in German). In electrical measurements, such as deep level transient spectroscopy (DLTS), defects are labeled in order of their energy levels. For example, D1 and D2 for electron donors and A1 and A2 for electron acceptors, as shown for the case of the a) Electronic mail: a.walsh@imperial.ac.uk FIG. 1. Intrinsic defect levels in CuInSe2 with respect to the valence (VBM) and conduction (CBM) bands that have been measured and calculated from first-principles. The height of the histogram columns on the right side represents the spread in experimental data. Reproduced with permission from Ref.

3.
chalcopyrite semiconductor CuInSe 2 in Fig. 1. Matching atomic models with spectroscopic signals is of considerable importance in physics and chemistry of materials.
A widely used notation for the atomic defect models is Kröger-Vink. 5 A defect is represented by X Y , where X is the species occupying the atomic site Y. A vacancy and an interstitial site are denoted by V and i, respectively. The relative charge of a defect can be represented by a superscript of × , • and for a neutral, positive and negative charges, respectively. A numerical notation 6 of the charge state (q) is popular in the recent literature. For example, the negatively-charged B-on-Si is represented

by B -1
Si , and the neutral Si self-interstitial is represented by Si 0 i . When more than one symmetrically inequivalent site exists, an additional subscript of the Wyckoff position or the point group symmetry of the site can be added to distinguish distinct species.

EQUILIBRIUM CONCENTRATIONS
The equilibrium concentration of defects (n d ) at a fixed temperature and pressure is given by the density that minimises the free energy: where N site and g denote the number of available sites of the defect in the unit volume and the degeneracy of the defect, respectively. ∆G f is the Gibbs free energy of formation of the defect, k B is the Boltzmann constant, and T is temperature. This can be further decomposed into contributions from enthalpy (H) and vibrational entropy (S) as follows: The enthalpy change dominates under standard conditions and is easier to compute, therefore vibrational terms are often neglected. The formation energy of a defect is given as: where ∆E is a change in the total energy due to the formation of the defect. The second term takes into account the energy cost to exchange n i atoms of kind i with their reservoir chemical potential µ i (e.g. this term for X q Y is µ X − µ Y ). q and E F denote the charge state of the defect and the Fermi level, respectively. The correction term (E corr ) will be discussed in the next section. These terms are illustrated in Fig. 2. The typical magnitudes of first, second, and third terms in Equation 3 are a few eV, while the value of the forth term is usually < 1 eV. Since a macroscopic crystal should be charge neutral overall, the concentrations of electrons n 0 , holes p 0 , positively-charged donors n qi Di , and negatively-charged acceptors n qj Aj must satisfy electroneutrality: 7 The equilibrium population of charge carriers are given by: where N C and N V are the effective density of states of the conduction (E C ) and valence (E V ) bands, respectively. Since the formation energy of charged defects depends on the Fermi level (Equation 3), which in turn depends on the population of charged defects, the concentrations must be solved self-consistently as illustrated in Fig. 3.

PRACTICAL CALCULATIONS
Several approaches exist for calculating defect formation energies. Those based on embedding potentials (a dilute defect in a host matrix) offer several advantages but remain technically challenging to setup and analyse. 8,9 For first-principles approaches, including density functional theory (DFT), the supercell method is by far the most widely employed. 10 With the supercell method, periodic boundary conditions remain in place, but an expanded repeat unit is employed. This repeat unit should be large enough so that the host material is well described and the periodic interactions between repeating defects can be corrected. For example, if a cubic unit cell with lattice spacing (a) is expanded using a matrix of: there will be a spacing of 2a between repeating defects along each axis. Vacancies and substitutions can be introduced by simply removing atoms and replacing atoms, respectively. Interstitial sites require further care because of the larger configurational space. Candidate sites for interstitials can be assigned in a way that resembles well-known structure motifs such as tetrahedra or octahedra. 11 PyLada adapts a scheme based on Voronoi tessellation. 4 The range of possible charge states is inferred from the oxidation states of atoms. For example, the charge state of Sn Zn can be 0 or +2 which correspond to Sn(II) and Sn(IV), respectively.
We will now breakdown Equation 3 into its components and discuss how to compute the four terms in turn.

A. Term 1: Raw defect formation energy
To obtain ∆E, the standard operating procedure is as follows: 1. Fully optimise the crystal structure of the host material in its primitive unit cell.
2. Create a supercell expansion that is as close to cubic as possible, which minimises anisotropy in defect-defect interactions.
3. The total energy of the pristine supercell becomes the E H reference for defect formation energies. 4. Introduce a defect of your choosing and optimise the structure under constant volume conditions keeping the lattice vectors fixed.
5. The total energy of the optimised defective supercell is E D,q .

The raw defect formation energy in Equation 3
is the difference of two total energy calculations, One common mistake is to assume an incorrect spin state. Even if a host material is non-magnetic, the ground state of a defect may require spin-polarisation, e.g. in an open-shell singlet or triplet configuration. 12

B. Term 2: Atomic chemical potentials
The raw defect formation energies are not meaningful as they do not represent balanced reactions (atoms may have been created or annihilated). The chemical potentials µ i account for the exchange of species with their environment. In practice, a range of environments are possible. In some cases we may try to mirror an experiment, e.g. an environment of oxygen gas at a particular temperature or pressure. More generally, we treat µ i as a parameter than can vary over the accessible phase space of the material. The standard operating procedure is as follows: 1. Calculate the total energy of the standard states of each element found in your host material (e.g. Zn metal and oxygen gas for ZnO).
2. Calculate the total energy of all possible secondary phases that could form (e.g. ZnO and Al 2 O 3 for a ZnAl 2 O 4 host).
3. Solve the accessible chemical potential region considering these boundaries. Any point in this stability field can be used safely in Equation 3.
For multi-component materials, the associated simultaneous equations become cumbersome to solve. One freely available package developed for this purpose is CPLAP. 13

C. Term 3: Fermi level
In semiconductors and dielectrics, the electronic chemical potential (Fermi level) can vary between the valence and conduction bands, depending on the doping regime and history of a sample. Here, qE F represents the cost of exchanging electrons or holes with the host and is therefore proportional to the defect charge. It has become standard to present defect formation energies as a function of a parametric E F in the range [0,E g ], where E g is the bandgap of the host compound. In reality, the full range will not always be accessible. The equilibrium Fermi level can be solved conveniently using a package such as SC-FERMI. 14

Code
Purpose Correction Scheme PyLada 4 Automate point defect calculations Lany-Zunger CoFFEE 15 Electrostatic corrections for charged defect calculations FNV PyDEF 16 Defect formation energies using VASP Lany-Zunger PyCDT 11 Facilitate high-throughput DFT defect calculations FNV and KO sxdefectalign 17 Point defects in bulk within SPHinX 18 FNV sxdefectalign2d 19 Point defects at surfaces and interfaces within SPHinX FNV CPLAP 13 Calculation of stable chemical potential ranges .. SC-Fermi 14 Determine the Fermi level based on defect formation energies .. CarrierCapture.jl 20 Calculate non-radiative carrier capture by anharmonic defects ..

D. Term 4: Charged defect corrections
The treatment of charged defects is a longstanding issue for periodic boundary conditions due to the longrange nature of the Coulomb interaction. There are two issues to resolve. Firstly, charged defects are able to interact with their periodic images. Secondly, a homogeneous "jellium" background charge is introduced to enforce charge neutrality and a convergent Coulomb energy. These issue results in a shift to the average electrostatic potential the supercell and in E D,q . Large supercells are optimal to minimise these errors; however, we are often limited by computational cost and available resources.
A number of correction schemes have been developed that result in the term E corr . Some are discussed below and for a more complete description of these issues, we refer the reader elsewhere. 21,22 1. The Leslie-Gillan correction 23 models a point charge q interacting with its periodic images through an isotropic dielectric medium. This correction takes a simple analytic form that depends on the charge state q, static dielectric constant ε, separation between images L and the Madelung constant α m characteristic of the lattice.
2. The Makov-Payne correction 24 includes an additional term to account for higher-order multipoles: An issue associated with this approach is in determining the quadrupole moment Q.
3. The Lany-Zunger correction 25 combines the Makov-Payne correction, including a procedure for calculating Q, with potential alignment to correct for the shift in electrostatic potential.

The Freysoldt, Neugebauer and van de Walle
(FNV) method 26 models the defect charge as a Gaussian distribution. The difference between the electrostatic potential of the charged defect and perfect bulk supercells, calculated far from the defect, is aligned with the defect model potential.

5.
Kumagai and Oba (KO) extended the FNV method using atomic site potentials combined with the Gaussian charge model for an anisotropic dielectric medium. 27 Such schemes were initially developed for use with threedimensional crystal with homogeneous dielectric screening. Recent work has extended these methods to twodimensional 28,29 and one-dimensional 30 materials.
There is no standardised approach to defect charge corrections, which can lead to a spread in calculated defect formation energies in the literature, and predicted defect densities that differ by orders of magnitude. FNV and KO have become the most widely used methods. These are implemented in several computational tools such as PyLada and sxdefectalign, see Table 1. Developing an efficient scheme to account for microscopic effects and anisotropy remains an active area of research. 21,22 As a side note, for shallow defects where the valence or conduction bands become occupied, an additional band filling correction can be required to obtain results in the dilute limit. 31

Carrier capture and recombination
Our previous discussion was limited to an equilibrium description of defects and charge states (e.g. for a crystal in the dark at a certain temperature). Either following a pump pulse or under steady-state illumination, the kinetics of carrier (electron and hole) capture by defects becomes important.
A defect may capture a delocalised free carrier with the aid of electron-phonon coupling. As the charge is transferred from the delocalised state to the localised state around the defect, the local atomic configuration is rearranged. Such processes can be described in the framework of a configurational coordinate (CC) diagram. The CC is usually in the form of a one-dimensional pathway between two local minimum structures as shown in Fig.  4.
To describe such non-radiative defect processes, both the nuclear and electron wave functions must be ade- Configuration coordinate diagram for SnZn in Cu2ZnSnSe4. It describes the non-radiative electron-hole recombination process: Reproduced with permission from Ref. 35. quately described. Several schemes have been proposed to calculate the cross-sections and rates of carrier capture based on first-principles simulations. [32][33][34] The current schemes require a significant amount of researcher expertise and computational resource. Further developments are required to develop reliable and robust procedures suitable for general applications.

Dynamics and transport
Materials engineering often requires a certain spatial distribution of defects. To achieve a desirable profile of defect species in a sample, one needs to understand atomic diffusion in crystals. However, diffusion is a rare event compared to typical vibrational frequencies (1-10 THz). Standard ab initio molecular dynamics simulations can not properly capture diffusion kinetics due to the limited time and length scales that are accessible. To overcome such difficulties, statistical methods such transition state theory (TST) 36 can be employed. The nudged elastic band (NEB) 37 method has been developed to find a minimum energy path and the energy barrier at the saddle point, which can be used to paramaterise kinetic models.
Real diffusion processes may consist of multiple barriers, even in relatively simple structure types such as zincblende. 38 The role of excited states in crystals with defect-mediated mass transport, for example as found in halide perovskite solar cells 39,40 , is one area that requires further development.

Automation and databases
Defect studies are demanding on human time due to the large number of individual calculations involved. There are a growing number of publicly available packages to assist with pre-or post-processing (see Table  I). Powered by the rapidly expanding computational capacity and the automation frameworks such as AiiDA, 41 databases for first-principles calculations including modelling of defects are being developed, which will offer the opportunity for to gain insights when combined with statistical analysis and machine learning models. CONCLUSION We have discussed common considerations and procedures for simulating point defects in bulk materials. First-principles modelling can provide both qualitative and quantitative descriptions of properties of crystalline solids including carrier generation (dopants) and recombination (traps). Emerging renewable energy technologies -including batteries, solar cells, and thermoelectrics -can benefit from first-principles modelling of point defects, complementing experimental characterisation. The prospect of developing large defect databases with the aid of automation offers an opportunity to extract valuable insight toward defect-engineered materials.