Fingerprinting defects in diamond: Partitioning the vibrational spectrum

In this work, we present a computational scheme for isolating the vibrational spectrum of a defect in a solid. By quantifying the defect character of the atom-projected vibrational spectra, the contributing atoms are identified and the strength of their contribution determined. This method could be used to systematically improve phonon fragment calculations. More interestingly, using the atom-projected vibrational spectra of the defect atoms directly, it is possible to obtain a wellconverged defect spectrum at lower computational cost, which also incorporates the host-lattice interactions. Using diamond as the host material, four test case defects, each presenting a distinctly different vibrational behaviour, are considered: a heavy substitutional dopant (Eu), two intrinsic defects (neutral vacancy and split interstitial), and the negatively charged N-vacancy center. The heavy dopant and split interstitial present localized modes at low and high frequencies, respectively, showing little overlap with the host spectrum. In contrast, the neutral vacancy and the N-vacancy center show a broad contribution to the upper spectral range of the host spectrum, making them challenging to extract. Independent of the vibrational behaviour, the main atoms contributing to the defect spectrum can be clearly identified. Recombination of their atom-projected spectra results in the isolated defect spectrum.


I. INTRODUCTION
Vibrational spectroscopy is an important tool for the structural investigation and characterization of solids. [1][2][3][4][5][6][7] Quantum mechanical modeling of such experimental spectra starts from the calculated vibrational spectrum of the system, with the appropriate intensities for the individual spectral modes determined depending on the target experimental technique (e.g., Infra-Red, Raman,...). 2 Because of the inherent control over the atomic structure, such calculations can provide an invaluable source of information and understanding with regard to the structure of the system, and more specifically the impact of defects. [8][9][10] However, the calculation of the vibrational spectrum of such defects faces a significant limitation: the calculation of an accurate and converged vibrational spectrum requires the use of large cells, which is computationally very demanding. 11,12 In the case of defects, the main interest goes to the modifications of the host spectrum due to the defect. It is therefore of interest not to spend computational resources on the reconstruction of the host spectrum, but to limit the calculations to the contributions to the spectrum due to the defect itself. In contrast to solid state modeling, where the phonon spectra are generally only considered for small unit cell systems, 10,13 several approaches have been developed to deal with large systems within the context of (bio-)molecular structure investigations, 14 . Examples are the selective calculation of specific normal modes, 12,15 the partial optimization of the molecular geometry, 16 and the reduction of the Hessian by assuming rigid subsystems. [17][18][19] Others have presented a fragment approach, in which a large system is decomposed in fragments of which high quality properties are calculated. The resulting fragment properties are then recombined again as approximation of the original system. 20 Yamamoto et al. 21,22 showed this method reproduces spectra of large molecules faithfully as long as suitable fragments were selected. They also note that the force field transfer accounted for nearly half of the observed error. 21 Hanson-Heine et al. 23 presented a local mode approach which can be used within the context of 2DIR spectroscopy of large systems, where it provides a platform for the parameterization of site frequencies and coupling maps with regard to the geometry of different functional groups. Another proposed fragment strategy is centered on the construction of the Hessian matrix considering only the atoms in the region of interest. 24 This approach efficiently succeeds in reproducing the spectra of interest, requiring only a small number of atoms to be considered. In this method, the selection of the atoms belonging to the fragment is rather ad hoc. Furthermore, delocalized modes can not be treated with this approach. More specifically, within this approach the remainder of the system is kept frozen which may in some cases lead to unphysical behaviour if a normal mode is not tightly localized. A similar partial Hessian approach is the so-called Mobile Block Hessian approximation. 18,19 In this approach, the remainder of the system is also considered, but to reduce the computational cost, the atoms outside the fragment of interest are grouped in rigid blocks, which have no internal degrees of freedom, only 6 external degrees of freedom. In contrast to these fragment approaches, Teodoro et al. 12 consider the full system using a computationally cheap approximate approach to obtain the initial full spectrum, after which only the normal-modes of experimental interest are selected and arXiv:2001.06277v1 [cond-mat.mtrl-sci] 17 Jan 2020 re-evaluated with a more accurate method. In this work, we present a method to identify the atoms contributing to the vibrational spectrum of a defect. The overlap of the atom-projected spectrum with the reference host spectrum is presented as a suitable quantitative measure. We show that small supercells suffice to clearly identify the defect atoms and that the latter are limited in number. We also show that the defect spectrum obtained through the combination of the atom projected spectra of the defect atoms is already well-converged when using a small supercell for the defect system. The resulting defect spectrum is continuous in nature, due to the incorporation of defect-host interactions. Four diamond based test cases are considered, for which the individual defect spectra are determined. Within the context of the fragment approximations mentioned above, this method could resolve the ad hoc nature of atom selection. Furthermore, within the context of the partial Hessian approximations, having a quantitative measure of the defect nature of an atom, would allow for more targeted selection of Hessian sub-blocks. In both cases, small supercells can be used to identify specific defect-atoms, and partial Hessian calculations on large supercells to obtain the spectrum of interest, reducing the computational cost of obtaining an accurate quantum mechanical vibrational spectrum in a periodic solid.

II. COMPUTATIONAL METHODS
First-principles calculations are performed within the Density Functional Theory (DFT) framework using the VASP package. 25 The kinetic energy cutoff of the plane wave basis set is set to 600 eV to obtain well converged forces, while the exchange correlation functional as proposed by Perdew, Burke and Ernzerhof (PBE) is used to describe the valence electron interactions. 26 The defects are imbedded in a 64-atom conventional supercell, with the first Brillouin zone sampled by a 5 × 5 × 5 Monkorst-Pack grid. Further details on the computational settings used are presented elsewhere. 9,27

III. HARMONIC PHONON SPECTRUM OF SOLIDS
In the following, we opted to use a very explicit notation for the dynamical matrix and its components over the more common and compact notation often found in the lattice dynamics literature. 28 This was done with the aim of clarity, also for those less familiar with the field. Vectors are indicated in bold, and inner products are written as ·.
A. Construction of the atom-projected phonon DOS Most modern quantum mechanical and quantum chemistry packages provide access to the vibrational spectrum of a system at the center of the first Brillouin zone. This vibrational spectrum at the Γ-point, can be obtained by the diagonalisation of the mass-weighted Hessian matrix, also called dynamical matrix: with N a indexing the atoms of the system, m a the atomic mass, and the 3 can be determined either as the second derivative of the total energy with regard to the displacements x i (N a ) and x j (N b ) of atom N a and N b , respectively, or as the derivative of the forces acting on atom N a by the displacement of atom N b : 29 For molecular systems, or clusters, a dynamical matrix as presented in Eq. (1) would provide a complete picture. However, in the case of a periodic solid there are two complications that need to be dealt with: (1) the infinite nature of a theoretical crystal and (2) the finite size of the first Brillouin zone. For a periodic crystal, all relevant physics is contained in a single unit cell, reducing the number of atoms to consider from infinity to a (small) finite number. On the other hand, to obtain the phonon density of states of a periodic solid, one needs to integrate the spectrum over the full Brillouin zone (similar as for the calculation of the electronic density of states). As such, one infinity is replaced by another, albeit a more manageable one. The vibrational spectrum at each point q of the Brillouin zone (BZ) is determined through the diagonalisation of the dynamical matrix: 28 with r Na − r N b the real space vector from atom b to atom a. Furthermore, because interatomic interactions are infinitely ranged, the dynamical matrix needs to incorporate interactions with other unit cells as well. Indexing the unit cells, with R = 1 being the reference unit cell (UC), the general form of the dynamical matrix can be written as: (4) For practical purposes, R can be truncated to a finite number of unit cells, R max , as the contributions to the dynamical matrix of unit cells farther away becomes vanishingly small. 28 The convergence of the vibrational band structure and density of states (DOS), as function of R max , is shown in Figure 1. Note that diamond has a rather small primitive unit cell. For large unit cell systems, such as for example metal-organic frameworks, 30 a converged spectrum may be obtained already at the unit cell level (i.e., R max = 1). Note that for supercell calculations, which are generally used to obtain vibrational spectra from quantum mechanical calculations, 31 the supercell dynamical matrix, Eq. (3), and the unit cell dynamical matrix, Eq. (4), are related through symmetry. Both give rise to the same phonon DOS, however, as matrix diagonalization scales approximately as O(n 3 ), Eq. (4) is much more efficient for larger supercells. This is of interest when using a dens sampling of the BZ.
The dynamical matrix is diagonalised by solving the following eigenvalue problem: with ω 2 (q, j) the j th eigenvalue at wave vector q and v(q, j) the corresponding eigenvector. This eigenvector v(q, j) represents the mass-weighted displacement vec-tors associated with phonon-mode j. From this it is possible to construct a weighing for each atom: allowing for a partitioning of the phonon DOS. The weighing factors normalize to one as a w a (q, j) = 1. 32 The atom-projected phonon spectrum for atom a at a frequency ν is then calculated as: with V BZ the volume of the first Brillouin zone and δ the Dirac delta function.

B. Differences of spectra
When trying to extract the part of the vibrational spectrum due to a defect, one may be tempted to take the difference of this full spectrum and a reference spectrum (i.e., the spectrum of the host material). The result will contain clear defect features-such as new peaks outside the host spectrum, and new intense features within the range of the host spectrum. This can provide a reasonable qualitative picture, even though significant noise as well as negative intensities are to be expected. Correct normalisation with regard to the host spectrum is complicated by the possible difference in number of atoms (e.g., due to an interstitial or vacancy), but also by the question of which atoms belong to the defect (e.g., only the substitutional dopant, or also nearest neighbours?). To move beyond the qualitative identification of defect related vibrational states and properties it is necessary to obtain a well-normalised spectrum (i.e., the integrated DOS corresponds to the actual number of states involved in the defect spectrum) as well as an associated listing of defect-contributing atoms. The resulting defect-host partitioning should be independent of the defect system, and provide a quantitative measure for identifying atoms belonging to the defect.

C. Isolating the phonon spectrum of a defect
In this work, we address the problem of isolating the phonon spectrum of a defect, starting from the uncertainty of which atoms contribute to the defect spectrum. Although the approach can be extended easily to host materials with multiple inequivalent atomic sites and multiple atomic species, we present the methodology from the perspective of a host material containing only one atomic species and one inequivalent atomic position: diamond. Since all atoms are perfectly equivalent in this system, their contribution to the phonon spectrum is exactly the same (i.e., 1 n times the total phonon spectrum of a unit cell containing n atoms). Therefore, the impact of a defect can be identified clearly as the deviation from this reference spectrum. A straightforward method to quantify this difference is by means of the Root-Mean-Square-Deviation (RMSD) of the two (normalized) spectra: with ν max the highest frequency of the spectraω andω a , the normalised host and atom spectrum. In this case, a host atom has theoretically an RMSD of zero. A defect atom, on the other hand, has a positive non-zero RMSD. However, as the upper bound of this function strongly depends on the shape of the spectraω andω a , the information gained is too limited for our purpose. Alternately, the overlap of the (normalized) phonon spectrum obtained for the host system and atom a of the defect system presents a bound function with an upper value of 1 (or 100%): In this case, a host atom shows 100% overlap while a defect atom shows a lower value. A value of 0% could in theory be obtained for a defect atom which gives rise only to vibrational contributions outside the host spectrum. The substitutional Eu atom in diamond, which we will discuss later, approaches this theoretical limit with χ Eu = 10%. We noted earlier that the vibrational spectrum and DOS for systems with a small unit cell (such as prospective host systems) may require rather large supercells to present a converged picture (cf., Figure 1). Defects, in contrast, are modelled using large supercells to approximate the experimentally relevant "low" defect concentrations. As a result, longer ranged vibrational interactions are by default incorporated for such systems, leading to a more converged phonon spectrum than is the case for a small host system unit cell (cf., black curve in Figure 1). It is therefore essential to obtain a sufficiently converged host reference spectrum,ω(ν) , (cf., the 5 × 5 × 5 spectrum in Figure 1) to avoid artificial overlap mismatch when calculating χ a . This mismatch can be quantified by calculating the convergence of χ a of the host spectrum itself. In the case of diamond, this is shown in Figure 2. E.g., the χ a of a 2-atom diamond system gives a mismatch of about 20%. As such, a "host atom" in a defective system, when using the 2-atom reference data, would have a χ a of about 80%, instead of the theoretical maximum of 100%. This is seen in Figure 3, showing χ a of the host C atoms in the defect systems to be in the range of 79-84% for the 1 × 1 × 1 reference spectrum. Fortunately, the computational cost of obtaining well converged reference host spectra is not extreme when taking advantage of (translational) symmetry. We therefore assume in the following that the reference spectra are sufficiently converged. For the "host atoms" in defect systems, long ranged interactions will impact their expected χ a as well. As can be seen in Figure 3, this leads to a leveling of the host atom χ a as function of the reference spectra used. For a defect modeled with a 64 atom conventional supercell (which is relatively small for a defect cell), the χ a of the host atoms ranges between 89 and 92%, which is consistent with the convergence of the host atom overlap in the primitive 2×2×2 cell (cf., Figure 2). For larger defect cells, the overlap of the host atoms increases, as can be seen for the example of the C i defect (cf., Figure 3). Placing the defect in a conventional 3 × 3 × 3 supercell, χ a increases to about 95%, which in turn, is consistent with the convergence of the reference host atom in the primitive 3 × 3 × 3 cell (cf., Figure 2). More interestingly, this increase is not due to a gradual increase in χ a for atoms ever farther away from the defect, but rather due to a general upward shift of the overlap of the non-defect atoms, as can be seen in Figure 4.
In contrast, χ a remains the same for the defect atoms, indicating the χ a of defect atoms to rapidly converge with regard to system size. This means that even using small defect cells, χ a is a usefull measure to effectively determine the atoms belonging to the defect, and even to which degree. This allows for the efficient calculation of the fragment spectra of a defect in a larger supercell system. Furthermore, the χ a values also allow for the systematic improvement of such fragment spectra (cf. below and Figure 6). Instead of defining a purely spatial threshold function, the threshold for inclusion can now be directly related to the atom's contribution to the defect spectrum. Alternately, it is also possible to directly construct a defect spectrum from the atom projected spectra, using χ a for selection purposes. The resulting spectrum will contain all (defect) features missing in the host spectrum. In addition, this defect spectrum will also contain contributions due to the interaction of the defect and the host lattice. These are incorporated through the (defect) atom projection of (delocalized) host system modes. This partitioning of the system into a host and defect fraction could be used as a platform to calculate derived thermodynamic contributions due to the defect (which goes beyond the scope of the current work). Furthermore, because of the small size of the fragment to consider, one could more easily move beyond the standard harmonic approximation. 11,33,34

IV. DIAMOND BASED DEFECTS
To evaluate our method, four different defects in diamond are considered.
• The substitutional Eu dopant in diamond (Eu sub ): This heavy lanthanide dopant gives rise to low lying atomic phonon bands with a clearly distinguishable peak in the phonon spectrum. 9 • The 001 split interstitial (C i ): This intrinsic defect places two C atoms at a single site. It provides a local breaking of the symmetry with limited change in the chemical environment. As a result, two very distinct optical phonon peaks are created well above the bulk spectrum. • The neutral C vacancy (C V ): This intrinsic defect is obtained by removing a single C atom, and as a result, it resembles pristine diamond most closely.
(The complex electronic structure and its influence on the defect geometry was modeled using the DFT+U method, 27 in contrast to the other defects where no +U correction on C was used.) • The negatively charged nitrogen-vacancy centre (NV − ): One of the most discussed and studied defects in diamond. This defect presents a combination of a substitutional dopant and a carbon vacancy. Due to a mass comparable to that of C, the N atom gives rise to a spectrum comparable to that of C itself, making it challenging to extract while being of great interest for applications.

A. Defect phonon spectra
The overlap χ a is calculated for each atom in the defect systems. In Figure 4, χ a is shown as function of the distance of atom a to the defect center. The atoms at the center of the defect are indicated, as well as the nearest neighbour (NN) atoms to the defect. In Figure 5, the phonon spectrum of the different defects is presented in comparison to the phonon spectrum of pristine diamond. Both the Eu sub and the C i defect give rise to clearly distinguishable phonon peaks, which show little to no overlap with the host phonon spectrum. This results in very low χ a values for Eu and the interstitial C atoms, as is seen in Figure 4. The shells of NN and next-FIG. 5: The phonon spectra of the four diamond defects systems (red curves) and the partial phonon spectrum due to the defect (blue curve) with a threshold of χa < 85%. The bulk diamond spectrum (black curve) is given as reference.
NN atoms already show rather large χ a values of 70-80%, indicating that their atom-projected spectra are harder to distinguish from that of a host atom, but still clearly different. In contrast to these outspoken differences, the C V defect system shows a phonon spectrum quite similar to that of the host system, making it hard to indicate the differences and their sources. However, looking at Figure 4, four atoms stand out clearly with a χ a ≈ 70%: the four atoms surrounding the vacancy. Next-NN C atoms present a converged host character, showing this defect to be strongly localized on the vacancy and its four surrounding atoms. Considering the projected phonon DOS associated with these atoms (blue curve in the bottom left panel of Figure 5), it becomes clear that the defect spectrum consists of a peak at the high end of the spectrum, and two peaks around 14.5 and 17 THz. Turning our attention to the NV − defect, we notice in Figure 4 that for the N atom, as well as for the three C atoms surrounding the vacancy, χ a ≈ 70%, similar as for the C V defect. The NN C atoms surrounding the N atom, on the other hand, present a high χ a associated with host atoms. The defect spectrum is also quite similar to that of the C V defect, with additional peaks in the range of 14-17 THz. However, in contrast to the C V defect, the peak at the high end of the spectrum is much less pronounced.
B. Comparison to the fragment spectrum: the 001 split interstitial In Figure 6 the defect spectrum of the C i defect is shown, obtained in both a smaller 2×2×2 (brown curve) and larger 3 × 3 × 3 (black) diamond supercell. It shows that the 2 × 2 × 2 supercell is sufficient to construct a well-converged defect spectrum. The two optical peaks at about 45 THz and 55 THz are found to be within 0.5 THz of the results obtained with the 3 × 3 × 3 supercell, while the feature at 13 THz shows no visible deviation. Furthermore, the broad band, due to defect-host system interactions, is well converged. It is important to note that the computational cost for generating the firstprinciples Hessian matrix within a periodic plane waves approach (shown in Table I), for the 2 × 2 × 2 supercell is 24× lower than for the larger supercell, making this a cost-efficient approach. The defect spectrum is also compared to different fragment spectra obtained using the 3 × 3 × 3 supercell. The atoms belonging to the fragment are determined using their χ a value: χ a ≤ 34% (2 atoms), 77% (6 atoms), 81% (10 atoms), and 89% (18 atoms). The resulting defect spectra obtained using the fragment approach are shown in Figure 6. All fragments (except the smallest 2-atom fragment) give rise to the two optical modes, and it is only the largest fragment which positions them with an accuracy comparable to the 2 × 2 × 2 defect spectrum (at almost twice the computational cost). More interestingly, the feature at 13 THz is not retrieved in the fragment spectra, neither is the broad interaction band.

V. CONCLUSIONS
In this work, a method is presented for determining the phonon-spectrum of a defect using relatively small periodic first-principles calculations. Our method provides a quantitative measure for assigning atoms to a defect. This allows it to be used in tandem with a fragment ap- proach to efficiently obtain incrementally more accurate fragments in much larger supercells. Alternately, combining the atom projected vibrational spectra of the defect atoms gives rise to a quickly converging defect spectrum which combines the defect specific features of the spec-trum with the contributions due to defect-host interactions. The resulting partitioning of the system spectrum into a host and defect component opens up the possibility for similar partitioning of properties derived from the phonon spectrum, which is the subject of ongoing research. The presented methods have been implemented in the HIVE package. 35