First-principles theory of the luminescence lineshape for the triplet transition in diamond NV centre

In this work we present theoretical calculations and analysis of the vibronic structure of the spin-triplet optical transition in diamond nitrogen-vacancy centres. The electronic structure of the defect is described using accurate first-principles methods based on hybrid functionals. We devise a computational methodology to determine the coupling between electrons and phonons during an optical transition in the dilute limit. As a result, our approach yields a smooth spectral function of electron-phonon coupling and includes both quasi-localized and bulk phonons on equal footings. The luminescence lineshape is determined via the generating function approach. We obtain a highly accurate description of the luminescence band, including all key parameters such as the Huang-Rhys factor, the Debye-Waller factor, and the frequency of the dominant phonon mode. More importantly, our work provides insight into the vibrational structure of nitrogen vacancy centres, in particular the role of local modes and vibrational resonances. In particular, we find that the pronounced mode at 65 meV is a vibrational resonance, and we quantify localization properties of this mode. These excellent results for the benchmark diamond nitrogen-vacancy centre provide confidence that the procedure can be applied to other defects, including alternative systems that are being considered for applications in quantum information processing.


Introduction
In the past decade, the negatively charged nitrogen-vacancy (NV) centre in diamond [1] has emerged as a very versatile solid-state system for studies of quantum information [2]. The main characteristics that make it unique [1] are its paramagnetic ground state [3], bright luminescence, extremely long spin coherence times [4], coupling to nearby nuclear spins [5], and the ability to initialize and read out the spin using optical techniques [6,7]. Increasingly, NV centres in bulk crystals and nanodiamonds are used for metrological applications at the nanoscale, i.e., for measuring local magnetic [8] and electric [9] fields, temperature [10,11,12], and pressure [13]. The negatively charged NV centre possesses C 3v symmetry and consists of a substitutional nitrogen atom adjacent to a nearby carbon vacancy (figure 1(a)) with an additional trapped electron, the total electric charge thus being −1. The electronic structure of the ground and the lowest excited states of the centre is mainly determined by four electrons in atomically localized states of a 1 and e (e x and e y ) symmetries; the energy level diagram of the many-electron system is shown in figure 1(b) [14,15]. The basics of NV physics is understood in terms of the ground-state triplet 3 A 2 state (configuration a 2 1 e 2 ), the excited state triplet 3 E state (configuration a 1 1 e 3 ), and two singlet "dark" states 1 E and 1 A 1 (configuration a 2 1 e 2 ). The singlets play a crucial role in both initialization and read-out of the ground-state spin [1].
Nearly all of the applications of NV centres rely on measuring photoluminescence between 3 E and 3 A 2 electronic states as a function of other experimental parameters [1]. At low temperatures the luminescence band [16] consists of a sharp zero-phonon line (ZPL) at E ZP L = 1.945 eV (637 nm), and ∼ 4 increasingly broad phonon replicas with a phonon energy of ∼ 63-65 meV, as schematically shown in figure 1(c). A detailed analysis of the experimental phonon sideband was performed by Davies [17]. In particular, he determined the weight of the ZPL w ZP L ≈ 2.4 % and a Huang-Rhys (HR) factor [18], in essence the average number of phonons emitted during the optical transition (see Section 2 for a quantitative definition), S = 3.73.
The relevance of the NV centre to a variety of applications and the crucial importance of the luminescence band in all these applications raises a question: can the luminescence lineshape, i.e., the electron-phonon coupling during the optical transition, be calculated using first-principles calculations that require no experimental input? Such calculations should address an accurate determination of the Huang-Rhys factor, frequencies of dominant phonon modes, as well as the fine structure of the phonon sideband, including the coupling to long-range acoustic phonons. Previous work [19,20] has addressed the vibrational structure of NV centres to some extent; however, because of finite-size effects the results of these calculations are somewhat ambiguous. No firstprinciples calculation of the luminescence lineshape has been performed to date. Such a calculation would provide valuable information about electron-phonon coupling at NV centres, which at present is incompletely understood [1]. In addition, if the theory is predictive, it can be applied to other defects, for example alternative systems that are currently being actively considered for quantum information and metrology applications [21,22,23,24], or defects that play an important role in light-emitting diodes [25,26].
In this work we present accurate calculations of the vibronic structure pertaining to the triplet luminescence band of the NV centre in diamond. We demonstrate that the combination of state-of-the-art first-principles methods, in particular hybrid density functional theory [27], and computational techniques to address electron-phonon coupling at large enough length scales to accurately include long-wavelength acoustic phonons, is very successful in describing the luminescence lineshape and all the related parameters. The experimental luminescence spectrum which serves as a benchmark for the theoretical study has been measured in our laboratory. The measured luminescence band is of a comparable quality to those of Refs. [28] and [29]. This paper is organized as follows.
In Section 2 we outline the general theory to calculate the vibrational structure of luminescence bands and describe our computational approach. In Section 3 the details of acquiring and processing the experimental spectrum are presented. The results are presented in Section 4 and analyzed in Section 5. Section 6 contains our conclusions. The paper is supplemented with four appendices that discuss specific technical issues in more detail.

Luminescence
The excited state 3 E is an orbital doublet that forms an E ⊗ e Jahn-Teller system via coupling to e phonon modes [17,30,31]. The Jahn-Teller effect is dynamical, since the energy splitting between the vibronic sub-levels is larger than the barrier in the adiabatic potential energy surface δ ≈ 10 meV [31]. The presence of this effect is manifest in the broadening of the ZPL that follows a ∼ T 5 rather than the usual ∼ T 7 dependence at low temperatures [30,31]. However, the effect is weak [31], and we will neglect it when calculating the phonon sideband. One way to judge the validity of this approximation is via an a posteriori comparison [17]. If a linear model of electron-phonon interactions, such as the one employed in this work, accurately describes the lineshape of the optical transition, then the Jahn-Teller effect can be considered negligible for this particular transition. As we show below, this turns out to be the case for the triplet luminescence at NV centres.
We also assume that the transition dipole moment µ eg between the excited and the ground state depends weakly on lattice parameters (the Franck-Condon approximation). At T = 0 K the absolute luminescence intensity I(hω) (i.e., photons per unit time per unit energy) for a given photon energyhω and for one emitting centre is given by (in SI units) [32]: Here n D = 2.4 is the refractive index of diamond; χ e0 and χ gm are vibrational levels of the excited and the ground state; E gm is the energy of the state χ gm , being the sum over all vibrational modes k, i.e., E gm = k n kh ω k ; and n k is the number of phonons of type k in this state. The absolute angle-averaged value of µ eg is ∼ 5.2 Debye, as extracted from the radiative lifetime τ = 13 ns of the m s = 0 spin state of the 3 E manifold [1]. A prefactor ω 3 in equation (1) arises from the density of states of photons that cause the spontaneous emission (∼ ω 2 ), and the perturbing electric field of those photons (| E| 2 ∼ ω). This prefactor has to be taken into account when determining parameters pertaining to the luminescence lineshape, and this will be discussed in Section 2.2. Since in both the excited and the ground electronic state the system has C 3v symmetry, only fully symmetric a 1 phonons contribute to the sum in equation (1). The experimental determination of the absolute luminescence intensity given in equation (1) is difficult. Thus, in this work we will consider the normalized luminescence intensity, defined as where is the optical spectral function, and C is the normalization constant: C −1 = A(hω)ω 3 d(hω). I(hω) is related to L(hω) via I(hω) = n D /(3Cε 0 πc 3h )L(hω). The evaluation of the overlap integrals χ gm |χ e0 immediately poses a challenge. Vibrational modes that enter into equation (3) are not those of the pristine bulk, but rather those of the solid with a defect. The use of bulk modes can lead to large discrepancies with experiment, as we will show in Section 4. Lattice imperfections induce localized or quasi-localized vibrational modes that depend on the local electronic structure; in addition, the normal modes in the excited state and the ground state can be in principle quite different [33]. This results in highly multidimensional integrals that can in practice be evaluated only for molecules [34], small atomic clusters [35], or model defect systems [36].
Some kind of approximation is thus unavoidable. Here we assume that (i) the normal modes that contribute to the luminescence lineshape are still those of the solid with a defect, but (ii) the modes in the excited electronic state are identical to those in the ground state. Such an assumption is implicit in virtually all studies of defects in solids [37,38]. First-principles calculations [19,20], as well as a comparison of experimental absorption and emission spectra [16], indicate that the assumption does not strictly hold for the NV centre. Since the more exact calculation is not feasible, the validity of this approximation has to be checked by comparing the results with the experimental spectrum.
When vibrational modes in the ground and the excited state are identical, the optical spectral function A(hω) (equation (3)) can be calculated using a generating function approach proposed by Lax [38], as well as Kubo and Toyozawa [39]. The fundamental quantity that has to be calculated is the spectral function (also called spectral density) of electron-phonon coupling [40] where the sum is over all phonon modes k with frequencies ω k , and S k is the (partial) Huang-Rhys factor for the mode k. It is defined as [37] S k = ω k q 2 k /(2h) with q k = αi m 1/2 α (R e;αi − R g;αi )∆r k;αi .
α labels atoms, i = {x, y, z}, m α is the mass of atom α (carbon or nitrogen, average atomic masses of naturally occurring isotopes were used), R {e,g};αi is the equilibrium position in the initial (excited) and the final (ground) excited state, and ∆r k;αi is a normalized vector that describes the displacement of the atom α along the direction i in the phonon mode k. One can use an alternative expression for q k : where F e;αi −F g;αi is the change of the force on the atom α along the direction i for a fixed position of all atoms when the electronic state of the defect changes from 3 E to 3 A 2 . The latter equation directly follows from the relationship ( R e − R g ) = −Ĥ −1 ( F e − F g ), wherê H is the Hessian matrix, different from the dynamical matrix only because of additional mass prefactors in the latter. The two formulations are completely equivalent in the harmonic approximation. In Appendix A we show that if the dynamical Jahn-Teller effect is neglected the anharmonicities are indeed minute. While being in principle equivalent, the use of equation (7) instead of equation (6) offers a huge advantage when dealing with large systems, i.e., when extrapolating S(hω) to the dilute limit, and this is discussed in Section 2.3 and Appendix B. Once S(hω) is determined, the spectral function A(hω) (equation (3)) is given as the Fourier transform of the generating function G(t) [38,39]: The generating function G(t) itself is defined as where and is the total HR factor for a given optical transition. In equation (8) the parameter γ represents the broadening of the ZPL. In real situations this broadening has two contributions: the homogeneous broadening due to anharmonic phonon interactions [41,42] and the inhomogeneous broadening due to ensemble averaging. Since neither of these two effects is modeled in our approach, γ is chosen to reproduce the experimental width of the ZPL.

Huang-Rhys and Debye-Waller factors
The partial HR factor S k defined in equation (5) is the average number of phonons of type k emitted during an optical transition [18]. The total HR factor, defined in equation (11), is then the number of phonons off all kinds that are emitted during the same transition. The HR factor is thus an important parameter that characterizes the vibrational structure of the luminescence band. If (i) there was no additional prefactor ∼ ω 3 in the expression for the luminescence intensity in equation (2) and (ii) the vibrational modes in the excited and the ground state were indeed identical, then the weight of the zero-phonon line would be given by [43,17,37,38] w ZP L = e −S . Since this line corresponds, by definition, to zero absorbed or emitted phonons, w ZP L is often called the Debye-Waller factor, in analogy with x-ray scattering, where it represents the ratio of the elastic to the total scattering cross section. Therefore, we also use this nomenclature to comply with the accepted practice. The Debye-Waller factor w ZP L is a quantity that is directly measurable in experiment, and its determination is therefore unambiguous. In practical situations the HR factor is often deduced from the relationship S = − ln(w ZP L ), where we have added a "tilde" to distinguish this quantity from the actual HR factor S, which is defined by equation (11). S differs from S because of the additional assumption (i).
The spectral weight in L(hω) (equation (2)) moves to slightly higher energies in comparison to A(hω) due to the prefactor ω 3 . This increases the weight of the ZPL, w ZP L , if determined from L(hω), and thus decreases the value of S with respect to S. This distinction has to be borne in mind when comparing different experimental papers. In this paper we will consistently use w ZP L and S in their original definitions.

First-principles approach
In this work the spectral function of electron-phonon coupling S(hω) (equations (4)-(6)) is calculated within density functional theory (DFT). The electronic, atomic, and vibrational properties of the NV centre are calculated in the supercell approach [44], whereby one defect is embedded in a sufficiently large piece of host material, which is periodically repeated. We take a conventional cubic cell with 8 carbon atoms as the building block for larger supercells. The cubic supercell N×N×N, for example, contains M=8N 3 atomic sites.
To study the electronic and vibrational structure of defects we use two different exchange-correlation (XC) functionals: the generalized gradient approximation (GGA) in the form proposed by Perdew, Burke, and Ernzerhof (PBE) [45] and the screened hybrid functional of Heyd, Scuseria, and Ernzerhof (HSE) [27]. PBE is known to describe structural properties of many materials with high accuracy, but the calculated band gaps of semiconductors and insulators agree poorly with experiment, and this also affects the position of defect levels within the band gap. The HSE functional overcomes this problem by incorporating a fraction a = 1/4 of screened Fock exchange (screening parameter ω = 0.2Å −1 ). HSE calculations yield excellent results for excitation energies for the spin-triplet optical transition in NV centres [46].
The properties of the excited state 3 E have been calculated using the constrained occupation method of Slater [47], as first applied to the NV centre by Gali et al. [46]. In this method one electron from the a 1 orbital is promoted to one of the e orbitals. The electronic and the atomic structure is optimized with a hole in the a 1 state. To circumvent the problems with the Jahn-Teller distortion in the excited state, resulting from the degeneracy of the nominal a 1 e 2 x e 1 y and a 1 e 1 x e 2 y configurations, the coordinate dependence of the total energy in the excited state is studied here by constraining the configuration to a 1 e 1.5 x e 1.5 y . This is a practical solution to restrict the excited state density to be the average of the two degenerate configurations, retaining a C 3v symmetry.
We find that while the integrated parameters, for example the total HR factor S (equation (11)) converge quickly when the size of the defect supercell is increased, the convergence of the spectral function S(hω) (equation (4)) is significantly slower. This is a particular concern for the spectral function at lower energies, i.e., coupling to longrange acoustic phonons. As an example, let us consider a 2×2×2 simple cubic supercell, containing 64 lattice sites. Without a defect, the lowest energy Γ-point vibration of such a supercell corresponds to the bulk transverse acoustic (TA) mode at the Λ point with an energy of about 68 meV. This is even higher than the experimentally determined energy of the dominant phonon mode at the NV centre, beinghω 0 =63-65 meV. Clearly, supercells of this size are insufficient to determine S(hω).
To obtain converged results and determine the nature of vibrational states, we have performed calculations for a series of supercells: from 2×2×2 (64 sites) up to 11×11×11 (10648 sites). Since a direct approach for supercells containing more than a few hundred atoms is computationally too demanding, we have developed a special methodology to achieve this goal. First, in Appendix A we show that it is an excellent approximation to calculate vibrational properties at the PBE level, since the relevant vibrational modes are very similar in the PBE as compared to the HSE functional. This presents huge computational savings, since HSE calculations are up to two orders of magnitude more expensive. Then in Appendix B we present a methodology to calculate vibrational spectra and spectral functions S(hω) for very large systems. In short, the procedure is as follows. Partial Huang-Rhys factors for large systems are determined from equation (7). Forces F {e,g} in the large supercell N×N×N (N > 3), needed in that approach, are obtained from the calculation of a smaller supercell (4×4×4) via a suitable embedding procedure, explained in Appendix B. For these large defect supercells, vibrational modes and frequencies, that also appear in expression (7), have been determined by diagonalizing the dynamical matrix constructed from dynamical matrices of bulk diamond and NV centre in the 3×3×3 supercell. The validity of the procedure relies on the fact that the dynamical matrix of diamond is rather short-ranged. Specific parameters of the procedure are determined from accurate convergence tests, and are discussed in Appendix B.
Defect calculations have been performed with the vasp code [48,49], and the interaction with ionic cores was described via the projector-augmented wave (PAW) formalism [50]. A kinetic energy cutoff of 400 eV (29.4 Ry) has been used for the expansion of electronic wavefunctions. For the 2 × 2 × 2 supercell the Brillouin zone was sampled using a 2 × 2 × 2 k-point mesh, and Γ-point sampling was used for larger supercells.
To produce additional insights, we have also calculated S(hω) (equation (5)) and L(hω) (equation (2)) with an additional assumption, namely that phonon modes that contribute to the luminescence lineshape are those of the unperturbed host [51,52]. For this purpose we have determined the vibrational modes of bulk diamond using density functional perturbation theory [53], reproducing earlier calculations [54]. Vibrational modes were determined on a very fine 27×27×27 k-point grid close to the Brillouin zone centre, and a courser 9 × 9 × 9 grid elsewhere. These calculations have been performed using the quantum espresso code [55] within the local density approximation [56]; this XC functional describes phonons modes of bulk diamond very well [54]. To evaluate S(hω), the modes were mapped to the Γ-point of the desired supercell. The contributions from the vacancy site are set to zero in equations (6) and equation (7), while the mass of the nitrogen atom was set to be equal to that of the carbon atom in this case.

Experimental spectrum
The NV centre photoluminescence (PL) spectrum was taken at 8 K on an ensemble of NV centres using a home-built confocal setup. The diamond sample used was a Sumitomo high-pressure, high-temperature grown Ib Sumicrystal with a specified nitrogen content of 30 − 100 parts per million. The sample was irradiated with 2 MeV electrons at a dose of 1 × 10 17 electrons/cm 2 and annealed at 850 0 C for 2 hours to generate a high density of NV centres within the bulk. The NV centres were photoexcited with 532 nm light with sufficiently low intensity to suppress the luminescence of neutral (NV 0 ) centres. Subsequent PL was collected into a spectrometer with ∼0.3 meV spectral resolution. The spectrum intensity was calibrated by measuring the nominallyknown spectrum of an OceanOptics LS-1-LL tungsten halogen light source placed at the same position of the diamond sample within the optical setup.
The experimentally obtained spectrum was normalized to 1 for comparison with the theoretical calculations. For normalization purposes the low-energy tail of the spectrum was modeled as an exponential function. The weight of the zero-phonon line (Debye-Waller factor) was determined to be ∼ 3.2%. This corresponds toS = 3.45, in very close agreement with Ref. [29]. The actual Huang-Rhys factor can be estimated to be S ≈ 3.85 ± 0.05.

Excitation energies
For the 4×4×4 supercell, the largest system for which we have performed actual electronic structure calculations, E ZP L was calculated to be 1.757 eV using the PBE functional, and 2.035 eV using the HSE functional. The latter is thus much closer to the experimental value of 1.945 eV. Our calculations agree with those of Gali et al. [46] and Weber et al. [21]. The HSE functional is clearly superior for describing the local electronic structure of the NV centre [46]. The difference of about 0.1 eV between the experimental and calculated ZPL is within the error bar of the HSE calculations, but would complicate direct comparisons between theoretical and experimental lineshapes. To enable a more meaningful comparison, in all subsequent analysis we set E ZP L to the experimental value. Thus, the broadening of the ZPL γ in equation (8) and the value E ZP L are the sole instances where information from experiment has been used in the theoretical results.

Spectral function of electron-phonon coupling S(hω)
We first analyze the convergence of S(hω) when the size of the supercell is increased. In addition to providing justification for the computational procedure, such a study provides insights into the origin of vibrational modes that contribute to the phonon sideband.  (4)) and partial Huang-Rhys factors (equation (5)) pertaining to the spin-triplet optical transition at NV centres for increasingly larger supercells, from 2×2×2 to 11×11×11 (some intermediate results are not shown). S(hω): left vertical axes and black solid lines; to enable a meaningful comparison the range of these vertical axes is the same for all supercells. S k : right vertical axes and blue bars; the range of these vertical axes decreases for larger supercells.
In figure 2 we show S(hω) (equation (4)) and partial Huang-Rhys factors (equation (5)) as a function of the supercell size, from 2×2×2 to 11×11×11 (results for five intermediate supercells are omitted). The range of the left vertical axes for S(hω) was kept identical for all supercells, but note that this is not the case for the right vertical axes that apply to S k . For the calculation of S(hω) δ-functions in equation (4) were replaced by Gaussians with widths σ = 6 meV. HSE results are discussed here.
In the case of the smallest 2×2×2 supercell, only a few phonon modes contribute to S(hω). The most important of these are modes with energies 60.4 meV and 77.8 meV, in complete agreement with the results of Gali et al. [19] (59.7 and 77.0 meV), who studied local vibrational modes for this size of supercell. While the energy of the first mode is close to the energy of the most pronounced phonon mode seen in experiment, this agreement is largely fortuitous, since, as mentioned in Section 2.3, the lowest-energy bulk TA phonon mode in this supercell has a similar energy. The low-energy tail that represents the coupling to long-range phonons (<60 meV) is completely missing for this supercell, and the total HR factor S = 3.02 (inset of figure 2) is 20% smaller than the converged value S = 3.67.
In the case of a larger 3 × 3 × 3 supercell, the most dominant vibration is the 45.9 meV mode. This result is an artifact resulting from the use of a small cell, since this vibration corresponds to the lowest-energy Γ-point bulk TA phonon-it is not an actual defect-derived mode. While the total HR factor increases to 3.27, S(hω) is still far from converged. This emphasizes possible dangers in drawing conclusions about local vibrational modes from small-size supercells [20].
When increasing the size of the supercell further, S(hω) slowly attains its converged form. figure 3 shows that the spectral function is essentially converged for the two largest supercells we use, even though there are still apparent changes in individual partial HR factors S k . The peak of S(hω) occurs athω 0 = 65 meV, in excellent agreement with experimental findings (see Sec. 4.3). This is the first time that theoretical calculations yield the energy of the peak decisively. Interestingly, the total HR factor, i.e. the integral of S(hω), is within ∼1% of the converged value already for the 4 × 4 × 4 supercell. Figure 2 also allows us to draw the following conclusions about lattice distortions or, equivalently, coupling to phonons, that occur during the 3 E → 3 A 2 optical transition: (i) The 65 meV vibration is not a localized phonon mode, but a defect-induced vibrational resonance: it occurs within the spectrum of bulk phonon modes (0-167 meV). In figure 2 this result is evident from the fact that for larger supercells this mode splits into many closely spaced modes, with a simultaneous decrease of their absolute contributions. The 65 meV resonance is induced by the NV centre itself, and cannot be understood solely by considering bulk phonon spectrum. This is demonstrated in Figure 3 and discussed in more detail in section 4.3.
(ii) In agreement with a general theory of vibrational broadening of luminescence lines [41], the spectral function is linear for small energies, i.e., S(hω) = αhω for ω → 0. Indeed, partial HR factors corresponding to acoustic modes scale like 1/ω, which, multiplied with the density of states of acoustic modes ∼ ω 2 , yields this linear dependence. While this general behaviour is known [41], we emphasize that the prefactor to the linear dependence is system dependent, and only accurate atomistic calculations such as the ones presented here can provide the actual value. In our case we obtain α ≈ 3.6×10 −4 meV −2 = 360 eV −2 . Interaction via acoustic phonons has been recently proposed as a promising mechanism to couple two NV centres in nanodiamonds [57]. The coupling of isolated qubits is essential for any quantum computing protocol. Our calculations provide information about the coupling of NV centers to acoustic phonons in bulk diamond, and can be useful pursuing the ideas proposed in Ref. [57] ideas further.
(iii) 99% of the lattice distortions due to the optical transition, as quantified by their contribution to S(hω), occur within ∼ 12Å of the NV centre. This follows from our finding that the total HR factor for the 4×4×4 supercell is within 1% of the converged value. However, long-range relaxations, while contributing little to the total HR factor S, are manifest in the low-frequency part of S(hω), and are actually observed in the luminescence lineshape (see Section 4.3).
In figure 3 we show a comparison of S(hω) calculated using three different approaches. From here on we use the following notation when we refer to our calculations: (i) "HSE" refers to calculations where atomic displacements or forces in equations (6) and (7) are calculated using the HSE hybrid functional, but vibrational modes are calculated using the PBE functional. As discussed in Section 2.3 and Appendix A, calculations for smaller supercells show that vibrational modes calculated at the PBE level are very similar to HSE results. (ii) "PBE" refers to calculations in which all quantities are determined at the PBE level. In both (i) and (ii), vibrational modes correspond to the defect system. (iii) "Bulk phonons" refers to calculations in which atomic distortions or forces were determined at the HSE level, as in (i), but vibrational modes correspond to those of the unperturbed host. The comparison of (i) and (iii) should inform us whether the introduction of the defect modifies the vibrational spectrum, and whether the phonon sideband can be understood by considering bulk modes alone.
S(hω), calculated at the PBE level, is qualitatively very similar to the HSE result. The function has a peak athω = 64 meV, but the absolute value of S(hω) is smaller for  almost all energies. In particular, the total HR factor is S = 2.78, a quarter smaller than in HSE. In contrast, when the bulk phonon spectrum is used, S(hω) is even qualitatively completely different. In this case the spectral function closely follows the density of vibrational states of bulk diamond [54,58], with a pronounced peak athω ≈150 meV. The total HR factor is 4.48 in this case. However, the coupling to low-energy (<20 meV) acoustic modes is very similar to cases (i) and (ii); indeed long-range phonons are expected to be little affected by the presence of the defect.

Comparison with experiment: luminescence lineshape and Huang-Rhys factors
In figure 4 we compare the luminescence lineshape L(hω) (equations (2) and (8)), calculated using the HSE functional, with the experimental one. The agreement between theory and experiment is extremely good. Not only is the overall shape of the luminescence band described correctly, but all the specific features are described very accurately. In particular: (i) The weight of the ZPL of the theoretical spectrum w ZP L = 3.8% is very close to the experimental result w ZP L = 3.2%. Both of these quantities have been determined directly from luminescence lineshapes shown in figure 4, as discussed in Section 2.2. The theoretical Huang-Rhys factor S = 3.67 is thus also very close to the experimental HR factor S = 3.85 ± 0.05; the latter has been extracted from the experimental spectrum as described in Sec. 2.2.
(ii) Both the experimental and the theoretical band show about 4 increasingly broad  phonon replicas. The theoretical phonon frequencyhω 0 =65 meV is in very good agreement with the experimental valuehω 0 =64 meV.
(iii) The fine structure near the ZPL, which is representative of the coupling to acoustic phonons, agrees closely.
We conclude that calculations based on hybrid density functionals describe the vibrational properties and the luminescence lineshape of NV centres with a very high accuracy.
In figure 5 we present luminescence lineshapes calculated using all the three different theoretical approaches discussed in Section 4.2. The experimental curve and the one that corresponds to the HSE functional are the same as those in figure 4. The lineshape calculated at the PBE level is qualitatively similar to the HSE one, but there are quantitative differences. In particular, the weights of the first two phonon replicas are larger, and the overall band is narrower. Figure 5 also shows that when bulk phonons are used instead, the calculated luminescence lineshape bears no resemblance to the experimental curve: it is much broader and has a very different fine structure. This result clearly shows that the consideration of the bulk phonon spectrum is not sufficient to understand the phonon sideband, challenging the discussion of Ref. [29]. Taking into account vibrational modes of the defect system is essential.

Analysis: localized vs. delocalized phonon modes
In Section 4.2 we mentioned that the 65 meV phonon that dominates the phonon sideband, is not a localized mode, but rather a vibrational resonance. This means  We illustrate the fact that the 65 meV is a vibrational resonance in the following way. For each defect supercell studied we choose the individual phonon mode that has the largest Huang-Rhys factor in the energy range 49 − 81 meV (right axis, figure 2). The energy range corresponds to a FWHM of the 65 meV peak in the converged function S(hω). In figure 6 (a), we plot this largest value of the partial Huang-Rhys factor S k as a function of the supercell size N. The value of S k for this mode decreases steadily, albeit with some oscillations, as a function of supercell size. Since the total HR factor does not change much as the system size grows, the decrease of this particular S k is compensated by an increase in other phonon modes ( figure 2). This is a signature of a vibrational resonance, which is also called a quasi-local mode.
To gain more insight, we study the inverse participation ratio (IPR) for the mode k [59]: where IP R k defined in this way measures the number of atoms onto which the vibrational mode is localized. If, e.g., only one atom vibrates for a given mode, IP R = 1. If all M atoms in the supercell vibrate with the same amplitude, IP R = M. Note that the definition in equation (12) is different from the one used in Ref. [20] to analyze vibrational modes of the NV centre in a 216-atom supercell, and is more in line with the traditional definition [59]. In figure 6(b), the IP R k for the most pronounced mode in the energy range 49 − 81 meV is shown as a function of the supercell size N. For all supercell sizes IP R k is but a fraction of the total number of atoms M, but steadily increases with N, albeit with similar oscillations as for S k . This underpins the finding that the 65 meV mode is a vibrational resonance. This resonance represents the lion's share of the distortion of the defect geometry (cf. equations (6) and (7)). It has the largest amplitude on the four atoms surrounding the vacancy, and the vibrational pattern is shown in Fig. 6(c). The N atom is vibrating along the defect axis, while the vibrational vectors of the C atoms form an angle of ∼110 • with this axis.
By analyzing partial Huang-Rhys factors and inverse participation ratios of all the modes we were able to identify a few other, weaker resonances. These are modes with frequencies 161, 134, and 120 meV (in the order of decreasing localization). All of these weaker resonances were recently identified in the experiment of Kehayias et al. [29]. The 153 meV resonance seen in the same experiment is not very pronounced in our calculations. As a measure of localization we define the "localization ratio" β, which is the ratio of the number of atoms in the supercell (M = 8N 3 ) to the largest IP R k corresponding to one of these resonances: We obtain the actual value of β k by fitting the IP R k for a given mode with a function 8N 3 /β k (see figure 6(b)). The larger the ratio β k , the more pronounced the resonance. For a truly localized mode in the limit M → ∞, β k would be infinite, since for a localized mode IP R k remains constant as M increases. The results are summarized in Table 1.
For example, the localization ratio β k for the 65 meV mode is ∼11. Values for the "localization ratio" should be considered as rough estimates, but they are useful when comparing different modes. Together with these vibrational resonances, we do find one truly localized defectinduced phonon mode. In figure 6(d) we show S k as a function of the supercell size for a phonon mode with a frequency ≈167 meV, which is slightly (≈0.2 meV) above the theoretical bulk phonon spectrum. Increasing the size of the system, S k approaches a constant value of ≈0.02. When the size of the supercell grows, the IP R k of this mode also approaches a constant value of ∼80 (figure 6(e)). In analogy with shallow defect levels with energies close to bulk band edges, one could name this mode a shallow defect-localized vibration. While this mode is "shallow", half of its total weight is distributed over 6 carbon atoms: 3 that are immediately adjacent to the vacancy, and 3 more that are nearest neighbours of the first trio along the defect axis. The vibrational pattern associated with this vibration is shown in figure 6(f). It is an optical mode with vibrational vectors of all atoms only slightly off the z direction (by ∼14 • ) due to the influence of the defect. The participation of the nitrogen is negligible in this vibration.
The 167 meV mode contributes less than 1 % to the total HR factor of 3.67, and therefore its role in the formation of the phonon sideband is not very significant. However, since this is a truly localized vibrational mode, it can play an important role in other physical processes at nitrogen-vacancy centres. Kehayias et al. [29] recently found that a phonon mode that has the signature of a localized vibration and an experimental energy of 169 meV plays a noticeable role in the infrared transition 1 E → 1 A 1 . Due to very similar atomic geometries of the 3 A 2 , 1 A 1 and 1 E electronic states [11,14,15,29] we suggest that the localized phonon mode found in our current study is the same as the one observed in the experiments of Kehayias et al. [29].

Conclusions
In this work we have developed a first-principles methodology to calculate the vibrational structure of defect luminescence bands. Both localized, quasi-localized, and bulk phonons are taken into account on equal footing. The methodology was applied to study the phonon sideband pertaining to the 1.945 eV spin-triplet transition at nitrogenvacancy centres in diamond. Calculations based on hybrid density functional theory yield a luminescence lineshape and all related parameters that are in excellent agreement with experiment. The phonon sideband is dominated by a vibrational resonance with an energy of ∼ 65 meV, but a few other, weaker resonances, are also identified. 99 % of all atomic relaxations that contribute to the phonon sideband occur within ∼12 A of the defect, but the interaction with long-range acoustic phonons is also directly manifest in the luminescence spectra close to the zero-phonon line. We find a truly localized phonon mode slightly above the phonon spectrum of bulk diamond. While this mode, being localized on ∼ 75 atoms, contributes little to the spin-triplet optical transition, it can play an important role in other physical processes at this defect, as recent experiments suggest [29]. Our findings provide a deeper understanding of the coupling of electronic states to a 1 phonon states at nitrogen-vacancy centres. The success of the computational methodology developed here provides confidence that it can be fruitfully applied to other systems of high current interest that exhibit a complex vibrational structure of luminescence bands [21,22,23,24,25,26]. A related quantity (∆R) 2 = αi ∆R 2 αi . is also useful in analyzing theoretical results, and can be alternatively used as a measure of atomic displacements during optical excitation. The plot that shows the dependence of total energies in the ground and the excited states E {e,g} as a function of Q is called the configuration coordinate (cc) diagram [32] (cf. figure 1(c)).
In figure A1 we present an explicit calculation of the 1D cc diagram for the NV centre (results from the 4 × 4 × 4 supercell were used). The HSE calculations (filled disks) yield ∆Q HSE =0.71Å·amu 1/2 and ∆R HSE =0.20Å. The PBE calculations (open disks) yield somewhat smaller values, ∆Q P BE =0.62Å·amu 1/2 and ∆R P BE =0.18Å. (In passing, we note that in their seminal paper Davies and Hamer [16] also estimated the total displacement ∆R based on a simple model for the defect. Despite the fact that their model turned out to be not entirely correct, their estimated ∆R = 0.18Å is astonishingly close to accurate first-principles results.) In order to more meaningfully compare HSE and PBE results, we show the 1D cc diagram calculated at the PBE level on the same graph, but shift the potential energy curve of the excited state horizontally to Q = ∆Q HSE . Both the HSE and PBE curves are adjusted vertically to match the experimental E ZP L = 1.945 eV, as discussed in Section 3. A simple visual inspection of figure A1 then shows that if plotted this way the potential energy curves determined in the two approaches lie virtually on top of each other.
More quantitatively, we have performed numerical fits to these one-dimensional potential energy curves using the function E(Q) = 1/2Ω 2 Q 2 + βQ 3 . It can be easily shown that Ω determined in this way is the mean square average of all the phonon modes contributing to the phonon sideband, the weight of phonon mode k being given by q 2 k (equations (6) or (7)). For the two functionals, these average frequencies differ by 1% in the ground state, and 1.6% in the excited state, and in all cases the coefficient β is essentially negligible. These findings justify the assumptions at the beginning of this section.
The similarity of vibrational modes calculated in PBE and HSE can also be demonstrated by a direct calculation of the vibrational spectrum of the supercell. Because of the high computational cost of the HSE calculation, we have performed this calculation only for the smallest 2 × 2 × 2 supercell. Vibrational modes and frequencies calculated using the two functionals are indeed very similar, supporting the conclusion achieved by analysing figure A1. Therefore, it is a very good approximation to use vibrational modes calculated at the PBE level in all calculations, and we adopt it for the present study.
The main difference between PBE and HSE functionals are the atomic relaxations ∆Q (or R e − R g ). It is because of this difference that spectral functions of electronphonon coupling S(hω) and total Huang-Rhys factors (figure 3) are different in the two approaches.

Appendix B. Calculations for very large supercells
A direct evaluation for the dilute limit, i.e., the use of very large supercells that would yield a converged S(hω), is nearly impossible not only for an HSE hybrid functional, but also for a less expensive PBE functional. This applies, in particular, to the calculation of vibrational modes. To obtain results for large systems, we have used the following methodology.
For the two smallest supercells, i.e., 2×2×2 (64 lattice sites) and 3×3×3 (216 sites) a direct approach has been applied. In particular, partial HR factors S k have been evaluated using equations (5) and (6). Vibrational modes and frequencies have been determined by diagonalizing dynamical matrices obtained directly from the supercell calculation.
For larger supercells we have used an alternative approach. First, we performed constrained geometry relaxations for the 4×4×4 supercell (512 lattice sites) with a defect in the middle of the supercell. In the calculations for the excited state the atoms within 3Å of the vacancy were allowed to relax, while the remaining atoms were kept in their ideal lattice positions ( figure B1(a)). This procedure yields zero forces F e;αi within this chosen radius (white inner circle in figure B1(a)). The forces F e;αi are non-zero for the atoms that were kept in their bulk positions. However, actual calculations indicate that the forces are appreciable only within ∼7Å away from the vacancy (i.e., about 4Å away from the atoms that were allowed to relax, indicated as an outer yellow circle in figure B1(a)). The crucial point is that there are no net forces exerted on atoms that are at the boundary of this supercell. Subsequently, we kept the geometry of the defect as optimized according to this procedure, but determined the forces F g;αi on atoms when the electronic state is changed to that of the ground state ( figure B1(b)). The resulting forces are non-zero in the entire region, shown as a yellow circle in figure B1(b), but essentially vanish beyond it. These two calculations yield the difference (F e;αi − F g;αi ) needed to determine partial HR factors via equations (5) and (7). The fact that the forces are essentially zero beyond the yellow circle in figure B1 (a) and (b) does not mean that these atoms stay in their bulk position. If the constraints were relieved, these atoms would move to find their equilibrium positions during a full geometry optimization, since the movement of their neighbours during this optimization would result in a build-up of forces. The point is that equation (7) includes this automatically.
To determine the vibrational spectrum for this supercell we have made use of the fact that in covalent semiconductors the dynamical matrix is short-ranged. For example, tests show that the inclusion of five nearest-neighbour shells is sufficient to obtain a vibrational spectrum of bulk diamond. Thus, if the atoms in the defect system are further away from each other than 4Å, we set the dynamical matrix element to 0. Otherwise, if one of the atoms is within 2.5Å of the vacancy or the nitrogen atom, the matrix element is taken from the calculation of the 3×3×3 supercell. For other pairs we use bulk diamond values. The choice or parameters leads to a converged vibrational spectrum of the 4×4×4 defect supercell. A similar procedure to construct the dynamical matrix was recently used for defects in GaN by Shi and Wang [60]. Partial HR factors are then determined from equations (5) and (7).
For larger supercells N×N×N (N > 4) the procedure is as follows. First, the two 4 × 4 × 4 defect supercells from the previous steps were embedded into a larger N×N×N supercell for both the excited ( figure B1(a)) and the ground state ( figure  B1(d)). This automatically yields the force difference (F e;αi − F g;αi ) for this larger system. The wavefunction of the NV centre is very localized, and thus we might expect that the actual calculation for a larger supercell, were it possible, would yield very similar force difference (F e;αi − F g;αi ). It is now clear why the formulation based on equation (7) is hugely advantageous. When the atoms away from the defect are kept fixed in ideal bulk positions during the geometry optimization in figure B1(a) this excludes the elastic interaction between periodically repeated replicas of the defect, and eventually enables embedding this smaller system into a large one. Vibrational spectra for these larger supercells have been determined in the same way as for the 4×4×4 supercell. Using these techniques we were able to study supercells as large as 11×11×11 (10648 lattice sites). Our procedure procedure yields results of nearly the same quality as if explicit first-principles calculations were performed for these large supercells. Appendix C. NV centre in the 13 C diamond lattice In all above discussions, we have considered the NV centre in natural diamond, with the atomic mass of carbon atoms set to 12.0111 a.m.u. This is useful when comparing calculations to ensemble measurements, as done in the present work. Our results apply to NV centres in 12 C diamond as well. We have verified that the frequency of the dominant phonon modehω 0 =65.0 meV, the Huang-Rhys factor S=3.67, and the Debye-Waller factor w ZP L =2.4% in 12 C diamond are within 0.1 % of the values in natural diamond.
It is of interest, however, to study NV centres in 13 C diamond, where the different mass of carbon atoms may lead to more noticeable changes. The comparison of the main parameters pertaining to the vibrational sideband of NV centres in natural (or 12 C) diamond vs. 13 C diamond is shown in Table C1. Calculations have been performed at the HSE level. Compared to NV centres in natural diamond, the total HR factor Table C1. Comparison of calculated parameters pertaining to the phonon sideband in natural and 13 C diamond. m C is the mass of the carbon atom,hω is the energy of the most pronounced phonon mode, S is the total Huang-Rhys factors, and w ZP L is the weight of the zero-phonon line.