Supplementary information for “ Automated effective band structures for defective and mismatched supercells ”

where the transformation or supercell matrix M is nonsingular with integer components, which implies that the SC is commensurate to the pc. The determinant of M is the multiplicity N of the SC, i.e. the ratio of the respective volumes VSC/vpc. The hexagonal pc and orthorhombic SC of the two-dimensional honeycomb net is depicted in figure A.1a. In reciprocal space, there are consequently two distinct Brillouin zones (see figure A.1b): the primitive cell Brillouin zone (pbz) and the smaller supercell Brillouin zone (SBZ). Their respective basis vectors~bi (~Bi) are again connected by the supercell matrix M: ~B = M−1~b. (A.2)


Introduction
E versus k band structures and derived quantities such as effective mass or group velocity are valuable concepts to describe the electronic properties of crystalline materials. The E( k) dispersion is usually evaluated along certain high-symmetry paths in the Brillouin zone (BZ) of a few-atom primitive cell (pc). These small cells cannot usually accommodate defects at experimentally relevant concentrations, or represent composite structures with significant lattice mismatch. In plane wave DFT [1,2], which requires periodicity, such structures are usually represented in supercells (SC). This corresponds to a BZ with a fraction of the size of that of the primitive cell and a multiple of the number of bands. Dispersion relations within this reduced BZ are cumbersome to interpret, promoting the use of integrated constructs such as the density of states (DOS). Figure 1 Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. 2 Current address: Centre for Predictive Modelling, School of Engineering, University of Warwick, Library Road, Coventry CV4 7AL, UK.
illustrates this problem: The bands of the 6×6 perfect supercell of graphene overlap and obscure the underlying physics. In calculations on defective or disordered supercells, one often wishes to make meaningful connection to (for example) band dispersions probed in angular-resolved photoemission measurements [3] in terms of smearing or shifting of bands in the primitive BZ.
There are several situations that require supercells: • random alloys, • interfaces between materials with different lattice constants, • defects.
In recent years, several methods to unfold supercell bandstructures have been proposed to map energy eigenvalues obtained from supercell calculations into an effective band structure (EBS) in the BZ of the primitive cell [4][5][6][7][8][9]. Of those, the scheme of Popescu and Zunger [6,8] is particularly attractive in the context of plane-wave pseudopotential calculations. They described their method primarily in the context of random alloys. In this paper, we demonstrate the utility of this scheme for supercell calculations in general and present an implementation within the popular plane wave DFT (a) primitive cell (b) 6 × 6 supercell The band energies are calculated at 42 non-unique wave vectors along this path and are represented by red plus symbols. The solid lines denote the 'traditional' representation of the (primitive cell) graphene band structure, which is obtained by connecting those band energies that originate from the same orbitals. As the supercell BZ is smaller than the primitive cell BZ, the path through the special -M-Kpoints of the SC is shorter. The SC path forms part of the pc path and is highlighted by a hatched background.
code CASTEP [10,11]. We demonstrate the efficacy of this approach for the study of band dispersion in defective, doped and incommensurate structures. We will first summarize the supercell EBS method (section 2). In section 3, we describe our implementation called bs sc2pc. Section 4 contains three representative examples using bs sc2pc to calculate effective band structures.

The supercell EBS method
The effective bandstructure method uses band unfolding techniques [4][5][6][7][8][9] to restore the E( k) picture in the primitive cell. It determines the spectral weight of the supercell eigenvalues and constructs a spectral function A( k, E) continuous in energy E, which replaces the discrete eigenvalue spectrum E( k) of the primitive cell. The method evolves from earlier techniques, such as the one by Wang et al [4], which was used to determine band edge states in disordered alloys. Zhang and Wang [7] use the generalised moment method to calculate A( k, E) for a narrow energy window at high-symmetry k points. By combining the spectral composition present in these descriptions with a k unfolding method, spectral functions can be obtained for the same k values and energy ranges as in 'traditional' pc band structures, which allows a direct comparison between the two. The core of the EBS method is obtaining the spectral weight of a SC energy eigenvalue E( K) by projecting the corresponding eigenstate on all pc Bloch states corresponding to the equivalent k (see supplementary information (stacks.iop.org/JPhysCM/26/485501/mmedia) and [8] for further details). This is a measure of the Bloch character of the pc state preserved in the SC, or, conversely, at which of several pc k i that map to a single SC K an energy eigenvalue contributes. From this information, the spectral function A( k, E) can be derived, which can be seen as a primitive cell representation of the supercell band structure. It should be noted, that the original derivation of Popescu and Zunger has to be modified for ultrasoft pseudopotentials [12] that fulfil a generalized orthonormality condition [13]. The necessary minor adjustments are described in the supplementary information.

Implementation
The EBS method requires full eigenstate information, i.e. both the eigenfunctions represented as plane wave coefficients and the band energies as eigenvalues. For larger systems, this is a considerable amount of data and to our knowledge there is no portable standard to encode such information. This makes it infeasible to implement the EBS method as a generic postprocessing tool to be used with output from various DFT codes. On the contrary, it is rather reasonable to implement the EBS method within a band structure calculation, so that the information can be drawn off at the source. As the effort of implementing a band structure DFT code from scratch is prohibitive, we instead modify the well-established CASTEP [10,11] package. In this way, our efforts could focus on programming the calculation of spectral weights, using preexisting functionality wherever possible.
The bs sc2pc utility automatically generates the supercell K points from primitive cell k point path, taking the symmetry of pc and SC into account. From the SC band energies E( K) it then extracts A( k i , E). Further implementation details can be found in the supplementary information (stacks.iop.org/JPhysCM/26/485501/mmedia) to this article.

Examples
We demonstrate the functionality of bs sc2pc in three systems which each require large supercells to capture either structural or substitutional defects or the combination of two mismatched structures. The three examples illustrate the utility of our tool for one-, two-and 3D systems.

Bismuth substitution in crystalline silicon
The first system is a single bismuth substitutional defect in bulk crystalline silicon. Bismuth-doped silicon is a promising candidate material for quantum information applications due to an anomalously strong hyperfine coupling [14][15][16][17]. A large supercell is required to eliminate interactions between periodic images of the defect and hence recover the electronic structure of an isolated substitution [18]. Here, the supercell is a perfect 3 × 3 × 3 multiple of the cubic Si unit cell with eight atoms, which is in turn a four-fold supercell of the facecentered cubic (fcc) primitive cell with two atoms. One of the resulting 216 Si atoms was replaced by a Bismuth atom and the atomic positions were optimized with CASTEP using a variation of the BFGS method [19], while keeping the cell dimensions fixed at the bulk Si value. This emulates the embedding of a single defect in the bulk material. The presence of the significantly larger Bi atom pushes the nearest neighbor silicon atoms away by almost 0.2 Å. The calculations used generalized gradient approximation (GGA) exchangecorrelation (XC) functionals in the PBE formulation [20] and ultrasoft (US) pseudopotentials [12] created on the fly using the standard pseudo-atom definitions distributed with the academic release of CASTEP 6.1. We used a plane-wave cutoff of (500 eV) and a 2 × 2 × 2 Monkhorst-Pack k-point grid [21]. At this level of representation atomic forces around the unrelaxed defect were converged to within 0.02 eV Å −1 with respect to both K point density and plane-wave cutoff energy. This compares to a geometry relaxation tolerance of 0.05 eV/atom. Forces on silicon atoms maximally distant from the bismuth site were less than 0.03 eV/atom, demonstrating system size convergence to a similar level. The effective band structure of the final configuration was then calculated along the L--X-U,K-path in the pbz using the same parameters. This band structure is shown in figure 2. For comparison, the band structure of silicon in the primitive cell calculated under the same condition is shown in the background.
It should be noted that the experimental band gap of silicon of 1.17 eV [22] is not reproduced in GGA calculations [23]. (While there are empirical or expensive methods to correct for this deficiency [23], this is of no relevance to the functionality of the bs sc2pc program.) This trend is confirmed by our pc band structure, which shows an indirect band gap of 0.60 eV between the valence band maximum at the point and the conduction band minimum along the -X-azimuth. The introduction of the bismuth atom in the supercell leaves most of the band structure intact; the highest intensities in the EBS coincide with the lines from the pc band structure.
There are subtle changes compared to the primitive cell: First, there is a nonvanishing spectral intensity across wide areas in the (E, k) plane. These low but non-zero contributions show oscillations with the underlying supercell periodicity. They are most remarkable near the Fermi energy E F , where they seem like echos projected over from SC-equivalent K points. Secondly, there is an additional low-energy band at just below −12 eV. Most notable, however, is the change to the band gap: The conduction band minimum is now at the X point at a band gap of only 0.55 eV. The new conduction band minimum is formed by a partial band (≈20% of a full band) splitting off from the main silicon conduction band, which remains at 0.72 eV. This information cannot be obtained in this form from a DOS calculation.
These results could then be used as a starting point for a further investigation into the electronic structure of a real defective system with sufficiently diluted dopants (precluding any interaction between them). In that case, the band structure for comparison to experiment can be determined as a weighted average of defective supercell band structures and that of the dopant-free (perfect) primitive cell. In addition, the intuitive representation of the electronic structure provided by the bs sc2pc tool could serve to study the known limitations of (a) (b) basic exchange-correlation functionals and their mitigation by more advanced descriptions. This information might be lost in the folded supercell band structures.

Stone-Wales defect in graphene
Stone-Wales (SW) defects [24] are topological defects in sp 2bonded carbon materials. The basic SW defect in graphene is a point defect where one bond of the honeycomb net undergoes a 90 • in-plane rotation. This replaces four six-rings of carbon by a pair of each five-and seven-rings. It has previously been shown that the most favourable configuration of this defect introduces an out-of-plane buckling of the graphene sheet [25]. We reproduced these results quantitatively for a basic SW defect in a hexagonal 6 × 6 supercell (72 atoms) with a 28 Å out-of-plane vacuum gap, using the PW91 GGA XC-functionals [26] with US pseudopotentials at a planewave cutoff of 560 eV on a 2 × 2 × 1 MP grid. Typical forces are then converged to within 5 meV Å −1 with respect to vacuum gap, cutoff and k point density. The defect is however not converged with respect to the supercell size; our simulations thus correspond to a periodic array of Stone-Wales defects (see [25] for a comparison of different supercells). We optimized the atomic positions using the LBFGS method [27]. In the final structure (shown in figure 3), the difference along the z axis between the highest and the lowest carbon atom in the supercell is 1.31 Å, in agreement with published results [25].
The key feature of the graphene band structure is the relativistic (linear) dispersion of the bands near the Dirac (K) point, where valence and conduction band meet to make graphene a zero-gap semiconductor with extremely high carrier mobility. These properties are captured almost perfectly in band structure calculations in graphene, which are shown as solid lines in the background of figure 4. The EBS of the Stone-Wales shown in the same figure shows distinct changes: Several band degeneracies are lifted by the introduction of the topological defect (top valence band at , lowest two bands at K) and also introducing band splitting away from the special points. In fact, from an EBS it is impossible to distinguish between band splitting and lifting a degeneracy; the information that allows assigning energy eigenvalues at different k-points to a single band and which is used to connect these eigenvalues in 'classical' band structure plots, is lost in the process of creating an EBS.
The magnification of the SW defect EBS in figure 4 shows the changes around the Dirac point: Here the valence and conduction band both split into three to four branches of different weights, with a significant share of the spectral weight opening up a partial band gap of around 0.2 eV and showing a band curvature. This could suggest that the presence of Stone-Wales defects adversely affects the conduction properties of graphene.

Potassium iodide in a carbon nanotube
The central cavity of carbon nanotubes (CNT) can be used as a template for the controlled growth of 1D crystals of binary halides [28,29], such as potassium iodide [30]. The constraints imposed by confinement and reduced coordination are responsible for drastic structural changes in these crystals compared to the bulk [31,32]. We study a (10, 10) singlewalled CNT filled with cubes of KI, where the ions sit on alternating corners, repeated along the CNT axis. As the unit cell lengths along the axis of nanotube (2.47 Å) and the KI nanowire (6.86 Å) are different, a DFT calculation with periodic boundary conditions requires either an unrealisticly high lattice mismatch strain, a vacuum gap in tube or filling [33,34], or a supercell. We used a structure consisting of eleven units of (10, 10)-CNT and four units of KI (472 total atoms) with a mismatch strain of 1.06%, which is a reasonable compromise between system size and mismatch strain. This is a significant improvement over previous band structure calculations by Yam et al [35], who determined the band structure for a single unit of KI at a lattice mismatch strain of 7.6%. The initial structure was optimized with the DFT package QUANTUM ESPRESSO [36], using PBE XC functionals [20] and CASTEP ultrasoft pseudopotentials [12] at a cut-off energy of 600 eV with a 16 Å vacuum gap. At these settings, forces are converged on a similar scale as in the previous examples. The final unit cell has a length of 27.14 Å; large enough that a computationally less demanding gamma-point-only calculation is sufficient for the preliminary self-consistent calculation. A section of the final unit cell is shown in figure 5. 4 See footnote 3. We then used bs sc2pc with identical settings and pseudopotentials to determine the effective band structure of the filled nanotube ( figure 6). For comparison, the band structure of an empty (10, 10)-CNT is shown in the background. To obtain this plot, the supercell periodicity was exploited in order to dramatically reduce the computational effort. The 23 k i sampling points along the CNT axis in the pbz fold into only three unique K i = {0, 1 4 , 1 2 } B z , where B z is the SBZ reciprocal basis vector parallel to the CNT. The EBS reproduces all bands of the empty CNT pc, including the crossing bands at the Fermi energy which confirm the metallic character expected of an empty (10, 10)-CNT [37] also for the KI-filled version. There are a number of additional, almost dispersion-free bands visible (e.g. at around −2 eV, −12 eV and −14 eV), while there are no KI states near the Fermi energy. Previous studies suggested that there is a transfer of negative charge from the KI to the nanotube [33,34,38]. Our simulations quantitatively confirm the charge transfer from published work [38], but this appears to have no significant effect on the EBS of the system.

Conclusion
In this work, we demonstrated the bs sc2pc program to determine effective band structures for supercells, based on the band structure unfolding method described by Popescu [8]. We demonstrated its functionality in three systems: a substitutional point defect in a bulk material, a topological defect in a 2D sheet and a mismatched supercell in a 1D nanotube. In each case, bs sc2pc could provide an EBS that offers insights into the electronic properties of these materials, which would be difficult to obtain from standard band struture calculations. The program is included in the current release of the DFT package CASTEP.
In a broader context, it would be desirable to provide a generic standalone supercell band structure unfolding utility that could be used with various plane-wave DFT codes. However, this would require access to the full wave function information (coefficients for spin, k-point, band and plane wave) in these codes, for which there currently exists no common format or interface. In addition, the symmetry analysis used to reduce the number of supercell k-points makes extensive use of the facilities provided by the CASTEP code and would need to be re-implemented for a standalone utility. This would make creating a generic band structure unfolding tool a challenging endeavor, but the instructive representation of effective band structures could still justify this effort.