Introduction

The increasing role of quantum chemical calculations in drug and materials design has led to a demand for methods that can describe the electronic structures of large and complex systems. Semiempirical methods based on the neglect of diatomic differential overlap (NDDO) approximation (e.g., the MNDO [1, 2], MNDO/d [3], AM1 [4], AM1* [5], and PMx [68] methods) are important representatives of such approaches. Many of these methods have been implemented in the massively parallel program EMPIRE [9], which makes the full quantum-mechanical treatment of systems containing 100,000 atoms or more possible.

Periodic boundary conditions (PBC) enable quantum chemical programs to treat condensed-phase systems, such as proteins in a periodic water box or solids. This allows molecular materials to be studied in their “native” environment, instead of comparing experimental bulk properties with gas-phase monomer calculations. For semiempirical methods, the most practical way of implementing PBC is the cyclic-cluster approach [1012] in which the system is approximated by a supercell and by imposing Born–von Karman boundary conditions [13]. Using a large unit cell allows the calculation to be performed entirely in real space. This is easily affordable because of the generally low computational cost of NDDO calculations. The main advantage of this technique is that program features like the calculation of local properties [14] or excited states are directly transferable from nonperiodic calculations [15]. We have, for example, used periodic EMPIRE calculations to model amorphous carbon [16].

EMPIRE, which was especially designed for calculations on systems with very many atoms, is also suitable for use on systems with very large unit cells (e.g., disordered and amorphous systems). EMPIRE can, for example, be used in combination with a classical molecular dynamics (MD) code to perform electronic structure calculations on snapshots from an MD run on a periodic system. In the first section of this paper, we discuss the implementation of periodic boundary conditions in EMPIRE. In the second, the program performance is discussed briefly. Finally, some exemplary applications of large-scale periodic NDDO calculations are shown.

Implementation

Periodic calculations in EMPIRE are performed entirely in real space. Therefore, no major changes to the NDDO SCF algorithm were required. Only small adjustments are necessary in the treatment of two-electron two-center integrals: the exchange energy and electrostatic interactions. These adjustments will be discussed below. For more background information, we refer the reader to [10, 12].

Two-electron two-center integrals

The values of the two-electron two-center integrals γ AB used in NDDO calculations (and the associated potential) quickly decrease with the distance between the centers but remain nonzero. Since these small values add up to unphysical, infinite potentials in a periodic system, they must be corrected [12]. This is achieved by introducing a Gaussian damping function that sets in at a cutoff value c cut (the default is 10.0 bohrs). The functional form for these integrals is (in atomic units)

$$ {\gamma}_{AB}=\frac{1}{r+{{\mathrm{e}}^{-0.25{\left(r-{c}_{\mathrm{cut}}\right)}^2\left(\frac{1}{G_A}+\frac{1}{G_B}\right)}}^2},\kern0.5em r>{c}_{\mathrm{cut}}, $$
(1)

where G A and G B are parameterized constants for elements A and B, respectively, and r is the distance between the two centers.

Exchange interaction energy

The next adjustment is required with respect to the two-electron two-center exchange integrals h μν , which appear in the Fock matrix. These terms depend on the density matrix elements P μν . In a periodic calculation, the exchange interactions for an orbital centered on a given atom are only evaluated within the Wigner–Seitz cell surrounding it. The net result is the neglect of very weak exchange interactions with distant electrons, which causes no loss in accuracy [10].

Electrostatic interactions

MNDO-like NDDO methods describe electrostatic electron–electron and electron–core interactions using multipole–multipole interactions. In a periodic system, small interactions with an infinite number of distant charges lead to unphysical results. This can be alleviated by introducing a simple, one-parameter screening function.

Simply put, distant charges are relocated to an effective distance r eff, which is a function of the actual distance r. The space around a charge is divided into three regions, delimited by a parameter α: at close distances (r < α), the actual and effective distances are equal (r eff =  r). At large distances (r > 2α), all charges are moved to a constant radius of 1.5α. In this manner, their effects cancel each other out due to symmetry [12]. In the intermediate region, the distance is scaled so as to satisfy the conditions

$$ {r}_{\mathrm{eff}}\left(\alpha \right)=r $$
(2)

and

$$ {r}_{\mathrm{eff}}\left(2\alpha \right)=1.5\alpha . $$
(3)

This scaling function [12] is defined as

$$ {r}_{\mathrm{eff}}=-\frac{\alpha }{2}+2r-\frac{r^2}{2\alpha }. $$
(4)

MNDO-like methods treat charges as distributed multipoles, i.e., point charges at a defined distance to a center. To keep this screening scheme completely consistent, the positions of all individual point charges that make up distant multipoles would need to be scaled. This is undesirable because it would require calculating the distance between all atoms and all distributed multipole point charges. To avoid this, we introduce a second scaling function for the distance between the point charges of a multipole and the atom on which they are centered. At distances r < α, the multipoles are unaffected. At large distances r > 2α, all multipoles are reduced to point charges. In the intermediate region α < r  < 2α, the multipoles are scaled by a factor λ(r), with the boundary conditions

$$ \lambda \left(\alpha \right)=1 $$
(5)

and

$$ \lambda \left(2\alpha \right)=0. $$
(6)

The corresponding function is the derivative of (4):

$$ \lambda (r)=2-\frac{r}{\alpha }. $$
(7)

The effect of scaling the multipole size can be evaluated by considering three different scenarios for an interaction between a point-charge and a dipole. The rigorous but unpractical solution is to scale the positions of the constituent point charges of the dipole individually, whereas the simplest approach would be to scale only the center of the dipole. Finally, we can scale the center of the dipole according to (4) and the distances of the distributed point charges to their center by (7).

Figure 1 shows the positions of the distributed monopoles for all three cases (α = 15.0 Å). Clearly, the scaling function λ is a practical way to describe the exact scaling of the multipole distance. This can also be shown by considering the Coulomb energy for the interaction between point charge and dipole. The dependence of the absolute error (i.e., the difference between the unscaled/scaled and exact cases) for this energy on the distance is also shown in Fig. 1. Below the cutoff value, all models are by definition equivalent. At the cutoff value, the scaled and unscaled errors are identical. At increasing distances, however, the error decreases for the scaled case and increases for the unscaled case.

Fig. 1
figure 1

Top: positions of distributed multipole point charges in the exact, scaled, and unscaled scenarios. Bottom: absolute error in the Coulomb energy for the scaled and unscaled cases

Performance

Setting up periodic calculations

Periodic EMPIRE calculations require little more input than nonperiodic ones. Apart from the Cartesian coordinates of all atoms, a unit-cell vector is required for each periodic direction. Calculations can be periodic in one, two, or three dimensions. Since no k-space sampling is performed, the unit cell should be sufficiently large. The MOPAC manual suggests that 7–8 Å per repeat unit vector should be sufficient for most compounds, and larger unit cells should be used for highly conjugated π systems and small band-gap materials [17]. It is good practice to check the convergence of the calculated results with the size of the unit cell. Figure 2 shows the convergence of the calculated heat of formation with the unit-cell volume for diamond, ZnO, NaCl, and the adamantane molecular crystal. The convergence of the unit-cell size may differ for other properties, as Bredow et al. showed for excitation energies, where significantly larger cells were required than for the ground-state energy [15].

Fig. 2
figure 2

Energy convergence with respect to the unit cell volume for diamond, ZnO, NaCl, and adamantane

The electrostatic screening parameter can be modified via the keyword ScreeningR, which sets the value of 2α in Å. The default is 30.0 Å. This conservative cutoff corresponds to the MOPAC default. Lower cutoffs result in lower computational cost, but whether the heat of formation is affected by the change should be checked. The energy convergences and computational costs of different values of α are shown for diamond and ZnO in the “Electronic supplementary material” (ESM, Figs. S1 and S2).

Single-node open MP scaling

Table 1 shows timings for AM1-SCF calculations of differently sized diamond and ZnO unit cells performed with the single-node OMP version of EMPIRE. The corresponding speedup for different numbers of cores is plotted in Fig. 3. The scaling is quite efficient; the speedup factor is >7 using eight cores for all systems investigated. The largest system considered here is the C512 unit cell, for which an SCF calculation takes less than 1 min on eight cores. This shows that on a modern desktop computer, periodic calculations with EMPIRE are absolutely affordable (see Table 2 and Fig. 4).

Table 1 Wall-clock times for AM1-SCF calculations performed with the single-node OMP version of EMPIRE. These calculations were performed on a node consisting of two quad-core 2.83-GHz Intel® Xenon® E5440 processors with 8 GB of memory. No hyperthreading was used
Table 2 Wall-clock times for AM1 SCF calculations performed with the multi-node hybrid MPI/OMP version of EMPIRE. Each node was equipped with two six-core Intel® Xeon® 5650 “Westmere” chips; the nodes were connected by an Infiniband interconnect fabric with 40 Gbit/s bandwith per link and direction. We used two MPI tasks per node and six OMP threads for each. No hyperthreading was used
Fig. 3
figure 3

Relative speedup factors for OMP parallel calculations of differently sized diamond and ZnO unit cells, performed on a single node consisting of two quad-core 2.83-GHz Intel® Xenon® E5440 processors with 8 GB of memory. No hyperthreading was used

Fig. 4
figure 4

Relative speedup factors for MPI parallel calculations of differently sized diamond unit cells, performed on the LiMa cluster. Each node was equipped with two six-core Intel® Xeon® 5650 “Westmere” chips; the nodes were connected by an Infiniband interconnect fabric with 40 Gbit/s bandwith per link and direction. We used two MPI tasks per node and six OMP threads for each. No hyperthreading was used

Multi-node hybrid OMP/MPI scaling

The scaling of the hybrid OMP/MPI multi-node version of EMPIRE was tested on the LiMa cluster at the Regionales Rechenzentrum Erlangen. Here, we used differently sized diamond unit cells from C1,728 to C13,824. Please note that very large unit cells also require large amounts of memory, especially because the integrals are stored, since their calculation is relatively expensive in periodic calculations. Therefore, it is not possible to use the same reference number of nodes when determining the scaling for these systems. The speedup is always relative to the lowest number of nodes feasible for a given system. Optimizing the SCF procedure for periodic calculations may improve the performance of EMPIRE on fewer nodes. As it is, the calculations scale very impressively up to twice the minimum number of nodes. Further increasing the number of nodes leads to a plateau.

Application

The application of NDDO methods to crystalline materials has been thoroughly tested and evaluated by Stewart, and will therefore not be discussed here in any detail [18]. Instead, we would like to focus on two aspects unique to EMPIRE: firstly, the calculation of local properties; secondly, the fact that even unit cells with thousands of atoms can be treated easily.

Local properties

A local property is any property that can be derived from the wavefunction of a structure and mapped onto a real-space grid, such as the electron density and the molecular electrostatic potential (MEP). These can be calculated with most electronic structure codes. EMPIRE (in combination with an auxiliary program) gives access to several additional local properties derived from molecular orbitals and their energies. These are the local electron affinity (EAL), ionization energy (IEL), electronegativity, and hardness, which have been used for biochemical QSPR studies and to predict the electron-transport properties of nanostructures [1927].

Figure 5 shows the molecular electrostatic potentials of a pristine and a defective ZnO\( \left(10\overline{1}0\right) \) surface. The nonpolar \( \left(10\overline{1}0\right) \) surface consists of rows of ZnO dimers that are separated by trenches [28]. The most abundant atomic defects on this surface are ZnO dimer vacancies [29]. The entire geometry was re-optimized with the MNDO/d Hamiltonian for both the pristine and the defective surface. The removal of one ZnO dimer clearly affects the electrostatic potential around the defect, making it necessary to choose a large unit cell to avoid interactions of the defect with its periodic image. This calculation required around 1 min per optimization step and converged in 32 steps. The cell contains 766 atoms and 3,064 electrons. Note also that the MEP does not depend on simplifications such as point-charge models. The multipole formalism used in MNDO-like techniques gives a very good representation of the MEP calculated at higher levels of theory, including anisotropic distributions around heavy atoms [30].

Fig. 5
figure 5

Left: the MNDO/d molecular electrostatic potential (MEP) projected onto an electron isodensity (0.01 e Å−3) surface of a ZnO\( \left(10\overline{1}0\right) \) slab calculated with two-dimensional periodic boundary conditions. The image on the right shows the same slab after a ZnO dimer was removed from the surface. The surface color code ranges from −50.0 (blue) to 50.0 (red) kcal mol−1

A recent application of periodic local property maps lies in the study of charge transport in organic materials [2527, 31]. In the condensed phase, the local ionization energy (IEL) and electron affinity (EAL) can be interpreted as the local valence-band maximum and conduction-band minimum, respectively. They can therefore be used to visualize the anisotropic electronic properties of a molecular crystal. More recently, the local properties have been used as external potentials to simulate charge transport (see [32]; Bauer T et al., A multi-agent quantum Monte Carlo model for charge transport: application to organic field-effect transistors, submitted). Figure 6 shows the local ionization energy (IEL) of a rubrene crystal projected onto volume slices that cut through the unit cell along its main axes.

Fig. 6
figure 6

AM1 local ionization energy (IEL) volume slices cutting through the rubrene unit cell perpendicular to the x, y, and z directions. The color code ranges from 360 (blue) to 600 (red) kcal mol−1

Low IEL values (shown in blue) correspond to electron-donating/hole-conducting pathways, whereas high IEL values (shown in red) represent energy barriers. In Fig. 6, the IEL maps look vastly different depending on the orientation of the volume slice. This is in line with experimental reports, which show that the field-effect mobilities in rubrene single crystals depend strongly on the orientation of the contacts [33, 34].

Large unit cells

As an example of a large system, we chose the solvated lipid bilayer membrane shown in Fig. 7. Specifically, the model consists of 128 1,2-dilauroyl-sn-glycero-3-phosphocholine (DLPC) and 3,840 water molecules equilibrated for 400 ns in a classical molecular dynamics simulation [35]. The unit cell contains 25,088 atoms and spans 62.502 × 65.506 × 58.441 Å3. An AM1-SCF calculation was performed on 384 cores of the LiMa cluster (64 MPI tasks on 32 nodes with 2 × 6 cores each). The SCF converged in 31 cycles and took a little over 3 h 7 min.

Fig. 7
figure 7

Left: model lipid bilayer membrane based on DLPC units and 3840 water molecules (taken from [34]). Right: MEP volume slice, color coded from −50 (blue) to 50 (red) kcal mol−1

Note that periodic calculations of this size push double-precision (64-bit) arithmetic to its limit, since many small values are summed to a very large result during the energy summation. To avoid numerical inaccuracies for large systems, this step is performed in quadruple precision (128-bit), and special care is taken in the ordering of the summands.

The resulting HDF5 binary wavefunction file has a size of 21 GB and can be used to calculate local property maps. The molecular electrostatic potential across the membrane is shown in Fig. 7 (right). This clearly visualizes the polar water layer and head groups and the nonpolar lipid bilayer. Such calculations could, for instance, be used to predict the permeability of membranes to different chemicals.

Every EMPIRE calculation also includes a Coulson population analysis. Figure 8 shows a plot of the Coulson charges of all oxygen atoms as a function of their vertical position. In this plot, five charge groups are discernable, corresponding to the four chemically distinct oxygen atoms in DLPC and the oxygen atom in water. This presents an interesting perspective in the development of force fields for condensed-phase applications, since the charges can be derived directly for the solid or liquid of interest.

Fig. 8
figure 8

Distribution of Coulson charges and vertical coordinates of all oxygen atoms in the DLPC membrane/water system

Conclusions

We have implemented periodic boundary conditions in the massively parallel semiempirical molecular orbital theory code EMPIRE. The standard SCF procedure of EMPIRE reliably converges the wavefunctions of a broad range of periodic systems, including covalent, ionic, and molecular crystals and surfaces as well as disordered biological systems such as a lipid bilayer. Like the nonperiodic version of EMPIRE, the program is parallelized in the single-node version via open MP, and in the multi-node version via a hybrid open MP/MPI approach.

The single-node version was shown to perform well for calculations on unit cells containing between 64 and 512 atoms, and to scale very efficiently on up to eight cores. The multi-node version allows systems with tens of thousands of atoms to be treated; the largest system described here consisted of 25,088 atoms. The program scaling is similar to that observed for nonperiodic calculations with EMPIRE [9].