Electron localisation descriptors in ONETEP: a tool for interpreting localisation and bonding in large-scale DFT calculations

Electron localisation descriptors, such as the electron localisation function (ELF) and localised orbital locator (LOL) provide a visual tool for interpreting the results of electronic structure calculations. The descriptors produce a quantum valence shell electron pair repulsion (VSEPR) representation, indicating the localisation of electron pairs into bonding pairs and lone pairs in single molecules, coordination compounds and crystalline solids. We have implemented the ELF and LOL within ONETEP, a DFT code designed to perform calculations on systems containing thousands of atoms with plane-wave accuracy. This is possible using a linear-scaling formulation of DFT in which the Kohn–Sham orbitals are expressed in terms of a set of strictly localised non-orthogonal generalised Wannier functions (NGWFs), themselves expanded in a psinc basis set. In this paper, we describe our implementation and explore the chemical insights offered by electron localisation descriptors in ONETEP in a range of bonding and nonbonded situations.


Introduction
The electron localisation function (ELF) [1], devised by Edgecombe and Becke for Hartree-Fock (HF) calculations in 1990 and later extended to density functional theory (DFT) by Schmider and Becke in 2000 [2], provides a useful tool for understanding chemical bonding from electronic structure calculations. The ELF indicates where pairs of electrons exist in the system, including lone pairs and bonding pairs. It is a dimensionless quantity, in the range of 0-1, with the purpose of providing a visual description of the chemical bond for almost all compounds [3]. The descriptor has been widely used over the years in the study of catalysis, organic synthesis, transition states and crystalline structures, for example [4][5][6]. The ELF is implemented in conventional DFT codes including CASTEP, Gaussian, Quantum Espresso and VASP [7][8][9][10].
The ELF provides a valence shell electron pair repulsion (VSEPR) representation of coordination compounds. VSEPR is a simple model for understanding and interpreting electronic structure based on numbers of electron pairs, taught in many undergraduate chemistry courses [11]. σ and π bonds are distinguishable, as well as metallic and ionic bonding. The dimensionless quantity equals 1 in regions of perfect localisation with respect to a reference electron of like-spin. The ELF vanishes for one-electron regions dominated by a single, localised σ-spin orbital. The ELF corresponds to the electron gas-like pair probability when equal to 0.5. The quantity is a scalar field evaluable at all points in space, so can be visualised and interpreted alongside other similar quantities, for example electron density.
Here we describe the implementation of the ELF and a simpler version, the localised orbital locator (LOL) [2], in ONETEP, which is a linear-scaling DFT code, with the aim of providing the capability for visual description of chemical bonding in larger systems than currently available, for example, protein-ligand complexes involving a few thousand atoms. We present the reformulation of the ELF theory within the context of the density matrix DFT implemented in ONETEP, in section 2, experimental details in section 3, and validation using examples in section 4. This includes the results for single molecules and the key features of their covalent bonds, bulk structures, and the directionality of nonbonded interactions in an entire protein-ligand complex with ∼2600 atoms. We finish with some conclusions in section 5.

Theory and implementation
2.1. ONETEP ONETEP [12], the order-N electronic total energy package, is a DFT code whose computational cost is linear-scaling with respect to the number of atoms. The code uses a reformulation of the plane wave pseudopotential method and retains the plane wave accuracy offered by conventional cubic-scaling DFT codes.
In ONETEP, non-orthogonal generalised Wannier functions (NGWFs) are used to expand the molecular orbitals (MOs). The NGWFs are constrained to spherical localisation regions about the atom centres, to offer a way of exploiting the locality of the density matrix, in order to achieve linear-scaling of calculations with respect to system size. These NGWFs, are expanded in an orthogonal basis of periodic cardinal sinc (psinc) functions, D(r − r m ). Each psinc is centered on a point of a regular real-space Cartesian grid, r m . Only the psincs within the localisation region, L α , are included, using a non-zero c mα [13,14]. A Fourier transform links psincs to plane waves, therefore, ONETEP ultimately runs calculations with plane waves as with conventional plane wave pseudopotential DFT approaches, which allow for systematic improvement of the basis. Thus, the same control of accuracy is achieved, controlled just by the grid spacing, which is specified via the kinetic energy cut off of the psinc functions. This accuracy has been demonstrated by comparison against other non-linear-scaling codes such as CASTEP and NWChem [15,16].
The NGWFs are optimised during the calculation rather than fixed, to allow for fewer local orbitals to be used and to avoid transferability issues. This also has the important result of eliminating basis set superposition error (BSSE) [17].
The electronic density is constructed from the NGWFs as follows; using the Einstein summation for repeated Greek indices, for non-orthogonal quantities, where σ is the spin. The density kernel, for spin σ, where M is a linear transformation between the set of localised NGWFs, {φ α (r)}, and the MOs, {f iσ } are the occupation numbers. In order to avoid cubic scaling steps, the {ψ i } and M are not calculated in practice, the calculation proceeds by calculating the NGWFs and the density kernel.

Electron localisation function
The ELF was defined by Becke for HF and DFT and extended by Savin to periodic crystalline systems and closed shell systems [1,18], as follows: for the individual spin, σ, where χ σ (r) is the ratio, τ σ (r) is the kinetic energy density, τ W σ (r) is the von Weizsäcker approximation to the kinetic energy density, and where n σ (r) is the electronic density. The kinetic energy density for the uniform electron gas, τ LSDA σ (r), named the local spin density approximation (LSDA), is where c F is the Fermi constant, 3 10 (3π 2 ) 2 3 . The numerator, τ P σ , contains all the electron localisation information. τ P σ will be small in regions of space, r, where there is a high probability of finding a localised electron.
To gain a physical interpretation, τ P σ is the difference between the kinetic energy density and the von Weizsäcker kinetic energy density. τ W σ is the semi-local approximation to the kinetic energy density. The difference, τ P σ , therefore is excess kinetic energy density due to Pauli repulsion, which is called the Pauli kinetic energy density. Also, the von Weizsäcker kinetic energy density is a lower bound to the positive definite orbital kinetic energy density, in the limiting case where a single orbital is the main contributor to the density in that region, at which point τ σ and τ W σ are equivalent, hence the Pauli kinetic energy density is always non-negative [18][19][20].
Pauli repulsion is lower in regions where electrons are less likely to encounter same-spin electrons, for example, regions where electrons are more localised in covalent bonds or lone pairs. τ P σ is, therefore, small in regions of high electron pair localisation. τ P σ ranges from arbitrarily small values, which give no indication of how localised the electron is, to infinitely high values, therefore Becke and Edgecombe [1] proposed scaling it with respect to the kinetic energy of the uniform electron gas, as shown in equation (5), and introduced the mapping function (ELF) of equation (4). This formulation was chosen in order to obtain a more desirable range of 0-1 for plotting, as opposed to having an open bound. Electron pair localisation can be found at unity, i.e. when the probability of finding an electron in close proximity to another electron of the same spin is zero.
We have implemented the ELF in ONETEP using Becke's definition with n σ (r) as the electronic density in ONETEP, as defined above, in equation (2).
The kinetic energy density has been implemented in ONETEP [21], for overlapping NGWFs, φ α and φ β , for each spin σ, and the density kernel K αβ σ as defined above. In the case of spin unpolarised systems, the up and down spin densities are equivalent to half of the total electronic density; n σ = 1 2 n, therefore τ σ = 1 2 τ and τ W σ = 1 2 τ W . χ, in terms of the total electronic density, is therefore: χ σ is dimensionless, hence, the ELF is also dimensionless. For the case of the uniform electron gas, τ P σ and τ LSDA σ are the same, therefore χ σ equals 1; making the ELF equal to 0.5. This is a useful comparison value for the analysis in the results and discussion section.

Localised orbital locator
We have also implemented the LOL, also introduced by Becke [2]. The LOL is similar to the ELF but is simpler. Here, just the orbital kinetic energy density and the LSDA kinetic energy density are compared, without considering the Pauli repulsion; The quantity is, therefore, dimensionless for the LOL and ranges from 0-1 too, with localisation at unity. For the spin unpolarised case, like before we substitute in half of the total electronic density and half of the kinetic energy density:

Calculation details
The ELF and the LOL were calculated for systems ranging in size, some of which were compared with the original results of Edgecombe, Schmider and Becke, as well as additional analysis performed by Savin et al [3,18,20]. The XC functional rPBE was used for the nanoparticle systems, as used previously by Verga [22]. PBE functionals were used for all other results. VMD version 1.9.4 [23] was used to plot the isosurfaces, and Jmol version 14.29.24 [24] for the contour maps. An NGWF radius of 8.0 Bohr was used for every system, except for the Pt atoms of the nanoparticle and the CO adsorbate, which had an NGWF radius of 9.0 Bohr. The kinetic energy cut off used was 900 eV for all systems. A spherical Coulomb cut off of radius 28 Bohr was applied to the isolated hydrocarbons and 23 Bohr to the water molecule. All systems were initially geometry relaxed using ONETEP.

Results and discussion
It is interesting to observe how the ELF compares with the ingredients that make it, the density and the kinetic energy density. We show in figure 1 these three quantities as contour plots for the water molecule. These plots clearly indicate the chemical information gained from calculating the ELF, while the density and kinetic energy density are much less revealing in chemical terms. The regions of electron pairs are clearly shown in blue at high values of the ELF. The lone pair can be identified above the oxygen, and the O-H bonding pair is also clearly indicated.

Bonding types
The contour map of the ELF in figure 2 shows the covalent σ and π bonds in the molecules ethane, ethylene and acetylene. Each type of C-C bond (single, double, triple) gives rise to a characteristic form of the ELF, which enables the bonding type between the carbons to be identified without reference to the MOs. The π bond is pictured as more of a 'doughnut' shape between the carbons, whereas the σ bond is oval. The π orbital is more extended than the σ orbital. The region of the ELF that corresponds to the triple bond is much narrower compared with the double bond, which seems to indicate that the ELF is not as intuitive in describing triple bonds, possibly due to the delocalisation of the bonding electrons in two different planes. This is in agreement with literature [3].
The C-H bonds are represented by the diffuse blue regions surrounding the hydrogens in the contour maps through the C-C and C-H plane. The atomic cores are coloured in red, representing no electron pair localisation in the core region since we use norm-conserving pseudopotentials.

Larger systems
To demonstrate the ELF calculated for larger systems, we tested two crystalline lattices; diamond and sodium chloride. Figure 3 shows the ELF of these two in the 110-plane. The diamond lattice is comparable to results by Savin et al [18]. The blue regions of high value indicate the shared pair of electrons of the C-C covalent bond. The red centres on the carbons that match the value of the delocalised regions surrounding the molecules are due to the norm-conserving pseudopotentials used. The core electron density is absent and the valence electrons are not concentrated in this region. The NaCl lattice shows localisation solely on the atoms, as one expects for ionic bonding, with no electron pairs between the atoms.
To test systems with metallic character, we use previous results from our group, by Verga et al, of CO adsorption onto a Pt nanoparticle in hollow (hcp) sites [22]. This is helpful for gaining insights to the uses of the nanoparticle for catalytic activity, as these nanoparticles are models for catalysts. Electron localisation descriptor (ELD) analysis here is helpful for gaining insights on electronic structure and the interactions between reactants and surfaces. A cross section along the 111-plane through the Pt 55 cuboctahedral nanoparticle is shown in figure 3. Here the contour map demonstrates electron delocalisation between the atoms. There are yellow-green valence regions on the Pt atoms, indicating that there is no electron localisation. The green-blue surrounding regions do not correspond to the highly localised regions of covalent bonding between the atoms seen in the diamond structure, but instead, these are continuous regions of relatively delocalised electrons between the Pt atoms, indicated by a small ELF value of 0.4. This is clearly different from the red region between the atoms of NaCl and between the carbon atoms of diamond where there is no electron pair presence. The electron pair localisation only peaks at 0.44, for the Pt nanoparticle, which is not particularly high compared with 0.93 of the covalent diamond and 0.88 of the ionic NaCl. This behaviour of the ELF is consistent with a system with metallic character.
The ELF of the nanoparticle is also plotted as an isosurface in figure 4, at a value of 0.75. The first plot is of the relaxed geometry of the Pt 55 CO system. Going from left to right, the CO adsorbate is moved away from the Pt surface in increments of 1 Å, in the vertical direction. First of all, looking at the Pt nanoparticle, on the Pt atoms the ELF is negligible at a high localisation value of 0.75. This can be confirmed by the contour plot in figure 3, the scale of which shows there are no regions taking this value. In contrast, the ELF can be seen to show the delocalisation regions of the metallic nanoparticle as an isosurface in the supplementary information (http://stacks.iop.org/EST/2/027001/mmedia), plotted at a much smaller value of 0.35.
Returning to the varying distance of CO from the Pt surface in figure 4, we observe a more well-defined emergence of the lone pair on the carbon as the CO is moving away from the Pt, especially after a shift of 1 Å. This is consistent with the disrupting of the σ-donor contribution of the CO as it is pulled away from Pt and is returning to the isolated structure of the CO molecule. Equally observable is the strengthening of the bonding between carbon and oxygen. This is further supported by contour plots of the CO in the supporting information. The changes observed in the ELF as the Pt-CO distance increases are consistent with the conventional picture of metal-carbonyl bonding.

Lone pair directionality in biomolecular interactions
Finally, we investigate the ELF on a system of biomolecular and pharmaceutical interest. This consists of the interaction of a ligand with an entire protein (T4-lysozyme) with ∼2600 atoms, which we can calculate in its entirety by using a linear-scaling code such as ONETEP. Here we perform calculations on one snapshot of this complex from previous simulations by Fox et al [25], with the aim to examine what information we can obtain from the ELF about the ligand-protein nonbonded interactions. For example, in figure 5, where we focus on the region of the active site in the presence of the ligand, the ELF allows us to distinguish the presence of hydrogen bonds and lone pairs, which contribute to the protein-ligand stabilisation interactions.
The visualisation of the ELF can also indicate directionality of the interactions. Here, the hydrogens of the catechol ligand are pointing towards the oxygen electron pairs of the host protein, T4-lysozyme. The lone pairs can also be seen on the oxygen of the catechol, as well as all of the bonding pairs in the carbon ring.

Linear-scaling computational effort
The key feature of ONETEP is its linear-scaling capability. In ONETEP, both the electronic density and kinetic energy density, as given by equations (2) and (9), are quadratic forms, computed by the 'FFT box' technique, in which Fourier transforms are applied to a subregion of the simulation cell, instead of the entire simulation cell [12,13]. Efficient algorithms have been developed to obtain these quantities, with similar procedures, except that the evaluation of the kinetic energy density is more costly than the electronic density since it additionally involves the application of the gradient operator and a scalar product over Cartesian components [21,26]. As discussed also in these previous papers, these quantities are nonzero only when the NGWFs α and β overlap  . Measured walltime for calculating the ELF as a function of the number of T4-lysozyme protein-catechol ligand complexes (from 1 to 4) (black squares). The system was increased by one protein-ligand complex in the same direction, and the simulation cell dimension was extended by its original dimension size used in the previous work of Fox et al [25], 154 Bohr, each time. The (solid black) trendline, produced by the least squares method, shows a linear fit to the measured data. Calculations were performed on the Iridis5 cluster using 16 nodes with 40 Intel Skylake cores each, running at 2.0 GHz, for a total of 64 MPI processes, each spawning 10 OMP threads. Calculation parameters: kinetic energy cut off: 900 eV, NGWF localisation radius: 8.0a 0 .
spatially. Given the localisation of the NGWFs, the number of these spatial overlaps is expected to increase asymptotically linearly with respect to the system size. The implementation for the calculation of these quantities, in equations (2) and (9), is such that only the overlapping elements are included, and as a result, the overall cost is expected to be linear-scaling. The ELF formula, in equation (4), is a local function of these quantities so it is also expected to have a linear computational cost with the system size.
The calculation of the ELF was tested for its computational effort with respect to system size, for over 10 000 atoms, repeating the T4-lysozyme-catechol complex up to 4 times in an expanding simulation cell. The results, in figure 6, demonstrate that the software does indeed achieve linear-scaling for the ELF calculation, as the trendline of the measured data points is indeed linear. The linear-scaling cost of the total energy calculation in ONETEP has been demonstrated elsewhere [12,21]. In figure 6 the combined walltime for calculating the kinetic energy density, the electronic density and the construction of the ELF, is plotted. This combined walltime is approximately 4% of the total walltime for the ONETEP properties calculation.

LOL
The LOL [27], a simpler version of the ELF, has also been implemented in ONETEP. The only difference is that in practice the values of the LOL range from 0 to only ∼0.6, as shown in the examples in the supplementary information. From the examples given, the LOL can be regarded as producing slightly less chemically intuitive contour plots, however the desirable features are still present.

Conclusions
We have presented the implementation of the electron localisation descriptors, the ELF and the LOL, in the ONETEP linear-scaling DFT code. These descriptors provide a visualisation representation of electron pair localisation in a system in terms of a scalar field, with values ranging from 0 to 1. Importantly, the ELF is a visual descriptor and interpretative tool. We have compared our implementation with other codes qualitatively, without the consideration of basis sets or any other fundamental differences between codes. Our implementation of the ELF in ONETEP reproduces key qualitative features of other implementations. We have demonstrated the chemical information that these descriptors provide using a range of bonding situations, from isolated molecules to crystalline solids, with an example of how σ and π bonding types can be distinguished using the ELF. We have presented examples on covalently bonded materials such as diamond, ionic materials such NaCl, and metallic materials such as Pt nanoparticles. The bonding between the atoms is distinctly different between these materials, and this is reflected in the ELF representations. Furthermore, to demonstrate the usefulness of the ELF in biomolecular contexts, we have presented an example of nonbonded interactions in a large protein-ligand complex, of interest to drug design.
The availability of the ELF and LOL in a large-scale DFT code such as ONETEP opens the possibility for providing additional chemical insight in a large range of applications, such as surface chemistry, reaction mechanisms, and drug design. Future extensions of these developments in ONETEP could include extension to PAW calculations by including core densities and implementation of other visual electronic structures which make use of the same quantities as the ELF.
The supplementary information contains further isosurface plots of the CO at varying distances from the Pt 55 nanoparticle and contour plots of these, with a comparison between the ELF and the LOL. Additionally an isosurface of the LOL of the catechol ligand in the active site of T4-lysozyme is plotted.