A two-phase Hessian approach improves the DFT relaxation of slabs

A two-phase Hessian approach to DFT slab relaxation of slabs has been implemented and tested. It addresses weaknesses in the modified Broyden and Pfrommer BFGS algorithms specific to relaxing slabs. Complete Hessian and then inverse Hessian matrices with no strain/stress components are first constructed at high force signal-to-noise ratios with no accompanying relaxation. In a second phase the static inverse Hessian is used to relax the slab down to a low force tolerance.


Introduction
DFT calculations are often used to find the equilibrium conformations of groups of atoms: the forces on the atoms are calculated using DFT and the atoms moved until those forces are within a defined tolerance of zero. Some invest igations into ferroelectricity in slabs need precise phonon and polarisation calculations based on residual atomic forces in the range of 10 −4 -10 −5 eV Å −1 [4,8,10,14], which are difficult to attain. This applies particularly to thicker slabs and lower phonon frequencies. Other applications can have similar requirements.
When performing DFT simulations of SrTiO 3 , there was a requirement to distinguish between relative permittivities of 300 and 500, at an applied external electric field of 1 V Å −1 , equivalent to an approximate polarisation of 0.1 C m −2 . This requires a polarisation resolution of around 1.3 ×10 −4 C m −2 . Calculations using a Bsite cation Born dynamic charge of 8 [16], LDA (local density approximation exchange correlation functional) value for the bulk modulus of 220 GPa [12] and lattice constant of 4 Å give a nominal maximum force on any one atom of 3.4 ×10 −4 eV Å −1 . However, the SrTiO 3 polari sation is mainly dependent on the Ti ion movement, which forms part of the 'softmode' transverse optical phonon, for which the spring constant will be significantly less than indi cated by the bulk modulus. Hence our force tolerance conv ergence target was set to 10 −4 eV Å −1 . This paper concerns the problems encountered when attempting to use conventional DFTbased approaches to relax the atomic positions of slabs with more than a few layers to very low force tolerances.
The modified Broyden quasiNewton algorithm as described in Johnson 1988 [5] and the BFGS (Broyden-Fletcher-Goldfarb-Shanno) quasiNewton algorithm as mod ified and described by Pfrommer et al 1997 [11], while good at relaxing bulk solids, do not work well for perovskite slabs more than 8 unit cells thick in a vacuum. These algorithms are described in detail in books such as Numerical Recipes by Press et al [13] and Numerical Optimization by Nocedal and Wright [9]. Figure 1 shows the BFGS relaxation of a layered slab with a first layer of 3 unit cells thickness of SrTiO 3 , a second layer of 6 unit cells thickness of SrRuO 3  Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. cells thickness of SrTiO 3 . The entire slab is surrounded by vacuum at both surfaces. The DFT electronic energy conv ergence was set to 10 −7 eV per atom. |F|max, the blue contin uous line, is the maximum absolute residual force on any atom as a ratio of the convergence target of 0.01 eV Å − 1 . Hence a convergence ratio of 3 would represent a maximum atomic force of 3 × 10 −2 eV Å − 1 . Vertical black lines denote the start of new relaxation runs.
The maximum residual atomic force convergence ratio shows a downward trend from iterations 61-78, and thereafter oscillates randomly between 8 and 1 (representing 8 × 10 −2 to 1 × 10 −2 eV Å − 1 . All three parameter ratios reduce to below their respective convergence ratios of 1 at iteration 108. Meeting the convergence criterion appears to be more a matter of random chance than a consistent downward trend. There seems little chance of getting down to 10 −4 eV Å −1 (force convergence ratio 10 −2 ).
Our project required DFT relaxation of perovskite slabs of 12 and 14 unit cells in a vacuum down to atomic forces of less than 10 −4 eV Å −1 . Without manual intervention such low tolerances could not be obtained using standard DFT codes such as CASTEP (Cambridge Serial Total Energy Program) described in Clark et al 2005 [3], a planewave DFT code, or SIESTA (Spanish Initiative for Electronic Simulations with Thousands of Atoms) [1], an atomiclike orbital DFT code. CASTEP implements the standard DFT version of the Pfrommer BFGS algorithm. SIESTA uses the modified Broyden algorithm.
Inspection of the forces during slab relaxations indicated that a manual approach might succeed where the standard algorithms failed. After the successful manual relaxation of a simple slab, an automated algorithm more suited to slabs was defined and implemented. That is described here.

Pfrommer BFGS relaxation
In the linear regime the Pfrommer BFGS relaxation process for N atoms with 3N independent coordinates defines a Hessian matrix A as follows : (1) where ∆F is the set of stress tensor ∆σ i and atomic force ∆F i component changes and ∆X is the set of strain tensor changes ∆ i and atomic fractional coordinate changes ∆X i as in figure 2. This charts the convergence of iterations of BFGS geometry relaxation across multiple CASTEP DFT runs for a slab with layers of SrTiO 3 × 3 unit cells, SrRuO 3 × 6 unit cells and SrTiO 3 × 3 unit cells normal to the slab surface and surrounded by vacuum. The DFT SCF (selfconsistent field) step electronic energy convergence was set to 10 −7 eV per atom. Three measures of convergence are shown. |F|max (blue, continuous line) is the maximum residual force on any atom as a ratio of the target. dE/ion (red, dashed) is the change in total energy per iteration, averaged per atom, and |dR|max (green, dotted) is the largest atom move between iterations, both as ratios of their respective targets. Vertical black lines denote the start of new relaxation runs. From iteration 78 onwards the residual forces are almost random just above 10 −2 eV Å −1 (force convergence ratio of 1) with no obvious downward trend. There appears to be little chance of getting down to 10 −4 eV Å −1 (force convergence ratio 10 −2 ). Change vector structures for strain/individual atomic displacements and stress/atomic forces. In this notation ∆σ 1 = ∆σ 11 , ∆σ 2 = ∆σ 12 and ∆σ 9 = ∆σ 33 etc.
If the Hessian matrix A is known, it can be inverted to give a negative semidefinite inverse Hessian matrix H such that which drives the relaxation process. Care must be exercised during the inversion. The matrix A can be decomposed into eigenvalues λ k and eigenvectors e k , and the inverse matrix would normally then be specified as For a 3D bulk relaxation, up to six orthogonal eigenvectors of A form the basis set for translations and rotations of the entire structure. In an exact representation of A, the translations have eigenvalues of zero, as the force between atoms does not change on displacing the whole structure. Periodic boundary conditions on the DFT supercell will normally prevent rota tions from having zero eigenvalues. The corresponding e k i e k j /λ k terms with zero eigenvalues must be omitted from the summation in (3).
Consider the effect of an set of ion displacements de i which are a multiple d of the i th nonzero eigenvector e i of A. If the atoms are perturbed from an equilibrium position then the restoring force must be in exactly the opposite direc tion. Hence all the nonzero eigenvalues of H must be nega tive. Both A and H must therefore be negative semidefinite matrices.
The stress tensor σ is defined in terms of the individual strain tensor components ij by where U is the energy and Ω the supercell volume as in Knuth et al 2015 [7]

Structure of the inverse Hessian relaxation matrix
The block structure of the Pfrommer BFGS inverse Hessian relaxation matrix is illustrated in figure 3. The top left block maps stress to changes in strain tensor components. The bottom right block maps atomic forces to changes in indi vidual ion fractional coordinates.
In normal bulk solids, where all bonds are of somewhat sim ilar strength, the upper right X i j cross terms mapping atomic forces to strain changes, and lower left X i j cross terms map ping stress to atomic displacements would typically be small, but may not necessarily be zero. But if a previously relaxed material has a plane of atoms with weak bonding on one side and strong bonding on the other, a uniform stretch normal to the plane creates nonzero net atomic forces on some atoms in the plane, which would introduce significant upper right and lower left X i j cross terms in the inverse Hessian.
Pfrommer et al 1997 [11] recommend a specific block diagonal, initial inverse Hessian matrix which has 9 identical scalar terms on the top left diagonal relating strain and stress, and 3 × 3 blocks on the bottom right diagonal relating atomic forces to changes in position.
Starting with this initial inverse Hessian, the n + 1 th move is calculated using the Pfrommer et al 1997 [11] formulae where λ is determined by a line minimisation. Close to equi librium, λ = 1 is the only value used most of the time. After each step, the new values for the atomic forces are calculated and the inverse Hessian is modified using a formula given in Pfrommer et al 1997 [11].
In theory, in the harmonic regime, for a system of N inde pendent atoms, after 3N moves the inverse Hessian should be fully developed and the structure fully relaxed.

Weaknesses of the Pfrommer BFGS approach applied to slabs
For both slab and bulk solid relaxations, the inverse Hessian matrix H must remain negative semidefinite. That is, if v is any nonzero vector then the scalar s = v * Hv (7) must be either zero or a negative real number.
For noisefree forces, and displacements from equilibrium in the quadratic regime, H can be shown to be negative sem idefinite by representing v in terms of the orthogonal eigen vectors e m of A and H as follows : From (3), using the fact that the eigenvectors e m are orthogonal: (C m ) * C m is always positive. From section 2.1 Pfrommer BFGS relaxation, all the eigenvalues λ m of A are negative or zero. The eigenvectors corresponding to zero eigenvalues of A are not used in the construction of H. Thus for any v, which is a linear combination solely of such excluded eigenvectors, the effective eigenvalue of H is zero. Hence v * Hv 0 and H is therefore negative semidefinite. After each move, CASTEP performs an indirect check that H is still negative semidefinite. Although detailed checking  confirmed that CASTEP accurately implements the Pfrommer BFGS algorithm, for the slabs of interest, the check that H was negative semidefinite failed every 4 or 5 moves. This resulted in a reset of H to its initial values. Working closely with mem bers of the CASTEP development team did not enable them to identify a potential fix for the problem. Although the exact cause of the failure to stay negative semidefinite is unknown, it may be related to noise in the forces. It is relevant to some of the problems in relaxing slabs discussed below. One specific difference between relaxations of slabs and bulk systems is the presence of a null move. This consists of a change in strain normal to the slab, exactly counterbal anced by changes to atomic fractional coordinates in the same direction. Although both changes are nonzero, there is no change to the relative position of bonded atoms, measured in Ångstroms (although the size of the DFT supercell vacuum region is no longer the same). The corresponding eigenmode has an eigenvalue of zero. If this is not removed during incre mental updating of the inverse Hessian it can potentially cause numerical problems as the corresponding inverse Hessian eigenvalue diverges.
Another problem arises because average strain in a slab is directly caused by the movements of surface atoms, and not those in the interior of the slab.
For a bulk solid at equilibrium, the forces on each atom are zero. Applying a small tensile strain, in the direction of one unit cell lattice vector, will produce a proportionate stress. If the bulk solid consists of only one type of atom, this should produce no net atomic forces. If the bulk solid con sists of multiple atomic types, and all bonds are of similar strength, it will produce only small net atomic forces, as forces from stretched bonds in one direction will mainly be offset by forces from stretched bonds in the opposite direc tion. However, the stress and crystal energy will increase as the crystal is stretched relative to the lowest energy equilib rium configuration. Application of the initial inverse Hessian to the stress tensor of the strained material defines a change in strain approximately equal and opposite to the original strain applied, leaving small local atomic forces to be relaxed away.
A slab at equilibrium does not behave the same way. A small tensile strain normal to the slab surface will not only cause a proportionate stress, but also a net force towards the slab centre on atoms at the slab surface. These atoms have no outwardfacing bonds to counterbalance the inward force due to the stretched inside bonds. The multiplication of the stress and atomic force vector by the initial inverse Hessian will define, not only a recommended strain change, but also super fluous fractional coordinate moves to slab surface atoms. In the next iteration, the forces on the atoms in a plane are trans mitted, diluted, to the next inward atomic plane, and so on. In theory the dynamic updates to the inverse Hessian will even tually introduce cross terms between stress and atomic moves in fractional coordinates, and between surface atom atomic forces and strain. Although these should lead to an effective inverse Hessian for the slab, this did not happen, prevented by resets of the inverse Hessian, which was no longer nega tive semidefinite after each four or five relaxation iterations, as discussed above.
Suppressing changes to supercell lattice constants to switch off the processing of stress and strain also fails. Strainlike distortions of the whole slab, would be present, for example, if the lattice parameter normal to the surface were incorrect. These must then be relaxed by motions of individual atoms, related to the atomic forces via the lower righthand block of the inverse Hessian matrix. Unfortunately, the only large net atomic forces present in a strained slab are those on the sur face layers; the bonds inside the slab are stretched too, but bulk atoms have similar forces in opposite directions which mainly offset each other to leave a much smaller net force. To correct the larger forces on only the surface atoms requires correlated moves of at least (N − 1)/2 atoms before the BFGS inverse Hessian is fully developed. When there is significant noise in the DFT forces, to develop such a correlated move from the initial inverse Hessian is difficult. The relaxation of a stretched slab produces waves of atomic moves towards the centre, decaying as they propagate.
As the relaxation proceeds, and the atomic forces reduce, the finerresolution updates to the inverse Hessian matrix take place at decreasing force signaltonoise ratios, also making optimum convergence less likely.
A slab can be viewed as a system for which, applying the usual DFT periodic boundary conditions, the bond via the vacuum between the two slab surfaces is just extremely weak. Other systems contain both weak and strong bonds, such as an array of molecules bonded together by Van der Waals forces. Therefore, for these systems, similar problems are expected to arise and the standard Pfrommer BFGS algorithm may not be effective.

Modified Broyden relaxation
Johnson 1988 [5] developed a modified version of the Broyden 1965, quasiNewton minimisation method [2] for coupled systems of nonlinear equations. This combined Srivastava's 1984 [17] modifications to the Broyden method, which minimised use of memory, with Vanderbilt and Louie's 1984 [18] process and convergence improvements. Although emphasising the application of the modified Broyden method to electronicstructure energy minimisation, Johnson notes that it can also be used for moleculardynamics simulations.
Johnson defines the modified Broyden algorithm using a Jacobian matrix, whereas Pfrommer [11] defines enhance ments to BFGS using a Hessian. However, a Hessian is the Jacobian of a gradient, and the Jacobian of a force vector is equivalent to the Hessian of an energy scalar, so, while differing in detail, the two methods are based on similar constructs.
The reasons why modified Broyden relaxation failed for our slabs were not investigated in detail. However, due to the similarities between the methods, the reasons for failure of modified Broyden to relax thick slabs sufficiently are likely to be similar to those causing Pfrommer BFGS to fail. Both the presence of a null move, causing divergence, and the interfer ence of surface atom forces and positions with standard stress and strain processing, are likely to contribute to the failure to relax slabs sufficiently, as described in section 2.3 Weaknesses of the Pfrommer BFGS approach applied to slabs.

The two-phase Hessian approach
A more robust approach to high precision relaxation of a slab in a vacuum is to completely determine the inverse Hessian matrix at a relatively large force signaltonoise ratio before using it to drive the relaxation of the structure in a second, separate phase. The better force signaltonoise ratio results in a more accurate inverse Hessian than one built dynami cally during relaxation, allowing convergence to a lower force tolerance.
The stress/strain mapping is dropped from the inverse Hessian. In its place, the algorithm relies on sufficient precision in the formation of the Hessian matrix to identify relaxation modes involving moves of most of the atoms in the structure. This enables the algorithm to respond to stresses induced by a pure strain without incorporating an explicit strain/stress map ping. Hence the process uses a fixed supercell lattice vector in the direction normal to the slab surface. This approach is suitable only for slabs and not for bulk materials, for which the stress/strain processing is relatively independent of that for atomic forces and fractional coordinate positions.
An effective twophase automatic method of performing 1D relaxations in the direction perpendicular to the surface of the slab is as follows: Phase 1-Hessian discovery and processing 1. A DFT run is performed for a base configuration with fractional coordinates X base , producing forces F base . For the i th independent atomic coordinate the positions and forces are X base i and F base i . Each atomic coordinate specifies the position along the axis normal to the slab of one atom or a set of atoms constrained to the same coordinate.

A series of DFT runs is performed, equal in number
to N, the number of independent atomic coordinates, during each of which only a single independent coor dinate is temporarily displaced from the base atomic coordinates by a displacement h, chosen as described below. The nth run uses atomic fractional coordinates X n of which the ith coordinate is X n i where : and results in forces F n . 3. The changes in all atomic forces for each displacement are calculated using :

The Hessian matrix A is defined by
5. The Hessian matrix A is decomposed into normalised eigenvectors e k each with eigenvalue λ k .
from (3) above. This inverse Hessian is never changed during the phase 2 slab relaxation below.

Phase 2-Slab relaxation
8. The atomic positions within the slab are now relaxed iteratively using X r+1 = X r + HF r (15) where r is the relaxation iteration number, X 0 = X base and F 0 = F base , until the largest residual error is within a factor of 10 of the noise expected in each element. 9. The atomic positions within the slab are now relaxed iteratively using a lower proportion, 0.3, of the calcu lated move: until either the desired force convergence tolerance has been reached, or the DFT steps show no systematic reduction in residual force. 0.3 is just a factor which works well across later relaxation steps, because it reduces the impact of a large coordinate change solely due to a random, high level of force noise. 10. I f the desired force convergence tolerance has not been reached, consider reducing the force noise in each step by one or both of: • increasing the energy cut off • applying specific DFT code eggbox removal tech niques (discussed below). and then performing further relaxation steps.
If the desired residual force threshold is not reached in a reasonable number of phase 2 relaxation iterations then the process is now repeated from the first phase 1 step using the atomic configuration with the lowest maximum residual force as the new base configuration.

Phase 1-Hessian discovery and processing
The Hessian discovery step size h needs to be chosen to be sufficiently large to give a good force signaltonoise ratio (recommended to be in the range of 30-100) for the largest force change on any atom in each phase 1 Hessian discovery step. It must also be sufficiently small to stay within the linear displacement/force region around equilibrium, if the initial atomic starting positions permit this. Values for h were typi cally chosen to result in temporary displacements of 3 × 10 −3 Å. The step size, h, was kept constant for all atomic coordinate changes during the Hessian discovery phase.
Using a constant step size h in all phase 1 discovery steps results in a varying force signaltonoise ratio across discovery steps. A temporary displacement h of certain single atomic coordinates results in a bigger maximum force change on any atom than is produced by the same displacement of other single atomic coordinates. For the second and subsequent times through the two phase process, for each discovery step, h could be varied inversely with the maximum force change produced on any atom for that step the first time through. This would result in similar absolute values for the largest resulting atomic force change for every discovery step, and provide a consistent maximum force signaltonoise ratio across steps. This enhancement was not implemented.
In the 1D case the single translational eigenvector to be removed from inverse Hessian construction should have an eigenvalue close to zero. It will also have similar values for each eigenvector element.
In the 3D case there will be three orthogonal translational eigenvectors, all with eigenvalues close to zero. The transla tional eigenvectors' separate x, y and z components should independently have nearly constant elements within that comp onent. Periodic boundary conditions should ensure that rotations within a DFT supercell have nonzero eigenvalues. Figure 4 shows the Hessian eigenvectors for a 1D triple layer slab relaxation using the twophase Hessian approach.
The inplane x and y coordinates of the atomic positions are fixed. In the z dimension the first layer is 4 unit cells thickness of PbTiO 3 , with a second layer of 6 unit cells thickness of SrRuO 3 and a final layer of 4 unit cells thickness of PbTiO 3 . In the z dimension both slab surfaces are surrounded by vacuum.
Each eigenvector coefficient is plotted on the chart yaxis against the z fractional coordinates of the atom corresponding to that eigenvector element. The two different atomic types in each atomic plane have similar z coordinates but different eigenvector element values. The lines are smoothed using splines with no markers on the points. This produces a clear chart, but also multiple values for some fractional coordinates, where multiple atoms in the same plane have similar fractional coordinates but different eigenvector coefficients.
The six eigenvectors of the 1D Hessian with the least nega tive lowest eigenvalues are displayed. The single translational mode, represented by the flat, dark blue line, has an eigen value of 0.68-close to zero as expected. The cosinusoidal modes on the chart represent negative Hessian eigenvalues whose eigenvector elements average to zero. Their spatial fre quencies increase monotonically as the eigenvalues become more negative.
The twophase Hessian approach can succeed without incorporating the explicit relationship between stress and strain into the inverse Hessian because the relaxation process is now accurate enough to generate automatically the set of atomic moves required to eliminate linear stress and strain.
If the base structure starts too far from equilibrium then the change in forces used to build the Hessian may not be linear with the additional displacements and the residual force conv ergence target may not be reached. Further, the noise in the forces in the Hessian discovery steps can magnify particular crossterms in the Hessian, shifting Hessian eigenvalues, which ought to be negative, closer to zero.
The quality of the discovered Hessian can be evaluated by examination of its eigenvalues and eigenvectors, and the matrix should be symmetric. For eigenvectors which should not be discarded, eigenvalues which are positive or too close to zero are signs that the inverse Hessian quality will not be good enough to relax to a tight residual force tolerance. However, in 1D relaxations, using the phase 1 inverse Hessian in a phase 2 relaxation always lowered the residual forces somewhat, even if the discovered Hessian was not ideal. The partiallyrelaxed atomic configuration with the lowest residual forces was then selected as the new base atomic configuration and the entire procedure repeated.
Eigensystem decomposition instead of SVD (singular value decomposition) was used in an exploratory manual relaxation of a slab and use continued by default in the automated two phase Hessian relaxation process. It helped build an intuitive understanding of the quality of the Hessian.

Phase 2-Slab relaxation
In theory, if the forces are perfectly linear with the displace ment from equilibrium, a single application of H to the forces in the base configuration DFT step will completely relax the forces to zero. However, although the complete inverse Hessian H has a lower signaltonoise ratio than an inverse Hessian generated dynamically using the Pfrommer BFGS algorithm, it is still not perfect. Further, the residual forces to which H is applied also contain noise. Hence multiple itera tions are necessary.
Problems were encountered iteratively applying H to low residual forces, just above the convergence tolerance. Hence, from that point, a fraction 0.3 of the calculated move was applied, until the target residual force was reached, or there was no further reduction in residual forces. Using a ball and spring model with Gaussian force noise applied, invest igations showed that it is best to allow the initial relaxation steps to use the full move, changing to a fraction of 0.3 only once the maximum residual force on any atom reduces to an order of magnitude larger than the estimate of residual force noise.

Reducing force noise.
During the later relaxation steps the residual forces become lower and the relaxation less effec tive because the force signaltonoise ratio also reduces. At this point the force signaltonoise ratio can be improved in various ways.
Increasing the energy cut off will directly reduce the force noise, as it increases the resolution of the DFT realspace grid(s). Increasing the energy cut off partway through phase 2 relaxation temporarily increases the residual forces because the equilibrium atomic configuration also changes slightly. However, after a few relaxation steps the residual force will be reduced below that before the increase of energy cut off. See figure 5 on page 14.
In DFT codes small changes in nuclear positions with respect to the realspace integration grid cause significant energy changes-the eggbox effect described in Junquera's slides [6] and in RuizSessano et al 2012 [15]. Apart from increasing the energy cut off, specific DFT codes have specific ways of reducing the impact of the eggbox effect.
In SIESTA, eggbox noise can be reduced by specifying a set of realspace grid fractional offsets with regular spacing at which forces are recalculated. Taking the average of the results across all fractional offsets often reduces the force noise by one or two orders of magnitude, though sometimes this is not effective unless the energy cut off is also increased. CASTEP and other plane wave codes use a realspace fine grid with a resolution which can be altered independently, both of the energy cut off, and of the standard grid scale used for the reciprocalspace grid. Increasing the realspace fine grid resolution reduces eggbox force noise.
SIESTA provides the total of forces along each lattice vector direction for each DFT step-referred to below as the 'total force discrepancy from zero'. Physically the total should . The green, two dots, one dash line is for the CASTEP BFGS relaxation of a shorter SrTiO 3 × 3/SrRuO 3 × 6/SrTiO 3 × 3, 62 atom slab with no field. Both use a DFT SCF (selfconsistent field) energy convergence of 10 −9 eV/atom. The twophase Hessian relaxation iteration number shown is for the second, relaxation phase, before which, all 59 phase 1 discovery steps were run. The twophase Hessian slab converged down to a maximum force on any atom of 1.2 × 10 −5 eV Å −1 after parameters affecting the force noise were changed twice. The number of offsets from each grid point at which forces are calculated and averaged was doubled from 40 to 80 at iteration 15, which did not disrupt the relaxation. The energy cutoff, which also determines the realspace grid spacing, was changed from 1600 eV to 6400 eV at iteration 31, causing an immediate increase in the residual force after which the downward movement resumed. The BFGS relaxation shows much slower convergence, similar to the same slab in figure 1 which was performed with a higher DFT SCF energy convergence of 10 −7 eV/atom. be zero but in a SIESTA DFT step it often is not. However, when relaxing a centred atomic configuration showing inver sion symmetry, the total force discrepancy from zero in each lattice vector direction can be zero or very low. This does not necessarily indicate low force noise, just that most force noise on one atom is not random and is equal and in the opposite direction to the force noise on its inversionsymmetric partner atom, as eggbox effect force noise would be. A similar effect was seen in CASTEP during DFT runs to analyse force noise. Displacing the entire atomic configuration slightly from the centre may expose a higher total force discrepancy from zero.
The techniques above enable most of the processing, including Hessian discovery, to be performed efficiently at higher absolute force noise, but, by using an appropriate dis covery step size h, still with large force signaltonoise ratios. More expensive, lowernoise steps are required only towards the end of the relaxation, optimising overall efficiency.

Number of DFT steps
The number of DFT job steps for phase 1 Hessian discovery is the number of independent atomic coordinates plus one. For a well formed inverse Hessian the number of phase 2 relax ation steps should be independent of the number of atoms. Typically 30 to 50 relaxation steps sufficed for perovskite slabs of 42-72 atoms.
The number of SCF (DFT self consistent field) iterations per job step (each job step resulting in one set of atomic moves) is greatest for the initial base run. Thereafter, reuse of wave function or density output files as input to subse quent steps reduces the number of SCF iterations. The phase 1 (Hessian discovery) second and subsequent DFT steps take approximately the same time. However, each fractional real space grid offset contributing towards the average of forces, as described in section 4.2 above, requires a similar time to an SCF iteration.
Although relaxation steps take less time as the phase 2 relaxation proceeds, the constant processing time for the force calculation at multiple fractional offsets starts to dominate the reducing number of SCF iterations per DFT relaxation step. If higher energy cut offs and/or more realspace grid offsets are used to reduce force noise in the later DFT relaxation steps, each subsequent step will be considerably slower than earlier steps. The number of steps cannot readily be compared with those of Pfrommer BFGS (CASTEP) or modified Broyden (SIESTA) as neither of these provided the desired conv ergence for perovskite slabs without manual intervention.

Results
The convergence of the relaxation for an SrTiO 3 × 4 (unit cells)/SrRuO 3 × 6/SrTiO 3 × 4, 72 atom slab with external applied field of 1 V Å −1 ; is shown in figure 5. Compared to the results obtained using the initial parameters, parameter changes to reduce the noise in the forces restored the down wards convergence, enabling a reduction in residual forces of a further two orders of magnitude.
Using the twophase Hessian algorithm the slabs of interest were relaxed to well below the target of 10 −4 eV Å −1 using SIESTA 4.1b3. Typically, multiple twophase Hessian runs were required. This compared favourably with those few SIESTA relaxation runs using the modified Broyden method which came close to a higher atomic force convergence target of 2 × 10 −4 . But, in those cases, the slab had to be manually (de)strained multiple times to remove obvious stress before this target could be reached.
A BFGS 1D relaxation of an SrTiO 3 × 3 (unit cells)/ SrRuO 3 × 6/SrTiO×3 slab, with no applied field, is also shown on the chart (maximum force on any one atom shown by the green, two dot, one dash line). The convergence of this is very slow. This slab is slightly shorter than those in the SIESTA, twophase Hessian relaxation, and there is no applied electric field (allowing imposed symmetry). Both of these are reasons why the BFGS relaxation would be expected to be easier and quicker than the twophase Hessian relaxation.
A comparable SIESTA, modifiedBroyden relaxation was not possible, as SIESTA does not allow only one supercell lat tice vector to be varied to perform a 1D relaxation.

1D simple ball and spring model results
To provide further insight the Pfrommer BFGS and twophase Hessian relaxation algorithms were tested on a simple 1D ball and spring relaxation model. This model is described in more detail in the supplementary materials (stacks.iop.org/ JPhysCM/30/315901/mmedia).
The system studied consisted of a chain of 15 atoms joined by harmonic springs with arbitrary, nonregular, initial spacing and with both random force noise with a Gaussian distribu tion and eggbox force noise, described in section 4.2.1. The eggbox force noise is sawtoothshaped, varying periodically with atomic position offset from the nearest grid line of a randomlypositioned grid. The zeroes of the sawtooth occur when an atomic position coincides with a grid line.
For a 15atom chain, the ball and spring model force conv ergence reached for each algorithm is shown in table 1.
The ball and spring model was run 100 times each with Gaussian force noise, eggbox noise, or a mixture of the two. Each run consisted of 100 iterations, and the lowest value for the maximum residual force on any individual atom was noted. The higher the fraction of eggbox noise relative to random Gaussian noise, the worse the Pfrommer BFGS algo rithm performed relative to the twophase Hessian algorithm. However, the minimum residual force was highly dependent on the exact resolution and offset of the nominal grid, relative to the expected equilibrium atomic positions. Thus the values in table 1 should be regarded as indicative only.
In the presence of eggbox noise, the Pfrommer BFGS algorithm performed worse. There were also significantly more numerical problems than with the twophase Hessian approach, caused by breaches of the requirement that the inverse Hessian remains negative semidefinite.
The twophase Hessian approach gave maximum ben efits close to the noise level when only 0.3 of the calculated displacement was applied in the following iteration. This tech nique enabled convergence to lower residual forces than either the Pfrommer BFGS approach or the twophase Hessian approach when the full calculated displacement was always used.
The twophase Hessian approach appeared to perform similarly in real DFT runs and in ball and spring model runs. However, the Pfrommer BFGS relaxation performed worse in real DFT runs than in ball and spring model runs, for reasons which we do not understand.

Conclusions
The twophase Hessian approach works well for slabs not successfully relaxed by the modified Broyden or Pfrommer BFGS algorithms. It could also be useful for bulk materials where very low residual forces are more important than run time. The inverse Hessian matrix derived using individual atom moves is constructed from forces with high (30-100) signaltonoise ratios. In the standard Pfrommer BFGS algo rithm the residual forces (i.e. signal), used to dynamically update the inverse Hessian, reduce as the relaxation proceeds, whereas the force noise remains constant. The reduced signal tonoise ratio degrades dynamic updates applied to the inverse Hessian during the later relaxation steps.
If the twophase Hessian algorithm is operating in the linear force regime, it is more robust and relaxes to lower residual forces than standard Pfrommer BFGS. If outside the linear force regime, the full twophase Hessian process needs to be repeated.
The twophase Hessian procedure allows for increasing the force signaltonoise ratio towards the end of the relaxation, by increasing the energy cutoff and/or taking additional measures to counter eggbox effects. This can reduce the residual forces efficiently, incurring the overhead of increased calculation only when necessary. In principle this could also be imple mented in the Pfrommer BFGS algorithm, but care would be necessary to avoid updating the dynamic inverse Hessian for the step during which a higher energy cut off was first used, as the forces would change purely because of a new equilib rium atomic configuration caused by the higher energy cut off. This would cause a bad inverse Hessian update for that step. Similar considerations apply to modified Broyden.
The supplementary materials contain implementation details for the twophase Hessian process and further details of the 1D ball and spring model. Comparison of the residual atomic forces, for the Pfrommer BFGS and the twophase Hessian algorithms, calculated using a 1D ball and spring model of 15 atoms. For each of the force noise settings, each algorithm was run 100 times. Each run consisted of 100 iterations from which the lowest maximum force on any one atom was chosen. The 100 iterations of the two phase Hessian model do not include the iterations required for the Hessian discovery phase (equal to the number of atoms in the chain plus one). The starting position of atoms was manually adjusted at random. With the addition of significant eggbox noise, the Pfrommer BFGS algorithm performed worse relative to the two phase Hessian algorithm.