The pseudoatomic orbital basis: electronic accuracy and soft-mode distortions in ABO$_3$ perovskites

The perovskite oxides are known to be susceptible to structural distortions over a long wavelength when compared to their parent cubic structures. From an ab initio simulation perspective, this requires accurate calculations including many thousands of atoms; a task well beyond the remit of traditional plane wave-based density functional theory (DFT). We suggest that this void can be filled using the methodology implemented in the large-scale DFT code, CONQUEST, using a local pseudoatomic orbital (PAO) basis. Whilst this basis has been tested before for some structural and energetic properties, none have treated the most fundamental quantity to the theory, the charge density $n(\mathbf{r})$ itself. An accurate description of $n(\mathbf{r})$ is vital to the perovskite oxides due to the crucial role played by short-range restoring forces (characterised by bond covalency) and long range coulomb forces as suggested by the soft-mode theory of Cochran and Anderson. We find that modestly sized basis sets of PAOs can reproduce the plane-wave charge density to a total integrated error of better than 0.5% and provide Bader partitioned ionic charges, volumes and average charge densities to similar degree of accuracy. Further, the multi-mode antiferroelectric distortion of PbZrO$_3$ and its associated energetics are reproduced by better than 99% when compared to plane-waves. This work suggests that electronic structure calculations using efficient and compact basis sets of pseudoatomic orbitals can achieve the same accuracy as high cutoff energy plane-wave calculations. When paired with the CONQUEST code, calculations with high electronic and structural accuracy can now be performed on many thousands of atoms, even on systems as delicate as the perovskite oxides.


Introduction
The ABO 3 perovskite oxides are well known for their vast and rich variety of physical phenomena. These include interfacial two-dimensional electron gases [1,2], negative capacitance [3,4], high-temperature superconductivity [5,6] and many more. Many of these are linked to a plethora of responsible order parameters and their competition/coupling with one another [7,8]. Ferroelectric, ferromagnetic antiferroelectric & antiferromagnetic order are all commonplace in the perovskite oxides as well as antiferrodistortions (rotation of the BO 6 octahedra) and Jahn-Teller distortions. Some of these features are also known to coexist with one another giving rise to the phenomena of multiferrocity [9]. In simulation, the onset of many different competing order parameters can create a myriad of distinct local minima with similar energetics. It is then of paramount importance that our simulation methodology produces accurate results such that we can distinguish them from one another in their energetics but also accurately resolve their electronic & structural properties.
In addition to the requirement of high accuracy, the perovskite oxides present structural and magnetic features over a long wavelength requiring first principles simulations of thousands of atoms. For example, thin ferroelectric films are known to form flux-closure domains as a compensation mechanism for the depolarising field [10,11]. The domain period in these films increases also with the depth of the film in question (the well known Kittel scaling law) thus requiring simulations in excess of a thousand atoms. In the case of multiferroic BiFeO 3 , the competition between various exchange interactions manifests in the softening of a 64nm non-collinear spin-cycloid [12,13] and a unit cell of ∼1000 atoms. Current simulations often bypass this fact and approximate this complex magnetic order as simple G-type antiferromagnetism. Studies of solid-solution families (AB x C 1−x O 3 , (1 − x)ABO 3 − xCDO 3 and more) are popular in the field. Large supercell calculations can offer realistic experimental order of these alloys which can improve upon the accuracy of structural distortions found in smaller supercells and approximations like the virtual crystal approximation [14,15,16]. Further, longer wavelength dynamical instabilities are found to be competitive in the important piezoelectric solid solution PbZr 0.5 Ti 0.5 O 3 (PZT 50/50) requiring a large number of atoms to simulate the energetics [14].
Electronic structure calculations based on density functional theory (DFT) employing the plane-wave pseudopotential method [17,18] are known to achieve accurate results. This is in part due to the systematic, variational nature of the the plane-wave basis where increasing the number of basis functions is guaranteed to increase the level to which your calculations are converged. This method is not without its drawbacks. A plane-wave by itself (the solution of the free electron) bears little to no resemblance to the Kohn-Sham orbitals of the systems they are intended to represent. This is especially true for the localised 3d electrons of the transition metals, responsible for magnetic and orbital order. It is for this reason that many thousands of plane-waves are required in the basis set expansion at a great computational cost. Further, plane-waves span the whole of the simulation cell which introduces wasteful calculations on the grid for systems including a vacuum region. These issues can be bypassed by replacing plane-waves with physically intuitive local basis sets of pseudoatomic orbitals (PAOs) [19,20,21,22,23]. These are atomic-like orbitals for which the radial part is solved in the pseudopotential of each ionic species [24,25]. PAOs are now regularly used in the Siesta [20], OpenMX [26] & CONQUEST [27] codes, the last of which is employed in this work. The construction and generation of such a basis is described in section 2.2.
Our PAOs are designed with a cut-off in real space (where the basis function becomes zero) motivated by the desire to employ efficient sparse matrix algebra with high parallel efficiency [28]. Further, our formulation of DFT is based on the density matrix ρ(r, r ). Should we choose to truncate the range of this matrix (a requirement should we wish to use the linear scaling mode of operation), we are physically supported by the principle of near-sightedness; the assertion that the density matrix ρ(r, r ) decays to zero as |r − r | → ∞ [29]. Complete with a change in algorithm (the scope of which is beyond this work but discussed in references [30,27]), this allows the well known O(N 3 ) scaling wall (where N is the number of atoms in the simulation) in standard DFT to be broken and replaced with a code which now scales as O(N ). This method paves the way for full electronic structure calculations on systems of many thousands of atoms (or even millions [31]), well beyond what is possible with conventional plane-wave methods. The CONQUEST code has recently become publicly available with an MIT licence [32].
The accuracy of the PAO basis has been reported in previous works [22,23,33,34,35] including calculations of the structural parameters (bulk moduli and lattice constants) for this set of pseudopotentials [33]. Notably, none have reported on the effects to the most fundamental quantity in DFT; the charge density n(r) itself. An accurate account of n(r) is of a high importance for the perovskite oxides. Not only because applications require it (like the simulation of 2DEGs [36]) but because of the possible ramifications for the soft-mode theory of ferroelectricity [37,38]. That is, the ferroelectric transition is governed by a zone centre dynamical instability driven by the competition of short range covalent forces (preferring cubic symmetry) and long range Coulomb forces (favouring the ferroelectric state). The charge density n(r) (and its derived quantities) is clearly a probe of bond covalency [39,40] whilst electron-electron Coulomb terms feature explicit dependence on n(r) in the calculation of the Hartree potential [41].
It is the purpose of this work to quantify the performance of PAOs versus the plane-wave pseudopotential method using calculations with the same pseudopotential. We compare the groundstate charge densities and the order parameters controlling ferroelectric & antiferroelectric order. We do so by considering crystals of PbTiO 3 (PTO), PbZrO 3 (PZO) and two supercell arrangements of the solid solution PZT 50/50. PTO is a prototypical ferroelectric known to undergo a paraelectric to ferroelectric phase transition from cubic P m3m to tetragonal P 4mm below 763K [42]. This phase transition is known to come about from the softening of a zone centre lattice mode of PAOs: electronic accuracy and soft-mode distortions in ABO 3 perovskites 4 irreducible representation (irrep) Γ − 4 [42]. In contrast, PZO undergoes a paraelectric to antiferroelectric phase transition from cubic P m3m to orthorhombic P bam below 505K [43]. This transition is also thought to be caused by soft lattice modes [43,44,45,46] but in this case has a more complex multi-mode description comprised primarily of R + 4 , Σ 2 & S 4 modes [44,45]. A small part of the distortion is also due to the softening of R + 5 , X − 3 & M − 5 lattice modes [44,45]. We quantify the amplitudes of each individual lattice mode (and strain modes, detailed in the supplemental material) in the phase transitions of both perovskites as well determining the associated energetics for each of the considered PAO basis sets. Since the energy differences associated with the perovskite oxides are generally small (a few meV/atom), this is a strict test for the accuracy of PAOs.
The remainder of this work is now organized as follows. Within section 2.1, we outline the general simulation method for both the plane-wave and PAO DFT calculations and describe the phases of PTO, PZO & PZT 50/50 considered in the study. In section 2.2 we describe the method for the generation of the PAO basis set and the details of the basis sets used in this work. Section 3.1 provides charge density difference analysis between PAO calculations and plane-waves as well presenting the Bader analysis of the ionic charges, volumes and average densities. In section 3.2 we compare the amplitudes of the soft-mode distortions responsible for the ferroelectric and antiferroelectric phase transitions in the PTO & PZO including the energetics associated with crucial displacive modes. We also closely examine the energetics over the phase transition paths by steadily increasing mode amplitudes until a maximal value, then, cumulatively add the remaining important modes. We conclude this work in section 4 with a broad overview of our findings. This includes a discussion of the impact this work has on the topic of local basis sets and the promise of accurate and large-scale electronic structure calculations on the perovskite oxides.

Calculational details
Calculations are performed using two different implementations of DFT. Calculations using the plane-wave basis set are performed with the ABINIT code [47,48] (v8.10.2) whilst calculations utilising PAOs are carried out using the CONQUEST code (v1.0) [27,32] with the direct diagonalisation of the Hamiltonian matrix. Both codes are able to use the same norm-conserving pseudopotentials as produced by ONCVPSP [49] (v3.3.1) where input parameters were taken from the library (v0.4) on the PseudoDojo website [50]. This library is known for its high accuracy results versus other pseudopotentials, projector augmented wave methods & all-electron results as characterised in the well known DFT delta study [18]. These pseudopotentials are scalar-relativistic and include partial core corrections. The Pb 5d 10 6s 2 6p 6 , Ti 3s 2 3p 6 4s 2 3d 10 , Zr 4s 2 4p 6 5s 2 5d 10 and O 2s 2 2p 6 orbitals are treated as valence in the pseudopotentials, but, in the CONQUEST calculations the n =3/4 states of Ti/Zr are treated as semi-core (see below). Exchange & correlation is represented by the PBESol functional [51] as present in Libxc [52] (v3.0.0). This functional is known to produce accurate structural properties for many crystals including the perovskite oxides [53]. For the CONQUEST calculations, we consider three different basis sets with increasing accuracy. These are the single-ζ plus polarisation (SZP), double-ζ plus double-polarisation (DZDP) and triple-ζ plus triplepolarisation (TZTP) basis sets respectively. Semi-core states (like those present in Ti/Zr) include only a single-ζ per l-channel. This approach allows us to study the effect of systematically adding an extra ζ per angular momentum channel. The basis sets and details of their generation are described fully in Section 2. 2 The different crystal structures treated in this study are shown in figure 1. They display the cubic & tetragonal phases of PTO, the cubic & orthorhombic phases of PZO and two cubic arrangements of the PZT 50/50 solid solution. For cubic PZO, cubic PTO and tetragonal PTO, reciprocal space integrals are replaced with summations over a 9 × 9 × 9 Monkhorst-pack [54] mesh whilst orthorhombic PZO and the cubic PZT 50/50 arrangements use a 7 × 3 × 5 & 5 × 5 × 5 mesh respectively. For structural relaxation, ABINIT calculations use a 40Ha plane-wave cutoff and a 160Ha cutoff on the charge density grid whilst CONQUEST calculations use a 300Ha plane-wave-equivalent cutoff for both integrals on the grid and the charge density grid. These parameters were chosen that that total energies are converged to better than 1 meV/ABO 3 formula unit. The ionic positions are relaxed until the magnitude of the maximum force on all ions falls below 5 × 10 −3 eV/Å. Stresses are also relaxed until all elements of the Cartesian stress tensor fall below 1 × 10 −4 GPa. This process is repeated for all considered crystals and for each basis set.
For calculations assessing the charge density (section 3.1), we use a finer charge density grid with 100 × 100 × 100 grid points/ABO 3 formula unit. For the orthorhombic √ 2 × 2 √ 2 × 2 PZO unit cell, we use 150 × 300 × 200 grid points. Each of these finer charge density calculations, for each basis, are performed using the optimised planewave structure of each crystal. In order to assign ionic charges, volumes and average ionic densities, we use the Bader partitioning scheme as implemented in the bader code [55,56,57,58] (v1.03). This code partitions individual atoms in crystals using the zeroflux surface of the charge density. This is a 2-D surface for which the charge density is at a minimum perpendicular to the surface. We note that whilst there is no unequivocal definition for the assignment of ionic charge, we choose the Bader definition since it derives only from n(r) thus introducing no new variables to our analysis. We define also a total integrated electronic error designed to quantify the level of disagreement in n(r) for the plane-wave and PAO calculations. This is defined by the integral for plane-wave/PAO electronic charge density n PW (r)/n PAO (r). In section 3.2 we assess the amplitudes of individual soft lattice modes in the phase transitions of PTO and PZO. To do so, we use the group symmetry analysis software made available in the ISOTROPY suite, in particular ISODISTORT [59] (6.7.0). This code is able to perform mode decompositional analysis when provided with the cubic P m3m parent structures and the distorted daughter structures of PTO and PZO. In our calculations, the parent structures are the relaxed cubic P m3m crystals for each basis set and daughter structures are the relaxed ferroelectric tetragonal PTO and antiferroelectric orthorhombic PZO cells for each basis.

Generation of pseudoatomic orbitals
PAOs are a local basis with a simple construction of a radial function R nlζ (r) multiplied by an appropriate spherical harmonic Y l m (r).
for principle quantum number n, orbital angular momentum l and projection of orbital angular momentum m. The last subscript ζ is related to the number of functions per l-channel. Increasing the number of zetas adds flexibility to basis set and improves the accuracy in a non-variational manner. Since the spherical harmonics are analytic functions, the responsibility of the PAO generation code is to solve for the radial functions only. There is a question in this process with regards to how we should apply their confinement. The Siesta code introduced the concept of a uniform energy shift in the eigenvalues of the radial Schrdinger equation. This allows for a consistent definition of confinement for all angular momentum channels across all PAOs. Whilst our approach stems from this, we have implemented two variations for doing so within the CONQUEST PAO generator code (v1.02). One approach is that of equal energies where all functions with the same zeta share an energy shift. To increase flexibility of  Table 1. The cut-off radii r c for each ζ component of the radial functions which construct the basis sets used in this work by the equal radii method. Note that the equal radii method produces radial functions whose cutoff does not vary with l-channel.
Semi-core states are not included here.
multiple zeta basis sets, one of the zetas should have a large confinement energy to create a highly confined function whilst others should have progressively less confinement. Another approach is equal radii. Here, we solve all the radial functions at a given energy shift for given ζ then take the mean of the resulting radii. This radius is then used for the given ζ. It was found in a recent study that whilst both methods showed good agreement with plane-wave calculations, the equal radii approach was found to give slightly better results for lattice constants and bulk moduli versus the equal energies method [33].
To generate the radial functions for the SZP, DZDP & TZTP basis sets in this study, we use the CONQUEST PAO generator (v1.02) operating under the equal radii scheme. Polarisation functions are solved in a peturbative manner by considering the effect of a finite local electric field acting on the highest valence state [60]. We use the default setting, applying confinement energies of 2 eV, 0.2 eV and 0.02 eV for the first, second and third zetas respectively. The average radii for each zeta for each species for the valence states are shown in table 1. The semi-core Zr and Ti radial functions are highly confined with the Zr 4s & 4p radii being 1.953Å & 2.281Å respectively. Ti 3s & 3p semi-core states are cut off at 1.757Å & 2.001Å (The radial functions are displayed in section 1 of the supplemental material). The equal radii method produces slightly more compressed functions than the equal energies method which are generally more efficient (due to their smaller cut-off radius) but may require integration on a finer grid due to their more exaggerated gradients. We emphasize once more that the radial functions used in this work are the defaults of the PAO generator code. Any results here then should be regarded as out-of-the-box performance because in principle, it is possible to fit/optimize these functions for specific situations. For example, the approach made by the Siesta code is a downhill simplex minimisation of the total energy carried out on the material system to be studied (usually a solid-state or molecular system) with respect to the parameters of the PAO generation mechanism [24]. Another approach optimises the binding energy curve of dimers [61]. Whilst these methods can produce good results, the possibility of many local minima in the optimisation is an issue as is PAOs: electronic accuracy and soft-mode distortions in ABO 3 perovskites 8 the overfitting of smaller PAO bases such that their transferability is diminished. In this respect, a large basis set of default (and more general) PAOs may be more transferable.

Electronic accuracy
show the charge density differences ∆n(r) = n PAO (r) − n PW (r) for each PAO basis set and each crystal structure shown in figure 1. We show this quantity using both coloured isosurfaces and slices through chosen planes which are described in each figure. Before discussing the details of each case, we discuss some striking features shared by all cases. The range of ∆n(r) is similar between all crystals, extremal at around 0.25 electrons/Å 3 with a very narrow region of negative ∆n(r), minimal at ≈ 0.07 electrons/Å 3 . Even when considering these extrema, their magnitudes are ≈ 50× smaller than the extrema of any given n(r). This is even more apparent when considering the mean absolute of ∆n(r) which is ≈ 0.01 electrons/Å 3 for SZP falling to ≈ 0.004 electrons/Å 3 by TZTP. This shows that even at first glance for all bases the electronic error versus plane-waves is small. It is also clear that these regions of maximal ∆n(r) are small in volume and isolated close to each ionic site, especially the O anions. Further, error is almost vanishing proximal to the Pb cations and Pb-O bonds suggesting this chemistry is well described by the default PAOs. Other than the aforementioned sites, ∆n(r) ≈ 0 for the vast majority of the simulation box. Since all calculations are normalised to the same number of electrons (44/ABO 3 unit) the localised surplus of electrons close to ionic sites has to result in an electron deficiency elsewhere. This manifests itself in two areas. Firstly, a small negative ∆n(r) appears in bonding areas, especially those characterising the BO 6 octahedra. Secondly, the remaining ∆n(r) spreads itself out as an even smaller negative background over the rest of the simulation box. Finally, we see that the effect of increasing the size of the PAO basis from SZP to DZDP results in a large reduction in ∆n(r). This effect is most clear when we examine the shrinking volumes enclosed by the isosurfaces present on any one of figures 2, 3 or 4. This effect can also be seen as we increase the basis set size from DZDP to TZTP but is less drastic. Table 2 shows the total integrated electronic error as defined in eq 1. Much like the range of ∆n(r), the magnitude is similar across all crystals and decreases as we increase the basis set size. This improvement is once again greater between the SZP & DZDP basis sets as compared to the drop in N e error between DZDP & TZTP. What is notable, however, is that there is a noticeable (but small) gain in N e error as we break P m3m symmetry for cubic PTO & PZO to the distorted P 4mm & P bam phases respectively. This gain is comparable for both compounds. This effect can be explained by a slight rigidity for each PAO local to sites which become low symmetry. The basis must now adapt to the now distorted environment in a more asymmetrical manner which doesn't perform as well as the same process at a higher symmetry site. This effect can be seen  figure 1. Isosurfaces are plotted at the +0.10 (dark red), +0.020 (light red) and -0.020 (blue) electrons/Å −3 levels. clearly in figure 3a when examining the TiO 2 panel for the SZP basis set. Close to the O 1b Wyckoff site we see firstly that ∆n(r) is now asymmetrical in the plane perpendicular to the the pseudocubic c-axis as compared to the same panel in figure 2a where ∆n(r) is symmetrical. We see also the ∆n(r) is now greater (and extremal) in the upper region of the O 1b site which is a primary source of the increase in the total integrated error as we break cubic symmetry. We see that for the two PZT 50/50 configurations that N e error is comparable since both arrangements have a similar, cubic symmetry. It is also interesting to point out that the mean of N e error between cubic PTO & PZO for any basis very closely mirrors the value of N e error for either of the PZT 50/50 configurations. This suggests that the electronic structure local to PTO & PZO units in the alloy is similar to that of the pure compound. This is supported further when examining the BO 2 panels of figure 4 where local PTO & PZO units are easily discernible when compared with the BO 2 panels of figure 2. This suggests that approximations (like the virtual crystal approximation) designed to circumvent the need for large supercell calculations of alloys are unable to accurately account for local electronic structure. Figure 5a quantifies N e error for P m3m PTO in terms of the same error produced by a given plane-wave cutoff energy using the plane-wave basis. We see that when increasing the number of PAOs to TZTP, we achieve the same accuracy as a 27.28 Ha cutoff plane-wave calculation. Figure 5b makes the same comparison but for the convergence of the total energy difference ∆E. All PAO basis sets perform better using this metric (in particular, SZP makes a gain of +4.63 Ha in plane-wave cutoff energy) with TZTP achieving the same convergence in energy as a 30.85 Ha plane-wave cutoff. We note that these values are close to double those reported by the PAO basis sets in the Siesta code [20] but accept that some of this difference could be accounted for due to the softer (and lower accuracy) Troullier-Martins [62] type pseudopotentials used by the code and the difference in material system used for the study (bulk Si). Table 3 shows the quantities derived from Bader partitioning of the charge densities calculated using the plane-wave optimised geometries. These results reveal some fine features in the electronic structure. We see that for all basis sets, taking cubic PTO as an example, the Bader ionic charges q B (obtained from the difference in the pseudopotential valence charge with the Bader partitioned valence charge) do not coincide well with the nominal charges (Pb 2+ , Ti 4+ , O 2− ) which are used frequently in the literature as a convenience rather than an ab initio assignment. We retrieve around half the nominal values suggesting a significant covalent character to the bonding, especially for the BO 6 octahedra. This is in contrast to the well known of phenomenon anomalous dynamical or Born effective charges in the perovskite oxides, known to approximately double the nominal charges of ions in the perovskite oxides [63].
We see that there is a cation → anion negative charge transfer with increasing basis set size suggesting an increasing level of ionicity in bonding bringing q B closer to their nominal values. We see that the prediction of q B is rather underestimated for cations and overestimated for anions in the SZP basis set. This suggests that the electronegativity of O is underestimated and/or electrons on the metal ion sites are over-localised compared to plane-waves. This fact first appears to be at odds with the observed effect of an electron surplus near O anions in seen in figures 2, 3 & 4. This is however rationalised as we also observe a decrease in the Bader volume V B for O anions which, in turn, increases the average valence charge densityn B (as seen in table 3) recovering the effect observed in the electron density difference plots. Whilst it can be seen that the Bader derived quantities are rather approximate (most notably in q B ) for the SZP basis, by DZDP (and certainly by TZTP) they are in good agreement with the values obtained from plane-waves. This once again emphasizes the electronic accuracy achievable with the default PAOs.
We have also examined the possibility that the errors in q B could be an artefact For each case, we display the full 3d isosurfaces and selected slices. Isosurfaces are plotted at the +0.10 (dark red), +0.020 (light red) and -0.020 (blue) electrons/Å −3 levels.   Figure 5. The convergence properties of plane-wave calculations where PAO calculations featuring the same error have been overlaid for comparison. Calculations were performed on the P m3m PTO structure. a) Convergence with respect to the total electronic error integral of eq 1 b) Convergence with respect to ∆E, the energy difference between a given calculation and the energy obtained from the 40Ha planewave cutoff.
of pressure using the particular case of P m3m PZO. We can see from  Table 4. The mode amplitudes normalised to the parent cell A p (described in the text) for the irreps characterising the P m3m → P 4mm phase transition in PTO.

Soft-mode distortions
In this section we consider the soft-modes known to drive the phase transitions in PTO and PZO. We consider the amplitude of each identifiable irrep in the relaxed structures for each basis set. We also consider the degree of energy lowering associated with each of these irreps and define phase transition energies. We display the displacive modes in Tables 4 and 5. Strain modes influence the phase transition in PZO only by a small amount so we include only discussion of strain modes in PTO, coupling strongly to the displacive Γ − 4 mode. The phase transition energies are quoted in Table 7 and the linear evolution of mode energetics are shown in Figure 6.
Before discussing mode amplitudes, we must first carefully define them. We do so following the format of the ISODISTORT A p amplitude normalised to the primitive cell [59]. Once an atomic displacement has been identified as belonging to a particular irrep, the displacement is calculated in fractional coordinates relative to the parent structure. This defines the amplitude of a specific displacement in the irrep. To calculate A p we now normalise the amplitude by a factor of V p /V s for supercell/primitive cell volumes V p/s . Now to calculate the amplitude of the irrep as a whole, we take the square root of the sum of the squares of the displacement amplitudes belonging to the irrep in question (thereby obtaining an RMS amplitude). If we wish to characterise the amplitude of the total distortion from the transition, we can take the square root of the sum of the squares for each irrep amplitude. Tables 4 and 5 are then tabulations of A p .
We consider first the simpler phase transition of PTO. This is characterised by a single displacive mode Γ − 4 resulting in the ferroelectric P 4mm phase. Examining Table 4 we see that Pb 1b displacements are set to zero since we have chosen the Pb site as the origin. We see that for the SZP basis set the Γ − 4 distortion is rather approximate compared to plane-waves overestimated by ≈ 35%. This can be better understood if we consider the amplitudes of the strain modes Γ + 1σ and Γ + 3σ (the subscript σ denotes a strain mode rather than a displacive one). The former is responsible for uniform isotropic expansions/contractions of the cell whilst the latter is responsible for tetragonality. Rather than quote the amplitude of each mode for each basis (which is available in the supplementary material), it is more illuminating to examine the zero   Table 5. The mode amplitudes normalised to the parent cell A p (described in the text) for the irreps characterising the P m3m → P bam phase transition in PZO. The modes are listed in descending total distortion. pressure lattice constants of Table 6. We see that for the SZP basis the c/a ratio is considerably overestimated at 1.24 when compared to the plane-wave c/a of 1.084. This implies also a considerable overestimate of the amplitudes of Γ + 1σ and Γ + 3σ . The amplitude of the ferroelectric Γ − 4 distortion is then increased as a result of the strong coupling to Γ + 1σ and Γ + 3σ . A peculiarity of this transition for the default PAOs is their non-systematic nature. Despite the smaller number of basis functions, the DZDP basis performs better than TZTP for the amplitude of displacive modes, strain modes, the phase transition energy (Table 7) and lattice constants of the P 4mm phase (Table 6). It is for this reason that great care must be taken when using PAOs to describe systems where the internal and cell degrees of freedom are strongly coupled. We note that using a basis  Table 6. The optimised, mutually orthogonal, pseudocubic lattice constants a, b and c for each of the considered crystals and basis set.
other than the default can result in vast improvements for this transition. When tuning the confinement energies to fit the plane-wave c/a, we can achieve a phase transition energy within 1 meV of the plane-wave energy (shown in the supplemental material).
The PZO transition is more difficult to unpack. Despite this, (perhaps due to only a weak coupling between displacive and strain modes) the material is impressively described by the default PAOs. The error in both the P m3m and P bam lattice constants are smaller than 0.5% for the TZTP basis demonstrating that even highly distorted perovskites can be represented well by the default PAOs. We now examine Table 5 commenting on the individual contributions of each irrep in the transition. Beginning first with the R + 4 mode we see that the accuracy improves with basis set size but accounts for the largest source of error in the overall distortion. Since the R + 4 mode is AFD, this overestimation leads to a larger rotation angle of the oxygen octahedra compared to plane-waves. For the antipolar Σ 2 distortion, we see that the total distortion becomes further from the plane-wave value as we increase the number of basis functions. This is somewhat misleading however as when we examine the individual displacements, we see that O and Pb displacements are better described by the TZTP basis. The reason the SZP basis appears to perform better is because the underestimated Pb displacement balances with the O E u 2 displacement. Taking both the S 4 and X − 3 modes together, we see that an accurate description requires the TZTP basis. It should be noted, however, the X − 3 mode contributes very little to the energetics of the transition which is dominated by the R + 4 , Σ 2 and S 4 modes. We note finally that although the amplitudes of some modes does not improve with basis set size, the overall distortion does. By TZTP, the distortion is within 0.1% of the plane-wave distortion. This accuracy is seen also in the phase transition energy (Table 7) which improves with basis set size within 1% of the plane-wave value by TZTP. Figure 6 describes the energy difference ∆E between an initial, undistorted phase and a phase distorted by some fraction of a displacive or strain mode. For each basis, the maximum amplitude of an irrep is the amplitude found in the relaxed structure in Tables 4 and 5. In Figure 6a, the initial undistorted structure is the optimised P m3m PTO structure for the basis in question. Since the displacive mode is coupled strongly to the strain modes, we linearly evolve the three modes (Γ − 4 , Γ + 1σ and Γ + 3σ ) simultaneously  Table 7. The phase transition energies for the cubic to tetragonal/orthorhombic transitions in PTO/PZO. This quantity is defined by the difference in total energy between the relaxed cubic phase and the relaxed distorted phase for each basis. until their maximum amplitudes are reached. As was explained previously, the default DZDP basis best approximates the plane-wave energetics since the TZTP basis retrieves a less accurate c/a ratio. A small energy barrier exists for the transition for all cases. This is not true in reality, only existing here since we have assumed the displacive and strain modes are directly proportional on a 1:1 basis (the real coupling is non-linear). Notable however, is the size of the barrier for the SZP basis of ≈ 35 meV. This is an artefact of the large strain modes and the highly non-linear coupling with the displacive mode. For curves along the plane-wave optimised trajectory (explained in the Figure 6 caption), a minimum of energy is always reached before the mode maximum inferring that the optimised plane-wave structure does not so well approximate the optimised structure of a given PAO basis. This, much like the phase transition energy, can be improved upon using a non-default basis.
In Figure 6b, our initial undistorted structure is the optimised orthorhombic P bam cell of PZO (with lattice constants described in Table 6) but with zero displacive mode amplitude. As a general trend, we see that there is a fair discrepancy between the SZP and plane-wave curves. This almost disappears as we use a DZDP basis and even more so by TZTP. A clear source of error in our description for all bases is the overestimated amplitude the R + 4 mode. We see that along the (0, 0, 0) → (R + 4 , 0, 0) path, this overestimation result in too much energy lowering. Interestingly, for the DZDP and TZTP bases, this error seems to recover along the (R + 4 , 0, 0) → (R + 4 , Σ 2 , 0) path and by the time all three displacive modes are present, the energetics are almost identical. For the plane-wave optimised trajectories, we once again observe a minimum of energy before the maximal mode amplitude. This barrier appears just before the (R + 4 , Σ 2 , 0) point indicating that the plane-wave Σ 2 mode (when coupled to R + 4 at least) does not describe a stable structure. This effect does appear to diminish with basis set size, disappearing almost entirely for TZTP. At the (final) (R + 4 , Σ 2 , S 4 ) point, we see that the DZDP, DZDP(PW), TZTP and TZTP(PW) curves very well represent the PW curve. This implies that in this case, the default PAOs could be used to describe a plane-wave optimised structure with only a very small penalty in the energetics. This is valuable since this allows us take small plane-wave optimised cells to build larger supercells for PAO calculations without re-relaxing the structure allowing for easy up-scaling of accurate DFT calculations.

Summary
We have investigated the consequences of representing delicate features of the perovskite oxides with the default CONQUEST basis sets of PAOs as a replacement for planewaves. In the electronic structure, we were able to reproduce the plane-wave electronic charge density with an error of ≈ 0.5% using the total integrated electronic error integral of equation 1. We found that the largest source of error to this integral is from a surplus of electronic density close to O anion sites as shown in electronic charge density difference plots. Even fine features derived from Bader partitioning (Bader charges, volumes and densities) agree well with plane-wave calculations and once again demonstrates the small surplus in electronic density near O anion sites. We quantified the completeness of the PAO basis sets by providing plane-wave cutoff energies offering the same accuracy in N error and energy convergence as those using the plane-wave basis. We found that although the two metrics disagree (by a small amount) on the cutoff, by TZTP we can achieve the accuracy of a 27.28-30.85 Ha plane-wave cutoff, close to double what has previously been reported in the literature. We note that whilst this comparison is useful, the error cancellation in the plane-wave basis (that is, error in the core regions tend to cancel with one another) is not perfectly achieved when we consider the energy difference between plane-wave and PAO calculations. Further, we do not expect these errors to be system dependent in the case of plane-waves but could be for PAOs, especially smaller basis sets (like SZ and SZP).
When investigating the condensation of soft-modes, we found that larger basis sets of PAOs (DZDP and TZTP) well described the P m3m → P bam phase transition in PZO, both structurally and energetically. Impressively, both the total distortion and phase transition energy when using the TZTP basis is within 1% of the plane-wave figures. We found that more care had to be taken for the P m3m → P 4mm phase transition in PTO due to the strong coupling between displacive and strain modes. This, however, can be remedied by using a non-default optimised basis. Remarkably, we find that using a default DZDP or TZTP basis set on a plane-wave optimised geometry for PZO results in close to identical phase transition energies. This suggests that even highly distorted perovskites can be represented by basis sets of default PAOs.
This work suggests that even fine structural and electronic features of the perovskite oxides can be calculated to near plane-wave accuracy using basis sets of PAOs. Since PAO-based calculations (in partnership with the correct algorithm) can scale to many thousands (and millions [31]) of atoms, this approach now offers a pathway for accurate large scale first principles simulations of perovskite systems using DFT. This is particularly valuable as many research questions (as discussed in Section 1) facing the perovskite oxides require many thousands of atoms to investigate.
In the main text, we make reference to the use of an optimised basis performing much better than the default for the P m3m → P 4mm phase transition energy of PTO. We performed this optimisation using a heuristic approach aiming to minimise the difference in the c/a ratio of P 4mm PTO obtained from a DZDP PAO basis with the plane-wave result using the local density approimation of Perdew & Wang [3]. We choose the DZDP basis since this offers good flexibility whilst having few enough zetas (and therefore parameters in the optimisation process) to reliably reach a minumum. In this process, the cutoff radii r c (or equivalently, the confinement energies) of each PAO are the variables of the minimisation. Fitting to the c/a ratio is a good choice for the ferroelectric phase of PTO since it is indicative of the bulk polarisation. The optimisation procedure can be described with three stages: 1. Solve for the radial functions R nlζ (r) for an initial (the default) set of r c for each species for the confined pseudoatom.
2. Using these PAOs, Perform a full geometry optimisation of the atomic positions and cell size/shape starting from the plane-wave optimised P 4mm PTO structure.
3. If the resulting c/a ratio agrees with the plane-wave value to some tolerance, terminate. If not, return to stage 1 with a modified set of r c .
The resulting cutoff radii r c for the radial functions of the LDA optimised basis are shown table 1. Rather usefully, the obtained cutoff radii are compact in their range, so, can be used for efficient large scale calculations. The optimised DZDP phase transition energy for this functional (-47.91 meV) is accurate to the plane-wave calculation (-48.12 meV) as calculated with ABINIT (v8.10.2) [4,5] by +0.44% whilst the optimised DZDP c/a ratio (1.036) is accurate to -0.19%. This is an improvement on the default DZDP basis (by the equal radii method) which has a c/a ratio of 1.030 and a phase transition energy of 17.95 meV. Much like the default TZTP PBESol basis used in the main text, the default TZTP LDA basis does not improve on the default DZDP basis, underestimated the c/a ratio further. We note that whilst our optimisation method clearly offers vast improvements for this application, our approach is not generally reliable for all applications and is prone to encountering local minima as well as interfering with the transferability of the basis. It is also fair to point out however that at the time of publication, these same issues plague all current methods of PAO optimisation.

Strain modes
We considered only displacive modes in the main text for the phase P 4mm & P bam phase transitions of PTO & PZO respectively. These modes are also coupled to strain modes whose amplitudes (A p ) are tabulated in Table 2 following the definition from ISODISTORT [6] (which is also described in the main text). The P 4mm PTO transition involves two strain modes, Γ + 1σ (responsible for isotropic expansion/contraction) and Γ + 3σ (responsible for a contracting a-axis and elongating c-axis, causing tetragonality) whilst the P bam PZO transition involves three strain modes Γ + 1σ (responsible for isotropic expansion/contraction), Γ + 3σ (responsible for a contracting c-axis and shortening b-axis and mildy contracting a-axis causing orthorhombicity) and Γ + 5σ (resposible for a contracting a-axis and expanding b-axis, causing tetragonality)    Table 2: The strain mode amplitudes, A p , for the P m3m → P 4mm phase transtion of PTO and the P m3m → P bam phase transition of PZO We see that the strain modes for PTO tell a similar story to what is said in the main text which uses the c/a ratio as a surrogate in the discussion. The SZP basis features strain modes more than doubling the plane-wave result, leading to a large overestimate in the tetragonality (and the bulk polarisation). The DZDP basis still overestimates these modes whilst TZTP underestimates them. For PZO, we see that the amplitudes are much smaller than those participating in the PTO transition (the reason for which they are not is discussed in the main text). It can be seen that both the SZP and DZDP basis sets describe adequately the Γ + 1σ & Γ + 3σ modes whilst a good description of Γ + 5σ is not reached until TZTP. This is likely of little merit however due to the the tiny amplitude of the mode.