HP -- A code for the calculation of Hubbard parameters using density-functional perturbation theory

We introduce HP, an implementation of density-functional perturbation theory, designed to compute Hubbard parameters (on-site $U$ and inter-site $V$) in the framework of DFT+$U$ and DFT+$U$+$V$. The code does not require the use of computationally expensive supercells of the traditional linear-response approach; instead, unit cells are used with monochromatic perturbations that significantly reduce the computational cost of determining Hubbard parameters. HP is an open-source software distributed under the terms of the GPL as a component of Quantum ESPRESSO. As with other components, HP is optimized to run on a variety of different platforms, from laptops to massively parallel architectures, using native mathematical libraries (LAPACK and FFTW) and a hierarchy of custom parallelization layers built on top of MPI. The effectiveness of the code is showcased by computing Hubbard parameters self-consistently for the phospho-olivine Li$_x$Mn$_{1/2}$Fe$_{1/2}$PO$_4$ ($x=0, 1/2, 1$) and by highlighting the accuracy of predictions of the geometry and Li intercalation voltages.


Introduction
The development of density-functional theory (DFT) [4,5] has allowed modeling of a broad spectrum of properties for a large variety of systems. In practical applications DFT relies on approximations to the exchange-correlation (xc) electronic interactions, among which the local-density approximation (LDA) and the generalized-gradient approximation (GGA) are the most popular ones. Both approximations suffer from self-interaction errors (SIE) [6], which limit the accuracy only to simple systems with chemistries determined by s-and p-type electrons. In systems with strongly localized electrons of d and f type, the SIE of LDA and GGA is quite large and leads to a significant overdelocalization of these electrons which translates into quantitative and sometimes even qualitative failures in the description of complex materials.
A popular approach to alleviate SIE in DFT calculations is to use Hubbard corrections to approximate DFT energy functionals [7,8,9]. The rationale for this is that Hubbard corrections impose piecewise linearity in the energy functional as a function of atomic occupations [10], and thus remove/alleviate SIE in the Hubbard manifold both in extended systems and in molecular ones [11,12]. Within this approach, often referred to as DFT+U, the Hubbard correction acts selectively on strongly localized manifolds (of d or f types, typically) through projectors on the corresponding states, while electrons on more delocalized states are treated at the level of approximate DFT. What has made this approach popular is certainly the possibility to achieve significant improvement in the description of systems with localized electrons, while maintaining a combination of simplicity and reduced computational costs. An extended formulation of DFT+U -so-called DFT+U+V [13] -takes into account inter-site Hubbard interactions through the parameter V and allows to improve the description of localized electrons even in presence of significant hybridization with neighbors. Recent works [14,15,3] have highlighted the quantitative accuracy of DFT+U+V calculations with both on-site U and inter-site V effective parameters. It is useful to mention about alternative recent formulations of DFT+U+V [16,17].
The effectiveness of the (extended) Hubbard functional depends critically on the values of the effective interaction parameters (i.e., the Hubbard U and V). Unfortunately, the values of these parameters are not known a priori and it is still quite a common practice in literature to evaluate them semi-empirically by fitting various experimental properties (when available), which prevents this method from being fully ab initio and from being predictive for novel materials. Most importantly, it is often forgotten that the Hubbard correction acts on a specific manifold that can be defined in different ways [18]. Hence, the values of the Hubbard parameters are not transferable and should not be considered as universal quantities for a given element or material (see the appendix in Ref. [12]). It is crucial to use the U and V parameters consistently, i.e., maintaining the same Hubbard projectors, pseudopotentials, oxidation states, functionals, etc. for which they were computed. During the past 30 years there has been a large effort to develop methods for the first-principles calculation of Hubbard parameters. Among these, the constrained DFT (cDFT) approach [19,20,21,22,23,24,10,25,26,27], the Hartree-Fock-based approaches [28,29,30,31,16,17], and the constrained random phase approximation (cRPA) approach [32,33,34,35,36,37,38] are the most popular. A linear-response formulation of cDFT (LR-cDFT) was introduced in Ref. [10] and generalized to the calculation of the inter-site Hubbard parameters V in Ref. [13] (see also Refs. [39,40]). LR-cDFT has recently been recast via density-functional perturbation theory (DFPT) [2,3], allowing to overcome several challenges of the supercell approach of Ref. [10]. It was shown that both LR-cDFT and DFPT give exactly the same values of Hubbard parameters (modulo the numerical noise), as they ought to by construction [2,3]. By constructing the localized perturbation (typically requiring costly calculations in supercells) as a series of independent monochromatic perturbations in the primitive unit cell, it improves significantly the computational efficiency, accuracy, user-friendliness, and automation [2,3], as also demonstrated by several recent applications [14,41,15,42,43,44,45,46,47,48,49]. Key to this successful implementation of the LR-cDFT is indeed the capability to express perturbation theory in reciprocal space as in the calculation of phonons using DFPT [50,51,52].
An open and recurring question concerns the extent at which the Hubbard parameters obtained from different approaches, e.g. cRPA [34] vs. LR-cDFT/DFPT [10,2] vs. ACBN0 [31] compare to one another. First and foremost, these methods have quite different definitions of the Hubbard U which complicates very much their comparison. In fact, only a few attempts have been made in the literature trying to identify the analogies and differences between these theories [1,35,40]. Moreover, different Hubbard projector functions can be used in these methods: cRPA is based on maximally localized Wannier functions [53], while LR-cDFT/DFPT and ACBN0 are used with several other popular types of Hubbard projectors (e.g. nonorthogonalized or orthogonalized atomic orbitals, projector-augmented-wave (PAW) functions, to name a few). Since the computed Hubbard parameters are very sensitive to the choice of Hubbard projector functions [47], the comparison between the values of U computed using different methods and different projector functions becomes even more complicated. Furthermore, in some works the Hubbard U is computed in a one-shot fashion (i.e. a single calculation from an uncorrected ground state) while in other works a self-consistent procedure is adopted where the effective Hubbard U is recomputed from a DFT+U ground state until self-consistency is reached [3,10,54]. The choice of pseudopotentials is also very important when computing the Hubbard parameters [12,27]: the value of U obtained for the same structure but with differing pseudopotentials may easily differ by as much as 2-3 eV, in particular if the pseudopotentials were generated from different oxidation states [12]. All in all, for a given material of interest, even if the calculations of U are done using the same crystal structure and atomic positions, the same kinetic-energy cutoff, k points sampling of the Brillouin zone (BZ) and same other technical details, all other aforementioned aspects must be 2 very carefully considered when comparing studies reporting different Hubbard U parameters. Therefore, we believe that it makes sense to compare U values computed using different methods only when the same computational setup is used. In this paper we introduce a computer code, named HP (Hubbard Parameters), which implements DFPT for the calculation of Hubbard parameters. HP is distributed under the terms of the GPL license [55], as a component of the Quantum ESPRESSO suite of open-source codes based on plane-wave basis sets, pseudopotentials, and using periodic boundary conditions [56,57,58]. This paper is organized as follows. In Sec. 2 we provide a theoretical background for the Hubbard-corrected DFT and for DFPT. In Sec. 3 we describe the components of HP . In Sec. 4 we provide the instructions for installing HP on UNIX systems and discuss various levels of parallelization implemented in it. In Sec. 5 we give an example of the usage of HP for the calculation of Hubbard parameters for Li x Mn 1/2 Fe 1/2 PO 4 and studying its ground-state properties. Finally, Sec. 6 contains conclusions and perspectives for future work. Appendix A presents examples of the input files for the PW and HP codes, and Appendix B contains the description of the input variables of the HP code. Hartree atomic units are used throughout the paper.

Hubbard-corrected density-functional theory
In this section we briefly review the formulation of the extended DFT+U+V approach [13]. All equations which are presented below can be easily reduced to the case of DFT+U by simply setting V = 0. For the sake of simplicity, the formalism is presented in the framework of norm-conserving (NC) pseudopotentials (PPs), for non-metallic ground states, in the collinear spin-polarized case.
As a generalization of DFT+U, DFT+U+V is also based on an additive correction to the approximate DFT energy functional, modeled on the Hubbard Hamiltonian [13]: where E DFT represents the approximate DFT energy (constructed, e.g., within the local spin density approximation -LSDA, or the spin-polarized generalized-gradient approximation -GGA), while E U+V contains the additional Hubbard term. At variance with the DFT+U approach, containing only on-site interactions, DFT+U+V is based on the extended Hubbard model including also inter-site interactions between an atom and its surrounding ligands. In the simplified rotationally-invariant formulation [9], the extended Hubbard term reads: where I and J are atomic site indices, m 1 and m 2 are the magnetic quantum numbers associated with a specific angular momentum, U I and V I J are the on-site and inter-site Hubbard parameters, and the star in the sum denotes that for each atom I, the index J covers all its neighbors up to a given distance (or up to a given shell). The atomic occupation matrices n I Jσ m 1 m 2 are based on a generalized projection of the Kohn-Sham (KS) states on localized orbitals ϕ I m 1 (r) of neighbor atoms: where v and σ represent, respectively, the band and spin labels of the KS (pseudo-)wavefunctions, k indicate points (wave vectors) in the first BZ, N k being their number, N occ is the number of occupied KS states, andP JI m 2 m 1 is the generalized projector on the localized orbitals of neighbor atoms: In Eq. (3) and hereafter, with the superscript • we indicate quantities which refer to the unperturbed ground state of the system. Here, ϕ I m 1 (r) ≡ ϕ γ(I) m 1 (r − R I ) are localized orbitals centered on the I th atom of type γ(I) at the position R I . Given their importance for the calculation of the Hubbard parameters it is convenient to establish a short-hand notation for the on-site terms of the quantities defined in Eqs. (3) and (4): n Iσ m 1 m 2 ≡ n IIσ m 1 m 2 andP I m 1 m 2 ≡P II m 1 m 2 . The standard DFT+U approach is recovered by setting V I J = 0 in Eq. (2). Based on the definitions above, it is quite straightforward to see from Eq. (2) that the two terms in the corrective energy functional, proportional to the on-site (U I ) and inter-site (V I J ) couplings, respectively, counteract each other. In fact, while the on-site term favors localization on atomic sites (typically suppressing hybridization), the inter-site one favors hybridized states with components on neighbor atoms. Computing the value of the U I and V I J effective interaction parameters is thus crucial to determine the degree of atomic localization of d-and/or f -type electrons when the system is in its ground state. The Hubbard projector functions {ϕ I m 1 (r)} can be constructed using different types of projector functions as a basis set (see Sec. 2.2). 3 The action of the Hubbard potential on KS wavefunctions can be obtained by taking the functional derivative of E DFT+U+V [see Eq. (1)] with respect to the complex conjugate of the same KS wavefunction [13,59]. The term corresponding to this functional derivative of E U+V [see Eq. (2)] is: This Hubbard potential is added to the standard DFT Hamiltonian [2,3], and then the modified KS equations are solved self-consistently. As was mentioned in Sec. 1, in DFT+U and DFT+U+V the two crucial aspects are: i) the choice of projector functions that are used to construct the Hubbard manifold, and ii) the choice of the Hubbard parameters (U I and V I J ). It is of utmost importance to understand that the latter strongly depends on the former. Therefore, it makes sense to discuss the values of Hubbard parameters only when the Hubbard projectors have been fixed.
The use of DFT+U+V instead of DFT+U is mainly motivated by the increased flexibility the former offers compared to the latter. This is particularly important for more covalently-bonded systems where electronic localization occurs on hybridized states with components on neighbor atoms. In fact, a too large U, favoring on-site localization, might suppress the inter-site hybridization and distort the electronic structure of these systems to the point that a negative U might be required to restore a more physical picture and to recover the results achieved with hybrid HSE06 functionals [60]. This effect was already observed in Ref. [13] that showed that the standard on-site only DFT+U can actually suppress the insulating behavior of covalent semiconductors (Si and GaAs) and decrease their band gap. In the same work it was first demonstrated that the use of a finite and positive V (within DFT+U+V), by favoring the hybridization of atomic states between neighbor atoms, can in fact re-establish the insulating character of their ground state also predicting a wider band gap than the one obtained using uncorrected DFT functionals. A finite positive V was also found effective in reestablishing a sound description of other problematic covalent systems without the need of negative on-site U's, as detailed in Ref. [17]. While the need of an inter-site interaction V makes a semi-empirical evaluation of the Hubbard parameters much harder, the calculation of V from first-principles can be achieved simultaneously to that of U, as illustrated already in a number of studies [2,3,13,16,17]. Therefore, DFT+U+V where both U and V values are computed ab initio constitutes a robust and accurate approach that describes accurately the on-site localization and inter-site hybridization of electrons without any manual calibrations of Hubbard parameters. More on this is discussed in Sec. 2.3.

Hubbard projectors
As was discussed above, one of the key aspects of the Hubbard-corrected DFT formalism is the choice of the projector functions for the Hubbard manifold. In other words, we need to choose the basis {ϕ I m (r)} for the projectorP JI Refs. [61,62]). In particular, we mention here nonorthogonalized [10,63] and orthogonalized atomic orbitals [14,15,43], nonorthogonalized [64] and orthogonalized Wannier functions [65], linearized augmented plane-wave approaches [66], and PAW projector functions [67,68]. A common feature of all these projector functions is that they are spatially localized and depend explicitly on atomic positions. In this work, we consider only two types of projector functions, nonorthogonalized and orthogonalized atomic orbitals. Let us comment briefly about each of them. Nonorthogonalized atomic orbitals represent one of the simplest choices for the Hubbard projectors, and they are well suited for systems with a pronounced ionic character. These functions are provided with pseudopotentials and, by virtue of their angular part, they are orthonormal within each atom but not between different atoms. Therefore, whenever interatomic overlaps between them become relevant, this type of projector functions is not suitable any more, and inter-site orthogonalization becomes necessary. Orthogonalized atomic orbitals are obtained by taking atomic orbitals of each atom and then orthogonalizing them to all the orbitals of all the atoms in the system. In this work, we will use the Löwdin orthogonalization method [69,70]. By doing so, a new set of orbitals is obtained that, by virtue of their mutual orthogonalization [18], provide a more accurate representation of inter-site hybridization. This choice is particularly appealing to define the Hubbard projectors, because it allows us to avoid counting Hubbard corrections twice in the interstitial regions between atoms. This is especially relevant in the case of DFT+U+V. As explained in Ref. [18], Löwdin orthogonalized atomic orbitals are defined as: where O is the orbital overlap matrix which is defined from its matrix elements as: (O) I J m 1 m 2 = φ I m 1 |φ J m 2 , where φ I m 1 (r) and φ J m 2 (r) are the nonorthogonalized atomic orbitals. It is important to note that the atomic and state indices must be understood as being in couples, (I, m 1 ) and (J, m 2 ), because for different types of atoms the considered atomic manifolds can be different. Here we orthogonalize all the states of all the atoms in the system. It is important to orthogonalize not only states that belong to the chosen Hubbard manifolds of each atom (e.g., d or f states), but also the remaining states, in order to preserve the on-site orthogonality. 4 In practice, we construct Bloch sums of the Löwdin orthogonalized atomic orbitals and then use only their latticeperiodic parts (see Eqs. (A2) and (A5) in Ref. [2]). Further, these quantities are Fourier-transformed from real to reciprocal space and used in the DFT+U(+V) and DFPT formalisms to compute various scalar products [see e.g. Eqs. (3) and (4)] as sums over reciprocal lattice vectors.
As shown in Ref. [47], the computed Hubbard parameters do depend strongly on the choice of the Hubbard projector functions. For example, for β-MnO 2 the U values computed using nonorthogonalized and orthogonalized atomic orbitals differ by about 1−2 eV while V values differ by about 0.3 eV [47]. Therefore, it is crucial that the same Hubbard projector functions are used for computing Hubbard parameters and for the subsequent DFT+U(+V) production calculations.

Hubbard parameters from density-functional perturbation theory
Following Ref. [10], Hubbard parameters can be defined as the second derivatives of the total energy with respect to the total occupation of a given atom, i.e. with respect to the trace of the site-diagonal occupation matrix n Iσ m 1 m 2 . This is consistent with the structure of the simple Hubbard corrective functional, shown in Eq. (2), that subtracts from the total energy a quadratic term in the atomic occupation, to substitute it with a linear one, thus removing the unphysical selfinteraction (delocalization) errors from approximate energy functionals. In this context, the inter-site part in Eq. (2) serves as an off-diagonal correction when removing self-interactions, which is especially relevant for systems with a covalent bonding. In practice, the calculation of the energy second derivative can be achieved by perturbing the system with a shift in the potential acting on the Hubbard states of a given atom J, λ J mP J mm (λ J is the strength of the perturbation), and then computing the response of all the atomic occupations. Applying this to all the Hubbard atoms in the system allows to construct the bare and self-consistent susceptibility matrices that are obtained, respectively, at the beginning of the perturbed run and at its self-consistent convergence: Here, n I = mσ n Iσ mm and similar definition holds for n I 0 . From these, the effective Hubbard parameters can be readily obtained [10]: It is important to stress that the procedure outlined above is based on isolated perturbations; therefore, it requires the use of large supercells (whose size has to be increased until the convergence of U I and V I J is achieved) [10] that makes these calculations computationally demanding and prone to accuracy issues due to their problematic convergence and their reliance on small energy differences. DFPT offers a more efficient approach to linear-response calculations and allows us to largely reduce these issues [2,3]. Within the framework of DFPT, the response of the KS wavefunctions to a small perturbation of the atomic potential [that induces a variation of the atomic occupations defined in Eq. (7)] is obtained as the self-consistent solution of the perturbative problem resulting from a first-order variation of the KS equations: Hub,σ is the total Hamiltonian of DFT+U+V, whereĤ • DFT,σ is the Hamiltonian of standard DFT, and V • Hub,σ is the corrective Hubbard potential defined in Eq. (5). ε • vkσ and ψ • vkσ are the KS energies and wavefunctions of the unperturbed system in the DFT+U+V framework.V J pert = mP J mm is the perturbing potential; dV Hxc,σ dλ J , dψ vkσ dλ J , and dε vkσ dλ J are the response Hartree and xc (Hxc) potential, response KS wavefunctions, and response KS energies, respectively [2,3]. It is important to remark that the response of the Hubbard potential is not present in Eq. (10) so that the Hubbard parameters are obtained, consistently with their definition, as second derivatives of the DFT part only of the total energy [2]. The problem has to be solved self-consistently because the response of the KS eigenvalues and of the Hxc potential appearing on the right-hand side of Eq. (10) depend on the response of the KS wavefunctions, obtained from the solution of the perturbative problem in the equation above. Once convergence is achieved, the variation of the diagonal (with respect to atomic sites) atomic occupation matrices [that define the self-consistent susceptibility matrix in Eq. (7)] are obtained [71]: The major advantage offered by the DFPT reformulation of LR-cDFT consists in the possibility to obtain the variation of atomic occupations as a sum of wavevector-specific contributions that can be computed independently from one another (thus leading to better scaling of the computational cost [2]) using the primitive unit cell of the system. In fact, the Fourier 5 spectrum of a perturbation that has the periodicity of a supercell (as needed to eliminate the interactions with periodic replicas) only contains fundamental vectors of its reciprocal lattice that map into a corresponding q points grid within the BZ corresponding to the primitive cell [2]. The total response of atomic occupations can thus be written as (see Eq. (42) in Ref. [2]): dn slσ where the atomic site indices I and J have been decomposed as I = (l, s) and J = (l , s ) indicating, respectively, the cell the atom belongs to (l and l ) and its position within the cell (s and s ). Here, N q is the number of q points in the first BZ (note that the dimension of the q points grid reflects directly that of the supercell it is the reciprocal-space image of). Hereafter, we use the over-bar to indicate lattice-periodic parts of the ground-state and response quantities. ∆ s qn s σ m 1 m 2 represents the lattice-periodic response of the occupation matrix to a monochromatic perturbation with a wavevector q, and it can be linked to the lattice-periodic variations of the KS wavefunctions as follows [2]: Here,ū • vkσ and ∆ s qūvkσ are the lattice-periodic parts of the unperturbed and linear-response monochromatic q component of the KS wavefunctions, respectively (see the appendices 1 and 3 of Ref. [2]). The lattice-periodic part of the projector on the Hubbard manifold, which appears in Eq. (13), reads [2]: The two terms on the right-hand side of Eq. (13) were made look similar (except for the inversion in the order of indices m 1 and m 2 ) by the use of time-reversal symmetry. As was mentioned above, due to the linear character of the perturbative problem [Eq. (10)], the lattice-periodic components of the response KS wavefunctions at different q can be computed independently from one another as solutions of q-specific Sternheimer equations [2]: where the perturbing potential reads:V The quantitiesĤ • k+q,σ and ∆ s qVHxc,σ are, respectively, the lattice-periodic parts of the unperturbed HamiltonianĤ • σ (which contains the Hubbard corrective potential with on-site U and inter-site V) and response Hxc potential for a specific q. The response Hxc potential ∆ s qVHxc,σ depends on the response spin charge density at the same q, ∆ s qρσ (r), which in turn depends on ∆ s qūvkσ (r) (see Ref. [2] for more details). The operatorsÔ k+q,σ andP k+q,σ are the lattice-periodic parts of projectors on the valence and conduction manifolds, respectively [52,72]: are the highest and the lowest energies of the occupied KS bands, respectively. The operatorÔ k+q,σ is inserted on the left-hand side of Eq. (15) in order to avoid singularity issues during the iterative solution; at the same time the operatorP k+q,σ avoids very expensive sums over numerous empty states [52,2,3]. Note that due to the presence of the projectorP k+q,σ in Eq. (15) the derivative of the KS eigenvalues disappears from the right-hand side of Eq. (15) in comparison to Eq. (10) (see also Ref. [2]). The KS wavefunctions at k + q points,ū • v k+qσ (r), which are present in Eqs. (17) and (18) are obtained by solving non-selfconsistently the KS equations and using the unperturbed ground-state spin charge densityρ • σ (r). All the operators in Eq. (15) appear with a specific q component as a result of recasting Eq. (10) in reciprocal space through the Bloch sums of all the quantities appearing in there (this is discussed in detail in Ref. [2]). The potential terms appearing on the righthand side of Eq. (15) represent the lattice-periodic components of the corresponding potential variations at the indicated wavevector q. Once these equations are solved self-consistently for all the wavevectors, Eqs. (12) and (13) are used to compute the susceptibility matrices using Eq. (7), from which the Hubbard interaction parameters are readily obtained as indicated in Eqs. (8) and (9). 6

Extensions of the DFPT formalism
The formalism presented above has been generalized in several ways: i) metallic ground states; ii) ultrasoft (US) PPs [73] and the PAW method [74]; and iii) explicit account of symmetry. A detailed discussion about the first two points can be found in Ref. [3].
Metallic ground states. The ground state of a given system computed using standard DFT can be used as a starting point for the DFPT calculation of Hubbard parameters. However, in some systems standard DFT predicts a metallic ground state due to large self-interactions errors, while in reality the system is insulating. Therefore, the DFPT approach presented above must be generalized to be able to work on top of metallic ground states. For this reason, an extension of the DFPT approach to metallic ground states has been developed and discussed in detail in Ref. [3]. It is based on the use of the smearing technique [75]. In metals, very dense k points sampling is needed to sample the Fermi surface, which is computationally very expensive. Thanks to the smearing of the Fermi surface it is possible to reduce greatly the number of k points needed to describe electronic states around the Fermi level, which helps containing the computational cost; as a consequence, these states have partial occupancy (between 0 and 1). The DFPT approach can still have the same form as for non-metallic systems but the following modifications are required [3]: i) electronic states should be allowed to have a fractional occupancy which translates into weighted sums over k points and band indices, ii) the definition of the projectorP k+q,σ is generalized, and iii) the response occupation matrix ∆ s qn s σ m 1 m 2 and the response spin charge density acquire an extra term proportional to the shift of the Fermi level when q = 0. The interested reader can find a detailed discussion in Ref. [3].
US PPs and PAW. In systems containing localized valence states (e.g., semicore states included in the valence region or atomic states of d or f kind in transition-metal and rare-earth compounds), high kinetic energy cut-offs in the plane-wave expansion are needed if NC PPs are used. In this case, it is useful to replace NC PPs by US PPs or PAW which allow to reduce significantly the kinetic energy cut-off and thus lower the overall computational cost of the calculations. However, when US PPs or PAW are used augmentation terms are added in the expressions for the spin charge density to restore the correct normalization, and consequently extra terms appear in the expressions for the local and Hxc potentials. In US and PAW formalisms, the standard KS equations must be replaced by generalized KS equations which contain the overlap operatorŜ [73], and therefore also the first-order response KS equations are modified [3]. The unperturbed and response occupation matrices are also generalized, as well as the Hubbard potential, with the projector on the Hubbard manifold acquiring theŜ operator, asP JI m 2 m 1 =Ŝ |ϕ J m 2 ϕ I m 1 |Ŝ . Detailed discussions are presented in Ref. [3]. Symmetry. The CPU time and memory requirements of the DFPT calculation can be significantly reduced by exploiting the symmetry of the system. In fact, as explained in abundant literature (and specifically in Appendix A.4 of Ref. [56] for Quantum ESPRESSO ) the use of symmetry allows to focus ground-state calculations only on a small portion of the regular k point grid used to sample the BZ (the so-called irreducible wedge of the BZ (IBZ)). Within DFPT, the use of symmetry is slightly more articulated due to the presence of perturbations. The response of a system (and the derivative of relevant quantities) is typically reconstructed from the response to a series of monochromatic perturbations that are computed one by one. Since a finite wavelength perturbation (q 0) lowers the symmetry of the crystal, at each specific q point all the ground-state quantities that are needed in DFPT calculations (e.g., KS wavefunctions, Hamiltonian, and eigenvalues) must be recomputed on an extended IBZ that is determined according to the so-called small group of q (i.e. the group of symmetries such that S q q = ±q + G, where G is a reciprocal lattice vector). The same procedure needs to be repeated for each q point. Once DFPT calculations are completed for all the inequivalent points of the q-grid, and the response on the other q-points is reconstructed by symmetry, the total response is computed according to Eq. (12). Symmetry is thus important to reduce the workload of both DFT and DFPT calculations. The use of primitive unit cells, and the possibility to reduce the number of q-specific linear-response calculations thanks to symmetry contribute to make these calculations substantially faster than supercell-based LR-cDFT which cannot escape the cubic scaling of DFT calculations with respect to the number of atoms.

Description of software components
The HP code is contained in a module of the Quantum ESPRESSO distribution [56,57,58], and it resides in a selfcontained directory HP under the root directory of the Quantum ESPRESSO tree. The HP code is tightly integrated in Quantum ESPRESSO , and it uses many routines from other modules, namely PW , Modules, and LR Modules (see Ref. [58] for more details). In addition, HP uses various domain-specific mathematical libraries of Quantum ESPRESSO , such as LAXlib (containing routines to perform linear-algebra operations) and FFTXlib (containing routines to perform Fast Fourrier Transforms (FFT's)). Basic linear-algebra operations (e.g. matrix-matrix and matrix-vector multiplications, scalar products, matrix inversions, etc.) are efficiently performed using BLAS and LAPACK libraries. In the following we discuss the workflow of the calculation of Hubbard parameters.

Ground-state calculation
In order to compute the Hubbard parameters for a given system, a standard ground-state DFT [or DFT+U(+V) with some initial guess of U (and V)] calculation has to be performed first, yielding the lattice-periodic parts of the unperturbed KS wavefunctionsū • vkσ (r) and the KS energies ε • vkσ for all the occupied states, the ground-state spin charge densityρ • σ (r), and the occupation matrix n Iσ m 1 m 2 . The information thus obtained is then used as input for the linear-response calculation with the HP code. This ground-state calculation is performed using the PW code (pw.x executable), which is one of the key components of the Quantum ESPRESSO distribution. In Appendix A sample input files for pw.x are shown (see the input samples 1 and 2). After the successful completion of the ground-state calculation, the pw.x code writes the ground-state quantities mentioned above to disk, together with all the relevant information about the system, including unit cell and atomic positions, pseudopotentials, energy cutoffs, k point grids, etc. This data is used by the HP code which reads it from file. Therefore, it is not necessary to redefine the system under study in the input file of hp.x.

Linear-response calculation
The linear-response calculation of Hubbard parameters (U in the DFT+U framework, or U and V in the DFT+U+V framework) is done using the HP code (hp.x executable). A list of all input variables of hp.x is given in Table B.3 of Appendix B, and a sample input file for hp.x is given in Appendix A (see the input sample 3). The total number of linear-response calculations that have to be performed is N pert N q , where N pert is the number of Hubbard atoms in the primitive unit cell that have to be perturbed, and N q is the number of points in the q grid.
The size of the response matrices χ and χ 0 is N sc H ×N sc H , where N sc H = N H N q , and N H is the total number of Hubbard atoms in the primitive unit cell [2]. Each column of the response matrices corresponds to the perturbation of a specific Hubbard atom of the primitive unit cell with a specific q. It is possible to reduce the number of linear-response calculations by perturbing only inequivalent Hubbard atoms in the primitive unit cell. The HP code contains the implementation of several algorithms which find inequivalent Hubbard atoms (i.e. N pert ); this is controlled by the keyword find atpert which is described in Table B The linear-response calculation for each perturbed Hubbard atom requires solving N q independent q-specific Sternheimer equations (15). N q is the number of points in the q point grid, N q = nq1 × nq2 × nq3. The strength of DFPT (see Sec. 2.3) resides in the fact that linear-response calculations at each q point are independent from other q points (in linear regime there is no coupling between perturbations at different wavelengths), and hence it is possible to parallelize calculations over q points as well (see Sec. 4.2). When the self-consistency of each q-specific Sternheimer equation has been reached, the q-specific response occupations matrices ∆ s qn s σ m 1 m 2 are computed using Eq. (13). Then, all the responses are summed up using Eq. (12), thus giving one column of the self-consistent response matrix χ (see Eq. (7)). One column of the bare response matrix χ 0 is computed in a similar way, however the sum of responses using Eq. (12) is computed after the first iteration in the self-consistent cycle of Eq. (15), i.e. before that the response of the Hxc potential (i.e. ∆ s qVHxc,σ ) builds up. Other columns of the response matrices χ and χ 0 are obtained from the perturbation of other inequivalent Hubbard atoms.
The Sternheimer equations (15) are solved iteratively and self-consistently using the conjugate-gradient method [52,56], and using the standard linear-response machinery of Quantum ESPRESSO [57,58]. In order to speed up the convergence of the iterative solution, the scheme of Ref. [76] is used for mixing the response Hxc potential ∆ s qVHxc,σ [2]. Table B.3 describes the parameters that control the convergence of the Sternheimer equations and of the response matrices χ and χ 0 .
The final step of the HP calculation is the postprocessing calculation of the Hubbard parameters using Eqs. (8) and (9). This step is computationally inexpensive (negligible compared to the linear-response calculation). In this final phase, the missing columns of the response matrices χ and χ 0 are reconstructed from the available data (i.e. columns which were computed explicitly, as discussed above) exploiting the symmetry of the system. This is done by comparing the inter-atomic distances, atomic types, and the orientation of spin (up or down). Once the full matrices χ and χ 0 have been reconstructed, they are inverted to compute the interaction matrix as showed in Eqs. (8) and (9).

Installation instructions
The HP program is distributed as source code, like the other components of the Quantum ESPRESSO distribution. Version control is handled using Git and the code is hosted on the GitLab platform [77]. The installation procedure of HP is the same as for all other modules of the distribution. Quantum ESPRESSO (including HP) makes use of GNU autoconf [78]. The HP repository, which contains the source HP code, is residing within the Quantum ESPRESSO tree. 8 The code is compiled with the following commands from within the Quantum ESPRESSO tree: ./configure make pw make hp Alternatively, it is possible to use cmake [79] instead of ./configure. Here, the first step sets up the environment (compilers, libraries, etc.), the second step compiles the PW code (pw.x), and in the third step, the HP code (hp.x) is compiled. Links to all these executables are created in the bin/ directory of the Quantum ESPRESSO tree. More detailed instructions on the installation can be found in the documentation that comes with the distribution. If problems are encountered during the installation or the use of HP (or of any other code in the Quantum ESPRESSO distribution) users can also take advantage of the Quantum ESPRESSO users forum [80] by posting specific questions about their difficulties. Users intending to make quick tests with HP or to use it for demonstrative or teaching purposes [81] could consider the Quantum Mobile [82] -a virtual machine that has Quantum ESPRESSO pre-installed along with several other codes for quantum-mechanical materials simulations. Quantum Mobile can be easily installed on laptops and desktops and its use avoids any issues related to the installation of the Quantum ESPRESSO package. Of course, the Quantum Mobile is not recommended for production runs; an optimized installation of Quantum ESPRESSO on workstations and highperformance computers should instead be preferred for these purposes.

Parallelization of the code
Like the other components of Quantum ESPRESSO, the HP code is optimized to run on a variety of different platforms, from laptops to massively parallel architectures. The parallelization of the HP code is achieved by using the messagepassing paradigm and calls to standard Message Passing Interface (MPI) libraries [83]. High performance on massively parallel architectures is achieved by distributing both data and computations in a hierarchical way across processors. The FFT's, which are used for transformations from real space to reciprocal space and vice versa, are also efficiently parallelized among processors [56]. The HP code supports four levels of parallelization: 1. The parallelization over perturbed Hubbard atoms, which is implemented by distributing independent linear-response atom-specific calculations (each having a grid of q points) across the processors, each taking care of one atom-specific perturbation. This parallelization is controlled by setting the input parameter perturb only atom(i) to .true., which specifies that the Hubbard atom with the index i will be perturbed in the current run; 2. The q points parallelization, which is implemented by distributing independent linear-response q-specific calculations of an atom-specific perturbation across the processors, each taking care of one or more q points. This parallelization is controlled using the input parameters start q and last q which specify the index of the starting and final q points from the list of all q points that have to be considered in the current pool of q points; 3. The k points parallelization, which is implemented by dividing all processors into pools, each taking care of one or more k points; 4. The plane-wave parallelization, which is implemented by distributing real-and reciprocal-space grids across the processors. Figure 1 shows the hierarchy of the parallelization levels of the HP code. First, the calculation can be parallelized over the Hubbard atoms that must be perturbed (see perturb only atom); second, for each perturbed Hubbard atom the calculation can be parallelized over q points (see start q and last q); third, for each q point (or a subset of q points) the linear-response calculation can be parallelized over the k points (by choosing an appropriate number of k point pools N pool ); and, last, within each k point pool all available CPUs are used to parallelize the calculation over plane waves (G points). If the system under study is quite small (say, a handful of atoms) and it can be run on a local workstation (with e.g. 8-16 cores), then it is convenient to skip the parallelization over perturbed Hubbard atoms and over q points, and use the parallelization over the k points (which can also be skipped) and use (only) the parallelization over plane waves. If the calculations are run on HPC clusters with many nodes, then it is highly recommended to use all aforementioned levels of parallelization in order to utilize the computational resources in the most effective way. Moreover, on HPCs it is recommended to avoid that the same compute node is split between k pools or groups of q points or perturbed atoms (due to the slow inter-node communications). Finally, we note that at present these levels of parallelization have to be chosen by the user manually.

Testing of the code
The HP code implements DFPT which is complex from the programming point of view, and hence it is crucial to have a test suite to ensure that the new developments do not break existing functionalities of the code -its availability facilitates maintenance of the code and ensures its long-term stability. As other components of Quantum ESPRESSO , the HP code relies on the test suite that is based on the testcode.py [84]: this provides the functionality to run tests 9 … perturb_only_atom(1) = .true.   Table B.3. Note, the specific parallelization over the k and q points is shown just for demonstration purposes, but in practice the total number of k and q can be split in many different ways, i.e. different number of k point pools N pool and different parallelization over q points (depending on the availability of computational resources).

PWs PWs
automatically (nightly) and compare selected quantities (Hubbard parameters) parsed from the output files generated by HP against reference values. The HP code is run both in serial and in parallel, using various combinations of commonly used compilers [Intel Fortran compilers (ifort), GNU Fotran compilers (GFortran), etc.] and libraries (Intel MPI, Open MPI, etc.), which ensures that the code function correctly on various high-performance computer (HPC) architectures and in different environments. Whenever new features are added to the HP code, the corresponding tests must be added to the test suite by the developers in order to guaranty the robustness of these features in the long term.

Benchmarking
We now showcase how to use the HP code for computing the Hubbard U for the Fe(3d) and Mn(3d) states and Hubbard V for Fe(3d)-O(2p) and Mn(3d)-O(2p) in Li x Mn 1/2 Fe 1/2 PO 4 at x = 0, 1/2, and 1. We recall that the validation of the DFPT implementation versus the finite-difference supercell approach of Refs. [10,13] was already done in our previous works [2,3]. After computing U and V self-consistently, we present the results for this material obtained in the framework of DFT+U and DFT+U+V.
The DFPT calculations of Hubbard parameters were performed using the uniform Γ-centered k and q point meshes of size 3 × 4 × 5 and 1 × 2 × 3, respectively, which give an accuracy of 0.01 eV for the computed values of U and V. These k and q point meshes were determined by performing convergence tests as described in detail in Ref. [2]. The KS wavefunctions and potentials were expanded in PWs up to a kinetic-energy cutoff of 65 and 780 Ry, respectively, for calculation of Hubbard parameters. The linear-response KS equations of DFPT were solved using the conjugate-gradient algorithm [92] and the mixing scheme of Ref. [76] for the response potential to speed up convergence.
Bulk Li was modeled at the DFT-PBEsol level using the bcc unit cell with one Li atom at the origin. The optimized lattice parameter is 3.436 Å. The BZ was sampled using the uniform Γ-centered k point mesh of size 10 × 10 × 10, and we have used the Marzari-Vanderbilt smearing method [53] with a broadening parameter of 0.02 Ry. The KS wavefunctions and potentials were expanded in PWs up to a kinetic-energy cutoff of 65 and 780 Ry, respectively.
The phospho-olivine Li x Mn 1/2 Fe 1/2 PO 4 has an orthorhombic crystal structure at x = 0 and x = 1 with a Pnma space group [93]. The unit cell contains four formula units, i.e. 24 atoms in the case of x = 0 and 28 atoms in the case of x = 1. The crystal structure at x = 1 is shown in Fig. 2. The transition-metal (TM) atoms (labeled as M) are coordinated by six O atoms forming a MO 6 octahedron of which it occupies the center. The P atoms are instead at the center of PO 4 tetrahedra that they form with neighboring oxygens. The three-dimensional structure of the crystal can be understood as being based on a network of corner-sharing MO 6 octahedra further linked by "interstitial" PO 4 tetrahedra that act as structural reinforcer [avoiding excessive volume variations upon Li (de-)intercalation] and chemical stabilizers (useful to avoid oxygen escapes). Li ions reside within octahedral channels along the intermediate-length side of the cell. The phospho-olivines are known to show an antiferromagnetic behavior below their transition temperatures. In the previous study (Ref. [14]) it was shown that different antiferromagnetic arrangements of spins result in total energies that differ not more than by ∼ 20 meV at the DFT+U+V level of theory. Here we use the magnetic configuration that minimizes the total energy (labeled "AF 1 " in Ref. [14]), and it is depicted in Fig. 2. Finally, there are several configurations for arranging two Mn and two Fe atoms in the unit cell of Li x Mn 1/2 Fe 1/2 PO 4 . Our goal here is not to investigate all configurations but rather to choose one as a representative case for comparing results obtained using different levels of theory. To this end, we choose to arrange Mn and Fe atoms such that two Mn atoms are antiferromagnetically coupled to each other and same for Fe atoms, as shown in Fig. 2.
The data used to produce the results of this work are available in the Materials Cloud Archive [95].

Ground-state calculation
In order to compute the Hubbard parameteres, independently from the specific functional adopted (DFT, DFT+U, or DFT+U+V), the first step is the ground-state calculation using the PW code. Since we are interested in a self-consistent calculation of Hubbard parameters [3], in Appendix A we show the input samples for DFT+U (input sample 1) and DFT+U+V (input sample 2) already with the converged values of U and V. If one wants to start from the DFT groundstate, then the values of U for Fe(3d) and Mn(3d) states have to be initialized to some small numbers (e.g. 10 −10 eV) for the sake of activating the Hubbard-related machinery (in the DFT+U+V case, there is no need to initialize V, instead initialize U for O(2p)).
The input files for the PW code contain the standard input parameters that will not be described here (the interested reader is invited to check the documentation of the PW code [96]). Instead, we focus on the initialization of the Hubbard-related input parameters. Since Quantum ESPRESSO v7.1, the input syntax for Hubbard-corrected DFT has changed to make it more user-friendly. More specifically, there is a new input card called "HUBBARD", where Hubbard-related data has to be specified. In particular, one has to specify the type of Hubbard projectors. Currently, in Quantum ESPRESSO there are two most popular types of Hubbard projectors, namely "atomic" that corresponds to the nonorthogonalzied atomic orbitals and "ortho-atomic" that corresponds to the Löwdin orthogonalized atomic orbitals (see Sec. 2.2). Here we use the second option, thus in the input file we specify "HUBBARD {ortho-atomic}". Next, one has to specify the Hubbard manifolds, the values of the Hubbard parameters, and the indices of neighbors I and J between which V is applied in the case of DFT+U+V. On the one hand, in the case of DFT+U (see the input sample 1 in Appendix A) we specify that we want to apply the Hubbard U correction to Fe1, Fe2, Mn1, and Mn2, so we write a letter U on each input line inside the "HUBBARD" card. For each of the Hubbard atomic types we specify the Hubbard manifold which is 3d, hence we indicate it as Fe1-3d, Fe2-3d, Mn1-3d, and Mn2-3d. Finally, we specify the corresponding values of the Hubbard U parameters on each input line: for Fe1(3d) and Fe2(3d) states we have exactly the same value of 4.97 eV, while for Mn1(3d) and Mn2(3d) we have 4.32 eV. These values were computed self-consistently using the HP code (see below). It is important to remark that Fe1 and Fe2 are crystallographically equivalent, and same for Mn1 and Mn2, and they differ only by the orientation of spin (this is why we have defined sublattices with different indices). That is why the values of Hubbard U parameters are the same for the same TM elements. On the other hand, in the case of DFT+U+V we need to specify both Hubbard U and V. The syntax for setting up Hubbard U is the same as in the DFT+U case. To specify Hubbard V, we need to indicate a letter V and then to indicate what is the couple of neighbors that we want to consider. For example, in the input sample 2 in Appendix A we specify the pair Fe1-3d and O-2p, and similarly for other nearest neighbors for each TM element. However, this is not all and we need to specify the indices I and J to say specifically to which atoms we are referring. Since Quantum ESPRESSO uses periodic boundary conditions, our real simulation cell is virtually replicated in all three Cartesian directions (positive and negative directions) and periodic replicas of atoms are generated (so we end up with a virtual 3 × 3 × 3 supercell). It is important to stress that in practice we still work with the unit cell while the virtual 3 × 3 × 3 supercell is only generated internally in the code just for the sake of determining the indices I and J of the neighboring atoms. This virtual supercell should not be confused with any of the real supercells that are used in the LR-cDFT approach to converge the computed Hubbard parameters. A priori the user is not supposed to specify these indices, instead, these indices and the values of Hubbard parameters are obtained as an output of the HP calculation. In the input sample 2 in Appendix A we can see that for each TM element we have specified six nearest neighbors because of the octahedral coordination; from our experience, setting six nearest neighbors is sufficient in the vast majority of cases. However, technically there is no restriction to include even further neighbors, but care must be taken in converging accurately the values of V.
In the case of magnetic insulators (which is the case here), the ground-state calculation must be performed using a twostep procedure. First step, we need to perform a self-consistent-field (SCF) DFT+U or DFT+U+V calculation as indicated in Appendix A by treating the system as a fake metal by using some smearing function (occupations = 'smearing', smearing = 'mv', and degauss = 0.01 in the system namelist). This is needed technically in order to allow for fractional occupations in the spin-polarized calculation (nspin = 2 in the system namelist) by setting some nonzero starting magnetization to each TM element (see starting magnetization). If we proceed directly to the HP calculation immediately after this first step there will be a problem because the density of states at the Fermi level is very small and there will be a diverging "metallic term" (see Eq. (79) in Ref. [52]). Therefore, we need to perform a second SCF calculation by restarting from the wavefunctions and spin charge density/potential of the first SCF calculation (by adding startingwfc = 'file' and startingpot = 'file' in the electrons namelist), by setting occupations to be fixed (occupations = 'fixed'), and by setting a total magnetization to be equal to the one determined in the first SCF calculation (tot magnetization = 0.00 in this case in the electrons namelist). After this second SCF calculation we obtain a converged ground state for a magnetic insulator, which allows us to proceed to computing Hubbard parameters using DFPT as implemented in the HP code.

Hubbard parameters
In this section we discuss how to compute Hubbard parameters using the HP code starting from the data generated from the ground-state calculation (see Sec. 5.2). The input file for the linear-response calculation is quite simple and is shown in Appendix A (see the input sample 3). First of all, one has to specify prefix = 'olivine' and outdir='./', which are the prefix and output directory that must be exactly the same as in the input samples 1 or 2 (this is needed for reading the ground-state data). Second, the q point grid must be set: in this case we use the grid of size 1 × 2 × 3 and we specify it as nq1 = 1, nq2 = 2, and nq3 = 3. The values of the Hubbard parameters computed using DFPT must be converged with respect to the size of the q point grid [2]; this is equivalent to converging Hubbard parameters with respect to the size of the (real) supercell when using the LR-cDFT approach of Ref. [10]. In principle, this should be enough in the majority of cases because the default values for other input parameters will be used. For the case of the phospho-olivine Li x Mn 1/2 Fe 1/2 PO 4 we have changed somewhat other two input parameters compared to the default values. Namely, we set conv thr chi = 1.0d-7 eV −1 which is the convergence threshold for the self-consistent response matrix (χ) I J [see Eq. (7)] during the iterative solution of the Sternheimer equation (15), and dist thr = 5.D-3 Bohr which is the threshold for comparing inter-atomic distances when reconstructing the missing elements of the response susceptibility matrices (χ) I J and (χ 0 ) I J in the post-processing step.
It is useful to comment on the different options available to make the HP code determine which Hubbard atoms must be perturbed. This is controlled by the input parameter find atpert (meaning "find atoms to perturb"). The default value (which is also used in this work) is find atpert=1: it checks the ground-state atomic occupations, Tr[n Iσ m 1 m 2 ], and compares them for different Hubbard atoms. If the differences between the atomic occupations of Hubbard atoms of the same type are smaller than the threshold docc thr (whose default value is 5 × 10 −5 ) then these Hubbard atoms 12 are considered to be crystallographically equivalent (even if the magnetic moments have opposite sign). By applying this check the Hubbard atoms are classified according to their occupations and one atom per class is then perturbed. This option is the default one because it is general and it works well in most cases (regardless the number of symmetries that the system features). A second option is find atpert=2: the code perturbs one Hubbard atom per type. It is important to stress that when using this option, Hubbard atoms of the same type will be always treated as equivalent (for the purpose of calculating the Hubbard parameters) even if they are crystallographically inequivalent or show different atomic occupations. This could lead to inaccuracies if not used properly. This option is useful when the user wants to distinguish Hubbard atoms that might accidentally assume the same occupation (see previous option). Option find atpert=3 corresponds to determining Hubbard atom perturbation classes based on the symmetries of the system. This option, mostly useful when the system is highly symmetric, serves to distinguish Hubbard atoms that accidentally show the same atomic occupation. Finally, find atpert=4 corresponds to perturbing separately all the Hubbard atoms in the unit cell (i.e., assuming they are all inequivalent), which is obviously the most computationally expensive case. All these options of find atpert provide a significant flexibility in determining which Hubbard atoms should be perturbed in a system of interest. In general, however, it is recommended to start with the default option find atpert=1. By applying this procedure to LiMn 1/2 Fe 1/2 PO 4 , 8 Hubbard atoms are perturbed in total within DFT+U+V: Fe1 (#1), Mn1 (#3), and 6 oxygen atoms (#5, #6, #9, #10, #13, #15) -see the numbering of atoms in the ATOMIC POSITIONS card in Appendix A.
As can be seen, within DFT+U+V when we compute the Hubbard parameters using DFPT we perturb also O atoms -this is needed in order to determine the inter-site Hubbard V parameters. However, we do not report and do not use Hubbard U for O atoms.
The resulting Hubbard parameters in the DFT+U and DFT+U+V frameworks are shown in Table 1. It is interesting to observe how the values of Hubbard parameters change upon the lithiation of the material. In the DFT+U case we can see that when going from x = 0 to x = 1/2 the U value for Fe(3d) states is only slightly increased (by 0.13 eV), while the U value for Mn(3d) states decreased significantly (by 1.66 eV). This means that the extra two electrons went to two Mn atoms and hence the corresponding U values dropped substantially. When further going from x = 1/2 to x = 1 we see that the U value for Fe has now decreased (by 0.17 eV), while U for Mn has further decreased (by 0.25 eV). Such a nonmonotonic behavior of U for Fe upon the lithiation is quite confusing, while for Mn we observe a systematic decrease with x. As will be shown in the following, this is a consequence of the fact that DFT+U does not take into account the inter-site Hubbard interactions that are very important for materials with covalent bonding.  Table 1: Self-consistent Hubbard parameters (HP) in eV computed using DFPT in the DFT+U and DFT+U+V frameworks for Fe(3d) and Mn(3d) states in Li x Mn 1/2 Fe 1/2 PO 4 for x = 0, 1/2, and 1. This is the case study presented also in Ref. [48].
In the DFT+U+V case we take into account the inter-site Hubbard interactions hence we can see more clear trends in Table 1. More specifically, when going from x = 0 to x = 1/2 the U value for Fe(3d) states stays essentially constant, while the U value for Mn(3d) states decreases (by 1.46 eV). This clearly shows that the extra two electrons went to two Mn atoms, while Fe atoms remain unaffected. When further going from x = 1/2 to x = 1 the U value for Fe(3d) states decreased (by 0.16 eV), but also the U value for Mn(3d) states decreased (by 0.23 eV, which is much smaller than when going from x = 0 to x = 1/2). This means that the extra two electrons now went mainly to Fe atoms. The relatively small change in U for Mn(3d) states seems to indicate that these states are very active chemically and hence they are very sensitive to changes in the chemical environment around them. In addition, it is important to remark that these are self-consistent values of Hubbard parameters, i.e. a structural optimization is performed for each concentration of x [3]. This might explain in part while the U values for Mn(3d) states still change when going from x = 1/2 to x = 1. The overall decrease in the U values for Fe and Mn can be explained by the fact that the 3d manifolds of TM ions acquire an extra electron due to the insertion of Li; the more electrons there are in the Hubbard manifold, the weaker is the screened Coulomb interaction between them, and hence the U value is smaller. Regarding the Hubbard V, in Table 1 we show the 13 range of obtained values for different couples Fe(3d)-O(2p) and Mn(3d)-O(2p). The intersite Hubbard V also decreases on average when going from x = 0 to x = 1. This latter observation is justified by the fact that the Li insertion leads to an increase of the cell volume and of the Mn-O bond lengths; the larger the bond lengths between two atoms, the smaller is the intersite Hubbard V interaction.

Accurate geometry and energetics from DFT+U+V
Using the self-consistent Hubbard parameters presented in Sec. 5.3, various ground-state properties of the phosphoolivine Li x Mn 1/2 Fe 1/2 PO 4 can be computed, such as lattice parameters, electronic structure, atomic occupations, magnetic moments, Li intercalation voltages, and others. The reader is invited to check Ref. [48] where all these properties are discussed in great detail for several phospho-olivines including Li x Mn 1/2 Fe 1/2 PO 4 . Here, for the sake of demonstration purposes only, we highlight briefly the accuracy of predictions of lattice parameters and Li intercalation voltages.   Table 1). The experimental data is from Ref. [93]. This is the case study presented also in Ref. [48].
To the best of our knowledge, the experimental lattice parameters and volume for Li x Mn 1/2 Fe 1/2 PO 4 are available only at x = 1 [93]. Hence in Table 2 we show the optimized lattice parameters and the experimental one at x = 1. It can be seen that DFT underestimates the lattice parameters and the cell volume while both DFT+U and DFT+U+V slightly overestimate them, with DFT+U+V marginally overcoming DFT+U in terms of accuracy. The effect of V on the crystal structure of this material is very small though. This trend is consistent with the one we found for LiMnPO 4 in Ref. [3]. Hence, overall we find that DFT+U+V gives the closest agreement with the experimental lattice parameters at x = 1. In Ref. [48] the optimized lattice parameters are presented also for other values of x. 1/2 < x < 1 Figure 3: Voltages vs. Li/Li + (in V) for Li x Mn 1/2 Fe 1/2 PO 4 for 0 < x < 1/2 and 1/2 < x < 1 computed using DFT, DFT+U, and DFT+U+V with self-consistent U and V determined from first-principles (see Table 1). The experimental data is from Ref. [93]. This is the case study presented also in Ref. [48].
The Li intercalation voltages for Li x Mn 1/2 Fe 1/2 PO 4 are shown in Fig. 3. A detailed discussion on how these voltages were computed can be found in Ref. [48]. Experimentally it is known that in Li-ion batteries with this material as a cathode there are two plateaus, with 4.1 V for 0 < x < 1/2 and 3.5 V for 1/2 < x < 1 [93]. The former corresponds to the average voltage of Li x MnPO 4 , while the latter corresponds to the average voltage of Li x FePO 4 . Indeed, when intercalating Li and changing its concentration from x = 0 to x = 1/2, Mn ions are first to react while Fe ions remain unchanged, hence the voltage corresponds to the pristine material Li x MnPO 4 . When further increasing the concentration of Li from x = 1/2 to x = 1, now Fe ions react while Mn remain unchanged and hence the voltage corresponds to the pristine material Li x FePO 4 . As can be seen, DFT largely underestimates the voltages, while DFT+U overestimates them. DFT+U+V gives the best agreement with the experimental voltages, which means that applying the on-site U correction alone is not 14 sufficient and it is important to include also the inter-site V correction to take properly into account the interactions of 3d electrons of TM ions with the 2p electrons of ligands. It is instructive to compare the accuracy of the DFT+U+V voltages with the ones obtained when using HSE06. As reported in Ref. [48], the HSE06 voltages (computed using the DFT+U+V optimized geometry) are 4.34 and 4.03 V for 0 < x < 1/2 and 1/2 < x < 1, respectively. Therefore, we find that DFT+U+V outperforms even HSE06 in terms of accuracy for predicting cathode voltages in this class of materials. Thus, DFT+U+V provides the most accurate and reliable framework for predicting voltages in phospho-olivines [14,48], and currently we are investigating the predictive accuracy of this approach for other types of cathode materials.

Conclusions
We have presented the HP code that implements DFPT for the calculation of the Hubbard parameters. DFPT allows to reduce significantly the computational costs and to improve the numerical accuracy of the Hubbard parameters by recasting the linear response to a localized perturbation into an array of monochromatic perturbations that can be calculated in the primitive cell independently of one another. Moreover, the calculation of empty electronic states is avoided [52] which greatly speeds up linear-response calculations of U and V. The HP code is one of the core components of the Quantum ESPRESSO distribution. It has multiple levels of parallelization which allows efficient usage of high-performance computers. Moreover, due to the high level of automation, HP can be readily used for high-throughput calculations e.g. using AiiDA [97,98].
The effectiveness of the code has been demonstrated by computing the Hubbard parameters self-consistently for the phospho-olivine Li x Mn 1/2 Fe 1/2 PO 4 at x = 0, 1/2, and 1. It has been shown that the Hubbard parameters change upon changes of x which means that U and V should be recomputed at each Li concentration and not treated as global xindependent parameters. The predicted crystal geometry and intercalation voltages are in very good agreement with the experimental data, thus validating this statement.
In the same spirit as the Quantum ESPRESSO project, HP provides scientists worldwide with a well documented and open-source framework for implementing their ideas. It is in our best hope that HP can benefit from the already well established users community of Quantum ESPRESSO for incorporating new ideas and keep growing in the future. The HP code is hosted in a community accessible Git repository [77] and hence, apart from the releases of Quantum ESPRESSO [96], researchers who are willing to test the latest experimental implementations are welcome to do so and to contribute with their feedback. Finally, the HP code can be extended so as to employ various new features, in particular: (maximallylocalized) Wannier functions as the projectors of the Hubbard manifold [99]; calculation of Hubbard parameters on top of meta-GGA functionals (e.g. SCAN [100]); extension to multi-channel and noncollinear spin-polarized frameworks; ability to be run on novel GPU-enabled architectures, to name a few. These are some of the topics of future investigations of the HP developers.  Prefix which is prepended to input/output filenames; must be the same used in the calculation of unperturbed system.

outdir './'
Path to the working directory containing temporary files (wavefunctions, spin charge density, occupation matrix, XML file with the system's data, etc., which are generated by a ground-state pw.x run).

find atpert 1
Method for searching of atoms which must be perturbed. Possible values: 1 -find how many inequivalent Hubbard atoms there are by analyzing the trace of unperturbed occupation matrices, Tr[n Iσ m 1 m 2 ]; 2 -find how many Hubbard atoms to perturb based on how many different Hubbard atomic types there are. Note: atoms which have the same type but which are inequivalent by symmetry or which have different occupations will not be distinguished in this case; 3 -find how many inequivalent Hubbard atoms there are using symmetry. Atoms which have the same type but are not equivalent by symmetry will be distinguished in this case; 4 -perturb all Hubbard atoms (this is the most expensive option).  skip equivalence q .false.
If .true. then hp.x will skip the equivalence analysis of q points, and thus the full grid of q points will be used. Otherwise, the symmetry is used to determine equivalent q points (star of q), and then perform calculations only for inequivalent q points.
dist thr 6 × 10 −4 (Bohr) Threshold for comparing inter-atomic distances when reconstructing the missing elements of the response susceptibility matrices in the post-processing step.
Continuation on the next page... Keywords controlling the convergence conv thr chi 10 −5 (eV −1 ) Convergence threshold for the self-consistent response matrix χ [see Eq. (7)] during the iterative solution of the Sternheimer equations (15).

thresh init 10 −14
Initial threshold for the solution of the Sternheimer equations (first iteration). Needed to converge the bare (non-interacting) response matrix χ 0 [see Eq. (7)]. The specified value will be multiplied by the number of electrons in the system (i.e. it is an extensive quantity).
ethr nscf 10 −11 (Ry) Threshold for the convergence of KS eigenvalues during the iterative diagonalization of the Hamiltonian in the non-self-consistent calculation at k and k + q points. Note, this quantity is not extensive.
niter max 100 Maximum number of iterations for the self-consistent iterative solution of the Sternheimer equations (15).
alpha mix(i) 0.3 Mixing parameter (for the i-th iteration, i runs from 1 to niter max) for updating the response Hxc potential ∆ s qVHxc,σ using the modified Broyden method [76]. nmix 4 Number of iterations used in the mixing of the response Hxc potential ∆ s qVHxc,σ using the modified Broyden method [76].
Keywords for the parallelization of the calculations (optional) perturb only atom(i) .false.
If perturb only atom(i)=.true. then only the i-th atom (not the atomic type) will be perturbed and considered in the run. This variable is useful when one wants to parallelize the whole calculation over perturbed Hubbard atoms (see Sec. 4.2).

start q 1
This keyword is used for the parallelization of the calculation over q points [see Sec. 4.2)] for a fixed perturbed atom (see perturb only atom). start q specifies the q point starting from which the calculations will be performed; see also last q.

last q N q
This keyword is used for the parallelization of the calculation over q points [see Sec. 4.2)] for a fixed perturbed atom (see perturb only atom). last q specifies the q point up to which the calculations will be performed; see also start q.
If it is set to .true. then hp.x will collect pieces of the response occupation matrices ∆ s qn s σ m 1 m 2 [see Eq. (13)] for all q points and compute the sum of them including the respective phase factors and using the normalization 1/N q , according to Eq. (12). This variable should be used only when start q, last q and perturb only atom are used; otherwise, hp.x will automatically compute the sum using Eq. (12).
Continuation on the next page...

19
Variable name Default Description compute hp .false.
This keyword is used to perform post-processing calculation of the Hubbard parameters. If it is set to .true., hp.x will not perform linear-response calculations; instead, it will assume that selected columns of the χ 0 and χ matrices were already computed in previous runs [each column corresponds to the response of occupations on all atoms to a perturbation of a specific Hubbard atom, see Eq. (7)]. The hp.x code will look for the files prefix.chi.i.dat (i runs over perturbed Hubbard atoms I) that must be stored in outdir/HP/. This keyword must be set to .true. when the calculation was parallelized over perturbations (or when the postprocessing step must be re-run). outdir and prefix are defined in Appendix A.