Fisher information and density functional theory

Funding information National Research, Development and Innovation Fund of Hungary, Grant/Award Number: 123988 Abstract Relationship of Fisher information and density functional theory is reviewed. Links between the Fisher and Shannon information, the local wave-vector and the relative information are displayed. Euler equations for the Fisher and relative Fisher information are presented. A combined information theoretical and thermodynamic view of density functional theory is analyzed. The extremum of the Shannon and Fisher information results a constant temperature and for Coulomb systems a simple relation between the total energy and phase-space Fisher information. Relations for phasespase fidelity, fidelity susceptibility and relative information are also presented.


| INTRODUCTION
Fisher information [1] has proved to be very useful among others in physics and chemistry. It has turned out to be particularly valuable in density functional theory (DFT) [2]. Its suitability was first emphasized in the fundamental paper of Sears, Parr and Dinur [3] presenting a relationship between the Fisher information and the quantum mechanical kinetic energy functional. In the last four decades a great number of papers (e. g. [4][5][6][7][8][9][10][11][12][13][14][15][16][17]) analyzed this topic. The Fisher information can be considered as a functional of the probability density distribution. In DFT the key quantity is the density. That is why Fisher information is so appropriate in DFT. The basic theorem of DFT has the consequence that the density determines the external potential, hence the Hamitonian, therefore any property of the system. It is interesting to add that there are other quantities with this extraordinary characteristic. For example, the local Shannon information also determines every property of a finite Coulomb system both in the ground and excited states [18].
If the density is known, the Fisher information can also be obtained. It can be done for any state of the system. On the other hand, if the densities of two different states are known, relative Fisher information can be determined.
It has been shown that the Euler equation of DFT can be derived from the principle of extreme physical information. In case of Coulomb sytems it can be done for excited states as well. Though there are several ways to treat excited states in DFT (see e. g. [19][20][21][22][23][24][25][26][27]), here the recent time-independent theory for a single excited state of a Coulomb system [28][29][30] is considered. Coulomb systems include a very important class of systems as all atoms, molecules or solids belong to it. The Coulomb density has the remarkable property that it determines both the Hamiltonian and the degree of excitation as well. There exists a universal excited-state variational functional for the sum of the kinetic and electronelectron repulsion energies. This functional is relevant for the ground state and all bound excited states. Furthermore, the Euler equation and Kohn-Sham equations have the same form as in the ground-state theory.
Dedicated to the memory of Istv an Mayer.
The Euler equation of the non-interacting system is where T Coul s , v KS and μ are the non-interacting kinetic energy functional, the Kohn-Sham potential and the chemical potential, respectively. ϱ k is the density of the kth excited state. In orbital-free DFT, the Euler equation should be solved instead of the Kohn-Sham equations. As there is only one Euler equation, while there are several Kohn-Sham equations for a large system, it is worth using orbital-free DFT provided that adequate approximation for the kinetic energy functional is available. Information theoretical concepts turned to be useful to approximate kinetic energy functionals [13,[31][32][33][34]. It is worth mentioning that there exists an orbital-free scheme in which the knowledge of the kinetic energy functional is not needed [35,36].
Information theoretical aspects can be combined with a thermodynamic transcription of DFT. Ghosh, Berkowitz and Parr (GBP) [37] proposed thermodynamic interpretation. According to it the ground-state DFT can be considered a local thermodynamics and local temperature can be defined. This fascinating method has found several extensions and applications . For example, it is possible to define the local temperature by maximizing the phase-space Shannon information [60,61] or minimizing the phase-space Fisher information [62].
The review is organized as follows: Section 2 summarizes the definition and properties of the Fisher information, the relationship with the Shannon information, the local wave-vector, the relative information and the fidelity. In section 3 the Euler equation and the principle of extreme physical information are studied including the Euler equation for the Fisher information and the relative Fisher information. Section 4 presents a combined information theoretical and thermodynamic view of DFT. Utilizing the Ghosh -Berkowitz -Parr theory and its extension, phase-space fidelity, fidelity susceptibility and Fisher information for Coulomb systems are analyzed. The last section is devoted to discussion.

| FISHER INFORMATION
Fisher information [1] is a measure of intrinsic accuracy in statistical estimation theory. Consider a normalized probability density function f(xjθ) that contains a parameter θ. The Fisher informational functional [1] is defined as If we make a measurement of x we can try to estimate the parameter θ. We look for the best estimateθ ¼θ x ð Þ. Examine a large number of x-samples and denote e 2 the mean-square error. According to the estimation theory Ie 2 = 1 for the best possible estimator. For any other estimator the mean-square error is larger. That is, Fisher information provides the lower bound. The Cramer-Rao inequality states There is a family of distribution functions of the form f(xjθ) = f(x À θ). This shift invariance has the consequence that does not depend on θ. Therefore, the parameter θ is taken to be zero: For a three variable function f(r) Equation (5) takes the form 2.1 | Relationship with Shannon information and the local wave-vector Two decades after Fisher [1] defined information Equation (2) Shannon [63] introduced a new kind of information S is often expressed with the density ϱ because ϱ is the fundamental quantity in DFT.
Introducing the shape function σ(r) = ϱ(r)/N the corresponding Shannon information is Obviously, because ϱ is normalized to the number of electrons N. Frequently, the Fisher information Equation (6) is also expressed with the density It is worth introducing the quantities: the specific Shannon information (Shannon information per particle) s(r) = À lnϱ(r) and the specific Fisher information (Fisher information per particle) i(r) = (jrϱ(r)j/ϱ(r)) 2 the specific Shannon information (Shannon information per particle) [64].
To establish a relationship between the Fisher and the Shannon information [11,12] it is worth to introduce the local wave-number [65] as the ratio of the density gradient to the electron density Therefore [10,11,64].
That is, the local wave-number provides a link between the Shannon and Fisher information. Namely, q is the gradient of the specific Shannon information, while the square of q gives the specific Fisher information.
The relationship between the Shannon and Fisher information can be formalized in another way [9]. The local Shannon and Fisher information can be defined as ϱ Shannon (r) = À ϱ(r)lnϱ(r), ϱ Fisher (r) = jrϱ(r)j 2 /ϱ(r) andρ Fisher r ð Þ ¼ Àr 2 ϱ r ð Þ ln ϱ r ð Þ. Note that ϱ Shannon (r) = ϱ(r)s(r) and ϱ Fisher (r) = ϱ(r)i(r). ϱ Fisher andρ Fisher are different functions of r, but both integrate to the same Fisher information. The Shannon information can be expressed as [9]: ð ϱ Fisher r 0 ð ÞÀρ Fisher r 0 ð Þ j r À r 0 j drdr 0 : ð15Þ and the relative Shannon or Kullback-Leibler information (also called cross-entropy) [72]. The relative Fisher information obtained from the density takes the form J can also be rewritten as On the other hand, the relative Shannon or Kullback-Leibler information constructed from the density has the form G has been applied in chemistry to quantify electrophilicity, nucleophilicity, regioselectivity [73].
The relative local wave-vector was defined as [74]q where From Equation (13) followsq The relative specific Shannon information can be defined as It is related to the relative local wave-vectorq

| Fisher information, fidelity, fidelity susceptibility and Kullback-Leibler information
For pure states fidelity is defined by the overlap between the states Ψ and Φ as In DFT the density is the fundamental quantity, therefore DFT fidelity [75] is defined as the distance between two normalized densities f and g: Fidelity is an important concept of information theory to measure the "closeness" of two quantum states. In quantum chemistry quantum similarity measures have been used for a long time (See e. g. [76]). One of the most popular similarity measures is the Carb o index [77].
where ϱ A (r) and ϱ B (r) are the molecular electron densities.
Later generalized quantum similarity index (QSI) was defined [78] as whereg 1 andg 2 are the distribution functions and q is a real number. Obviously we get back the Carb o index for q = 2. The case q = 1, on the other hand, gives the fidelity. QSI between Krypton and all neutral atoms with nuclear charge 1-103 for different q in position and momentum spaces were presented [78].
Consider now continous density distributions depending on some parameter θ. An expansion around θ leads to the definition of fidelity sus- Expanding f(θ + δθ) around f(θ) we arrive at (For convenience only the dependence on the parameter is written in the argument of f.) That is, the DFT fidelity susceptibility χ is proportional to the Fisher information (Equation (2)). In case of shift invariance χ is proportional to Fisher information of Equation (5) (or Equation (6)).
Following the derivation of Frieden [79] one can find a link between fidelity susceptibility and the relative or Kullback-Leibler information [72]. Rewrite Fisher information as Introducing the small quantity ν = f(θ + δθ)/f(θ) À 1 and using the expansion ln(1 + ν) = ν À ν 2 /2…, we arrive at Therefore, Equation (32) has the form or That is, the DFT fidelity susceptibility is proportional to the relative information constructed from densities f(θ) and f(θ + δθ) .
In case of shift invariance [79].
From the definition Equation (30) follows a linear relationship between the relative information and DFT fidelity

| Local values of observables and variance
Inequality Equation (3) is related to the variance of observables. To reveal this link it is worth introducing local values of quantum observables.
Consider a quantum mechanical observable A acting on the N-electron wave function Ψ. Define a local quantity as The real part A r ð Þ gives the expectation value where ϱ(r) is the electron density. The imaginary partÃ r ð Þ is related to the variance [12]: That is, the real part of the local value of a quantum observable is associated with the expectation value of the quantum observable, while the imaginary part comprises the fluctuation.
Consider the total momentum operator P as a sum of one-electron operators p j It was shown [12] that the imaginary part is the half of the local wave vector: or the half of the gradient of the local Shannon information per particle: Therefore the variance of the total momentum can be written as That is, the imaginary part comprises the fluctuation which is proportional to the Fisher information.

| Kinetic energy, Euler equation
The relationship between the quantum mechanical kinetic energy and Fisher information was established by Sears, Parr and Dinur [3].
where the first term is proportional to the Fisher information and is a Fisher information density associated with the conditional density Ψ is the wave function. The first term in Equation (45) is the full Weizsäcker kinetic energy [83].
A comparison of Equations (48) and (11) leads to the relation that the Weizsäcker kinetic energy [83] is proportional to the Fisher information: The difference of the total and the Weizsäcker kinetic energies is called Pauli energy [84][85][86][87]. As the density is the same in both the interacting and in the non-interacting systems, the Pauli energy can be defined by Equation (50) in both systems with E kin denoting the kinetic energy of the given system.
The functional derivative of the Weizsäcker term is The functional derivative of the Pauli energy is called Pauli potential T p and v p embody all the effects of the Pauli principle (the antisymmetry requirement for the wave function). The Pauli energy has been used to identify strong covalent interactions [88] and appraise the quality of approximate kinetic energy functionals [89].
The Euler-equation of the non-interacting system has the form where v KS is the Kohn-Sham potential and μ is the Lagrange multiplier arising from the condition that the number of electrons is kept fixed. The Euler-equation can also be written as a single Schrödinger-like equation for ϱ 1/2 It should be emphasized that the Euler Equation (53) of DFT is also the Euler equation for the Fisher information because the Weizsäcker kinetic energy is proportional to the Fisher information. Equations (49) and (53) lead to [4] δI δϱ where and ν ¼ 8μ: An Euler-like equation can be derived for the specific Shannon information. First rewrite the Euler Equation (53) using Equations (12)- (14).
We can immediately obtain Therefore the functional derivative of the Weizsäcker term in Equation (53) Substituting Equation (60) into Equation (53) we arrive at the Euler-like equation of the specific Shannon information Equation (61) has the advantage that it can provide directly the specific Shannon information [64].

| Euler equation for the relative information
Suppose we have another (reference) density function ϱ ref (r) and it also satisfies an Euler equation where v ref KS and μ ref are the Kohn-Sham and chemical potentials of the reference system, respectively. Equation (63) is equivalent to the Euler equation of reference Fisher information where w ref The difference of the Euler Equations (55) and (64) gives [90]. wherew Consider now the relative Fisher information J (Equation (19)) and calculate its functional derivative with respect to ϱ Comparing Equations (64) and (69) we arrive at Euler equation for the relative information Euler-like equation of the reference specific Shannon information can be written as Comparing Equations (61) and (71)

| DFT from the principle of extreme physical information
The principle of extreme physical information is a variational principle that can be used to derive major physical laws. According to the principle formalized by Frieden [79] the "physical information" K of the system is extremum I is the fixed form of "intrinsic" information defined above (Equations (5) and (6)). J b is the bound Fisher information that incorporates all constraints imposed by the physical phenomenon under measurement.
The fundamental equations of DFT can be derived using this principle [4,91]. Here the derivation of the time-independent Euler equation is summarized. The minimization is done under the conditions: 1. The wave function is antisymmetric. It was pointed out [92,93] that this requirement can be taken into account by a local potential u P (r). 2. The density is kept fixed. This constraint can be insured by a local potential u(r) as it is used in the Kohn-Sham system when the potential is determined by requiring that the density be equal to the density of the interacting system.
3. The electron density is normalized to N. A Lagrange multiplicator χ is introduced. Minimization of the Fisher information leads to the Euler-Lagrange equation Comparing Equation (75) with the Euler Equations (53) and (55) we see that they are equivalent and The principle of extreme physical information can be used to derive the Euler equation for the relative Fisher information, too [90]. Then we have to minimize J with the constraints above.
The variation leads to the Euler Equation (70)  The time-independent Kohn-Sham Equations [91] and the time-dependent Euler equation of the density functional theory were also derived using the principle of extreme physical information [4]. Moreover, the time-dependent Euler equation for the pair density has also recently been derived utilizing this principle [94]. It is worth mentioning that maximizing the relative Shannon entropy has also turned to be very useful for several chemical application [95,96].

| Ghosh-Berkowitz-Parr theory
A combined information theoretical and thermodynamic interpretation of DFT was proposed by Ghosh, Berkowitz and Parr (GBP) [37]. It is summarized and extended in this section. Consider an electronic system with density ϱ that integrates to the number of electrons N: We search for a phase-space distribution function f(r, p) satisfying the conditions ð dpf r, p ð Þ¼ ϱ r ð Þ ð84Þ and ð dp The kinetic energy density t(r) integrates to the kinetic energy The phase-space distribution function can be determined by maximizing the information with keeping the density (Equation (84)) and the kinetic energy density (Equation (85)) fixed. The variation leads to a Maxwell-Boltzmann-like distribution function where k is the Boltzmann constant. α(r) and β(r) are r-dependent Lagrange multipliers arising from satisfying conditions (84) and (85). The distribution function Equation (88) provides the kinetic energy density reminiscing the familiar ideal gas expression. Note that Equation (89) is the direct consequence of the fact that the kinetic energy density is kept fixed. Because of this analogy the r-dependent β(r) is called local inverse temperature. f can be recast in the form The phase-space information Equation (87) with the maximizing distribution function Equation (90) leads to the Sackur-Tetrode expression

| Extemum of Shannon and Fisher information
It is wellknown that the kinetic energy density t(r) is not uniquely defined. The kinetic energy is unique and does not change if we add a term that integrates to zero, to the kinetic energy density. But, of course, β(r) and consequently, f will be different. There are several forms for the kinetic energy density (e. g. [97][98][99]) and one can apply the one which is more suitable for one's purpose. It has recently been proposed to take the kinetic energy density for which phase-space Fisher/Shannon information is minimum/maximum [60,62]. We seek that particular kinetic energy density that maximizes S: with keeping the orbitals and the density fixed. We arrive at that is, The Lagrange multiplier ζ equals the inverse temperature, i. e. β is constant: The same result can be obtained if the phase-space Fisher information is minimized. Take the normalized phase-space distribution function and construct the phase-space Fisher information Observe that the parameter θ in Equation (2) is the r-dependent inverse temperature β(r). Substituting Equation (96) into Equation (97) and integrating for p, we get The minimization of I g with fixed kinetic energy leads to that is, to a constant inverse temperature. It has the consequence that the kinetic energy density is proportional to the electron density and there is a simple relation between the Fisher information and the kinetic energy:

| Fisher information for Coulomb systems
The GBP transcription was proposed for the Kohn-Sham scheme and applied for the non-interacting kinetic energy. But we can observe that in the derivation there was no reference to the non-interacting system, and the derivation is valid also if the true interacting kinetic energy is employed [100]. The extemum of the Shannon and Fisher information can also be studied both for the non-interacting and the interacting kinetic energies. If we consider Coulomb systems we can obtain further important relations. In equilibrium nuclear geometry the virial theorem has the form where E and E kin are the true (interacting) total and kinetic energies, respectively. Therefore, Equation (101) can be rewritten as or we can express the total energy with the Fisher information Having a look again on the derivation above we can also observe that it is valid for excited states as well. One only needs the density and the kinetic energy density, they do not have to be ground-state density and kinetic energy density. Moreover, the virial theorem is also valid for excited states. Therefore, Equations (103) and (104) can be written for an excited state. Then the ith excitation energy can be given where E 0 , I 0 g and E i , I 0 i are total energy and Fisher information of the ground and the ith excited states.

| Phase-space fidelity, fidelity susceptibility and Fisher information for Coulomb systems
Based on Equation (27) the phase-spase fidelity can be written with the phase-spase density functions g(r, pjβ) and g ref (r, pjβ) as Using Equation (96) we obtain [100] F g, g ref For constant inverse temperatures F has the form where is the position-space fidelity between the states with densities ϱ and ϱ ref . That is, the phase-spase fidelity is proportional to the position-space fidelity, where the factor of proportionality depends on the inverse temperatures. Instead of the inverse temperatures one can use the total energies, too [100] F g, g ref Based on Equation (17) the relative or Kullback-Leibler information defined as measures the "distance" between the phase-spase density functions g and g ref . From Equations (96) and (111) follows For constant inverse temperatures where and That is, the phase-spase relative information G(g, g ref ) is equal to the position-space Kullback-Leibler information G(ϱ, ϱ ref ) plus a term depending on the inverse temperatures. c can also be expressed with the total energies [100].
The phase-spase Fisher information can be rewritten as That is, the Fisher information, the fidelity susceptibility and the Kullback-Leibler information are proportional quantities. There is a linear relationship between the Kullback-Leibler information and the fidelity [100] for "close" states.
Instead of the relative information J f (Equation (16)) and G f (Equation (17)) symmetrized forms called Fisher divergence and Jensen-Shannon divergence can also be used [66,67]. These divergences were studied for different families of probability distributions. They provide relevant information on the atomic shell structure both in position and momentum spaces.
Fisher information has found several interesting application in DFT. A stimulating utilization is related to the steric effect. The steric effect is a widely applied qualitative concept of chemistry: it indicates that an atom occupies a certain amount of space in a molecule. A precise definition came to light in DFT by Liu [84]. According to it the steric term E steric equals the Weizsäcker kinetic energy: It means that the steric term is proportional to the Fisher information. The Weizsacker kinetic energy has also been employed to predict stereoselectivity [101,102].
The Euler equation presented in several forms in Section 3.1 and 3.3 is valid for the ground state and if Coulomb systems are considered even for excited states. The Euler equation for the relative information in Section 3.2 and 3.3 can be gained from any two density functions and Eulerlike equation can be derived for relative Shannon information with any appropriate reference density. So relative Shannon information can be studied for example between an excited state and the ground state or two different excited states or two (excited or ground) states of different systems. Specific relative Shannon and Fisher information of Hydrogen atom was analyzed in [90].
In Section 4 phase-space distribution functions are considered. Phase-space Fisher information has interesting properties [103]. These distribution functions are supposed to be nonnegative and produce the correct marginal distribution functions [104][105][106][107]. It means that integrating f with respect to the momentum (position) coordinates we are led to the position (momentum) density. We mention in passing that the Wigner function [108] has the correct marginals, but is not everywhere nonnegative. Moreover, Wigner showed that bilinear distribution functions are not universally nonnegative. As Cohen and Zaparovanny [109] proved there are distribution functions that are nonnegative, provide the correct marginals, but they are not bilinear.
In the Ghosh-Berkowitz-Parr approach only the position-space marginal of the distribution function is correct. The momentum-space marginal is different from the correct one as the correct kinetic energy density is constrained instead.
The choice of the constant temperature is especially appealing because in Coulomb systems the GBP phase-space distribution function is a function of the density. It is the consequence of the fact that the density decays as where E NÀ1 0 À E is the vertical ionization potential of the N-electron system. E is the energy of the state considered and E NÀ1 0 is the ground-state energy of the N À 1 electron system [28][29][30]. Thus, the asymptotic decay of the density decides the energy E and β via the virial theorem. Therefore, the density determines the phase-space distribution function g (Equation (96)). Obviously, the density dictates the phase-space Fisher information I g (Equation (103)), too.
In Section 3 the position-space Fisher information with translational invariance is applied. In Section 4, on the other hand, the phase-space Fisher information (I g ) with parameter of β is utilized. It is easy to derive a simple relation between these quantities. From Equation (50) follows that Then using Equations (49) and (101) we arrive at the inequality between the two kinds of Fisher information There is equality for one-level systems. The simplest example for it is the Hydrogen atom. In the ground state the kinetic energy is E kin = 1/2 (in atomic units). Equation (49) gives I σ = 4, while Equation (101) provides I g = 1/6. That is, we are led to equality in relation (122).
Another advantage of the GBP phase-space distribution function is that phase-space fidelity is a product of the position-space fidelity and a term depending on the energies (Equation (110)). It should be emphasized that the normalized phase-space distribution functions g and g ref are not necessarily associated to the same system. These functions can belong to any Coulomb system and the phase-space fidelity can uncover similarity of different systems.
In Coulomb system the phase-space fidelity and Kullback-Leibler information are also determined by the density, that is, the (position-space) density comprises phase-space information. Earlier, position and momentum-space Fisher information were generally analyzed separately (e.g. [110][111][112]). The phase-space fidelity and Kullback-Leibler information include information of both spaces and can provide a combined measure of resemblance.
Fidelity and fidelity susceptibility have been applied in several fields, e. g. in quantum phase transitions [113], topological phase transitions [114,115]. The link between relative information and fidelity susceptibility [116,117] has also been revealed and illustrated with the quantum phase transition.
Chemical reactions have turned to be an important field of application of information theoretical concepts. Earlier reactivity, selectivity descriptors are mainly used in the ground-state theory, though these descriptors are often linked to excitability. (Even the first excitation energy can be used as a reactivity index [118].) Excited-state reactivity is now a new frontier [59,119,120]. A recent paper studies chemical selectivity through well selected excited states [121]. It is expected that Fisher information and related quantities presented above would turn to be a valuable tool to investigate excited-state reactivity descriptors.