Variational improvement of the statistical model for heavy atoms, with minimization of correlation energy

The statistical model of Thomas–Fermi–Dirac treats many-electron atoms in terms of statistical density, and derives the well-known single particle potential US which in turn gives the determinantal function ΦS that carries the many-body feature. This paper presents: (a) a variational improvement of the model is presented by simply modifying the Ritz principle with a special form of trial function that contains both the ΦS and dynamical correlation parts. With the ΦS assuming the principal role, the Ritz principle is improved for treating heavy atoms. (b) Mediated by ΦS, the variational functional reduces immediately to that of the correlation energy Ect, and, when optimized, yields an upper bound on the exact energy. The exact Ec is derived to clarify the contents of the Ect, and, several approximations using the bound property are discussed. (c) Finally, the RRU is critically compared with the density functional theory, indicating that the RRU can be a viable variational alternative, as they are operationally inverse of each other, but with the important feature that the theory can be systematically improved.


Introduction
For the ground state of an atom with a small number N of electrons, the Rayleigh-Ritz (RR) variational principle [1], with its bound property, provides binding energies, often of very high accuracy. As N increases, the RR rapidly becomes more difficult to apply, as more correlation parameters are required. The Hartree-Fock-Slater method (HFS) [2,3], with Slater determinants and self-consistency, has been the most widely adopted approach for all N, but the self-consistency requirement for each orbital makes it computationally intensive, although several computation program packages are available. The many-body theories (MBT) have been developed [4][5][6] 1 , some along the HFS, and e.g. the correlation energy of an inhomogeneous electron gas has been calculated in terms of density fluctuations.
On the other hand, the statistical model of Thomas-Fermi-Dirac (TFD) [7][8][9] treats the many-body feature of a large atom in terms of the statistically averaged electron charge distribution, which in turn generates, via the Poisson equation, the well-known single-particle TFD potential U S . The individual shell orbitals are generated by the U S , which can then be used to construct the determinantal function Φ S and its energy E S for the ground state. Gross properties of many complex systems have been described well by the model [8]. The density functional theory (DFT) [10][11][12] has been designed to improve the TFD and also to simplify the HFS. Once the complicated exchange-correlation energy E X xc is evaluated, e.g. by the MBT [6], the rest of the theory is represented by the firstorder Kohn-Sham density ρ X and the single-particle potential U X derived from E . section 2 by choosing a special form of two-component trial function that contains explicitly the Φ S and an Ω t for dynamic correlation. This is a trivial extension, but makes the RR more readily applicable for systems with large N. Since the RRU turns out to be of more general applicability, we introduce a class of statistical potential model (SPM), with a single-particle U t and associated determinant function Φ t . The choice Φ t =Φ S and U t =U S gives the TFD as a prominent special case of SPM.
When a pre-fixed Φ t is inserted into the RR, the energy functional is shown to reduce immediately to that of correlation energy E ct and gives the RRU. Thus, the Φ t not only induces separation of E ct from the total energy, but also allows evaluation of E ct for each given Φ t and improves it variationally. Obviously, the RRU would be most effective if the part Φ t is made dominant, e.g. by optimal variation of a trial U t , and the Ω t can be simply approximated, as partially realized with the TFD and DFT. It will be assumed that such optimization can be and has been done.
In section 3, the exact E c is derived to clarify the contents of Ω t and E ct , and several sample approximations are considered in section 4. The RRU is then critically compared with the DFT in section 5, showing that the RRU can be a viable variational alternative to DFT. Driven by an optimized Φ t , the RRU enhances applicability of the Ritz principle for heavy atoms with many electrons. As will become clear, the RRU has the operational structure that is inverse of the DFT.

Modified RR with SPM functions-RRU
The RR principle for the ground state of an N-fermion bound state is given for a square-integrable trial function where η t is the Lagrangian multiplier. (Subscript 't' for trial, or approximate.) The full Hamiltonian H=H 0 +V, where H 0 =T+U Z for the kinetic energy and core potential. With the variations of J with respect to Ψ t and η t , we obtain η t →E t =J, and the well-known bound property E t E, where E is the exact ground state energy. The RR is especially effective for systems with small N (< 4), for which one-component trial functions are employed, as contrast to the approach given below.
2.a. The SPM model emphasizes the role of single-particle potential U t and is simply constructed by an independent-particle model Hamiltonian where the U t is given by = S = U u ; i N t 1 t it represents on average the potential for all the occupied orbitals that satisfy Fermi statistics. Its explicit form needs not be that of TFD, and will be the subject of study in the following. Typical examples are the atomic and nuclear shell models, and the TFD. The H U t generates the individual orbitals j nt mt nm In the following, all the Φ's are assumed to be of SPM type, constructed with the orbitals {j i ( j)} generated by U t , and n=1 for the ground state, we often set Φ 1t =Φ t .
The simplest case of RR is to take Ψ t =Φ 1t . Then, from the RR and (1), we have Note that the first term in (3a), 1t needs not be a bound on E 1 , in which case the second term is an important correction that restores the bound.
The E 1t can be optimized by varying the U t , taken as a nonlinear parameter. An obvious choice for U t is the U S and F = F , E  where the symbol∼indicates that no bounds can be proven. Incidentally, the Φ nt 's are especially convenient in defining density matrices, as * r = F ⋅ F , U N,nt nt nt and all the lower order r , U k,nt 1kN. They can be generated by the usual reduction procedure, which is reversible without loss of information in the present case of SPM functions Φ nt . (Notation: r U k,nt for the density matrix of order k for the model state n, and subscript 't' for trial or approximate.) The state label n=1 for the ground state will be dropped for simplicity.
2.b. A special form of the trial function for the ground state is constructed that contains two components, as We first examine the case where both the Φ t and Ω t are varied independently. The form (4) for Ψ t contains two parts, one for the SPM and the other for the 'correlation'. For a given Φ t , the Ω t term corrects variationally, and vice versa.
the J can be written as dual variational functionals. That is, and in which both the Φ t and Ω t are independently varied. The result is a set of coupled equations, where the normalizations depend on the coupling terms. Ordinarily, the bound state problem is usually handled in the RR by one component trial functions Y = W , t t without the Φ t . But, the Φ t term is included in the RRU to facilitate the RR for heavy atoms, still maintaining the bound property.
Incidentally, the two-component description is adopted frequently in describing scattering systems [13,14], for different reasons related to complicated asymptotic conditions. Nevertheless, it improves the variational treatment, with the bound property on the amplitude [15] 2 .
2.c. If a Φ t is pre-selected and held fixed, at F = F t s with F F = ( ) , 1 s s for a SPM function Φ s . The subscript s is to identify the fixed Φ s . Then, the J of RR immediately reduces to The variation d dW = J 0 c t then gives, with the above normalization condition on Ω t , Equation (8) is the first of the two main results of this paper on the RRU and correlation energy that improves the effectiveness of RR for heavy atoms with many electrons. (See also the form (10) below.) Its derivation is simple and elementary, and the many-electron feature is principally included in the Φ s , as it is supported by the TFDtype SPM potential U t . Explicit examples for the Ω t are given in section 4 that illustrate the effect of normalization condition. As stated earlier, the choice F = F s S and = U U t S gives the case of TFD, and the J c of (8) is a variational improvement of the TFD.
The identification made in (8) of J c as a correlation functional E ct is not immediately apparent, and must wait for the analysis given in section 3 where the exact correlation energy E c is derived. Evidently, the optimized E ct depends on the choice of Φ s ; for any Φ s , corresponding E ct can be defined. Thus, within the RRU, generally there are no unique definition of E c , as it represents the remainder of the correlations carried by E s and Φ s . However, one special case is described in section 3, where the U t is chosen such that the exact and model energies are made equal.
2.d. The bound property of E ct follows from the original RR, by letting and showing that The results (7)- (9) are the desired variational expression for the correlation and total energies. In the case of TFD, the TFD energy = E E s S is variationally improved by E ct . The derivation is simple, elementary and direct, and the (7) is of general validity, to be applicable to a wide variety of systems, especially in improving the TFD, shell model, pseudo-states with polarization potentials, and problems in surface physics, etc. The case with a more concrete choice of Φ t will be discussed in the next section.
The form (8) is not convenient for obtaining a direct explicit proof of the bounds. For this purpose, using the normalization condition given in the curly bracket in (7), (8) can be re-written, with  (10) becomes indicating that its bound cannot be given; for an accurate η t it is likely to be negative.

The exact E c and analysis of E ct
It is not immediately clear that the variational functional forms given by (8) and (10) are in fact that for the 'correlation energy'. In this section, the contents of E ct and its physical meaning are clarified by first specifying the function Φ s and then examining the 'exact' correlation function Ω t associated with it. This will also help comparison of the RRU with the DFT, in section 5.
The analysis basically involves comparison of the two spectra, of H and H .
U t The spectrum of H is given by the set {Ψ n , E n }, where D n Ψ n =0 with D n =E n − H and (Ψ n , Ψ m )=δ nm . In addition, we have the independent-particle SPM Hamiltonian n can be readily constructed, in determinantal form. As in section 2, we are interested only on the ground state properties, with n=1, the subscript n=1 is dropped for simplicity in the following, unless more clarity is needed; e.g. Φ=Φ 1 , E=E 1 , and Ω=Ω 1 , etc).
3.a. The exact ground state wave function Ψ is set, as in (1) with the special choice Φ t =Φ, The non-trivial solution for Ω, Ω≠−Φ, is obtained by firstly setting R=DΦ, and write Ω=−D −1 R.  (12a) and (13), the single-particle potential U has been left unspecified, and the discussion has been general. Now, we choose the special U=U M such that

3.b. In
, by adjusting . 14 U This choice of U in (14) connects for the first time the original system described by H to the model system H U . Writing U M simply as U ≡ U M in the following, the left hand side of (13) is given as Q That is, we have, with (12a), the self-consistency condition for the U, as , and the correlation energy is defined uniquely by (16) is the direct consequence of the requirement Δ=0 of (14), and thus they are equivalent. With Ψ t =Φ t +Ω t , the relations (14)- (17) can also be written as by adjusting the U t , and The E c of (17a) clearly represents the 'dynamical correlation', as it depends on the V's of second and higher orders, but its magnitude depends also on the difference Y=V−U, and not on V alone. This is physically reasonable, as whatever included in U must be subtracted out in E c . From (4) and (16), 3.c. One of the drawbacks with (17) is the explicit dependence on the exact energy E which appears in G Q . In principle, with Δ=0, E can be replaced by E U , or by E t . This problem does not appear in (8) or in (10), where the Lagrangian multiplier η t replaces the exact E.
The W of (17) is the correlation potential, defined in the full space spanned by the determinantal Φ n 's, nonlinear and nonlocal, and in multi-particle form (connecting up to four particles at a time). But, it is always to be averaged over the density matrix of order * r = F ⋅ F N, U N, 1 1 1 of order N, reducing it effectively down to the order two after the reduction. On the other hand, the U M in (14) is by definition in the single-particle form, so that we cannot simply set U M =V+W unless both the V and W are converted to single-particle, localized forms. The necessary conversion may be accomplished simply and approximately by performing the integrations only over the N -1 variables, dr T − 1 , which omits dr 1 .
Thus, the (16) and (17) support the main results of this paper (8)- (10); they give the set that must be solved selfconsistently for the U and Φ. The explicit derivation of E c and Ω in (17) is new, and shows for the first time all the necessary components to be self-consistent; U M , Φ, Y=V -U M , and G Q , all of which are intimately related to each other. The G Q itself must also be related to these components.

Approximations on Ω t and W
The development of the RRU and the exact E c given in sections 2 and 3 is useful only if simple approximations can be devised. We consider here a few sample cases, some retaining the bound property but always requiring the self-consistency (16), under the condition Δ=0.
4.a. Approximations directly on the Ω t in the variational functional J c can be obtained by constructing a properly normalized trial function. Consider the one-term case for Ω t , The E ct of (10) is then evaluated by inserting (18b). Further optimization of J c can be carried out through choices in the form of W¢ .
t For example, a determinantal form W¢ t = F¢ , t but F¢ t ≠Φ 1t . The trivial case W¢ t =Φ 1t simply gives Ψ t =−Φ 1t , and reduces to the case given by (3). Generally, more terms can be added in Ω t to improve the J c , as e.g.
where the normalization condition in (7) must be imposed. For convenience, the n=1 term is omitted in (19), such that (Ω t , Φ 1t )=0. It is of interest to compare (19) and the direct RR with Ψ t , expanded for example in a form Y = S ¢F = c , n n t 1 Nx nt with the original normalization condition (Ψ t Ψ t )=1. Since the Ω t of (19) includes a small component of Φ 1t , the fixing of Φ t =Φ 1t in the RRU eventually should give the same total energy E t , with ¢ c 1 ≈1. But, this point is made explicit in RRU, while the rest of the terms are treated separately.
4.b. The closure approximation [16,17] 3 for G Q and E c,av can be introduced, in which the energy denominators (E -E n ) −1 , n>1 and E=E 1 , in G Q are replaced by (E -E av ) for some E av . Then, by closure of the complete set {Ψ n }, as I=∑ n=1 |Ψ n )(Ψ n }, we have where E av is the effective 'average excitation' energy; it is often close to E 3 through E 5 . Methods to estimate E av are available [18] 4 . On the other hand, for example, assume that E av is approximately known, say E av ≈E 4 . Then, for A form similar to (21b) was also derived for the van der Waal's potential [16] (see footnote 3).
From (17) c,av av av 2 2 av 3 For the E av for the Lamb shift. Reference [13]. Specially see p 690 for the closure approximation. 4 Evaluation of E av for the average fluctuation potential was discussed in some detail.
But, the second term in (22a) vanishes; (Φ, Y Ψ)=Δ (Φ, Ψ)=0 by the definition of U M . Therefore, we finally have, with E=E U for Δ=0, dx i =dr i ds i including the spin. The density in (22b) immediately reduces to that of order three, and even lower with further approximation on Y 2 .
The E c,av is one of the simplest results of this paper for the ground state n = 1. Evaluation of the parameter E av requires the potential U M in Y, as well as the Φ and E U . All the terms in (22b) are calculable; the steps involved in its evaluation are completely spelled out. The iteration loop is: U M →Φ, E U →E c,av , (Φ, VΦ)=E dx →check with E UV =(Φ, U M Φ) and adjust U M and return back. Therefore, a self-consistency condition is involved in the evaluation of E c,av .
The corresponding W av can be defined as which is an average fluctuation potential, and is 'local', but can involve up to four-body coupling; by contrast W of (12) is generally nonlocal. Thus, the E c,av requires the statistical densities r U k of orders up to k = 3, which can be readily obtained from the * r = F ⋅ F U N,1 1 1 by the usual reduction procedure, but the W av may further be approximated. Incidentally, the potentials such as W and W av are routinely derived in composite system scattering [19].
4.c. Variational approximations on the G Q can be derived systematically using the bound property W<0 and the Hylleraas-Undheim-MacDonald (HUM) theorem [20] 5 . For example, an expansion of G UQ in terms of the set {Φ n } may be used. A variational estimate of G Q with the bound property [20][21] (see footnote 5) can be derived, as where the Ψ t is a square-integrable trial function. Equation (24b) can be improved by increasing its ranking, N x >1, which requires a diagonalization of the matrix with D, via the HUM. The result is similar to (19).
Unlike with (18), (24) depends on the exact ground state energy E≡E 1 . This is an unsatisfactory feature that is always present in the Green's function formalism, but operationally can easily be adjusted. On the other hand, relying on the formal condition Δ=0, we may replace the G Q t of (24b) by G , UQ t which involves Φ nt . The bound property is restored when W is inserted into J c and adjusting c t for proper normalization.

Comparison of the RRU and DFT
The RRU, developed in section 2 and elaborated on in sections 3 and 4, is given in two component form, involving the spectra of H U and H, representing the independent-particle model and the correlation effects. Thus, the RRU has the general structure very much similar to that of DFT, and this is by design in search for an alternative. In fact, the RRU variationally improves any SPM of the TFD type. The original motivations for developing the RRU is not only to enhance the applicability of the RR in treating heavy atoms by incorporating the many-body aspect of the TFD, but also to provide a variational alternative to DFT, which has been highly successful in treating many complex systems. 5.a. We first briefly summarize the principal structures involved in the RRU and DFT, for a detailed comparison: (i) While the DFT relies on the densities as the main variable, the RRU employs the wave function description, which is convenient for the RR. If the RRU is to be converted to that involving density matrices, both the first and second orders must be present, in sharp contrast to the principal part of the DFT where only the first-order Kohn-Sham density r X 1,1 is involved in the energy functional; evidently, the complication due to the dynamic exchange-correlations is packaged in the E xc .
(ii) The E X xc obtained by the MBT in the DFT, that includes both the exchange and correlation effects, must be compared with the E ct of sections 2 and 3, where the exchange effect is included through the properly antisymmetrized, determinantal wave functions Φ s and Ω t .
(iii) Within the RRU, the unique definition of the E c is given in the case D º - by adjusting the U t , such that (16) is satisfied; they are equivalent. The Δ can be replaced by D = = E E U t 1 t 1 t in practice. (iv) The difference Y=V -U appears in the definition of E c , as it is mediated by Φ s , so that the 'correlation' is defined by whatever is not covered by Φ s . Presumably, the E X xc includes this effect in terms of the density fluctuation [6,11].
(v) The energies E ct and E s =(Φ s , HΦ s ) can be systematically improved in the RRU, using the bound properties (9) and (17); the former by trial function Ω t and the latter by adjusting the U t . The choice Φ s =Φ S of TFD should facilitate this process, especially for systems with large N.
(vi) Specifically, the RRU gives a variational improvement of the TFD when the Φ S of the TFD is chosen for Φ s , and the corresponding J c is minimized for E ct . That is, the TFD is first represented by constructing the determinantal function Φ S with the potential U S that generates the single particle orbitals j .
S i This is the essential step in adopting the RRU. Then, equation (4) is Y = F + W , S t S S t and construction of the trial function W S t follows the procedure outlined in section 4.a.
5.b. To make further a clear comparison, explicitly summarize the operational steps: RRU-The principal steps involved in the self-consistency loop are: (a) choose the U t , similar to U S but not necessarily equal to it, and generate the orbitals {j i }, which in turn gives Φ t . This part has been done frequently, e.g. within the central potential model for atoms and shell model of nucleus. (b) Choose a square-integrable trial function Ω t and evaluate the functional J c of (5b); this part is difficult to carry out for systems with large N, but not with (23), for example. (c) Minimization of J c [Ω t : Φ t ] with respect to Ω t for a given, fixed Φ t gives E ct . The E ct part is missing in the shell model. Return to (a) and repeat the cycle, until the lowest energy is achieved. As stated above, instead of Ω t in (b), the E ct can be estimated in terms of the G t in one of the approximations described in section 4, such as that of (22); this will avoid some of the complications with integrals involved with Ω t .
DFT-on the other hand, the DFT [10,11] operationally involves (a′) construction of E X xc by a MB method [5,6], which involves a careful estimate of density fluctuation [6].
1,1 is the Kohn-Sham first-order density. It is then combined with the rest of the direct potentials to form the U X . (c′) Finally, the individual orbitals j { } i X are generated by U X , which in turn gives the density ρ X . The total binding energy is then evaluated. The iteration cycle back to (a′) gives a new E , X xc which depends on the new ρ X .
Clearly, the two theories involve steps which are the inverse of each other. That is, (a) versus (c′), (b) versus (b′), (c) versus (a′); the RRU is operationally inverse of the DFT.
The main simplification is in the treatment of E c , in the steps (b) and (b′). Thus, if one is to start the RRU with the Φ X , obtained e.g. with the Kohn-Sham orbitals (or its slight variations) by a dilation (=inverse of reduction) procedure, then the final result on E ct should be comparable to that of DFT. This is variationally feasible. Therefore, we may infer the potential effectiveness of the RRU from that the DFT.
The DFT formulation [10] was originally based on the RR, but presumably the choice of the density as the primary variable in relating the DFT to the TFD necessitated reliance on the MBT analyses for the E X xc involving the density fluctuation. By contrast, the RRU formulation starts with the RR and the trial functions Φ t and Ω t , which immediately result in dual variational functionals. But, with Φ t =Φ s held fixed, the functional J c can be minimized, thus resulting in a Φ s dependent E ct .

Summary and conclusion
We have formulated the RRU and focused on the detailed analysis of its theoretical contents. The principal results discussed in this paper are: (a) the derivation of a variational improvement of the TFD, especially for heavy atoms, (b) explicit definitions of the correlation energy and the proof of its bound property, and (c) a close comparison of the structures between the RRU and DFT. The RRU is a viable alternative to the DFT, with the capability of systematic improvements, and could potentially be extended to treat excited and scattering states.
The RRU for the ground states of heavy atoms with large number of electrons N is effected by introducing a two-component trial function, the dominant Φ s of determinantal form and the small correlation function Ω t . The former allows ready treatment of heavy atoms, while the latter is driven by Φ s in implementing the correlation. Simply summarized, RRU=SPM (with single particle U s and Φ s )+RR (with Ω t and J c ). For example, in the case of TFD with Φ s =Φ S , an accuracy of the E S is ∼5% or better for N>7, so that, to attain an overall accuracy of 1%, the E ct (with Ω t ) must be obtained at the level of only ∼20%. For systems with N<7, the Ω t plays as important a role as Φ s .
The establishment and retention of the bound property of the variationally determined energy are crucial in allowing systematic improvements. This property is absent in the DFT. Furthermore, the unique E c is defined for Δ=0 in (17) with the self-consistency condition (16), where the definition W involves explicitly on the difference Y=V -U. That is, the E c represents only the remaining part of correlation that is missed by E s , through the model potential U t . This is the direct consequence of the dependence of E c on Φ s . Finally, in section 5, the RRU is critically compared with the DFT. They both have the similar general structure, in a two-component form, but operated in reverse order. Simply stated, DFT=TFD (with U X and E X xc )+MBT (for E X xc ). Furthermore, the RRU and DFT employ two different variables; the RRU employs the wave functions while the DFT is developed with the densities as basic variables.
The effectiveness of the RRU has not been addressed directly in this paper, as it requires a careful and extensive study. It is beyond the scope of the present paper. However, from the close 'inverse' relationship between the RRU and DFT, as shown in section 5.b, the effectiveness of the RRU may be inferred from that of DFT which has been successful in treating a variety of complex systems over the years. On the other hand, several careful analyses of properly chosen systems should eventually help to fully demonstrate the effectiveness of the RRU.
The RRU should be extended to excited states and ensemble density functionals, preferrably with a bound property of the form (9). The insertion of SPM, combined with the HUM [20] (see footnote 5), may yield an effective approach for the excited states. For the scattering state, the open and closed channels are usually treated separately [21] 6 , as only the wave functions for the latter are integrable. Previously, an ensemble DFT was adopted for the closed channel component, assuming that an equivalent ensemble energy functional B X xc is to be given by MBT's. Obviously, the B X xc must be augmented by the excited-state RRU noted above. A detailed application of the RRU is being carried out, and works to extend the theory to excited and scattering states are being completed.
Several critical comments made by Dr B C Chang on this paper are gratefully acknowledged.