Variational Cluster Approximation to the Thermodynamics of Quantum Spin Systems

We derive a variational cluster approximation for Heisenberg spin systems at finite temperature based on the ideas of the self-energy functional theory by Potthoff for fermionic and bosonic systems with local interactions. Partitioning the real system into a set of clusters, we find an analytical expression for the auxiliary free energy, depending on a set of variational parameters defined on the cluster, whose stationary points provide approximate solutions from which the thermodynamics of spin models can be obtained. We explicitly describe the technical details of how to evaluate the free energy for finite clusters and remark on specific problems and possible limitations of the method. To test the approximation we apply it to the antiferromagnetic spin 1/2 chain and compare the results for varying cluster sizes and choices of variational parameters with the exact Bethe ansatz solution.


Introduction
Magnetism is one of the fundamental phenomena in physical systems emerging from the interplay between Pauli's principle and the Coulomb interaction. In condensed matter systems the macroscopic nature of the system together with the crystal structure and details of the electronic bonding can lead to a huge variety of effects related to magnetism [1], and a proper theoretical description is still a major challenge. Loosely speaking one can distinguish two classes of magnetic materials, viz itinerant and localized magnets. The latter can be found quite frequently in so-called Mott insulators based on transition metal compounds [2] and can be well represented by models of localized spins interacting via an exchange interaction [1]. The simplest of such models is the Heisenberg model [1] where we introduce a magnetic field and allow for an anisotropic exchange interaction when J zz ij = J −+ ij . The physics of this model is well-known for the one-dimensional case [3]. For dimensions D ≥ 4 one can apply Weiss mean-field theory which correctly describes the universal properties of phase transitions and can be used to at least qualitatively calculate physical quantities [4]. For two and three dimensions, nearest-neighbor exchange and simple lattices one can use highly efficient Monte-Carlo simulations to investigate the static and dynamic properties of the model (1) [5].
The situation is quite different when the magnetic interactions become frustrated, for example for the Heisenberg model on triangular or Kagome lattices or in the presence of longer-ranged exchange interaction on square-lattices. In this case Quantum Monte-Carlo is plagued by a severe sign problem and reliable results for thermodynamic quantities cannot be obtained at low temperatures [6]. For certain situations, the sign problem can be circumvented by cleverly choosing the operator basis [7], but in general it poses a severe restriction on Monte-Carlo simulations [8]. Alternative numerical approaches, like the density-matrix renormalization group [9] or variational Monte-Carlo [10] are restricted to either one spatial dimension or groundstate properties only.
In the case of itinerant fermions Potthoff proposed a quite different ansatz, the self-energy functional approach (SEFA) [11]. It is based on the observation going back to Luttinger, Ward, Baym and Kadanoff [12], that the free energy can be formally represented as functional of the fermionic Green function, with a non-trivial part called Luttinger-Ward-Baym-Kadanoff functional. The important aspect of this approach is that this latter functional only depends on the structure of the interaction, but not on the kinetic energy of the fermions. This feature allows one to create well-defined approximations for models with strictly local interactions by replacing the lattice by a collection of clusters, which can then be treated exactly. These variational cluster approaches (VCA) have been used to study various models for interacting fermion systems [13][14][15][16][17][18][19][20]. One can also derive the dynamical mean-field theory [21] within this framework [11]. Koller and Dupuis later developed a formulation of the VCA for systems consisting of interacting bosonic particles, for example the Bose-Hubbard model [22].
Potthoff's SEFA rests on the representability of the free energy as a unique functional of the single-particle Green function respectively self-energy [23], with the contributions due to interactions being strictly separated from the non-interacting part. This property is in turn based on a linked-cluster expansion for the free energy involving Wick's theorem, or equivalently relies on standard Bose or Fermi commutation relations among the field operators constituting the non-interacting system [23]. Spin operators, however, form another algebra, and so the standard formalism does not work. Nevertheless, due to the reasons mentioned above it would be interesting to have an analogous method for Heisenberg Hamiltonians since it would open a new possibility to tackle the problem of frustrated spin systems. An additional category of models which cannot be treated easily within the standard Potthoff approach are those based on projected Hilbert spaces, like it is the case for the t-J model relevant for high-T c superconductors [24]. There do exist previous approaches where a diagrammatic perturbation theory for operators with non-standard algebra was devised [25,26], or a bosonization of the spin operators [27] introduced. However, none of these approaches met their expectations.
Another class of systems which are not included in the standard VCA formulations are those with non-local interactions as the necessary separation of local and non-local parts in the Hamiltonian is not possible any more. Early attempts to include such interactions in theories like the dynamical mean-field theory were usually based on scaling arguments [28,29] or assumptions about the structure how the fluctuation spectrum generated by these non-local interactions enters the free-energy functional [30]. In 2005 Tong proposed a so-called extended variational cluster approximation (EVCA) for fermionic models with non-local interactions [31]. In this approach, a Luttinger-Ward-Baym-Kadanoff functional was explicitly constructed from a fermionic coherent-state representation and tools of functional analysis were used to establish a cluster approximation for such systems. Tong also suggested that such an approach could be used for a Heisenberg spin system [31].
In this paper, we will derive a formulation of the variational cluster approximation for spin systems and test them for a simple spin model. The paper is organized as follows. In the next section we will introduce a coherent-state representation for spin operators. Within this formulation, we will derive an expression which has the structure of a Luttinger-Ward-Baym-Kadanoff functional and which will serve as basis to define an approximation based on a separation of the full system into clusters. This approach will be tested for the spin 1/2 Heisenberg chain in section 3. A discussion of the features and deficiencies of the presented approach in section 4 will conclude the paper.

The Free Energy Functional
The following discussion is based on the notation introduced by Tong [31], which was developed for non-local electron-electron interactions. Since there are considerable differences for a Heisenberg Hamiltonian we have to treat the derivation of a spin variational cluster approximation (SVCA) thoroughly in this paper.
As the starting point we use the spin path integral which can be derived by introducing spin-coherent states [32,33]. For a Hamiltonian of the form (1) with isotropic interactions J ij and zero magnetic field the partition function can then be written in the following form [34,35] The variables s i are vectors of length S which represent the quasi-classical path of the spins S. The function B( s i ) only depends on the structure of the manifold spanned by the possible paths and incorporates topological effects of a spin system. It represents the Berry phase [32] and does not include any information on the interaction J ij , which is solely present in the second term of the action (2). The term B( s i ) is one reason why the explicit evaluation of the spin path integral is rather complicated. However, to derive expressions for the free energy of a certain model and subsequently equations that allow to establish a spin variational cluster approximation one only needs formal functional dependencies following from (2), i.e. the precise form of B is not important. Let us define J ij (τ − τ ′ ) := J ij δ(τ − τ ′ ). As the exchange interaction will later serve as variable for performing variations, we introduce an auxiliary fieldJ, with the propertyJ = J for the true physical system. The action is then formally written as a functionalS Henceforth, functionals will be denoted by a tilde which will be omitted if the corresponding quantity assumes its physical value.
With this notation the partition function and free energy becomẽ The form of the functionalF [J] depends on the structure of the exchange interactioñ J only, but not its specific value. With the help of the functional (4), the full spin-spin correlation functionΠ is introduced as Note that this definition is somewhat different from the standard one, which explicitly subtracts the expectation values of the spins and results in the connected correlation function. Although it is possible to formulate the theory with this object, too, it turns out that the corresponding Hartree-like terms appearing in the action (3) lead to an additional set of constraints on local fields in the final formulation of the VCA for spin models, which are hard to satisfy for open spin systems. We therefore do not follow this route further here.
In this paper we treat the more general case of a Hamiltonian (1) where a finite magnetic field can be applied and which has the option of an anisotropic interaction. Here, a similar path integral representation can be derived using the spin-coherent states [33]. In contrast to the SU(2)-symmetric case we need to define two functional fieldsJ zz andJ −+ to introduce a functional actionS[s η ,J zz ,J −+ ] and the corresponding free energy functional. This is done in complete analogy to the above isotropic case. The integrands are now explicitly dependent on the spin vector components s η . On the other hand it can be shown that the Berry phase B( s i ) remains invariant [33,35]. This is expected since this topological term does not depend on a specific Hamiltonian but rather on the paths of the single spins along the sphere.
The actionS[s η ,J zz ,J −+ ] contains a local part which consists of terms originating from the finite magnetic field and of the Berry phase. This is justified since the latter is composed of a sum over the individual spins. The longitudinal and transversal spin-spin correlation functions can now be defined asΠ There exists another correlation function,Π +− , which is given by an expression similar to (8). In the final expressions we will encounter traces over these two quantities, which then lead to identical contributions. For that reason we will not considerΠ +− explicitly here.
We will now use the functional relations (7) and (8) to derive the spin VCA equations. To keep the formulae simple, we use a compact notation without explicit reference to the components of the functions, where appropriate. Note that in this case any trace also involves a sum over the different longitudinal and transversal parts of the functions appearing.

Luttinger-Ward Functional for Spin Systems
We start by introducing a Legendre transformed auxiliary functional where Tr denotes the trace over spatial indices, imaginary time and spherical components. The derivatives of the functionalÃ are where for simplicity we denote the longitudinal zz and transversal (−+, +−) correlation functions by a single z and t, respectively. With the help of Eq. (9) we can write the free energy functional as Legendre transform of the functionalÃ[Π], i.e.
So far not much can be said about the properties of the auxiliary functional. Of course the goal is to eventually derive some sort of Luttinger Ward functional. To this end we need to introduce the concept of a self energy for the correlation functionsΠ. A similar quantity is used for example in the extended Dynamical Mean-Field Theory (EDMFT) for fermionic models with non-local interactions [28,36]. Here, a generalized self-energy Γ can in principle be derived for a two-particle correlation function by using a cumulant expansion [29]. The resulting structure can be written in the general form [31] with J being a matrix consisting of the interaction parameters of the model and α some constant introduced to control the analytical properties of the approach. A typical choice is α = 1/2, which is also used by Tong [31]. However, in the case of spin models such a derivation does not readily exist, but one can nevertheless define a self-energy of the form (12) from analogy arguments [29,30]. This definition for Γ is sensible because it allows to introduce a Luttinger-Ward functional with respect to correlation functions which has the same structure as the standard functional for single-particle Green functions [31,37].
On this level, the nature of the quantity Γ seems somewhat artificial. Interestingly we can give it a well-defined meaning using the spin diagram technique introduced by Vaks, Larkin and Pikin [25] and further developed by Izyumov and Skryabin [38]. It is a perturbative approach with respect to the spin exchange interaction, and one can formally resum the diagrams to find relations similar to Dyson's equation, which in the present context are called Larkin's equations [38]. They can be written in matrix notation as [38,39] Π where ξ stands for z or t. The entries of the self-energy matrix Σ represent the collection of all diagrams in the expansion of the correlation function that are irreducible with respect to one interaction line. This is a conceptual difference to the diagrammatic definition of the usual self-energy and responsible for the slightly different structure of Eq. (13). It also means that Larkin's equation must not be identified with a Dyson equation of standard perturbation theory. Nevertheless we can formally rewrite Eq. (13) as Comparing this result to the expression (12), we now see that the previously defined quantity Γ corresponds to the inverse of Larkin's self-energy with α = 1. There is no need to provide an explicit expression for the Larkin self-energy in the present approach, so (14) can be used as a suitable and reasonable spin self-energy. From now on we will conveniently use Γ ξ for the inverse Larkin self-energy (Σ ξ ) −1 .
We can now proceed with the derivation of a Luttinger-Ward-like functional for spin systems. Using Eqs. (10) and (11) one can show that the following functional derivatives hold, where we used the self-energies (14). We now define a generalized Luttinger-Ward functional according tõ Of course one is faced with the question about the nature of this functional. The standard Luttinger-Ward functional can be derived as the collection of all connected closed skeleton diagrams [40]. A similar identification is not obvious for the formal definition (16), as the quantities appearing there are not explicitly connected to a diagrammatic expansion. Yet, as Tong already pointed out for non-local fermionic interactions [31], aΦ such as (16) is closely related to the formal derivation of the Luttinger-Ward functional for the standard Hubbard model introduced by Potthoff [37]. It can be shown that this Luttinger-Ward functional has several important properties, which we will discuss now with respect to (16). First, the free energy of the system can be written as a functional of the correlation functions usingΦ. Equations (11) and (16) lead to Secondly, the functional derivatives of the Luttinger-Ward functional with respect to the correlation function are given by the self-energy (14), i.e.
This property is readily shown using Eqs. (15) and (16). It defines the self-energies as functionals of the two correlation functions. When evaluated at the physical values Π, the functionalsΓ acquire their physical value in Larkin's sense. The third property the generalized Luttinger-Ward functional should have is that it is universal in the sense that it does not explicitly depend on the interaction parameters J. This is however a direct consequence of the definition of the free energy functionalF [J] in (4) and (11). It is defined by the structure of the local action (6) and the form in which theJ are introduced, not their explicit values. The property is inherited by the functional derivatives ofF [J] and the Legendre transform A[Π]. Therefore the functional (16) by construction does not depend on the specific interaction parameters.

The Spin VCA Equations
Even though the functionalF from Eq. (17) was derived in a formal way and is not directly based on a perturbative approach it can be used as starting point for approximations like the VCA. It was cast into a form with a structure which closely resembles the Baym-Kadanoff functional [12], and its constituent parts are well defined. The generalized Luttinger-Ward functional has the postulated properties, and a self-energy can be defined in a Larkin irreducible sense.
Following the idea of Potthoff's original approach and Tong's work we now rewrite the free energy as a functional of the spin self-energies Γ (14). For this step we need to introduce a Legendre transform of the Luttinger-Ward functional (16) The functional derivates with respect to the self-energies are These equations can be seen as defining theΠ as functionals ofΓ, i.e. we can write the free energy (17) with the help of equation (19) as a functional of these self-energies according to where all quantities are now to be taken as functionals ofΓ. The last two terms can be absorbed intoP [Γ] by using the defining relations (15), and we end up with the A central feature of the fermionic Luttinger-Ward functional not invoked yet is that it is stationary with respect to the corresponding single-particle Green function [12]. A similar stationarity condition holds for the functional (22) at the physical pointJ ξ = J ξ . We will refer to the functional (23) as spin VCA (SVCA) functional in the following. ForF SV CA we find We previously argued that the Luttinger-Ward functional (16) is universal in the sense that it does not depend explicitly on the interaction functionalsJ. This property is inherited byP which means that its functional form is the same for (22) and (23). We can thus eliminate it by subtracting the two free energy functionals, to arrive at This functionalF SV CA [Γ] is the equivalent of a Potthoff functional for a Heisenberg spin system. Although it was derived in a rather abstract way, it nevertheless can serve as starting point for a SVCA.
To this end one has to choose a proper reference system, i.e. a system of exactly solvable clusters that shares the local Hamiltonian H loc with the original system, while differing in the interactions J. The functionalsJ ,Γ andF in (25) are then replaced by the corresponding quantities of this reference system, their particular values depending on the interactions J c of the cluster. With a specific choice of these parameters we restrict the space of spin self-energies and obtain the expression Examples of suitable cluster systems for a square lattice can be found in figure 1, including possible variational parameters. Note that like in the fermionic approaches one has a rather large freedom regarding the clusters and parameters [41]. The exchange interactions connecting different spins as well as the components J z c and J t c can in principle be varied independently from one another. Also, additional sites can be added to the boundary of the cluster, and so on. However, to keep the computation practicable one usually restricts the variational space to a reasonable set of parameters. Similar to the usual SEFA for fermionic systems, the task then consists in finding a stationary point of the SVCA equation with respect to the chosen J c . Therefore Eq. (26) needs to be evaluated explicitly. Details on the actual technical implementation of this step can be found in Appendix A. An extremal point determined in this way represents an approximation to the stationarity condition (24). The results will of course depend on the specific choice of the cluster system and its parameters. In general one will expect that the approximation becomes better and less dependent on the actual selection with increasing cluster size. Similar to the SEFA for fermions it is hard to define a limit where the method becomes exact. For our theory this is the special case J → 0, i.e. for a model of decoupled spins. Here, the SVCA becomes exact for a reference system of single-site clusters.
Before we test the approach on a specific model two problems have to be discussed. As pointed out in Appendix A, there exists the possibility to obtain complex frequency poles in the approximate physical correlation functions. Since this should not be the case for a system in thermal equilibrium we conclude that the SVCA method can possibly break down and may not lead to meaningful results for a certain reference cluster with a specific set of parameters. Although such a breakdown need not be related to physical effects in any way, a possible interpretation could be that it signals a phase transition. The finite imaginary parts of the poles do not appear suddenly, but usually evolve continuously from the real axis to the imaginary axis by crossing the origin. However, as discussed in Appendix A, one has to demand that for consistency reasons there is no pole at zero frequency [22]. In bosonic systems a non-vanishing excitation at the complex frequency origin signals the development of a condensed phase.
For example, in VCA calculations for the Bose-Hubbard model, where breakdowns are also encountered, they are identified as a signal for the appearance of a superfluid phase [22]. This can be amended by formulating the VCA using a 'pseudo-particle' approach [42] or more rigorously within the Nambu formalism [43]. Both methods give valuable insight how condensed phases of bosonic systems can be treated in a variational cluster approximation. Although the physics of the spin operators is entirely different, it is tempting to invoke a similar interpretation for the breakdown of the approximation behind the SVCA, viz that a different magnetic phase evolves for which a certain cluster structure is not suited any more. If this is true, one should in principle be able to set up a generalized cluster Hamiltonian that respects such a phase transition.
One further comment has to be added. The breakdown we just discussed takes place below certain temperatures, which means that often T = 0 cannot be reached. This is the reason why it is advisable to evaluate the free energy (26) using a method suitable for finite temperatures as described in Appendix A.
A second problem for the SVCA is that it has a limitation on possible variational parameters of the reference system. In the derivation of the SVCA free energy it was necessary to assume that the formal structure of the local action (6) respectively the corresponding Hamiltonian describing the reference system is the same as for the model under consideration. The definition of the free energy functional (11), its Legendre transform and the generalized Luttinger-Ward functional critically depend on this property. In particular, we could not set up the SVCA equation (25) by eliminatingP otherwise.
So the local action (6) needs to be unchanged which is naturally fulfilled for the Berry phase term since it is independent of the actual Hamiltonian. The only limitation here would be that the magnitude of the spins remains the same in the reference cluster system, which is natural and reasonable in any case. The second term of (6) on the other hand imposes the restriction that the local magnetic fields h are fixed during the variation. Also, no additional local fields can be included in the reference system. Only entries in the interactions J ξ are eligible as variational parameters. On the other hand, variational local fields have proven to be a valuable tool in conventional VCA approaches, necessary to study certain states and phases of a system [14,15,20,22]. That we cannot use magnetic fields in such a way is a limitation of our SVCA method and it is currently not clear how to lift this restriction. The only possible solution at present is the introduction of local anisotropies like (S z i ) 2 in the cluster Hamiltonian. However, such terms give no benefit for the spin 1/2 chain used in the next chapter.
We would like to point out that the above limitation is also the reason why we do not use the connected longitudinal correlation function in the SVCA. In this case, the local part (6) of the action would also include terms proportional to S z i . The requirement that these terms remain unchanged for the original and cluster system can in general not be met for the open spin clusters treated so far.

Results
As a model to test our method we use the antiferromagnetic spin-1/2 Heisenberg chain with nearest neighbor interaction. This well-known model has been studied extensively and can be treated exactly via the Bethe-ansatz [44,45]. The spin chain including an applied magnetic field was solved with the help of analytical and numerical methods [3,46]. It will therefore be a good model to test the spin VCA since we can compare our results with the exact solutions by Klümper [46].
We use the Hamiltonian of the isotropic Heisenberg chain with N sites and periodic boundary conditions To apply our cluster approximation, we tile the chain into clusters with sizes between two and six spins and introduce spatially uniform cluster interactions J z c and J t c , which act as our variational parameters. In principle we could also attach additional auxiliary sites as seen in figure 1, thus enlarging the variational space. We will not use this freedom here as we could not observe a significant improvement of the results for the spin chain. For each of these cluster systems we need to evaluate the SVCA free energy (26) as described in Appendix A. To this end we use full diagonalization to find the eigenvectors and -values of the cluster Hamiltonian and then compute (26) Explicit expressions for these quantities are given in Appendix A, equations (A.10) and (A.14).
As an example we show in Fig. 2 the results of an evaluation of the SVCA free energy for a two-site cluster and a magnetic field h/J = 1. The cluster interaction is chosen to be isotropic, so the only variational parameter is J z c = J t c ≡ J c . To obtain a better picture of the relevant features the difference F SV CA − F c = T (K t + K z ) is plotted. Some characteristic properties can be discussed with the help of this plot. In VCA approaches one generally searches for extremal points of thermodynamical functionals respectively functions with respect to the variational quantities. It can be easily seen in Fig. 2 that extremal points exist for some temperatures, at least down to T = 0.4J. With decreasing temperature we observe that the algorithm does not give meaningful results for certain regimes of J c in accordance with the arguments in the previous section. In such cases no SVCA free energy is plotted in Fig. 2. Note that the regions where the algorithm breaks down become more and more extended as the temperature decreases, until we finally do not find any reasonable solutions to the SVCA any more. When this is happening for certain parameters the approximation as a whole fails. In the present example this takes place around T = 0.4J. For those regions where the algorithm works, we obtain smooth curves with well developed extremal points, which are actually maxima. This is a direct consequence of the derivation of Eq. (26). We note that in the results shown in Fig. 2 we do not observe a significant dependence on the k-mesh, but encounter such a problem when we reach the parameter regime where the present algorithm does not converge any more.
Once the free energy F SV CA (J c ) has been calculated, we need to determine the locationĴ c of the maxima. Within the present implementation of the algorithm this can be done with an accuracy of order of 10 −3 , which also provides an estimate of the numerical error of the computation. Within the SVCA the extremal point F SV CA (Ĵ c ) provides an approximation to the physical free energy of the system. We can use it to derive other thermodynamical quantities, for example the magnetization as the derivative ∂F ∂h , which is performed numerically via the central difference. By scanning the parameter space we can then determine these thermodynamical quantities as functions of the temperature T and magnetic field.
In Fig. 3 the magnetization per spin for a magnetic field h = 3J > h sat = 2J obtained from a two-site SVCA is plotted versus temperature. As can be seen from the exact solution derived with the Bethe ansatz the magnetization saturates for small T [46]. As will be discussed later we do not expect a breakdown of the SVCA approximation for this large magnetic field. Fig. 3 compares the exact solution with our results for several choices of variational parameters. We tested the isotropic case J t c = J z c we already discussed above as well as the case where J t c = J z c are two independent parameters. The latter means that we have to find an extremal point in a two-dimensional parameter space, which naturally is numerically more challenging. For the third choice presented in Fig. 3 we set J z c constant as J and only vary the transversal interaction J t c . Obviously, this latter selection of variational parameter yields the best result when compared to the exact solution. The isotropic and even more so the twodimensional anisotropic variation lead to obviously wrong magnetization curves, showing a suppression of S z for lower temperatures. The reason for this behavior is that when we use J z c as independent variational parameter the extremal points of F SV CA appear at values up to J z c ≈ 3.5J for low temperatures, while J t c stays between 0.6J and 0.8J. As a consequence the approximation is dominated by artificially enhanced antiferromagnetic correlations counteracting the external magnetic field, which can be clearly seen in Fig. 3. In the isotropic case, i.e. if we vary the two parameters together, the transversal part holds the longitudinal part at bay. Here values range between J c ≈ 0.8J for T = 5J and J c ≈ 1.5J for low T . If we only vary J t c the extremal points of F SV CA are found at values that always lie below 0.9J, which obviously leads to a better approximation of the spin chain thermodynamics.
It may at first seem odd that restricting the variational degrees of freedom results in a better representation of the physics. It can however be understood if one remembers that at least for one-dimensional spin models the spin operators can be mapped onto fermionic operators by the Jordan-Wigner transformation [47]. Under this mapping the transversal terms S + S − become the kinetic energy of the new Hamiltonian while the longitudinal terms S z S z lead to density-density interactions of the fermions. For the fermionic version of the VCA on the other hand it turns out that variational parameters connected to the kinetic energy lead to sensible approximations, while the interaction part should be kept fixed.
Another aspect of the behavior of the variational parameter J z c is in our opinion related to the magnetic field h applied in the z-direction. As we stated above we cannot use h as a variational parameter, i.e. it has to remain fixed. In the fermionic or bosonic version of the VCA local fields can be used to enforce thermodynamical consistency between cluster and real system, for example with respect to the occupation number in electronic systems [17,20]. As in the present formulation of the SVCA we do not have such direct control over the magnetization in z-direction through a variation of local magnetic fields, the SVCA seems to maximize the effect of the fixed external field by strongly increasing the longitudinal interaction J z c . This behavior is generally encountered in our SVCA computations for the spin chain. In the following we will thus present results solely with J z c = J fixed, using J t c as the variational parameter. Figure 4 shows the magnetization curves as function of temperature for four different magnetic fields, including the critical h sat = 2J. For larger fields the magnetization per spin is known to smoothly saturate to 1/2 [48]. This region is represented in Fig. 4 by a plot for h = 3J. When h < 2J the magnetization curves go through a maximum to settle for a finite value less than 1/2 as T → 0. Two graphs show results for magnetic fields below the critical field. Each plot in Fig. 4 contains the exact Bethe ansatz solution and SVCA results for three different cluster sizes, namely two-, four-and six-site clusters. We choose only even numbers of spins because a dangling spin for odd number of sites prohibits singlet formation in the individual cluster which leads to an expectedly unreliable approximation of the antiferromagnetic spin chain at least for small h.
As can be seen in Fig. 4, the SVCA approximation works best for the largest magnetic field, h = 3J. This holds true in general for any h > 2J, where the magnetization saturates. In this case we also find that the approximation remains stable down to T = 0. The saturation value of 1/2 is found numerically with good precision, which is remarkable without some variational local field as a control parameter. This also supports our choice to only vary J t c . The dependence on cluster size appears to be very mild, the curves for the four-and six-spin systems already nearly coincide. Thus, at least in the case of larger magnetic fields reliable approximations can be obtained for small to moderate cluster sizes.
The results for the critical value h = 2J seem to behave similarly at first glance, in particular the approach to the value 1/2 for T = 0. However, the specific form of the exact solution with its non-analyticity as T → 0 is not captured properly. Moreover, from the kink of the magnetization curve for the two-site cluster at low T one can infer the appearance of additional irregular behavior. This is a general feature in all our SVCA calculations close to the critical field, which is more or less pronounced depending on the actual quantity under consideration. For example, it is extremely prominent in the specific heat, see below. Finally, for h < 2J our approximation starts to break down at some finite temperature, as can be seen in Fig. 4. It thus seems that the SVCA indeed realizes that at h c a special situation arises for the model. Therefore, at h/J = 0.01 and h/J = 1 zero temperature can not be reached anymore as the SVCA starts to develop complex poles in the propagators for the whole parameter space and the results become meaningless. We however note that with increasing cluster size the stability region also increases and solutions are found for lower T . Indeed, especially the curves for the four-and six-spin cluster do not deviate much from one another until at a certain point the solution for the smaller system breaks away while the larger one continues to provide reasonable results down to lower T , until it starts to become unstable, too. This behavior shows that the SVCA indeed is an approximation that at least improves systematically with cluster size. Note that such a behavior can in principle be expected from the formulation, but is nevertheless by no means trivial.
As can be seen in the plots the magnetization usually becomes divergent to either plus or minus infinity when the SVCA breaks down. It is of course tempting to interpret this breakdown as a signal for a phase transition, albeit an artificial one. The antiferromagnetic Heisenberg spin chain does not have a true phase transition at any temperatures [49], but is critical in the sense that it develops algebraic correlations at T = 0 [1]. Adding a not too large magnetic field does not change this situation, but only results in a finite magnetization less than 1/2 [1,46]. Since mean-field like theories such as the SVCA tend to produce phase transitions if the correlations in the clusters become too strong, we suggest that the SVCA here tries to form an ordered state to accommodate the slow decay of correlations in the cluster. As the analytical structure of the quantities entering the SVCA should be different in such a situation, we cannot expect that our present implementation is suitable to handle it properly. In particular the appearance of a finite sub-lattice magnetization will lead to intrinsic consistency problems as discussed in the previous section.
In any case, apparently the SVCA for small clusters is not able to properly describe the region where the spins form some correlated, non-saturated state at low temperature. However, the curves in Fig. 4 for the six-spin cluster at least show the maximum of the magnetization present in the exact solution, although the magnetization values for both h = 0.01J and h = J are systematically too high, overestimating the exact value in the maximum by ≈ 25%. As mentioned earlier such a behavior should actually be expected generically because we do not have the option to adjust a local field as control parameter to enforce a certain magnetization value. Yet it is interesting that the positions of the maxima are close to their exact values: For h = 0.01J the SVCA predicts T max,SVCA ≈ 0.55J to be compared with T max,BA ≈ 0.65J, while for h = J we obtain T max,SVCA ≈ 0.44J versus T max,BA ≈ 0.5. Thus the SVCA with our present setup seems to systematically overestimate the magnetization -this is also true for larger magnetic fields, albeit not that strongly -while it underestimates the fluctuation scale related to the position of the maximum. Note that of course both features are related and the directions they are going consistent.
To conclude the discussion of the SVCA results for the antiferromagnetic spin chain we show in Fig. 5 our results for the specific heat as a further example. The cluster sizes and parameters used are the same as for the magnetization in Fig. 4. For the heat capacity c = −T ∂ 2 F ∂T 2 holds, i.e. one needs to numerically calculate the second derivative, which is more prone to numerical errors. From the plots in Fig.  5 we directly see that the SVCA results for the heat capacity are less accurate than for the magnetization, even for larger magnetic fields. For the critical value h = 2J we find additional artificial structures at low temperatures. Increasing the cluster size improves the agreement with the exact Bethe ansatz curve down to roughly T ≈ 1.4J, but deviations remain significant for lower T and do not seem to improve systematically. The situation becomes even worse for h < 2J where again divergencies appear and also the overall shape is not reproduced that well any more. At least for h = J the tendency seems to follow the expectation, viz that increasing the cluster size yields a systematic improvement of the results, but for h = 0.01J even this feature seems to be lost. While one can expect that deficiencies of the approximation as well as numerical errors are more pronounced in quantities which are obtained as higher derivatives, it is not clear at the moment why the scaling with cluster size of the heat capacity so significantly deviates from the expected behavior for small magnetic fields.
In the case of a large magnetic field h = 3J we again can calculate the specific heat down to T = 0 and the curves approximate the exact solution reasonably well. As expected, the largest cluster provides the best approximation. Again, the SVCA results tend to overestimates the heat capacity around the maximum, but predict its position with good accuracy, which is improving with increasing cluster size. This confirms our previous observation from the magnetization that for magnetic fields above the critical field the SVCA results are more reliable than below h = 2J.

Summary and Discussion
With the SVCA we present a new cluster approach for Heisenberg spin systems. It is inspired by the SEFA which was originally proposed by Potthoff for fermionic systems with local interactions [11] and which has been subsequently extended to bosonic degrees of freedom [22] and to cluster approximations like Tong's EVCA for more complex models [31]. Using a path-integral representation for the partition function of a Heisenberg model we developed a theory that allows to devise such a variational method for spin models. In the original SEFA, the key object is the single-particle self-energy of fermions or bosons, and approximations are applied directly to this quantity. For spin models a similar object is not readily available, but by means of spin diagram techniques we could identify the perturbatively defined Larkin selfenergy [38,39] as a suitable quantity for this step of the theory. We finally derived a set of variational equations which yield approximate solutions for the properties of spin systems by introducing certain solvable cluster reference systems and searching for stationary points of the variational free energy F SV CA . These approximate values for F SV CA can be used as a starting point to derive other thermodynamical quantities for the spin model under consideration.
There are however certain specific problems which arise and have to be addressed for the SVCA. One important limitation in the present formulation of the method is that we cannot use local fields as variational parameters. These variational parameters appear routinely in fermionic or bosonic theories and are actually necessary to control thermodynamic consistency between reference cluster and physical lattice. Moreover, the analytical structure of the SVCA can lead to a failure to find physical solutions in certain situations. This breakdown of the theory can possibly be interpreted as a signal for a phase transition, although it need not be one of the physical model. A more conservative point of view is that such a breakdown indicates that the present choice of cluster or set of variational parameters is not able to treat certain properties like an increase of correlation lengths properly. If this is the case, one should be able to see an improvement by choosing larger clusters or variational parameters that are better suited for the problem at hand. In any case such a breakdown of the SVCA can be used to monitor important changes in the physical state of the spin model.
To test the SVCA we studied the antiferromagnetic Heisenberg S = 1/2 chain in a magnetic field where we could compare our results with the exact Bethe ansatz solutions. Three reference systems -full Heisenberg interaction, transversal and longitudinal interaction separately and transversal interaction only as variational parameters -with different cluster sizes were used for several magnetic fields to apply the SVCA to the model. It turned out that the longitudinal interaction is not suited as variational parameter, probably due to the fact that one cannot use a local field in the cluster Hamiltonians as variational parameter to control the local magnetization explicitly.
Beyond the critical point h = 2J the SVCA results are satisfactory and at least qualitatively resemble the exact solution. For magnetic fields below the critical value we start to encounter a breakdown of the approximation for low temperatures and small clusters, with a systematic improvement with increasing cluster size, but the results are in general less reliable than for h > 2J. Accompanied with a breakdown we often find divergencies in the thermodynamical quantities. We attribute these to the inability of the used clusters to cope with the algebraic correlations that emerge for low T . This interpretation is in accordance with the observation that the results typically become better and more reliable with growing cluster size.
To summarize, we provided a proof of principle that the SVCA can lead to reasonable approximate results for Heisenberg spin models. To this end we chose a simple Heisenberg spin chain, concentrating on understanding the analytical structure of the SVCA equations and using a full exact diagonalization to treat the individual clusters. Further investigations with larger clusters are needed to confirm and extend our observations. However, to achieve this goal one will need more efficient algorithms to treat open spin clusters and possibly also employ different algorithms to evaluate the quantities entering the SVCA expression for the free energy .
Of course applying the SVCA to an antiferromagnetic spin chain can only be the first step. To really establish its usefulness one should also apply it to other models, e.g. ladder systems or two-dimensional lattices, including larger spins. Our present results indicate that the SVCA improves when long-ranged correlations are suppressed. This observation makes it particularly interesting to apply the approximation to frustrated spin systems, where short-ranged correlations often dominate for low temperatures.
For instance, this condition is met in several two-dimensional antiferromagnetic spin lattices [50]. One specific one-dimensional example would be the Heisenberg zig-zag ladder at the Majumdar-Ghosh point, where the ground state are dimers on the rungs [51,52]. Work along these lines is in progress.
in our case with the cluster spin correlation function. The transversal function Π t c = S − i (τ )S + j (τ ′ ) c for the cluster can be written using Matsubara frequencies as where the vectors |n are the eigenstates of the cluster system Hamiltonian and the E n the corresponding energies. This object can be analytically continued to establish a function of the complex variable ω. We want to write the correlation function in the Q-matrix representation. To this end we define with the help of a multi-index α = (n, m) where the ω in the last line has to be understood as being multiplied by the unity matrix. It is important to note that a certain combination α is taken into account only when the correlation function has a non-vanishing pole at λ t α . With the definitions in (A.4) we can now write The other term under the trace in (A.2) can also be rewritten with the above definitions as The step from the second line to the third can be shown by expanding the inverse [18]. In the last line we introduced the modal matrix M which diagonalizes L t = (λ t + Q t V t (Q t ) + g t ). The η t (k) denotes a matrix with the eigenvalues η t α (k) on the diagonal. These are wave vector dependent by virtue of the interaction matrix V t .
Due to its derivation in chapter 2 the term (A.6) can be viewed as a matrix containing approximations to the correlation functions of the original system. This view is supported by the general structure we derived in the last line. If we sum over the wave vectors k the eigenvalues η t α (k) represent an approximation to the excitations of the full system. This directly leads to a problem of the present approach. Since Q t V t (Q t ) + is a hermitian matrix and g t has varying entries ±1 the matrix L t is non-hermitian. This means that one can in principle obtain eigenvalues that lie on the imaginary axis which is not in accordance with their supposed interpretation as physical excitations. The possible imaginary poles will also pose mathematical problems for the evaluation of the SVCA free energy. We will discuss this point below.
The next step is to carry out the traces in (A.2). It is clear that the Matsubara frequency sum will be the most challenging part. Usually, the correlation function (A.3) decreases relatively fast with ω l . Therefore, for very large T it is sufficient to consider only a finite number of terms. For T → 0 on the other hand the sum over the Matsubara frequencies can be carried out efficiently as a numerical integration [20]. For the present approach however it is mandatory to devise an algorithm working at arbitrary temperatures, for reasons discussed at the end of section 2.3. The analytical technique we will apply was originally introduced for fermions [13] and bosons [22]. In the evaluation of the SVCA we can proceed along the lines of the latter because the correlation function of the spin operators has a structure similar to the bosonic Green function [54].
We start with the term in (A.2) that only incorporates the matrix of cluster correlation functions and is thus independent of the wave vectors k where the π t i (ω n ) are the eigenvalues of Π t c . In evaluating these objects a subtlety arises: each contains a certain number of excitations in such a way that every individual λ t α defined in Eq. (A.4) appears only once. This is not trivial to see and cannot be discussed within this paper. It is however important to ensure a proper normalization of the trace.
Next we have to evaluate the individual frequency sums over the terms ln π t i (ω n ). A standard way to do this is by means of Poisson's summation formula. An alternative approach suited for the terms appearing in (A.7) is covered in detail by Koller and Dupuis [22]. The result depends solely on the poles λ t α and zeros ζ t α of the functions π t i (ω) Note that the sum over the index i has already been taken into account. For the derivation it is important that the correlation functions are analytical at ω = 0. This is the case for the transversal correlation function Π t , except for a SU (2)-symmetric system. We will handle this point in detail when the longitudinal correlation function is discussed.
Using a similar computation one finds for the k-dependent part of (A. The poles η t α (k) were found in (A.6) as the eigenvalues of the matrix L t . The zeros ν t α (k) on the other hand are determined by the poles of the spin self energy Γ t defined in equation (12). In the central approximation of this approach that led to (26) one effectively uses Γ t c as the self energy for both the cluster as well as the lattice correlation function. So the collection of zeros {ζ t α } is equal to {ν t α (k)} which means that (A.8) and (A.9) combined leads to where the self-adjoint character of S z has been taken care of and the special contribution to the zeroth component been separated.
One comment is in order before we proceed with the evaluation. In K z the inverse and the logarithm of Π z appear. It is easy to see that in case total S z is conserved this matrix has zero as an eigenvalue for all Matsubara frequencies except ω 0 . Thus, (Π z ) −1 and ln Π z individually have to be in principle understood as containing a small regularizing parameter ǫ to take care of these zero eigenvalues. It can however be shown that the individual divergencies which develop for ǫ → 0 exactly cancel if we take into account all terms in K z respectively (A.1). Therefore the SVCA free energy as a whole is well-defined. Nevertheless, care has to be taken when one tries to evaluate the different contributions individually.
Let us now proceed with the evaluation of the longitudinal parts. The first line of (A.11) resembles terms that also appear in the transversal correlation function (A.3), and in principle one could perform a similar computation. Unfortunately the second line proves to be problematic. If we extend (A.11) to be a function on the complex plane it has a discontinuous point at the origin. This non-analyticity of Π z ij (ω) would render it impossible to evaluate the Matsubara frequency sum as in the transversal case. The deformation of the contour leads to a proper result only if the function is analytical at the origin [22].
So we introduce a matrixΠ z c which is defined by (A.11) but with the last line omitted. The corresponding functions are analytical at the origin. The first terms on the right hand side of (A.12) and (A.13) can now be evaluated as in the transversal case, withΠ z c , the Q z -matrices and corresponding quantities like L z defined in analogy to their transversal partners. The poles λ z α are obtained from the matrix (A.11) and the η z α (k) are determined by the eigenvalues of L z . Both of them are used in the formula corresponding to (A.10).
The last two terms from (A.12) and (A.13) which only give a contribution for ω 0 can be combined and computed directly. Putting everything together we arrive at For the last two terms we used the identity Tr ln M = ln det M. The matrices under the determinants cover the cluster site indices. One has to add that in case of a SU (2)-symmetric system the treatment of the transversal part has to be performed of course identically to the longitudinal part. Expressions (A.10) and (A.14) together give the SVCA free energy (A.1). Besides the cluster system free energy F c and excitations λ ξ α one has to compute the determinants in (A.14). One also needs to determine the eigenvalues of the L ξ which can become numerically challenging. These matrices have a rank of the order of excitations in the system and grow rapidly. Finally, the k-summation appearing in the equations has to be performed and is done over a mesh of different sizes to reduce numerical errors.
Of course a basic problem is to find the eigenvalues and eigenvectors of the cluster Hamiltonian. We use in this paper full diagonalization, which is limited to smaller clusters. An alternative would be some finite temperature Lanczos method, densitymatrix renormalization group or Monte-Carlo approach.