Some more remarks on the Witten-Veneziano formula for the $\eta'$ mass

We discuss some subtleties in connection with the new attempts to provide a firm basis for ths Witten-Veneziano formula.


Introduction
More than twenty years ago Witten [1] and Veneziano [2] proposed a formula connecting the mass of the η ′ meson to the quenched topological susceptibility χ qu t of QCD. This formula takes its simplest form in the chiral limit: Recently Giusti, Rossi, Testa and Veneziano [3] tried to put the old arguments on a firmer basis by starting with a well-defined lattice version, taking advantage of the recent progress in the understanding of chiral lattice fermions (see for instance the review [4] and references therein).
The crucial input for all derivations is the anomalous Ward identity for the U(1) axial current, which leads to the vanishing of the topological susceptibility in the full theory with dynamical fermions in the chiral limit: But the interpretation of both equations is tricky, as was pointed out long ago in [5]; there it was also observed that Witten's original arguments would require cancellation of terms of equal signs against each other. (The 1987 paper remained unpublished, but a scanned version is available from KEK via Spires). Here I will try to explain more clearly the main point of that old paper, correct some imprecisions and discuss its implications for the recent attempts.
The conclusion will be that the WV formula can be given an interpretation that makes it true, but that it is ambiguous as it stands. The recent lattice approaches offer the prospect of eliminating this ambiguity and providing a reliable foundation for the formula. It will also become clear that short-distance fluctuations of the topological density do play a crucial role.

'Axiomatic' considerations
There is a rather close analogy between 4D QCD and 2D (multiflavor) QED; the latter model, being exactly soluble, provides therefore a good testing ground for the considerations connecting mass generation in the flavorneutral pseudoscalar channel to the topological susceptibility. We will at first formulate 'axiomatically' the conditions which the topological charge density operator should fulfil in a continuum quantum field theory; in this general discussion the 4D and 2D models can be handled together. In the case of QED 2 the 'axioms' are true statements that hold in the explicit constructions. The situation for QCD 4 is different, however, since the Millenium Prize problem [6] of constructing this theory with or without fermions has not yet been solved. Here we assume here that such a theory exists and that its gauge invariant fields satisfy the Wightman axioms or, after continuation to the euclidean world, the Osterwalder-Schrader axioms (see for instance [7]).
Initially we do not have to distinguish between the quenched (no dynamical fermions) and the full models. The field of interest is the topological density, given in QED 2 by and in QCD 4 (formally) by where F is the Yang-Mills field strength tensor andF its dual. The crucial point is to notice that q is odd under time reflections and therefore satisfies an unusual form of reflection positivity (RP), which for its 2-point function reads Another way of saying this is that the Euclidean field q(x) corresponds to an antihermitian field operator because it contains one time derivative. This form of RP was stressed in [5] and later in [8].
The topological susceptibility χ t is supposed to be the integral G(x)dx. Two questions arise immediately: (1) Is G(x) integrable over the whole space?
(2) How can χ t be positive, as required by eq. (1), if the integrand is negative?
The answers to these questions are closely related. For (1) one would expect a negative answer: in QED 2 q(x) seems to be a dimension 2 field, whereas in QCD 4 one has to expect that it has dimension 4; in both cases G(x) should not be expected to be integrable at short distances.
On closer inspection, it actually turns out that in QED 2 there is a lucky coincidence: the coupling constant e has the dimension of a mass and the topological density turns out to be proportional to e times a dimension 0 field plus some white noise producing a contact term (see [9]).
This kind of accident cannot be expected in QCD 4 . To give the space-time integral of G(x) meaning, counterterms concentrated at x = 0, i.e. divergent contact terms are needed (see for instance [11]). The answer to question (2) is then that with a suitable choice of those contact terms one can indeed make χ t nonnegative. The validity of formulae like eq. (1) and (2) thus depends crucially on the right choice of contact terms.
We will now discuss the two cases QED 2 and QCD 4 separately in a little more detail.

QED 2
This case has been discussed for one flavor in [5] and for N f flavors in [10]. The construction employed there fixes the possible contact terms (which are finite in this case). We cite from the latter reference the result for full QED 2 with N f dynamical fermions; the quenched correlation is obtained by setting N f = 0 and is just the pure contact term contained in eq. (6).
As was discussed in [5,10], this construction and the choice of contact terms inherent in it make the formula eq. (1) true, provided F π is interpreted appropriately. As remarked before, it is a special feature of this twodimensional model (related to the fact that the charge has the dimension of a mass) that G(x) is, aside from a δ-function, an integrable function. Correspondingly its Fourier transform satisfies a dispersion relation (= Källen-Lehmann representation) of the form where the constant c is, up to some trivial numerical factor, equal to the quenched topological susceptibility χ qu t and the spectral density ρ is a δfunction.

QCD 4
By dimensional analysis and tree level perturbation theory q(x) is expected to be a dimension 4 field and hence, up to possible logarithms The Wightman axioms, which are assumed to hold for q(x), guarantee that in the Euclidean world G(x) is an analytic function for x = 0. Before we can talk about the Fourier transformĜ(p) of G(x) or the topological susceptibility, we have to promote G(x) to a distribution, and this means prescribing certain formally divergent contact terms. Mathematically the procedure goes as follows: G(x) can already be smeared with test functions that vanish to sufficiently high order at the origin, i.e. this smearing defines a linear functional on a certain subspace of the test function space. To extend this linear functional to all test functions in a way that is consistent with euclidean invariance requires the choice of 3 free parameters, corresponding to counterterms of form Once the extension has been fixed, the distribution G can be Fourier transformed according to the rules for distributions (see for instance [11]). Since neither in Yang-Mills theory nor in full QCD we expect the presence of a massless pseudscalar particle with the quantum numbers of q(x), we will make the further assumption that G(x) decays exponentially at large |x| This has the consequence that the Fourier transformĜ(p) is analytic in a neighborhood of real momenta; reinterpreted as a function of p 2 it is analytic near the real axis except for a cut from −∞ to −m 2 . At large momenta pĜ(p) grows like O(|p| 4 ) up to some possible logarithms. By the Källén-Lehmann representation (which actually follows from RP and euclidean invariance) we obtain the subtracted dispersion relation stated in [3] G(p) = a 1 + a 2 p 2 + a 3 ( where the constants a i are proportional to the free parameters c i and ρ(t)dt is a positive measure, growing at most like t 2 for t → ∞. It is obvious from this discussion that the 'topological susceptibility', does not have any unambiguous meaning, be it in full or quenched QCD. In full QCD in the chiral limit one postulates based on the anomalous Ward identity and the absence of zero mass particles. This equation can clearly be made true by simply putting a 1 = 0. It is also clear that eq. (1) can likewise be made true by a suitable choice of the constant a 1 for the quenched case, but that way the formula would of course not have any predictive value. The authors of [3] propose to derive the WV formula from eq.(11) by first sending the parameter u ≡ N f /N c to zero and then going to p = 0. A crucial assumption is that for u → 0 at fixed p = 0 the left hand side goes to the quenched valueĜ(p) qu . The right hand side is treated by an expansion in powers of u/p 2 followed by sending p → 0. This is a dangerous procedure, because truncating such an expansion at order (u/p 2 ) k leaves an error term O((u/p 2 ) k+1 ), and therefore sending p → 0 termwise in the expansion is not justifiable. This problem can, however, be circumvented by rewriting the dispersion relation eq. (11) in the form (14) where we have separated the contribution of the η ′ meson which is expected to dominate the dispersive integral. Simply putting p = 0 in this equation, one arrives at the relation By standard arguments one derives from this relation a WV-like formula, in which, however, the contact term b 1 takes the place of χ qu t . This was essentially the proposal made in [5] (where, however, an unsubtracted dispersion relation was used, which is only justified after approximating the spectral density ρ by a δ-function at the η ′ mass; see also [12]). This latter identification of b 1 with the quenched topological susceptibility can be based on the following reasoning, following the route taken by [3]: We also first send the parameter u = N f /N c to zero at fixed p and accept the assumption of [3] that this corresponds to quenching on the left hand side of eq. (14); on the right hand side, assuming with [3] that both m 2 η ′ and R 2 are O(u), after taking the second limit p → 0, one obtains just b 1 ; thus one concludes or, using eq. (15) which now leads by the usual arguments to the WV equation (1) in its standard form. But the fact remains that without suitably fixing the contact terms, the WV relation does not hold, and, as the discussion above shows, the quenched topological susceptibility is in fact equal to the contact term b 1 , similar to the situations in QED 2 . To put more meaning into the WV relation, a selfcontained lattice derivation is certainly desirable, and the paper [3] takes some important steps in that direction. We will make some comments about this in the next section.
The statement that χ t is defined only up to a free parameter and could have either sign, seems to clash with the 'obvious' identity with Q V = V q(x)dx, which seems to show that manifestly that χ t ≥ 0. But this argument is too naive. A harmless point is that a sharp volume cutoff as in Q V is not allowed due to the singular nature of the correlators of q(x). This can easily be fixed by replacing the quantity Q V by Q(f V ) where f V is a smooth approximation the characteristic function of the volume V . But one can still not conclude that Q(f V ) 2 ≥ 0, because there is no physical principle that restricts the free parameters c 1 , c 2 , c 3 . More generally, there is no physical principle requiring that the continuum correlation functions are moments of a positive measure and that the symbol . used to denote euclidean expections really means an expectation value in the probabilistic sense. Of course if is possible to choose a 1 ≥ 0; the dispersion relation then guarantees that χ t = 1/(2π) 4Ĝ (0) ≥ 0. But again everything depends entirely on the choice of the contact term, which does not have any intrinsic physical meaning.

Lattice versions of the WV formula
There have been earlier attempts to derive lattice versions of WV like formulae ( [13]), but the important progress that has taken place in the contruction of chiral lattice fermions (see for instance [4] and references therein) suggested a new attack on the problem using Ginsparg-Wilson (GW) fermions and this is what the authors of [3] proposed to do.
Everything is now well defined and in the absence of any vacuum angle θ one really has a positive measure determining the euclidean expectation values (Nelson-Symanzik positivity holds). So in this framework it is simply a fact that for # = qu or full, and the good chiral properties of the GW fermions assure that the anomalous Ward identity holds and hence The arguments sketched for the continuum depend on dispersion relations that are not valid in this form on the lattice. But if one assumes that they hold up to some corrections that disappear in the continuum limit, one obtains a lattice derivation of the WV formula which now has an unambiguous meaning (at least once one has settled on a definite solution of the Ginsparg-Wilson relation) and hopefully has a finite continuum limit on both sides. So there is a good chance that the work of [3] can be the starting point for a solid foundation of the WV formula. It would be very interesting to study the approach to the continuum of the quenched 2-point function of the topological charge density in this GW framework and see how the subtleties discussed above emerge. Even though RP does not hold for GW fermions before taking the continuum limit, it should (hopefully) become valid in this limit. So one should expect that in this framework the correlator of the lattice version of q(x) is negative, except at distances of a few lattice spacings, and one should see the emergence of a divergent contact term.
These phenomena have been studied in some detail in two-dimensional spin models: by Balog and Niedermaier [14] in the 2D O(3) model and by Vicari [8] in the 2D CP N−1 model in the N → ∞ limit. In this work it can be seen clearly that the correlator is negative, in accordance with reflection positivity, except at coinciding points, where the compensating contact term emerges. An analogous study for the case of QCD 4 , especially with the definition of the topological density suggested by [3] might be elucidating.
The main conclusion of this discussion is: the WV formula is ambiguous as it stands, and its truth depends strongly on the right choice of contact terms. If one starts from the lattice, it therefore all depends on the right treatment of the short distance fluctuations. The GW framework offers some hope for a self-contained lattice derivation and the anomalous Ward identity suggests the right choice of the topological density with the right short distance fluctuations.
The author is grateful to P. Weisz for discussions.