Scale Anomalies, States, and Rates in Conformal Field Theory

This paper presents two methods to compute scale anomaly coefficients in conformal field theories (CFTs), such as the c anomaly in four dimensions, in terms of the CFT data. We first use Euclidean position space to show that the anomaly coefficient of a four-point function can be computed in the form of an operator product expansion (OPE), namely a weighted sum of OPE coefficients squared. We compute the weights for scale anomalies associated with scalar operators and show that they are not positive. We then derive a different sum rule of the same form in Minkowski momentum space where the weights are positive. The positivity arises because the scale anomaly is the coefficient of a logarithm in the momentum space four-point function. This logarithm also determines the dispersive part, which is a positive sum over states by the optical theorem. The momentum space sum rule may be invalidated by UV and/or IR divergences, and we discuss the conditions under which these singularities are absent. We present a detailed discussion of the formalism required to compute the weights directly in Minkowski momentum space. A number of explicit checks are performed, including a complete example in an 8-dimensional free field theory.


Introduction
Ken Wilson taught us to think of quantum field theory as the result of "zooming out" from a local quantum theory, formalized by the renormalization group (RG). For example, a lattice of spins can be approximated as a continuum quantum field theory at distances much longer than the lattice spacing. This paradigm has led to an understanding of the robustness of effective field theories in particle physics, and universality of critical behavior in condensed matter physics.
The RG flow can be viewed as the effect of a scale (dilatation) transformation. Interacting systems generally have a nontrivial behavior under scale transformations, leading to renormalization group flows that can display interesting phenomena such as asymptotic freedom, dimensional transmutation, and IR fixed points. Such nontrivial renormalization group flows are possible only due to the fact that scale transformations are generally anomalous. If there were no scale anomaly, the breaking of scale invariance would be simply a matter of dimensional analysis, with all couplings being either relevant, irrelevant, or exactly marginal.
In this paper we focus on theories at a conformally invariant fixed point of the RG. The IR limit of many flows may be a gapped theory or a theory of free massless particles, but there are also many interesting theories that have a nontrivial conformally interacting IR fixed point, such as the conformal window of QCD-like theories. At such a fixed point, beta functions for dynamical couplings are zero by definition, but we can have a nonzero beta function for the self-coupling of source fields that define local operators in the theory. For example, we can define the energy-momentum tensor by coupling the theory to a background metric g µν . In 4-dimensional theories, there is a purely gravitational coupling proportional to the square of the Weyl tensor that is conformally invariant. The dimensionless coupling α is renormalized even at the fixed point, and therefore the beta function is a nonzero constant at the fixed point. The coefficient c T is a basic parameter in 4-dimensional conformal field theory. For example, it appears as the coefficient of the 2-point function of the energy-momentum tensor, and as a coefficient of one of the 3 tensor structures in the 3-point functions. 1 It is anomalies of this kind that we will study in this paper.
The anomaly coefficients appearing in 2-and 3-point functions are trivially related to the normalization of the fields and the OPE coefficients. Starting with the 4-point function, however, the anomaly coefficients depend non-trivially on the theory. In this paper we will study a scale anomaly in the 4-point function of a primary operator O of the form (1.3) where D is the dilatation operator (see Eq. (2.1) for a precise definition), d is the space-time dimension, we define x ij = x i − x j , and c O is an anomaly coefficient. Note that such a scale anomaly can occur only if the scaling dimension of O is given by ∆ O = 3d/4. The motivation for studying such a special case is to develop techniques to deal with the more interesting but more complicated cases such as the c T anomaly in 4-dimensional CFTs.
We will show that scale anomaly coefficients such as c O defined in Eq. (1.3) can be written in terms of the CFT data (operator spectrum and OPE coefficients) in an OPE-like expansion where the sum is over all primary operators Ψ that appear in the OO OPE, λ OOΨ are the OPE coefficients, and C Ψ are universal constants (independent of the theory) that we can compute. We can think of Eq. (1.4) as a manifestation of the "infrared face" of the scale anomaly in a CFT, since it relates the anomaly coefficient to the properties of the CFT data, which determines the correlation functions at separated points.
In this paper we derive two versions of the sum rule Eq. (1.4). The first is derived in Euclidean position space. In this case we exploit the similarity of Eq. (1.4) to Gauss' law with a point source, ∇ · E = c O δ 3d ( X ). This can be used to express the anomaly coefficient c O in terms of a flux integral. We give explicit expressions for c O as an integral over the conformal cross ratios u and v with universal smooth weight function. In the resulting sum rule, the coefficients C Ψ are not positive definite. For example, in d = 4 the contribution from operators of spin = 0, 4, 8, . . . is positive, while the other spins and the identity operator give negative contributions. 2 The second version of the sum rule is derived in Minkowski momentum space, and gives a sum rule where the identity operator does not contribute and C Ψ ≥ 0 for all operators Ψ. In this case the positivity arises because the scale anomaly is related to the coefficient of a logarithm in a forward 2-to-2 scattering amplitude, and this same logarithm gives the imaginary part of this amplitude, which is a positive sum over states by the optical theorem. The scattering amplitude is given by the connected part of the OOOO correlation function, and therefore the identity operator does not appear in this version of the sum rule. This argument is phrased in the language of scattering theory to make connections with well-known physical concepts, but it can be translated to an argument about CFT correlators and does not require any assumption about the existence of an S-matrix for a CFT.
The momentum space sum rule is not without caveats. In particular, its validity requires the absence of UV and IR divergences, which do occur in some theories. The UV divergences are well understood, and they restrict the validity of the sum rule to theories where the operator spectrum satisfies some conditions that are mild if O is a low-dimension operator. There are also IR divergences that are less well understood. We give some physical arguments and some examples, including a complete test of the momentum-space sum rule for an 8-dimensional free CFT that has no UV or IR divergences. We leave more detailed investigation of this question to future work.
There are a number of reasons this work may be of interest. First, we believe that the scale anomaly is intrinsically an interesting object of study, and we should understand all aspects of it. At a more practical level, the sum rule for the c-anomaly in 4-dimensional CFT may potentially lead to new analytic bootstrap bounds. If we can write a positive sum rule of the form Eq. (1.4) for the c T anomaly for 4-dimensional conformal field theory, it will take the form where we have moved the contribution from the energy-momentum tensor to the left-hand side (it is actually the sum of 3 terms). The quantity on the left is a positive-definite combination of the 3 T T T OPE coefficients, and because every term in the sum is positive, we obtain a lower bound on this quantity as a function of the OPE coefficient λ T T O for the lowest-dimension scalar operator. The validity of this bound requires us to understand the question of IR divergences, but it would be an example of a completely analytic bootstrap bound that is not easily obtained from the traditional Euclidean approach. We leave the investigation of this sum rule for future work.
Yet another reason this work may be of interest is that we develop the technology for summing over CFT states in Minkowski space, and in fact in momentum space. This is interesting because scattering amplitudes are natural observables that live in momentum space. We develop the analogs of conformal blocks for 4-point functions of the form  [1,2], but the treatment is heavily mathematical and is difficult to apply in practice. We hope that the present more physical discussion will prove useful for various Wightman-like correlation functions that have played a role in recent work on conformal field theory, for example in Refs. [3,4].
This paper is organized as follows. In Section 2, we derive the Euclidean position space version of the sum rule Eq. (1.4). In Section 3, we derive the Minkowski momentum space version of the sum rule. Section 4 contains our conclusions. The discussion in the main part of the paper focuses on the main ideas, but many details are provided in the appendices.

The Scale Anomaly in Euclidean Position Space
We begin by discussing the scale anomaly in Euclidean position space in arbitrary dimension d. We focus on the connected 4-point function of identical scalar primary operators O in Euclidean position space obeying the anomalous scale Ward identity This requires the special value of the scaling dimension Note that d = 4 examples of this include the operator O =ψψ where ψ is a free Dirac fermion, and O = φ 3 where φ is a free scalar field. The reason that Eq. (2.1) is written for the connected correlation function will be explained below.

The Scale Anomaly and Flux Integral
By translation invariance, the correlation function depends on the three differences x 12 , x 23 , x 34 , so we write Eq. (2.1) can then be written as Eq. (2.4) has the form of Gauss' law for the electric field in the presence of a point charge c O at X = 0. The electric field away from X = 0 is given by the CFT data, and our job is to find the value of c O corresponding to that. The solution is to write c O as a flux integral where Σ is any codimension-1 surface in R 3d that encloses X = 0.
The integral in Eq. (2.6) depends on the correlation function at separated points, and shows that the anomaly coefficient c O can be computed by the CFT data. (The integral includes points X where 2 or 3 points coincide, but we will see below that these singularities are integrable.) We can interpret Eq. (2.1) as saying that the divergence of the correlation function vanishes everywhere except at the coincident points, where it has a delta function singularity. Just as in electrostatics, the behavior of the fields at finite distances determines the strength of the delta function singularity, explaining why it is independent of any short-distance regulator.
We now show how to compute the flux integral Eq. (2.6) in terms of the CFT data. Writing the 4-point function using the (12)(34) OPE, we have where the sum over Ψ runs over all primary operators, λ OOΨ is an OPE coefficient, and g Ψ is a conformal block depending on the conformally invariant cross-ratios The OPE expansion is valid only for the full 4-point function, including the disconnected part: The disconnected part can have its own scale anomaly. For d = 4k with integer k, we have This gives a contribution to the anomaly that involves products of local and nonlocal terms. On the other hand, the anomaly for the connected correlation function is completely local, in the sense that the right-hand side of Eq. (2.1) is a delta function where all points are coincident. This is a manifestation of the fact that the scale anomaly for the quantum effective action is local: where ρ O is the source field for the operator O. The locality of the transformation of the quantum effective action is the defining property of an anomaly. It arises from the fact that the anomaly can be viewed as an effect of the UV regulator, which implies that the connected correlation functions change only by local terms under scale transformations.
We therefore obtain a sum rule of the form where C disc is the contribution to the flux integral from the disconnected part where E disc is the contribution to E from the disconnected part of the 4-point function.
Because the 2-point function is non-negative in Euclidean space, we have C disc ≤ 0.

Computing the Flux Integral
We now present the main ideas and results of the computation of the flux integral Eq. (2.6). Details are given in Appendix B.
The flux integral is over a 3d − 1 dimensional space, but the OOOO correlation function that gives the integrand depends non-trivially only on the cross ratios u and v. It is therefore clear that we can rewrite the flux integral as an integral of conformal blocks g Ψ (u, v) over u and v weighted by a universal weight function that depends only on ∆ O . The strategy to obtain this representation of the flux integral is to use the Faddeev-Popov method to factor out the remaining integrals. The result is in terms of a complex variable z and its conjugate z. The term in square brackets is proportional to the connected part of the 4-point function: it is the difference between the conformal block expansion of Eq. (2.7) and the disconnected part. This term is invariant under crossing (u, v) ↔ (v, u) ↔ (1/u, v/u). (Note that we are using the conventional definition of g Ψ (u, v) which has a nontrivial transformation under crossing, see Eq. (2.7).) The variables z and z in Eq. (2.14) are the natural variables for the conformal blocks [5,6], defined by The integration kernel K d is given by In these variables, crossing symmetry follows from invariance under z ↔ 1 − z and z ↔ z −1 . This can be used to check that the integration measure in Eq. (2.14) as well as the integral (2.16) are invariant under crossing. The kernel K d is a positive function with a maximum at the crossing symmetric point u = v = 1, and is analytic everywhere except z = 0 and z = 1. A plot of K 4 is given in Fig. 1. More details on the structure of the kernel are presented in Appendix B.
The integral Eq. (2.14) goes over the entire upper half complex z plane, which includes regions where the OPE in Eq. (2.7) does not converge. However, we can use different OPE channels in different regions of integration so that only convergent OPE expansions are used. Since regions with different convergent OPEs are related to each other by crossing symmetry, we can equivalently restrict the domain of integration to a fundamental domain. One example is Re(z) This domain is illustrated in Fig. 2. In this case we are integrating only over regions where the OPE converges.
When this is done, we find that the integral Eq. (2.14) converges everywhere except possibly at z = 0, where the kernel has the limiting behavior Convergence of the integral near z = 0 requires therefore For d > 2, the unitarity constraints imply that divergences can only happen for scalar operators, and we conclude that our sum rule Eq. (2.12) is invalid for theories with "super-relevant" scalar operators Ψ in the OO OPE ( d−2 . The integral for Ψ = 1 also diverges, but it is canceled by the contribution from part of the disconnected piece. We can therefore rewrite Eq. (2.12) as where where the factors of 6 account for the fact that there are 6 fundamental domains. Associating C 1 with the unit operator is a slight abuse of notation, since we see that it is really associated with the portion of the disconnected contribution that is not the unit operator.
It is interesting to know the signs of the coefficients C Ψ . A numerical analysis of the d = 4 case shows an alternating pattern: for operators Ψ with spin = 2n, those with even (odd) n give a positive (negative) contribution, while C 1 < 0. It is somewhat disappointing that the sum does not have a definite sign, but we will derive a positive-definite sum rule in §3. The series does appear to converge very quickly. One piece of evidence for this comes from the values of the coefficients C Ψ themselves, which are shown in Fig. 3. Note that these decrease rapidly as a function of the twist τ = ∆ − up to about τ ∼ 30. For τ > ∼ 30, the coefficients increase exponentially, but the OPE coefficients are expected to decrease exponentially as well [7][8][9]. We also note that at fixed twist, the coefficient for scalar operators always dominates by a large factor. We leave more detailed investigation of the convergence of this sum rule to future work.

An Example: the Free Scalar in 8 Dimensions
The free scalar field theory in d = 8 dimensions provides an interesting test of our sum rule: the free field φ having scaling dimension 3, there is an anomaly of the form Table 1. The contribution of all the operators in the φ 2 φ 2 OPE to the scale anomaly c φ 2 of Eq. (2.23). The second column indicates the disconnected part. The operator φ 2 itself gives by far the largest contribution to the anomaly. T µν is the stress-energy tensor for the free scalar theory. The columns denoted ∂ n φ 2 and ∂ n φ 4 indicate the total contribution of all operators of each type.
discussed above for the operator O = φ 2 / √ 2, where the normalization is chosen so that the 2-point function takes the canonical form. Since the theory is free, we also know directly the 4-point function in terms of the conformal cross-ratios u and v, see Eq. (C.13) of the appendix. Extracting the connected part, we obtain an expression for the anomaly as a flux integral, where K 8 is provided explicitly in Eq. (B.10).
We have not performed the sum over conformal blocks in this case, since the calculation of the OPE is very cumbersome. However, we can test the numerical convergence by computing the contribution of low-lying operators. The results are shown in Table 1. We see that there is good evidence that the sum is converging rapidly.

The Scale Anomaly and Rates in Minkowski Momentum Space
We now give a different sum rule for the scale anomaly by relating it to a property of physical observables.

Scale Anomalies and Rates
The idea is to couple the operator O to a free quantum probe field A via where is a coupling constant with dimension d − ∆ O . We work to lowest order in for processes involving external A particles (no A loops), so that we can view A as a probe of the CFT dynamics. This is analogous to using the photon to probe QCD.
Because we want to study the 4-point function of O, we consider the AA → AA scattering amplitude at leading order in : where we define the Fourier transformed operators and division by the space-time volume V = (2π) d δ d (p = 0) cancels the overall momentum-conserving delta function. Note that the disconnected contribution has two momentum-conserving delta functions, and vanishes for general momenta. We will later consider forward kinematics, but it should be understood as a limit, so that the disconnected part never contributes to the amplitude.
We take A to be massless (p 2 i = 0), so the amplitude is a function of the usual Mandelstam invariants Note that we define all momenta to be in-going. In the forward limit t → 0 the amplitude is a function only of s. (In the following we assume that the t → 0 limit is well-defined. This point will be discussed further below.) Crossing symmetry s ↔ u implies that M(−s) = M(s), and scale invariance then fixes the amplitude to be Real analyticity of scattering amplitudes M * (s) = M(s * ) tells us that the constant α is real. For general ∆ O , the amplitude has a cut along the real s axis, and we have Note that Im M vanishes whenever This cannot be right, since by the optical theorem the imaginary part of the forward amplitude is proportional to the total annihilation cross section AA → CFT states , which is surely nonzero.
Note however that this vanishing occurs precisely where the special values of the dimensions allows a scale anomaly of the form This expression is schematic because for r > 0 there may be more than one structure on the right-hand side. The scale anomaly modifies Eq. (3.6) to The log term arises from Eq. (3.9), and so β is a linear combination of the scale anomaly coefficients ∼ c We see that the imaginary part of the amplitude is determined by scale anomalies in all cases where they are present. 3 We will later restrict the discussion to the case of the r = 0 anomaly defined in Eq. (1.3). In that case we have β = −c O /4 and we learn that c O > 0. In fact, c O can be written as a positive sum over states. In a CFT, the states are in one-to-one correspondence with the operators of the theory, so we expect a sum rule of the form 3 If the dimension ∆ O continuously approaches a value where there is a scale anomaly, we expect that M(s) ∼ 1/δ where δ = 2∆ O − 3d/2 − r. Then as δ → 0 the imaginary part of the amplitude is finite, and the real part is local and divergent, but this can be canceled by adding a local counterterm ∼ ∂ 2r ρ 4 /δ to the theory to make the amplitude finite. In this way, everything depends smoothly on ∆ O . with C Ψ ≥ 0. Note that the identity operator does not appear because it is purely disconnected, and does not contribute to scattering. Our main goal in this section is to make this sum rule precise and compute the coefficients C Ψ . In the derivation of this rule, however, we will mostly make statements about the imaginary part of the amplitude, i.e. we will work in the general case where ∆ O is not necessarily equal to 3d/4, and take the special case corresponding to the anomaly at the end.
The above discussion has used the language of scattering theory because it relates the argument to well-known physical concepts. However, it is important to realize that our results do not depend on the assumption that the theory where A is coupled to the CFT has a well-defined S matrix. The "amplitude" M is defined by the Fourier transform of the 4-point function T [OOOO] . The "optical theorem" follows from the combinatoric operator identity where the sum over σ is over all partitions of 1, . . . , n into two groups, and T (T ) denotes time ordering (anti-time ordering). This result is proved by writing out all the (anti-)time orderings and checking that they cancel pairwise. For the case n = 4 and the region of momentum space we are assuming, the only nonzero terms give the optical theorem that we will use to compute the coefficients C Ψ in Eq. (3.12).
Before we turn to computation of these coefficients, we address the possibility of UV and IR divergences that potentially invalidate the result.

UV Divergences
The amplitude M(AA → AA) may be ill-defined due to UV divergences. A UV divergence that corresponds to a local counterterm for AA → AA does not contribute to Im M, and therefore does not affect the argument above. On the other hand, Im M may have a UV divergence corresponding to a counterterm for the process AA → CFT of the form where Ψ is a scalar primary operator in the OO OPE, and Λ is a UV cutoff mass scale. If such a UV divergence is present, it means that the amplitude AA → CFT depends on the dimensionful cutoff Λ, invalidating the arguments above. We can remove the dependence on Λ by adding a counterterm to the theory, but then the finite part of Im M is not uniquely determined by the coupling of Eq. (3.1). Dimensional analysis tells us that there can be no UV divergence of the form (3.14) for 15) or equivalently ∆ Ψ > d/2 in the presence of the r = 0 anomaly. Whenever there is no such low-dimension operator Ψ, we cannot have UV divergences, and the rate is calculable in terms of .
There are also possible counterterms involving derivatives, which can all be written in the form where Ψ µ 1 ···µ is a primary operator with spin ≥ 1 that appears in the OO OPE.
(Note that we can use integration by parts to move all derivatives to act on the CFT operator.) The condition that there is no UV divergence of this kind can be written UV divergences of the kind we are discussing are present in simple models. For example, consider the free scalar field theory with O = φ 3 . The rate for AA → CFT is given by the two classes of diagrams shown in Fig. 4, and the loop diagram is UV divergent for d ≥ 4. This coincides precisely with the case where the operator Ψ = φ 2 violates the bound Eq. (3.15).

IR Divergences
We now consider the possibility of IR divergences. The most obvious worry is that the rate AA → CFT is infinite. In a free CFT, this rate can be computed by Feynman diagrams, and involves an integral over final states that can be IR divergent. One example is the free scalar theory with O = φ 2 . In this theory the rate for AA → CFT is divergent for d ≤ 6 because the integral over final states is divergent when the φ particles have forward kinematics. Another example is the free fermion theory with O =ψψ, where the rate is similarly IR divergent for d ≤ 4. This divergence can be understood as a consequence of the existence of long-range interactions in these theories. In a general CFT where we do not have particle states, we may expect the same divergence in Im M(AA → AA) in the limit t → 0.
In the free-field theory examples we have considered, the IR divergence occurs precisely in the case where the coupling in Eq. (3.1) is relevant or marginal. When is irrelevant, the interaction (3.1) becomes weaker at small energies, and we expect that we can compute observables in a simple series expansion in powers of . On the other hand, if is marginal or irrelevant, we may need to resum the series, and we may expect IR divergences if we work at fixed order in . If this physical argument is correct, we expect no IR divergences in the imaginary part of the amplitude when There is a more subtle type of IR divergence that can spoil the connection between the scale anomaly and the imaginary part of the amplitude. If the amplitude M(AA → AA) has a finite limit as t → 0, we have seen that the scale anomaly c O is related to the imaginary part of the amplitude: the dilatation operator acting on the amplitude is This gives the relation between the scale anomaly and the rate presented above. Note that this amplitude is manifestly crossing symmetric, and is real analytic if α and β are real. The real part of M is singular as t → 0, but the imaginary part is finite: On the other hand, the dilatation operator acting on this amplitude for t = 0 is given by The relationship between the scale anomaly and the imaginary part of the amplitude is different from earlier, due to the divergence in the real part of the amplitude as t → 0. We see that an IR divergence in the real part of M can invalidate the relation between the imaginary part of the amplitude and the scale anomaly, even if both are IR finite.
We would like to have a rigorous understanding of IR divergences. We will see below that IR divergences we are discussing do not show up as singularities in the individual coefficients C Ψ in our sum rule Eq. (3.12). This means that the singularity as t → 0 can only be due to a failure of the sum over operators to converge. This makes sense, because the divergences are arising from the forward limit t → 0. Since this is localized in angle, it involves states of high angular momentum, which come from operators with large spin. We can hope to get a rigorous conditions that rule out the existence of IR divergences by using bounds on the growth of OPE coefficients for large dimensions [7][8][9], but we leave this for future work.
In this paper, we will proceed without a rigorous understanding of IR divergences, and the reader should be aware that the absence of such divergences is a non-trivial assumption required for the validity of Eq. (3.12).

States in Minkowski Space
Using the optical theorem as given in Eq. (3.13) above, the imaginary part of the forward amplitude M is where the limits p 3 → −p 1 , p 4 → −p 2 are understood. We want to insert a complete set of states in the matrix element between the T [· · · ] and T [· · · ] factors. These states can be described using the same operator-state correspondence used in Euclidean position space. We give the main ideas here, and provide details in Appendix A.
We begin with the radial quantization state defined by inserting a primary scalar operator at the origin in Euclidean space: These states are defined on a Hilbert space that lives on a unit sphere in Euclidean space. We then use a combination of conformal transformation and Wick rotation to map this sphere into the t = 0 slice of Minkowski space. We view the conformal transformation as acting on the operators and not the states ("Heisenberg picture"), so this tells us how to interpret the same state |Ψ in terms of Minkowski space correlation functions. 4 The result is that the state |Ψ is given by inserting the operator Ψ at a finite imaginary time, for example The next step is to Fourier transform the state |Ψ to define an eigenstate of the Minkowski momentum operator P µ . This is accomplished by It is easy to see that this is the unique state in the conformal subspace of |Ψ that satisfies P µ |Ψ(k) = k µ |Ψ(k) . This means that by Fourier transforming the state |Ψ we automatically include the full contribution from the conformal subspace containing |Ψ . We provide a proof of this statement in the appendix based on the insertion of a complete set of states in an arbitrary two-point function. In this sense, the conformal blocks in Minkowski momentum space are simply given by (squares of) Fourier-transformed 3-point functions.
For a general tensor operator we define states where M = µ 1 · · · µ denotes the Lorentz indices and I M are a complete set of symmetric traceless polarization tensors. The completeness relation for states can then be written as where the step functions θ restrict the integral over k to physical momenta (momenta in the spectrum of the operator P µ ). The function Π Ψ measures the norm of the momentum-space state: . Of course, we will find that all Lorentz non-invariance cancels when computing Lorentz invariant quantities.

Computation of the Imaginary Part of the Amplitude
We now discuss the calculation of the imaginary part of the amplitude, and subsequently of the coefficients C Ψ appearing in Eq. (3.12).
Using the completeness relation (3.28) with a sum over all operators appearing in the OO OPE, the imaginary part of the amplitude is given by The ordering in the 3-point function must be enforced by the correct i prescription (see Eq. (3.35) below). Then the coefficient C Ψ is given by which is manifestly non-negative.
These integrals can be carried out explicitly, and are finite in general. For example, for a scalar operator Ψ we have and The dependence of the functions V I Ψ and Π Ψ on s cancels by scale invariance when ∆ O = 3d/4, so that the coefficient C Ψ of Eq. (3.12) is a pure number that can be written, for a scalar intermediate operator, The coefficients C Ψ are manifestly non-negative, and are non-singular except at ∆ Ψ = d/2. They vanish when Ψ has the dimension of a double trace operator, namely This must in fact be the case, as our sum rule must give zero for a generalized free field theory, where the 4-point function has no connected part. Such a theory however has a nontrivial OPE with an infinite number of double trace operators, and positivity means that the only way to satisfy the sum rule in this case is to have C Ψ = 0 for any scalar operator with double trace dimension. One can also see in Eqs. We have not computed the coefficients C Ψ for a general spin-operator, although this could be done with additional labor. We have calculated them for the special case where the spin-operators saturate the unitarity constraints, and are therefore higher-spin conserved currents. This is the case that appears in the free-field example, to which we turn next.

A Worked Example
We can apply the momentum space sum rule to the free scalar field theory in d = 8 This theory has a scale anomaly of the form Eq. (2.1), and one can check that there are no UV or IR divergences that invalidate the sum rule.
The φ 2 φ 2 OPE contains operators of the form ∼ ∂ 2n φ 4 , but these have double-trace dimension and therefore do not contribute to the sum rule. (The agrees with the fact that in terms of Feynman diagrams there are no connected diagrams for AA → φφφφ.) Because the theory is invariant under φ → −φ, the only other operators we need to consider have the form ∂ 2n φ 2 . Moreover, only operators with spin = 2n are primaries, and since they saturate the unitarity bound, they must be conserved higherspin currents. We provide the details of the computation of the two-and three-point function in Appendix C. The result can be written as a sum over n, 4n + 5 (n + 1)(n + 2)(2n + 1)(2n + 3) = π 12 4 .

Conclusions
In this paper we investigated the scale anomaly of the 4-point functions for scalar operators in a general CFT, and showed, with some caveats, that the anomaly coefficient can be expressed in the form of an OPE, namely where the sum runs over primary operators Ψ that appear in the OO OPE. The most interesting version of this sum rule came from thinking of the OOOO correlation function as a scattering amplitude in Minkowski momentum space. In this version, the identity operator does not appear, and the coefficients C Ψ are non-negative. We developed a calculus for Minkowski correlation functions that allowed us to write a convergent OPE in momentum space. The main caveat to the momentum-space anomaly sum rule is that we do not have a rigorous understanding of the absence of IR divergences in the amplitude.
We also wrote a version of this sum rule that follows from the correlation function in Euclidean space. In this version, the identity operator does appear, and the coefficients C Ψ do not have a definite sign.
There are a number of extensions and directions for further investigations: • Understand better the IR convergence of the momentum space sum rule.
• Extend the results to operators with spin, in particular conserved currents and the energy-momentum tensor, and obtain a sum rule for the anomaly coefficient c T in 4-dimensional CFTs.
• Extend this approach to the a anomaly coefficient of a 4-dimensional conformal field theory. This anomaly does not lead to an anomaly for scale transformations, but does give rise to an anomaly for special conformal transformations. It is not clear that one should expect positivity for such a sum rule, if one can be found.
• Investigate other possible uses of the calculus we have developed for the OPE for Wightman ordered products in Minkowski space.
We hope to address these questions in future work.

Acknowledgments
We have benefited from discussions and encouragement from many people, including S. Dubovsky

Appendix A: Radial Quantization States in Momentum Space
In this appendix we describe how to use the state-operator correspondence to write a complete set of states in Minkowski space. We start with radial quantization in Euclidean space, and use a combination of conformal transformations and Wick rotation to relate this to states defined on a time slice x 0 = 0 in Minkowski space. This allows us to write matrix elements of Minkowski space operators with radial quantization states. We also give an independent argument that these states are complete by using them to reproduce the 2-point functions of the theory.

A.1 Conformal Mapping
Let us recall some basics of radial quantization in CFT. In any quantization, we pick a time surface and define "in" ("out") states by inserting operators to the past (future) of the surface into the path integral. Correlation functions can then be viewed as the overlap of in and out states: We begin with radial quantization in Euclidean space. We denote the standard Euclidean coordinates by x µ (µ = 1, . . . , d) and define radial quantization states on the unit sphere x 2 = 1 by inserting operators at the origin and integrating over the region x 2 < 1: For example, 2-and 3-point functions in radial quantization are given by out In radial quantization, Hermitian conjugation is given by inversion. This makes sense because inversion maps the path integral over fields inside the sphere to the path integral outside the sphere. For example, we have where (x −1 ) µ = x µ /x 2 . Note that this implies Any conformal transformation x µ → x µ maps radial quantization operators and states to operators and states in a new quantization. In radial quantization, in-states are defined by integrating over the region x 2 < 1, while in the new quantization, in-states are defined by integrating over the image of this region. We apply these ideas to a particular conformal map that maps the unit sphere x 2 = 1 into the plane x d = 0. This gives a quantization where the states live on the plane x d = 0. The transformation is a combination of a translation and a special conformal transformation, and is given explicitly by where R is an arbitrary length scale. (We think of the unprimed variables x µ as dimensionless.) Note that the origin is mapped to x 0 = ( 0, −R), and the point at infinity is mapped to x ∞ = ( 0, R). Conformal symmetry implies out Because the conformal transformation is unitary, we have |Ψ = U |Ψ , that is, this is a unitary transformation from one basis to another. In the primed quantization, we therefore have In other words, in this quantization, the states |Ψ are defined by inserting an operator at the point x 0 .
It is easily seen that inversion in the unprimed variables corresponds to reflection about x d = 0: This means that Hermitian conjugation is simply given by In particular, we have We see that it is convenient to choose R = 1 2 to avoid factors of 2 in the normalization of states.

A.2 Wick Rotation
Note that we have not specified a Hamiltonian for the primed quantization above. In radial quantization, the Hamiltonian is the dilatation operator. If we use the image of the dilatation operator under this conformal mapping as the Hamiltonian in the primed quantization, we obtain the so-called "N-S quantization" [10]. But we can use any Hamiltonian we like as long as the surface x d = 0 is a surface of constant "time." For our purposes it is most useful to use x d as the time variable.
We then define the Wick rotation by and let µ = 0, 1, . . . , d − 1. Now our states |Ψ live on the x 0 = 0 time slice of Minkowski space.
The Minkowski space path integral then defines the matrix element Note that the path integral now gives the time ordered product of operators in the out state, so the 3-point function has the standard Feynman i prescription x 2 12 → Fig. 6. The Poincaré patch in the cylinder coordinates, with iτ = i log |x| running vertically and x/|x| represented by an angle in the horizontal plane. The points labeled 0 and ∞ correspond to the origin and spatial infinity of Minkowski space, and both of them lie on the quantization surface τ = 0.
x 2 12 − (x 0 12 ) 2 + i . The terms involving x 0 = (iR, 0 ) involve imaginary time and do not need any additional i prescription. The fact that Ψ is at a finite positive imaginary time ensures that it is inserted to the right of all the other operators. Note also that after the Wick rotation, Hermitian conjugation becomes equivalent to ordinary complex conjugation. The conjugate state Ψ| correspond therefore to a local operator insertion at x * 0 = (−iR, 0 ), to the left of all other operators.

A.3 Alternative Derivation
There are other ways to obtain the results above through a combination of conformal transformations and Wick rotations. One way that is perhaps more familiar is to first map radial quantization to the Euclidean cylinder by r = e τ , then perform the Wick rotation of the cylinder time t = iτ . This gives a theory on the Minkowski cylinder, and states on the radial quantization sphere r = 1 are mapped to the sphere at t = 0 in this quantization. To obtain flat Minkowski space, we can then restrict the theory to the Poincaré patch shown in Fig. 6 using a special conformal transformation. This approach yields precisely the same final expressions as above. Conceptually it is a little less clear to us since the origin r = 0 is mapped to t → i∞, but then comes back to a finite (imaginary) point only after a final change of variables.
Yet another approach is to define the state |Ψ in Minkowski space directly by inserting Ψ at x 0 = (−iR, 0 ). It can then be seen that |Ψ is annihilated by the operators and is an eigenstate of with eigenvalue ∆ Ψ , where the operators P µ , K µ , M µν and D represent the generators of the SO (d, 2) conformal group in the primed coordinates. K µ and D are part of a representation of the Euclidean conformal group SO(d + 1, 1) acting on the unprimed coordinates of the previous section. They can be used to construct the full spectrum of states in the theory. Descendants states in particular are obtained by acting on |Ψ with the operators P µ conjugate to K µ , given by This quantization procedure is the one described by Lüscher and Mack [1]. Its Hamiltonian is given in terms of Minkowski coordinates x µ by Eq. (A.17).

A.4 Momentum space states
Where our quantization method differs from Lüscher and Mack's is in the construction of descendant states. Acting on the primary states |Ψ with the raising operators P µ of Eq. (A.18) generates all the states in the theory, but these states are not convenient to use. In particular they are not momentum eigenstates. If instead one tries to generate descendant states by acting on a primary state with the momentum operator P µ , then the set of states will still be complete, since the relation between P µ and P µ is linear, but the states obtained in this way are not orthogonal to each other, as |Ψ is not an eigenstate of the dilatation operator D .
An elegant solution to this problem is obtained taking the Fourier transform The states obtained in this way are remarkable in many ways: they have non-negative norm, are orthogonal to each other (see Eq. (3.29) and Eq. (3.30)), and are eigenstates of the translation operator P µ with eigenvalue k µ . Moreover, their overlap with in or out states is finite due to the presence of the imaginary time component. These properties allow to write a closure relation in the form of Eq. (3.28), which is correct provided that the set of states |Ψ(k) with k µ in the future light-cone is complete. This last point is not trivial, but we show its proof in the next section. Note that we have only mentioned scalar operators so far. The generalization to operators with spin is trivial, and the completeness equation (3.28) readily incorporates them.
A.5 Two-point functions and completeness of the set of states Let us consider the Wightman two-point function for a scalar operator O in Minkowski position space, In the Euclidean version of this two-point function, one can compute the right-hand side by inserting between O(x) and O(0) radial quantization states defined on a sphere of radius 0 < R < |x|. It is a known result that all the states in the conformal class of O must be taken into account to recover the full two-point function at finite separation x. This means that if we can reproduce the same result in Minkowski space inserting our set of states Eq. (3.28) on the left-hand side of Eq. (A.20), we will have proven that our basis is complete, at least in the conformal class of O. The generalization to fields with spin follows straightforwardly.
Plugging Eq. (3.28) in the above two-point function and using translation invariance to factorize the x dependence, one obtains the Fourier transform of a positive quantity, integral. By definition, the integral is over the four points x 1 to x 4 , but the fourpoint function only depends non-trivially on two real parameters: using conformal invariance, three of the four points can be brought to the points (0, . . . , 0), (1, 0, . . . , 0) and ∞, and the position of the last point is then only parametrized by its distance to the former two points. The integral should therefore be reducible to a two-dimensional space. We detail below this procedure, starting with the definition Eq. (2.6) and leading to the result Eq. (2.14).
The first step is to use translation invariance of the integrand to set x 4 = 0, and for later convenience to write the surface integral as a volume integral with a delta function, is related to the sum over conformal blocks of Eq. (2.7), minus its disconnected part, The second step consists in moving x 1 to infinity using the Faddeev-Popov method: one starts with inserting the identity in the integral in the form of d d b δ d (b µ − x µ 1 /x 2 1 ), and then one performs the change of variable Upon integration over x 1 , the previously inserted delta function selects the point x 1 → ∞, and we obtain The next step is to use a scale transformation to set the norm of x 2 to unity, using again the Faddeev-Popov method. Inserting ∞ 0 dλ δ(λ − |x 2 |) in the integral and performing the change of variables x i → λx i , b → λ −1 b, we get after successive When d is an integer multiple of 4, the hypergeometric function in Eq. (B.8) reduces to a logarithm times a rational function, and the integral can be performed explicitly. In d = 4, for instance, we find the simple result where the term in square bracket is the Bloch-Wigner dilogarithm [11], a function that is known to appear in various places in physics. 5 In d = 8, the kernel is also proportional to the Bloch-Wigner dilogarithm, A closed form result for K d in arbitrary dimension is not known.

Appendix C: The Free Scalar Theory in 8 Dimensions
In this appendix, we provide more details about the free scalar theory in 8 dimensions, which is the simplest example of a theory with a scale anomaly in a scalar four-point function. We review in particular some known results about the free scalar theory in arbitrary d and derive some identities that are being used in the main body of our work.
C.1 The φ 2 φ 2 OPE in Arbitrary d There are two types of operators that enter the φ 2 φ 2 OPE, schematically of the form ∂ n φ 2 and ∂ n φ 4 , where the derivative can actually act on any of the fields. In general, such operators can have spin = n, n−2, . . ., where n− of the partial derivatives have contracted indices. For the operators of type ∂ n φ 2 , the situation is however simpler, as they cannot be primaries if = n: they must either vanish by the equation of motion φ = 0, if the two contracted derivatives act on the same φ, or must be descendants. Therefore, all primary operators of type ∂ n φ 2 have spin = n, that is they are traceless symmetric tensors. Such operators saturate the unitarity bound and hence are conserved currents (except in the case n = 0). Using this conservation rule together with tracelessness and symmetry, one can show that there is a unique primary operator for each integer n (see for instance [14] or appendix F of [15]), which can be written as The c (n) s,k coefficients are fixed by the requirements of tracelessness and conservation to be of the form This means in particular that there are no Ψ n operators with odd n, as the sum Eq. (C.1) vanishes in that case (which we know also from parity considerations). The coefficients c (C.5) The first operator in this series is Ψ 2 = φ 2 / √ 2 itself, and the second coincides with the stress-energy tensor for the free scalar theory.
The three-point function of two operators φ 2 (or rather Ψ 2 ) with Ψ 2n has a similar form where the OPE coefficients in that case are related to the previous case by λ φ 2 φ 2 Ψ 2n = 2 λ φφΨ 2n . (C.10) Using known forms for the conformal block g Ψ 2n (u, v) of Eq. (2.7), one can check in any dimension that this implies ∞ n=0 λ 2 φ 2 φ 2 Ψ 2n g Ψ 2n (u, v) = 4u (d−2)/2 + 4(u/v) (d−2)/2 , (C.11) The remaining operators in the OPE are those constructed out of four φ fields. The problem of finding the operators and OPE coefficients in that case is more delicate, as there are not only operators of twist ∆ − = 2(d − 2) (totally symmetric tensors), but also of higher twist. We know that their contribution in terms of conformal blocks must be The six terms in the numerator on the right-hand side correspond to the six diagrams in Fig. 7. The last three form the connected part of the four-point function that is being used in Eq. (2.23).
For completeness, we also provide the coefficient for the leading operator of type ∂ 2n φ 4 in the φ 2 φ 2 OPE, (C.14) The operators and OPE coefficients computed in this section are used to derive the numerical results in Table 1.

(C.22)
The unknown overall coefficient a V,n can again be determined by looking specifically at a particular component. We get .