Causality constraints on corrections to Einstein gravity

We study constraints from causality and unitarity on $2\to2$ graviton scattering in four-dimensional weakly-coupled effective field theories. Together, causality and unitarity imply dispersion relations that connect low-energy observables to high-energy data. Using such dispersion relations, we derive two-sided bounds on gravitational Wilson coefficients in terms of the mass $M$ of new higher-spin states. Our bounds imply that gravitational interactions must shut off uniformly in the limit $G \to 0$, and prove the scaling with $M$ expected from dimensional analysis (up to an infrared logarithm). We speculate that causality, together with the non-observation of gravitationally-coupled higher spin states at colliders, severely restricts modifications to Einstein gravity that could be probed by experiments in the near future.


Introduction
Einstein's theory of general relativity (GR) has been extraordinarily successful since its inception over a century ago. Nevertheless, modifications to GR are often discussed in relation to various puzzles, such as the nature of dark energy and dark matter, see [1][2][3] for reviews. An important class of modifications add higher-derivative terms to the equations of motion, which lead to physical effects which grow at short distances. Such corrections, handily classified in [4], arise naturally in string theory, and presumably in any UV-complete theory of quantum gravity.
In this paper, we consider 2 → 2 scattering of gravitons and ask a simple question: Assuming that graviton scattering respects causality at all energies, by how much can the low-energy amplitude differ from the predictions of general relativity? A well-known result by [5,6] shows any Lorentz-invariant theory of massless spin-2 particles must reproduce general relativity at large distances. Our goal will be to bound the corrections to this limit, assuming that relativistic causality (as we understand it) holds.
Our answer will depend on the spectrum of the theory. Let us describe our setup and assumptions. At low energies, we assume there exists a massless graviton, together with a finite number of fields of spin ≤ 2, which can be described by an effective field theory (EFT). Schematically, the low-energy effective action encodes modifications to Einstein gravity: where Riem n denotes (possibly non-unique) contractions of products of n Riemann tensors. An important idea is that the sizes of the Wilson coefficients g R (3) , g R (4) , . . . are constrained by causality, that is, the notion that signals cannot travel faster than light. For example in [7] it was observed by Camanho-Edelstein-Maldacena-Zhiboedov (CEMZ) that in the presence of g R (3) , the two polarization modes of the graviton would move at different velocities in certain backgrounds, and inevitably one of them moves faster than "light". By considering a setup with large enough black holes and mirrors, this effect could lead to closed timelike curves, with ensuing grandparent-type paradoxes; the conclusion is that a classical theory with g R (3) = 0 is inconsistent. Ref. [7] further pointed out that paradoxes can be avoided at the quantum level if the graviton couples to higher-spin states. Denoting the mass of the lightest higher-spin state (spin 4 or higher) by M , this led to a parametric bound: |g R (3) | ∼ < 1 M 4 . Our main goal in this paper will be to quantitatively bound higher-derivative corrections in (1.1) in terms of the mass M of higher-spin states. This mass M provides a UV-cutoff scale for the low-energy EFT in (1.1). We will assume a large hierarchy between the Plank scale and this cutoff M 2 M 2 pl , (1.2) or, equivalently, GM 2 1, so that gravity is weakly-interacting below the cutoff. Our methods will test causality of graviton scattering with arbitrary center-of-mass energies, although physically the most important region for us will be near the cutoff M .
The notion of causality is subtle in gravitational EFTs because there is no globally welldefined lightcone in nontrivial backgrounds. As discussed recently in [8,9], one should contrast "asymptotic causality," which exploits a fixed causal structure at large distances, with "infrared causality" which compares local time delays between species [9]. We will use asymptotic causality, but crucially, imposed at all energy scales and not only within the EFT regime. This leads to sharp mathematical statements involving crossing symmetry, analyticity, and Regge boundedness of scattering amplitudes [10]. Many works have examined how these conditions constrain EFTs and their UV completions, see e.g.  and references therein. In particular, these conditions give rise to dispersion relations that relate high and low energies. In some cases, dispersion relations can be interpreted as expressing the commutativity of coincident shockwaves [35]. Initially, the use of dispersion relations in gravitational EFTs was hindered by divergences related to the forward limit of the low-energy graviton amplitude. These technical issues were recently partially overcome in [13] by studying dispersion relations in impact parameter space; hence our renewed interest in this problem.
There is a natural motivation to study scattering events which have center-of-mass energies above the "cutoff" M . In any scenario where a higher-derivative correction to GR might be observed in an astrophysical or other large-distance process, the suppression scale would have to be very low, M 1 TeV = (2 × 10 −19 m) −1 : energies above such a cutoff are routinely probed at colliders. However, collider experiments have not yet reported any higher-spin particles of the type suggested above. Is it at all possible to modify GR in a way that simultaneously: (1) satisfies collider constraints, (2) is relevant at large scales, and (3) respects causality as we understand it?
In this paper we take a modest step toward answering this question, by quantitatively relating higher-derivative corrections to the mass M of higher-spin states, using methods from [13]. Our main results will be that dimensionless ratios, of the schematic form g R (3) M 4 or g R (4) M 6 , are bounded by order-unity constants, times an infrared logarithmic divergence log(M/m IR ). Alternatively, given a measurement of such couplings, we constrain the mass M of new states and their couplings to gravitons. The task of bounding their couplings to Standard Model fields is left to future work.
The operational definition of "gravity" in this paper is a force which grows linearly with energy at high speeds, corresponding in particle physics language to exchange of a spin-2 particle. We stress that static long-range forces, which could come from direct interactions between matter and light spin-0 or spin-1 particles (also sometimes called fifth forces), are unconstrained by our arguments.
The infrared logarithmic divergence in our bounds is related to the divergence of the eikonal phase in four dimensions. In the context of scalar scattering in AdS/CFT, this logarithmic divergence gets regulated by the AdS curvature scale, yielding rigorous, finite bounds proportional to log M R AdS [36]. We expect the same mechanism to apply to graviton scattering as well. Thus, our bounds can be interpreted as finite bounds on gravitational Wilson coefficients in AdS 4 . The key feature of our bounds is the absence of power-law infrared divergences; eventually, we hope that infrared logarithms can be removed by studying suitable IR finite observables. In our view, an incredibly conservative assumption would be to replace m IR with the Hubble scale, which for M ∼ 1 TeV, multiplies our bounds by a factor of only log M/m IR ∼ 100. Note that this would still strongly rule out a modification of GR that simultaneously satisfies (1), (2), and (3) above.
This paper is organized as follows. In section 2, we state our assumed axioms encoding causality of graviton scattering, review dispersive sum rules, and some of their known implications, notably positivity bounds from forward limits. In section 3, we review the impact parameter approach of [13] and provide example positive functionals which prove upper bounds on gravitational EFT coefficients. We explain why the bounds are only weakly affected by possible light matter fields. We then describe our numerical strategy to systematically search for optimal bounds, and how CEMZ-like bounds are automatically included. In section 4, we report the bounds obtained with this method, comment on their relations with known theories, and speculate about the possibility of modifications associated with a low scale M . We summarize in section 5. In appendix A, we review partial waves for graviton scattering amplitudes. In appendix B we present graviton amplitudes from light exchanges of spin-0 and spin-2 particles. Finally, we record our numerical set-ups in appendix C.

Review: Dispersive sum rules
In this section, we describe our key physical assumptions and tools: effective field theory at low energies (section 2.1), unitarity (section 2.2), causality (section 2.3), and Kramers-Kronig-type dispersion relations and some of their consequences (section 2.4). Most of this material is standard, except perhaps for our discussion of Regge boundedness in section 2.3, using compact-support wavefunctions.
Let us first briefly motivate our assumptions, before we state them technically. By unitarity, we mean that initial and final states of scattering processes are elements of positive-definite Hilbert spaces, whose norms are conserved by time evolution. This represents the idea that the probabilities of all possible events are positive and add up to one. The reason we assume this is obvious: we wouldn't know how to interpret negative probabilities.
By causality, we refer to the notion that "signals can't travel faster than light". We assume causality because of its tremendous explanatory power and past successes: by forbidding instantaneous action at a distance, it qualitatively explains why electromagnetic and gravitational waves must exist, why forces in nature are mediated by particles, why antiparticles exist, how their interactions are quantitatively related [37], and so much more. Abandoning causality without a good replacement principle seems akin to opening Pandora's box. 1 1 If one were not worried about instantaneous action at a distance, any many-body Hamiltonian such as or others, would trivially define a "quantum theory of gravity".
There is an interesting interplay between unitarity and causality, as displayed by quantum fields with wrong-sign kinetic terms, sometimes called "ghosts". In one quantization, positivefrequency modes propagate forward in time but have negative norms. In an alternative quantization choice, norms are fine but positive-frequency modes propagate backward in time. Lee and Wick famously proposed that the ensuing acausality could be made unobservably small if one treats backward-moving modes as resonances which decay to normal forwardmoving modes [38], as is indeed seen at low orders in perturbation theory [39,40]. A problem is that, as soon as interactions with normal matter are included, negative-frequency modes make the vacuum unstable against resonant particle production. This instability can be nicely discussed in connection with a classical theorem by Ostrogradsky [41]; it seems incompatible with a long-lived Universe [42]. 2 In short, we assume unitarity and causality because we see no alternatives. It is possible that Nature does not conform to these principles as we understand them, but resulting bounds can be viewed as tests of these principles.

Helicity amplitudes and low-energy EFT
Four-dimensional gravitons possess two helicity states. A complete set of independent amplitudes for graviton-graviton scattering is  ) is nonvanishing, and a large fraction of our results will be derived by studying only this amplitude. Here, we use spinor-helicity variables (see [44]) and we have introduced the Mandelstam invariants with s + t + u = 0. The functions f (s, u), g(s, u), and h(s, u) are analytic in the upper-half plane with Im s > 0, and crossing symmetric: Other helicity amplitudes may be obtained by complex conjugation and Schwarz reflection, for example the −−−+ amplitude is g(s, u) = (g(s * , u * )) * .
2 As was pointed out with much deference to the original authors, the Lee-Wick prescription is ambiguous and "an additional prescription would be needed to completely define the theory" [43]. In our view, the Lee-Wick idea fails to address the vacuum stability issue for the simple reason that the timescales in the relevant vacuum diagram are shorter than the decay time of Lee-Wick quanta. Figure 1: 2-to-2 scattering amplitudes of gravitons within the low-energy effective theory. We include at tree-level both the graviton exchange and (higher-derivative) contact diagrams, as well as exchanges of possible light spin-0 and spin-2 particles. Other spins are forbidden by angular momentum conservation.
At low energies, we assume a spectrum comprising massless gravitons together with possible light particles of spin ≤ 2. These can be described by an effective field theory (EFT) of the generic form (1.1), with higher derivative terms encoding modifications to Einstein gravity generated by physics above the EFT cutoff M . We assume M M pl and thus neglect loops within the EFT. States with spin greater than two are genuinely gravitational, and assumed to have mass above the cutoff, m > M .
By contrast, particles of spin two or less and mass m M can be interpreted as additional states in the Standard Model of particle physics and its extensions, or states arising from Kaluza-Klein reduction of massless gravity in higher-dimensions. Angular momentum conservation forbids the decay of a state of half-integer or odd spin to a pair of gravitons, so only matter fields of spins 0 and 2 can affect graviton scattering at tree level. We will refer to both as "matter", even though this nomenclature is slightly unconventional for spin 2 fields.
The best way to enumerate EFT couplings is to list how they modify graviton scattering amplitudes. On-shell three-particle vertices are determined by Lorentz invariance up to overall parameters The tree-level four-particle amplitudes (2.3) may then be written in terms of exchange diagrams, plus a sum of contact interactions, which are simply polynomials with the symmetry (2.5): h low (s, u) = 40πG g 3 stu + 1 2 g 4 (s 2 + t 2 + u 2 ) 2 + 2 g 5 stu(s 2 + t 2 + u 2 ) + g 6 (s 2 + t 2 + u 2 ) 3 + g 6 s 2 t 2 u 2 + . . . + h matter (s, u) + O(loops) , where hatted couplings are complex (the real and imaginary part representing parity-even and parity-odd couplings, respectively). The subscript "low" emphasizes that this expansion is used only for |s| < M 2 . The signs on the first line have been chosen so that our couplings relate simply to those in [45]. 3 The matter contributions f matter (s, u), g matter (s, u), and h matter (s, u) are recorded in appendix B. It is straightforward to write down Lagrangians that give rise to the above amplitudes. Before doing so, it is important to note that Lagrangian densities are only defined modulo field redefinitions (which change contact interactions by equation of motions) and total derivatives. In particular, any higher-derivative term involving the Ricci tensor R µν or scalar R is removable, so only powers of the Riemann curvature R µνσρ must be kept. 4 Furthermore, numerous identities relate various contractions of Riemann tensors and derivatives. This is the reason why we do not include R 2 : R 2 -terms can be recast into the Gauss-Bonnet term, which is topological in d = 4. In contrast, the amplitudes (2.7) are unambiguous.
With this being said, it is straightforward to list a minimal set of irreducible higherdimension operators and map them to the amplitudes (2.7) by computing the resulting treelevel amplitudes. For example, the parity-even sector of cubic gravity contains 10 different operators, but field redefinitions and various identities leave us with only one independent operators [46]. Up to dimension eight, our effective action is where we defined (2.10) It is then straightforward to expand g µν = η µν + √ 32πGh µν and apply the standard Feynman techniques to evaluate scattering amplitudes and compare with eqs. (2.7): Note that we absorbed a factor of 8πG in three-point couplings but not in four-point couplings. 3 The conversion is simply: {g4, g5, g6, g 6 } here = {a0, a1, a2,0, a2,1} there . (2.8) In our notation the subscript always denotes half the number of derivatives in the contact interaction. 4 It is well-known for example that f (R) gravity is equivalent to standard Einstein gravity minimally coupled to a scalar field with a specific potential. From our perspective, f (R) gravity thus does not constitute a higherderivative correction to Einstein's gravity. Instead, it is a specific choice of matter sector.

High energies: partial waves and unitarity
We will assume that graviton scattering remains sensible even at center-of-mass energies that exceed the EFT cutoff M (where the parametrization (2.7) no longer applies). Our minimal assumptions are that the amplitude remains causal (that is, analytic) and unitary, and that the spectrum is relativistic so that it can be organized in terms of mass, m 2 , and spin, J.
In other words, the amplitude admits a partial wave expansion of the form [47] Explicitly, the MHV amplitude f (s, u) has distinct discontinuities in the s-and t-channels: whered J α,β are Wigner-D functions with stripped helicity factors, see appendix A for more details. The overall m −8 originates from the prefactor in (2.3).
For other helicity configurations, we have similar relations, except that the "imaginary part" gets replaced by the discontinuity Im a ≡ [a(s + i0) − a(s − i0)]/(2i), and the righthand-sides are now complex numbers: The corresponding partial wave expansions are

Regge boundedness and all that
Low and high energies are related by Kramers-Kronig-type dispersion relations. It will be crucial that we can predict beforehand which dispersion relations converge. Typically one assumes a Froissart-Martin-like bound at fixed momentum transfer and large complex energies: lim |s|→∞ M/s 2 → 0 at fixed t < 0 (not what we'll assume) . (2.20) For example, in tree-level string theory, M ∼ s 2+α t < s 2 . However the validity of this bound is not generally established in an abstract theory of quantum gravity. Martin's original proof of the Froissart-Martin bound in axiomatic field theory [51] does not apply to gravity, due to the absence of a mass gap. For holographic theories it has been argued that the behavior (2.20) holds for physical kinematics as a consequence of the chaos bound [52,53]. This difficulty is a physical one and not merely technical: to bound amplitudes at large complex energies, one must generally combine analyticity with some boundedness property on the real axis, as we do shortly. The difficulty is that analyticity holds at fixed momentum, while boundedness holds at fixed impact-parameter; these two spaces are related by a Fourier transform which is not easy to control. Namely, it is not straightforward to estimate largeimpact-parameter contributions in the absence of a mass gap or of an explicit model of the dynamics. Thankfully, large-impact-parameter physics however seems immaterial for bounding EFT couplings at the scale M . The intuition, stressed in [12,14], is that EFT parameters at the scale M satisfy sum rules saturated by impact parameters b ∼ M −1 .
Let us explain how we sidestep (2.20) by adapting a recent method from [36], which showed that the conclusions from flat space sum rules apply to quantum gravity in AdS (defined as a CFT with large but finite central charges and single-trace gap). The method is simple: we integrate scattering amplitudes against wavepackets that have finite support in momentum space and decay rapidly at large impact parameters b. Formally, for a wavefunction Ψ(p), we define the smeared amplitude: It is apparent that for |s| > 1 2 M 2 , all amplitudes on the right-hand-side are in the physical region where the partial wave expansion (2.12) applies. (The offset of s by 1 2 p 2 is not essential but ensures that s ↔ u crossing symmetry is simply reflection ofs.) Furthermore, thanks to compactness of the integral, M Ψ (s) inherits the analyticity properties of the original amplitude: our fundamental assumption is that a crossing path exists which connects the two pointss = ± 1 2 M 2 , and that the amplitude is analytic outside of that arc. Fast decay in b requires Ψ(p) to be smooth and to vanish rapidly enough at the endpoints; the precise condition is detailed below (see (3.17)). The upshot is that if the decay sets in at some b > b * , then the spin sum in (2.12) is effectively limited to J ≤ √ sb * . Since individual a J are bounded (see (2.15)), one trivially gets the bound We thus have an analytic function which is bounded on the real axis. Unless this function grows exponentially at complex energies (which would imply blatant time advances when Fourier transformed to the time domain, a behavior which was not seen in theories of quantum gravity in AdS realized by unitary CFTs [36]), it must be bounded in all complex directions by a version of the maximum principle called Phragmén-Lindelöf principle (see [52]): The results presented in this paper rely only on the above properties of smeared amplitudes M Ψ , and not on (2.20). (In fact, we will only use that lim |s|→∞ |M Ψ (s)|/|s| 2 = 0, which is easily implied by (2.23).) We believe that these are conservative assumptions directly traceable to causality and unitarity. We stress that causality is stronger than "particles cannot move faster than light". Since particles are waves, the notion that "signals" cannot travel faster than light is taken to mean that (asymptotic) measurements at space-like separated points A and B commute. This relates amplitudes for a particle moving from A to B to an antiparticle moving the other way: causality is entwined with crossing symmetry [54]. While this physical picture is compelling, we should note that crossing symmetry and analyticity are nontrivial to prove mathematically (for recent discussions see [55,56]). The use of standard S-matrix axioms in the context of quantum gravity is supported by the recent work [36], which showed that well-established CFT axioms imply that graviton scattering in AdS space satisfy dispersion relations.
The implications of (2.23) depend on the helicity of scattered particles. Recalling that for physical kinematics ij = ±[ij] * , and The first line gives respectively the fixed-u and fixed-t Regge limits of the MHV amplitude. Notice that certain limits enjoy improved behavior ∼ s −3 : when amplitudes are normalized so that contact interactions are polynomial (see eq. (2.7)), they vanish in some high-energy limits. This phenomenon is known as superconvergence and is the main reason why we will find stronger constraints on graviton contact interactions than for scalars. 7 Superconvergence is also related to the observation of [58] that a very limited number of graviton contact interactions obey (or more precisely, saturate) the classical bound (2.20). Although it is simpler to prove, the bound (2.23) is stronger and is not satisfied by any individual graviton contact interaction.

Dispersive sum rules
We are now ready to write dispersive sum rules for the amplitudes f (s, u), g(s, u), and h(s, u). We begin with the MHV amplitude f . From the behavior (2.25), we get two types of constraints: from fixed-u and fixed-t. For fixed-u we can separate f into combinations that are even/odd under s↔t and obtain the following basis of sum rules, for integer k: In general, superconvergence occurs in scattering of particles of spins J1 and J2 whenever J1 where J0 is the Regge intercept of the theory, see e.g. [35,57]. The bound (2.23) amounts to J0 ≤ 1 but all we ultimately use in this paper is J0 < 2.
where the integrals are along a large circle at infinity. We additionally have three fixed-t dispersion relations, which also integrate to zero for k ≥ 2 even: To avoid confusion between different channels, we always write the fixed momentum transfer as p. These sum rules become strictly valid after the p-dependence is integrated against appropriate wavepackets as in eq. (2.21). The subscript k indicates the Regge spin of a sum rule. This concept is closely related, but distinct, from the "number of subtractions" or power of 1/s inserted to improve highenergy convergence. For example, B 2 has fewer subtractions than B (2) 2 (and is even "antisubtracted" since it has no denominator!), yet they possess the same convergence properties. The nomenclature is motivated by the fact that exchange of a single t-channel particle of spin J yields an amplitude that grows like M ∼ s J : we say that a sum rule has spin k if it converges on exchanges with J < k (and marginally diverges on spin k).
Regge spin is more important than subtraction-counting because the Regge growth (2.23) translates into the simple convergence criterion k > 1. This is the same criterion as convergence of the Froissart-Gribov formula which extracts partial waves of spin J > 1, or of the analogous Lorentzian inversion formula [59][60][61] which extracts CFT data for spin J > 1.
Sum rules are obtained by deforming the contour towards the real axis but avoiding the low-energy region: the contour in figure 2 relates low-energy data at the scale M and heavy data above M : Note that the s and u channel cuts contribute identically due to symmetry of eqs. (2.26), so we included only the right cut. (The contour on the left is in reality the union of upper and lower half-circles, separated by the branch cut of the amplitude.) Let us focus on the first sum rule for simplicity. At tree-level we find only two residues, from s = 0 and u = 0, which contribute the same amount: Substituting in the low-energy amplitude (2.7a), only the exchange graphs contribute for k = 2, 3: Figure 2: Contour deformation which gives rise to sum rules eq. (2.28). The final contour relates low-energy EFT data along the arcs to heavy discontinuities along the branch cuts.
The absence of contact term contributions is a hallmark of superconvergent sum rules. Examples that probe contact interactions include: A salient feature is that the same couplings appear in multiple sum rules: this reflects crossing symmetry. Another feature is the appearance of the cubic self-coupling in B (1) 4 : this is due to the rapid growth with t of the t-channel exchange diagram with derivatives. This rapid energy growth at zero impact parameter will turn out to be a powerful mechanism to bound g 3 , as was proposed in section 7 of [58]; this mechanism is distinct from the spin-2 growth at large impact parameter that was exploited by CEMZ [7].
At high energies s ≥ M 2 , the amplitudes are beyond our knowledge. We can nevertheless evaluate the contribution to the dispersive sum rule, by inserting the partial wave decomposition and using eq. (2.16) for even k: where x = 1− 2p 2 m 2 and for later convenience we define the heavy densities C and (dimensionless) The remaining four sum rules are: In summary, the constraints of analyticity and unitarity used in this paper are embodied in the relation (2.28), which connects EFT couplings (eqs. (2.31)-(2.32)) to heavy averages like (2.33) that involve a positive measure.

Review of simple bounds from forward limits
Let us explore the above sum rules. Evidently, the spin-2 sum rules (2.31) diverge in the forward limit p → 0. This is the famous "graviton pole" problem, whose resolution using smeared sum rules will be described in the next section. However, higher-spin sum rules have smooth limits which have been extensively studied.
As stressed before, forward limits can be dangerous. Should we trust these bounds? From the present perspective, one can argue that the answer is: yes, as long as the size of g 4 is much larger than the loop effects which cause the danger at loop level. Let us estimate this in detail. The one-loop amplitude diverges in the forward limit p → 0 like M ∼ iπ G 2 s 3 p 2 log p 2 . This divergence can be avoided simply by replacing the limit with evaluation at a small scale p ∼ p * M . (This will spoil positivity at large b = 2J m , but if we assume that this region is controlled by known low-energy-computable eikonal physics, positivity should not be crucial there.) The bound (2.36) becomes schematically (up to logarithms) On the other hand, as discussed in the next section, in the presence of spin-4 particles at the scale M the expected size is g 4 ∼ G M 6 G 2 M 4 . By choosing appropriately p * , it is easy to make the loop corrections negligible in comparison: This can be satisfied since we assume that the EFT cutoff is parametrically below the Planck scale. A similar argument was described in [63]. By the same token we can trust the Taylor expansion around the forward limit of other sum rules with spin k ≥ 3. For k = 5 we find from which the two-sided bound −g 4 ≤ g 5 M 2 ≤ g 4 readily follows (again up to loop corrections). From the forward limit of B (1) 6 , we have We get additional sum rules of the same scaling dimensions by considering also forward limit derivatives of k = 4 and k = 5 sum rules: where J = J(J + 1). The novelty is the presence of a "null constraint", that is, a sum rule with vanishing low-energy contribution. By taking linear combinations of (2.42) and (2.43) that maintain the positivity of RHS, these reproduce the known bounds derived in [45]: 9 We will see in section 4.4 that the above bounds are not optimal; we will get tighter ones by using more null constraints. In particular we will show that indeed g 6 ≥ 0 (up to 1/M 2 pl corrections from loops, like other bounds in this paper). A generic observation is that all dimensionless ratios of the form g k M 2(k−4) /g 4 satisfy two-sided bounds, consistent with dimensional analysis scaling. We do not have a formal proof for all couplings, but we expect this to hold in analogy with the scalar theory case studied in [13]. The novel feature for gravitons is that we will be able to upper-bound g 4 itself by a multiple of G/M 6 , as we now discuss.
3 Bounds that relate gravity and higher derivatives

Impact parameter functionals
As we reviewed in the previous section, the simplest positivity bound is g 4 ≥ 0, which is trivially established by evaluating B (1) 4 at p → 0. However, this says little about the size of g 4 since Newton's constant does not appear. On the other hand, one may expect that all graviton interactions shut down if G = 0, a concrete example being the CEMZ bound [7]. To probe Newton's constant with dispersion relations, we must inevitably use the spin-2 sum rules B 2 and deal with the 1/p 2 pole. This precludes doing Taylor expansions around the forward limit.
Our strategy will follow [13]: we tame the graviton pole by considering processes with finite impact parameter b ∼ 1/M . In spacetime dimensions D > 4 this removes all divergences. In D = 4, as we are considering in this paper, we will be left with infrared logarithms, still a major improvement over power-law divergences in our opinion. Being mindful of the compact-support property (see section 2.3), we consider functionals of the generic form: where B i (p 2 ) generally denotes dispersive sum rules and f i (p) is compact functions in p. The reader might worry that the integral in Eq. (3.1) is done all the way up to the cutoff, pushing the convergence of the EFT expansion to the limit. For the time being we will focus on super-convegent sum rules, for which this is not an issue. In section 3.4 we will review how this problem can be avoided by considering improved sum rules [12,13], which only receive contributions from a finite number of EFT coefficients.
In order to get a bound, the idea is to look for functionals whose action on any state above the cutoff is nonnegative: are the heavy densities defined in Eq. (2.34). In words, any nonnegative functional yields an inequality on low-energy observables, thanks to the relation between low and high energies (2.28).
In D = 4 there will be a tension between finiteness and positivity: finiteness on the graviton pole requires ψ i (p) to vanish faster than p at the origin, which is impossible for the Fourier transform of a positive function. In practice, we begin by finding functionals which are rigorously positive at all impact parameters but logarithmically diverge on the graviton pole. We then regulate by adding an infrared cutoff m IR M , and accept that this causes negativity at large impact parameters.
Before explaining how we produce positive functionals, we first detail how we ascertain positivity.

Example positive functionals involving gravity
We now present explicit bounds that combine two ingredients: • The spin-2,3 sum rules B   In principle the spin-4 sum rules should also be smeared to make them rigorously valid, but as discussed in section 2.3 this is a technical modification, which we will ignore here.
Using these ingredients and the method of section 2.4, we have constructed the following two functionals: in units where M = 1. Notice the lower cutoff m IR makes the functionals infrared-safe. We claim that: • Without the cutoff m IR , the functionals are positive for all states with m > M • With the cutoff m IR , they only become negative at some large b ∼ m −2/3 IR • They imply the respective bounds: where the matter contributions (coming from possible light scalars or Kaluza-Klein modes and detailed in eq. (3.9) below) is sign definite: F matter ≥ 0.
The bound (3.4a) is a sharp version of the CEMZ constraint [7]: g 3 ∼ < 1 M 4 , up to logarithms. This shows that a cubic coupling of size 1 M 4 R 3 cannot be turned on without having a heavy state at the mass M or lighter. The bound (3.4b) is similar for quartic couplings.
The functionals (3.3) are not optimal: their main virtue is to be explicit enough to be analyzed in full detail in this section. They establish our main conceptual result: higher dimensional couplings can be bounded in terms of Einstein gravity. Optimal bounds are presented in the next section, see eq. (4.4).
How do we ascertain that a functional is positive on all heavy states? A simple strategy is to plot the action F[m 2 , J] as a function of m for various discrete spins J = 0, 2, 4, 5, 6, . . .. Note that the heavy contribution (2.33) is the sum of two positive unknowns (|c ++ J,m 2 | 2 and |c +− J,m 2 | 2 ) and we must ascertain that the coefficient of each is positive: we will refer to those as F ++ and F +− below. Thus two plots must be made for each (even) value of the spin, and one plot for each odd spin.
Since we cannot plot infinitely many spins, it is fruitful to exploit regularity of the functionals. We find that when plotted as a function of "impact parameter" b = 2J/m, and m, the curves vary smoothly with spin and display simple asymptotic trends. This allows us to draw contour plots where positivity is easy to ascertain, and potentially dangerous regions can be easily identified for finer sampling.
We display F g 3 in figure 3. Unless noted otherwise, all plots in this section are in units where M = 1. To make the plot, we considered data up to J max = 300, more specifically, J = 0, 2, . . . , 300 for F ++ g 3 and J = 4, 5, . . . , 300 for F +− g 3 , accounting for spin selection rules. For each spin, we sample the interval m ∈ [1,16], with a larger density of points closer to the origin (for example 80 points between 1 and 1.2 and 700 between 1.2 and 16). At the highest value m = 16 we have safely reached the m → ∞ limit, further discussed below, and the value b = 40 is well past any interesting structure. Therefore figure 3 establishes positivity of F g 3 . Similar plots are displayed for F g 4 in fig 4, where we used the same sampling.
In addition to sampling a finite range, we find it useful to verify positivity in a m → ∞ scaling limit. The limit is nontrivial if keeping impact parameter b = 2J/m is fixed, and is dominate by the B 2 component of the sum rules. It is essentially the Fourier transform of its coefficient [13]:   where . . . represents higher orders in 1/m. With m IR = 0, positivity is easy to ascertain at all b, as shown for F g 3 in figure 5. (In this limit, F ++ and F +− coincide, so there is only one curve.) However, with m IR = 0 the action on low-energy gravity is infinite, and the resulting upper bound is vacuous. We thus add a cutoff θ(p > m IR ), which creates negativity (we take F g 3 as example to illustrate): The second term overwhelms the first at b max ∼ m −2/3 IR and creates negative plateau up til b ∼ m −1 IR , where the Bessel function gets damped. While the total area under the functional is necessarily 0 (because of vanishing at p = 0), the behavior past m −1 IR is not universal and could be modified by using a smoother cutoff. This is depicted in figure 5, where we contrast the functionals with m IR = 0 and m IR /M = 10 −6 .
Note that the functionals F g 3 and F g 4 include derivatives of the spin-4 sum rule B (1) 4 around the forward limit. For the purposes of bounding | g 3 |, one can also find functionals that are pure linear combinations of B 2 and B 3 , with no forward limit component. 10 Such functionals may have useful technical applications, since they avoid the subtleties with forward limits discussed in section 2.5.

Light spin-0 and spin-2 matter fields don't lower the cutoff
In addition to the graviton, we allow for the presence of possible spin-0 and spin-2 light states (for example, Kaluza-Klein modes), whose amplitudes are given by eq. (B.1). They contribute to the low-energy part of sum rules (integral over the arc at s ∼ M 2 ): The spin-2 contribution with m → 0 is proportional to that from the cubic coupling | g 3 | 2 , see (2.31).
The key feature is that the unknown couplings |g J (m )| 2 are sign-definite. Therefore, if we take combination of sum rules such that the coefficient of each unknown is positive for m ∈ (0, M ), then the unknown light and heavy states will all contribute with the same sign leading again to a valid inequality on EFT parameters. For the functionals in eq. (3.3), for example, we get: Alternatively, we could move the matter contribution in eq. (3.4a) to the left-hand-side, giving: We conclude that our bounds on g 3 limit the sum of squared cubic couplings to all light particles below the higher-spin scale M . It is remarkable that light scalars or spin-2 particles cannot couple strongly to two gravitons. This is very different than for the scalar EFT studied in [12], where the only limit on the interaction strength of light scalars would be the unitarity bound Im a J (s) ≤ 2.
This result can be interpreted as follows. In Einstein's gravity, the decay rate of a Kaluza-Klein graviton to two massless gravitons is proportional to an overlap integral dy √ gχ(y) = 0, which vanishes by orthogonality of eigenfunctions (here y is a coordinate on the internal manifold and χ(y) is the eigenfunction corresponding to the mode in question). Thus Kaluza-Klein modes only decay through higher-derivative corrections from the higher-dimensional perspective. The 1/M 8 suppression in (3.10) confirms that such a suppression exists for any massive spin-two particle, irrespective of its microscopic origin. The scale of suppression is controlled the mass of higher-spin particles.

Systematic strategy: improved sum rules
In general, a low-energy gravitational EFT includes an infinite number of contact terms. When evaluated at low-energies, the spin-2 and spin-3 sum rules B 2 (t) and B 3 (t) are each sensitive to an infinite subset of these contact terms. It is useful to subtract from B 2 (t) a linear combination of forward-limits of higher spin sum rules to define "improved" sum rules [13], which are sensitive only to a finite number of higher-derivative corrections. Not all sum rules need to be improved, for example, the B 2,3 and B 2 sum rules are automatically free of higher-dimension contact terms. For the B 2 sum rules, we only need to define The improved spin-2 sum rules have the low-energy contributions −B (2) imp 2 (p 2 )| low, grav = 2πG where we have written only the contribution of graviton exchange for brevity. Similarly we can also define improved versions of higher-spin sum rules. An example is: (p 2 )| low, grav = 4πG| g 3 | 2 + 2g 4 . (3.14) In this section, we apply these improved sum rules (together with some forward-limit sum rules) to derive bounds involving higher-derivative Wilson coefficients and gravity, using the parameter choices listed in appendix C. Following [13], we consider wavefunctions ψ i (p) that are polynomials in p Since higher-spin sum rules are subleading at large-m (with fixed b), their wavefunctions can safely start with n min = 0, as long as spin-2 sum rules are included in the functional. It is worth noting that compact support in p can lead to oscillations at large b, which can potentially hinder positivity. To suppress oscillations, it is useful to make the wavefunctions smoother near p = 1 (in units where M = 1). We do this by multiplying the wave functions by a power of (1 − p). For example, our example functionals in the previous section included factors of (1 − p) 3 to suppress large-b oscillations. In this section, we consider wavefunctions of the form where we truncate n to n max . Larger values of n max will correspond to more complicated functionals and stronger bounds. As mentioned before, to allow for low-energy matter particles, we add additional constraints to the solver SDPB [64,65] that impose negativity of matter contributions to lowenergy sum rules for m < M . In practice, this means including positive matrices that are negatives of the matrices paired with low-energy three-point couplings g s (m ), g * s (m ), and g 2 (m ), g * 2 (m ) for m < M . In the MHV sum rules, matrices for matter particles are simply the coefficients of |g s (m )| 2 , |g 2 (m )| 2 . In numerics, we discretize m to the following values: m ∈ (0, 0.01, . . . , 1)M .

Relation with the CEMZ argument
In [7], CEMZ considered classical scattering of gravitons with various polarizations against a target (for example, a black hole) and argued that a too-large value of | g 3 |M 4 /G would lead to a time advance for one of the polarizations. It is instructive to see how the CEMZ argument is paralled in our formalism.
The basic idea is to consider the B sum rules at large impact parameters bM 1. Momentarily ignoring the compact-support constraint p < M , we can define impact-parameter sum rules by Fourier transforming in the transverse space (3.18) At large b, the integral is highly oscillatory, which suppresses the heavy action (2.33) on states with small J. Acting on states with J 1, thed function simplifies according to (A.11), and the integral localizes 11 to This is positive, yielding a constraint on low-energy coefficients − B 2 (b) low . More generally, we can consider scattering of an arbitrary helicity graviton (1 and 4) against a positive-helicity target (2 and 3), see fig. 7, which motivates us to consider a matrix of sum rules: evaluated in the forward limit. In the center of mass frame, we have

21)
11 A way to see this is using the identity 2 sum rules .

(3.22)
Let us transform to to b space using (3.19) for the diagonal terms. For the off-diagonal terms, we use We find the following heavy contribution: which is positive-definite, since the bracket is (by construction) a positive definite-matrix. On the other hand, evaluating the low-energy contribution in eqs. (2.7) we find . . It is instructive to see how these calculations relate. In that reference the eikonal phase is computed from the Fourier transform of the non-analytic parts of the amplitude δ(s, b) = 1 2s .

(3.26)
A classical time delay (matrix) is then extracted from the energy derivative: t = ∂ ∂E δ(E 2 , b). We see that the energy-growing part of the time delay is precisely what our matrix of sum rules captures: Now the dispersive sum rules state that (3.24) equals (3.25). In particular, they imply that the energy-growing part of time delays (in the linear regime) must be positive. The argument of ref. [7] is concluded by noting that the low-energy calculation (using graviton exchange) is only valid then b M −1 where M is the mass of heavy states. Thus we should only impose positivity of (3.25) in that range, which gives the parametric bound:  Table 1: One-loop contributions to low-energy parameters from heavy particles of various spins, divided by an overall factor G 2 N where N is the number of particles of the given spin circulating in the loop. Extracted from [45].
Ref. [7] further argued that an infinite tower of higher-spin states needed to appear. This discussion highlights that CEMZ constraints are built into dispersive sum rules, they are a subset of the functionals enumerated in the preceding subsection. Namely, this subset consists of spin-2 functionals B 2 (p) integrated against wavepackets that are peaked at impact parameters b. The CEMZ requirement b 1/M then has a clear origin in our compact support property p < M (see section 2.3).
Despite the fact that the same mathematics appear, the physical assumptions are quite distinct. Ref. [7] considered very large center of mass energy, where the amplitude exponentiates, whereas we consider Gs 1 where the tree-level approximation is sufficient. (Appendix D of [7] also presented an argument valid in the linear regime and related to the "chaos bound" [52].) The upside of using large center of mass energies there was that the acausality becomes classical and macroscopic, which led to transparent "grandparent paradoxes". The downside is that the cutoff is imprecise, b 1/M : such a method gives parametric bounds, in contrast with the precise cutoff p < M which yields sharp bounds.
In addition, the fact that we scatter waves rather than particles enable the precise energy resolution that is needed to probe the mass and couplings of states above the cutoff. One might say that time advances constitute a classical statement of causality, while crossing symmetry and analyticity of the S-matrix provide a quantum version.

Results
In this section we will present our results for bounds on EFT modifications of Einstein gravity.

Comparison with model amplitudes
Following ref. [45] it will be instructive to compare our bounds with explicit models, with higher-dimension operators arising from integrating out a loop of massive particles of mass m and spin two or less. For this one loop process, the mass of the lightest higher-spin states, which are two-particle states, is M = 2m. The resulting Wilson coefficients are shown in Table. 1.
Note that the complex coupling g 3 (which enter non-MHV amplitudes) vanishes for supersymmetric spectra; correspondingly, the first column in the table is proportional to the numbers of degrees of freedom and a fermionic sign. Other complex couplings follow the same pattern and it thus suffices to record the contribution from a heavy spin-0 particle: , 3488 225225 , 512 1576575 . (4.1) It is important to note that since we neglect light graviton loops, the effect of integrating out a particle are suppressed by a power of GM 2 1 compared with the effects we focus on, and are thus beyond the accuracy of our bounds. Nonetheless, the comparison with these models becomes physically meaningful (larger than neglected graviton loops) if a large number N of heavy particles circulate. In this sense, the upper bounds on couplings recorded below in fig. 9 can be interpreted as bounds on the number of species above mass M : N ∼ < # M 2 pl M 2 log(m/m IR ), as far as their effects on low energies are concerned. This has the expected dependence on M pl from "species bounds," although our bounds have an extra factor of an infrared logarithmic due to the nature of the probes we are using.
In addition, following [45], we will consider three models of string theory: where X = 1 − su M 2 (t+M 2 ) and p = 0, 1, 2 representing respectively the superstring theory, the heterotic string, and the bosonic string. In all cases M coincides with the mass of the first spin-4 particle exchanged between gravitons.
For the string models, we extract low-energy parameters by matching with the low-energy expansion (2.7) and subtracting the low-energy poles. For the bosonic string, we additionally subtract a tachyon pole 1 t+M 2 . (Because of the tachyon, the bosonic string model is not fully "physical", however after subtracting this pole it still has a positive heavy spectral density.) There is an ambiguity of whether we include the spin-0 and spin-2 contributions to the t = M 2 pole as part of f matter or as part of the "heavy" contributions: our bounds are valid in either case. In the plots below the string models thus span extended regions, depending on what fraction 0 ≤ γ i ≤ 1 of each pole we choose to subtract. For example, for the heterotic string we find

Bounds involving R 3 , R 4 and gravity
In the preceding section, we provided example functionals giving upper bounds (3.4) on | g 3 | 2 and g 4 in terms of gravity. As we emphasized, these bounds were not optimal, and we get with n max = 6, and we included additional ∂ q p 2 B (1) imp 4 (0) up to q = 2 to get the bound on g 4 . A finer way to present the constraint is to carve out the allowed space in the three EFT parameters | g 3 | 2 , g 4 and G, as shown in figure 8. These were are computed by using all improved B 2 and B 3 for n max = 5 and additional forward-limit contributions from ∂ q p 2 B (1) imp 4 (0) up to q = 2.
A special limit of the bound is the dashed line in figure 8 which is tangent to the allowed region near origin; from its slope we find numerically that This is effectively equivalent to the bound g 4 8πG ≥ 1 4 | g 3 | 2 M 2 reported in (6.13) of [52] using forward-limit bounds of spin k ≥ 4. This bound indicates that it is not possible to turn on a cubic coupling without having a quartic coupling as well. It is instructive to see the impact of allowing light spin-0 and spin-2 matter fields on the bounds. If we assume that such particles are absent, we of course obtain stronger bounds, as shown in figure 9, where we used the same space of functionals. Notice both regions with or without light matter share the same tangent line (4.6).
Our approach can also compute bounds for mixed amplitudes, beyond MHV. In figure 10, we derive bounds for Re g 4 (i.e., in all positive helicity configuration) and g 4 in terms of gravity. In particular, we instead use n max = 6 to help convergence of functionals built from combining B imp 2,3 and forward-limit of improved sum rules ∂ q p 2 B (2) imp 4 up to q = 2. Although we consider only Re g 4 , this bound really applies to the magnitude | g 4 | since from the perspective of the graviton scattering amplitude the overall phase of a complex coupling can be removed by a little-group rotation and thus cannot be constrained (only relative phases between couplings can).
The dashed lines display the positivity constraints (2.38), which we reproduce here: We see that the allowed region is much smaller than the cone (2.38), which appears to be tangent to the allowed region. In particular, a theory which would saturate one of these inequalities is ruled out; this would correspond to a theory where either of α or α in (2.9) is set to zero. (A similar conclusion was reached recently using seemingly different arguments which used only causality within the EFT [66]; it would be interesting to understand the relationship.)

Bounds involving D 2 R 4
Moving to the next derivative order, we apply spin-2 sum rules to relate g 5 to g 4 and G, see figure 11a. Of course, we already know from eq. (2.41) that g 5 /g 4 has two-sided bounds, and that g 4 /G log is bounded, and therefore we expected two-sided bounds on g 5 /G log as well. Similarly we can compute bounds on non-MHV couplings, i.e., for g 5 in figure 11b. We follow the same strategy as for g 4 above to project onto the parity eigenstates and thus present bounds for Re g 5 , although the bounds really apply to | g 5 |.
An interesting aspect of these plots is that the allowed regions are rather smaller than a cone. The dashed lines in figure 11a are tangent to the plot at origin, which reproduces the positive bounds (2.41): Similarly, the dashed lines that are tangent to figure 11b at the origin correspond to simple positive bounds Curiously, this bound appears stronger than what we could derive from forward limit functionals.
To compute bounds plotted in figures 11a and 11b, we truncated the space of functionals to n max = 5 built from improved sum rules to from spin-2 to spin-6, i.e., B imp i with i = 2 . . . 6. In the former case we also include forward-limit of sum rules ∂ k t B (1) imp 4 (0) and ∂ k t B (1) imp 5 (0) and in the latter ∂ q p 2 B (1) imp 4 (0) and ∂ q p 2 B (2) imp 4 (0), with up to q = 4 to guarantee the large J behaviour of functionals are positive. Other detailed parameter choices are listed in table 2.

Bounds involving D 4 R 4 and low spin dominance
In section 2.5 we reviewed how expanding higher-spin sum rules around forward-limit produces homogeneous bounds involving e.g., g 6 /g 6 , which gives eq. (2.44), which we reproduce here: − 90 11 ≤ g 6 g 6 ≤ 6 (using forward limits and a single null constraint) . (4.10) An important observation made in [45] was that the space of couplings spanned by the theories in Section 4.1, a.k.a "the theory island", is much smaller than that given by such homogeneous bounds. In [45], in order to approach the theory island, the authors propose an additional assumption called low-spin-dominance (LSD), which is a constraint on possible UV spectra stating that higher-spin states are suppressed compared to low-spin states. Quantitively, for MHV amplitudes, LSD implies LSD :  Although negative values of g 6 are excluded asymptotically, at any finite derivative order the boundary appears to be tangent to lines of slope 6 and − 90 11 .
where α ≥ 1 was used to parameterize the size of suppression of higher-spin states. By increasing α, ref [45] pushes the bounds asymptotically to 0 ≤ g 6 g 6 ≤ 2 (assuming LSD) , (4.12) thus narrowing down the space of couplings to that spanned by the aforementioned theories.
In this paper we do not assume LSD. However, by considering inhomogeneous bounds involving g 6 /g 4 and g 6 /g 4 of increasing derivative orders n (meaning null constraints having up to the same scaling dimension as the coupling g n ), we find that we can further narrow down the space of couplings as shown in figure 12. 12 As we increase the number of null constraints at higher derivative order, we observe that g 6 is approaching g 6 ≥ 0. We can thus claim that positivity of g 6 holds asymptotically, which agrees with the prediction of LSD [45]. On the rightmost edge g 6 M 4 /g 4 = 1 we find the absolute upper bound g 6 M 4 /g 4 2.38, which is significantly closer to the ratio predicted by LSD.
We can do even better by considering impact-parameter bounds on g 6 and g 6 normalized by gravity, which are the main novelty of this paper. This can be computed by using B imp i with i = 2 . . . 6 with n max = 8, with the result shown in Fig. 13a. Surprisingly, we find, as 12 This result was found concurrently in Ref. [67], which also studies inhomogeneous bounds of the form gnM 2(n−m) /gm, and makes similar observations. We thank the authors of that paper for sharing their draft with us and coordinating submission.  we include spin-2 sum rules, we have the dashed line tangent to the plot at origin that gives bounds 0 ≤ g 6 g 6 ≤ 3 (our bounds, using spin k ≥ 2 sum rules) . This lower bound is exactly the one predicted by LSD [45]. Furthermore, although the absolute upper bound is weaker than that predicted by LSD, the whole region is narrower since the upper dashed line is mostly excluded.
As an additional example, we can also provide bounds involving couplings which contribute to the + + +− helicity configuration. More concretely, we compute bounds on Re g 6 and g 4 in terms of gravity using B imp i with i = 2 . . . 7 together with ∂ q p 2 B (1) imp 4 (0) and (4.14) We conclude that the discrepancy between dispersive bounds and the known "theory space" is very much reduced when more sum rules are used. Compared with previous work, our constraints are tighter (without making additional assumptions such as "low spin dominance") mainly as a result of including more functionals and considering inhomogeneous bounds. The later greatly helps since tangent slopes near the origin in figure 12 converge The latter is only relevant if the former vanishes (g 5 and higher-derivative couplings can never dominate over g 4 , by (2.42)). The bounds 4.4 indicate that light higher-spin states with M < 10 8 eV or M < 10 6 eV would then have to exist, depending on whether b g 3 or b g 4 dominates, respectively. 7 Could this compatible with null results from collider searches?
Temporarily treating a higher-spin state as a point-like particle (left-panel of figure 14), there are two possible tensor structure for its couplings to two gravitons. Its production cross-section in association with a jet and graviton depends on whether the same-helicity one is present, or vanishes: (4.16) Here we have included a factor of ↵ s to create the jet, and b s is the partonic center of mass energy. For crude estimates, we will consider that missing-energy searches [] exclude amplitude that are |M| ⇠ > 1 when p b s ⇠ TeV. Thus, despite the suppression by the Planck mass M pl ⇠ 10 15 TeV, in the first scenario any M ⇠ < MeV is clearly ruled out, while in the absence of same-helicity couplings (as in a supersymmetric spectrum), existing experiments still rule out M ⇠ < 10 3 eV. Thus for a point-like higher-spin particle, colliders and our bounds easily exclude e↵ects of the size (4.15); corrections at larger distances will be even smaller than 10 20 .  more slowly than other constraints. Given the tendency of higher-derivative coefficients to grow geometrically, we expect even more dramatic reductions in the allowed volume at higher derivative orders, although we have not studied those systematically.

Can higher-spin states be hidden from the Standard Model?
This subsection is less rigorous than the rest of this paper; we limit ourselves to non-exhaustive arguments and order-of-magnitude estimates.
What do collider searches tell us about higher spin particles, of the kind that can lead to modifications of GR? Heuristically, because gravity is universal and couples to all matter, one might expect that modifications to it also couple to everything. Indeed, many specific scenarios of modified gravity, such as string theory models, predict resonances that couple directly to Standard Model matter. The non-observation of such resonances impose strong constraints on the string scale: M ∼ > 7.7 TeV, and on many other scenarios as well [68,69]. Can model-independent constraints be made on potential low-scale modifications to GR?
To orient the discussion, let us imagine a (very hypothetical) scenario where the dynamics of a 10M black hole, of size L ∼ 30km∼ 1/(10 −11 eV), were somehow observed to differ by more than 10 −20 from GR's prediction. (Such a signal strength is orders-of-magnitude weaker than considered in either LIGO or EHT contexts [70,71], but has been chosen to illustrate collider constraints.) An effective field theorist might try to attribute this to higher-derivative terms of size: 13 the latter being only relevant if the former vanishes. Note that g 5 and higher-derivative couplings can never dominate over g 4 , by (2.41). The bounds (4.4) indicate that light higherspin states with M < 10 −6 eV or M < 10 −8 eV would then have to exist, depending on whether g 3 or g 4 dominates, respectively. 14 Could this be compatible with null results from collider searches? Temporarily treating higher-spin states as point-like particles (left-panel of figure 14), there are two possible index structures for their couplings to two gravitons. These have respectively eight and four derivatives, corresponding to same-and opposite-helicity of the two gravitons. The production cross-section of the higher-spin particle at a hadron collider, in association with a graviton and a jet (to make missing momentum to trigger on) are then Here s is the partonic center of mass energy, and we have included a factor of α s to create the jet. For crude estimates, we will consider that missing-energy searches [72,73] exclude amplitudes of size |M| ∼ > 1 when √ s ∼ TeV. Thus, despite the strong suppression by M pl ∼ 10 15 TeV, we see that in the first scenario M ∼ < MeV is clearly ruled out. However, the analysis in this paper does not rule out the option that |c ++ | 2 = 0, which is the boundary of slope −1 in fig. 11a; but in this case there is still a tension with collider searches if M ∼ < 10 −3 eV. We conclude that if higher-spin particles are point-like, colliders and our bounds easily exclude effects of the size (4.15); effects on larger distances would be even smaller. Now, the new states don't have to be point-like: the above amplitudes could be softer. However, softening mechanisms generically open up other production mechanisms that are also constrained, as in the stringy example discussed above. These are depicted in figure 14.
For example, the higher-spin states could be two-particle states of some light fields that couple to us only through gravity, which would soften the top vertex. However since twograviton couplings in this scenario are small ∼ 1/M 2 pl , long-distance effects are negligible unless there are a very large number of such light fields (for the same reason that quantum effects from Standard Model loops are extremely small for simple observables around macroscopic black holes [74]). Specifically, in order to have a 10 −20 effect at L, at a minimum N ∼ > 10 60 of them would be needed (such that N/(M 2 pl L 2 ) ∼ 10 −20 ; fields with mass m < L −1 can be treated as effectively massless in this estimate). This would lead to a very low "cutoff" M pl / √ N above which gravity becomes strongly coupled, and values N > 10 32 are typically not considered for this reason [75]. Besides possible cosmological constraints, let us simply mention that even with the minimal coupling (b), since (TeV/M pl ) 4 ∼ 10 −60 , for such N even current colliders could have detected a production cross-section (4.17) 14 In this subsection we ignore infrared logarithms since (log M RUniverse) 1/4 ∼ < 3.
Other softening mechanisms may modify other ingredients in figure (a). For example new higher-spin states could be exchanged between Standard Model fields and the higher-spin particle. But these could then be produced directly with a four-derivative graviton-strength coupling as in (c): detectable if M ∼ < 10keV. This amplitude could again be softened, but again the cure is worse than the disease since this almost certainly requires new states exchanged on the horizontal propagator, leading to string-like resonances (d) as discussed above. These estimates, if correct, challenge the notion that modifications of gravity at large scales could be hidden from colliders. We hope that more robust and model-independent statements will be obtained in the future.

Conclusions
In this paper, we analyzed constraints on low-energy graviton-graviton scattering, assuming that causality and other basic principles apply at all energies. Theoretically, graviton scattering is an ideal way to probe potential modifications of Einstein's theory of gravity. At the same time, causality of a scattering process translates into well-understand mathematical properties like analyticity. This implies Kramers-Kronig-type dispersion relations, which express low-energy observables in terms of unknown but positive absorption probabilities at high energies, and which we have systematically analyzed.
Our key result (4.4) is simple to state: if a higher-derivative correction were measured, corresponding to L = 1 16πG (R + r 4 0 Riem 3 + . . .), and if Nature respects causality as we understand it, then a spin-4 particle must exist whose Compton wavelength is at least as long as the length r 0 : M −1 > r 0 .
The notion that higher-derivative corrections can come from heavy particles may not surprise an effective field theorist. The scaling with M pl of our bounds is however interesting. An effective field theorist might argue that the apparent breakdown of the derivative expansion at the length r 0 suggests the existence of new states with mass ∼ r −1 0 or lighter, but their friend could have objected that the states could be much heavier: the required three-point couplings respect unitarity bounds as long as M ∼ < ( M pl r 0 ) 1 3 . We excluded the second option: new higher-spin states must exist with mass M ∼ < r −1 0 or lighter. The M pl scaling of our bounds means that heavy states couple to two gravitons with a strength that never exceeds the three-graviton coupling of Einstein's theory.
This means that the effective field theory approach to gravity can never describe modifications to Einstein's theory that are not parametrically small: in any situation where an EFT is justified at the distance L (M L 1), corrections are small (r 0 /L 1). Our bounds on cubic couplings are qualitatively similar to the parametric bounds obtained in [7], with the important novelty that we find sharp bounds with precise numerical coefficients. As discussed in section 3.5, this is done by imposing a quantum notion of causality based on commutators and crossing symmetry, rather than a classical one based on time advances. Technically, we apply dispersion relations to a physical region where transverse momenta are below the cutoff M but energies are above M (but still much less than M pl , so that tree-level diagrams give a good approximation). The finite momentum transfer also allows us to constrain local (quartic) and non-local (cubic) self-interactions on the same footing.
Our bounds assume that new states satisfy M M pl . They are expected to be valid modulo relative corrections ∼ M 2 M 2 pl from loop effects within the EFT. It would be interesting to explicitly compute some of these corrections.
We treated the graviton as exactly massless. By continuity, we expect our bounds to also control the scattering of transversely polarized gravitons with m M . For massive gravitons, one however anticipates much stronger constraints coming from the study of longitudinal polarizations, which do not have a smooth m → 0 limit (see for example [32,76,77]).
Our results significantly reduce the gap observed in [45] between dispersive bounds and "theory space", as further discussed in section 4.4. Yet, it is noteworthy that our bounds are not saturated by known theories. This could indicate that stronger bounds are still possible; this is of course also suggested by the fact that all our bounds involving G contain infrared logarithms. It will be interesting to find a way to obtain infrared-safe bounds, and to further investigate the gap between theory space and dispersive bounds.
As explained in [36], dispersive bounds for scalar amplitudes in flat space can be lifted to AdS and imply corresponding bounds on OPE coefficients of holographic CFTs. Physically, the same must be true for graviton scattering as well. For example, (4.4) translates to a bound on three-point OPE coefficients of stress tensors in holographic 3d CFTs. In AdS 4 , infrared logarithms are simply replaced by log(M R AdS ).
The notion of causality that we used in the paper applies to scattering processes in an otherwise empty and flat region of space. Physically, since we mainly probe energies of order M and impact parameters b ∼ M −1 , we expect our conclusions to remain valid in any spacetime in which a flat patch of radius M −1 exists, including our own. (This is the reason for the agreement between bounds in flat space and AdS.) Evidently, studying energies above the EFT cutoff was essential to probe the mass and couplings of higher-spin states. An important question is whether some of our bounds can be understood from thought experiments within the EFT itself.
From the generic collider estimates given in section 4.5, assuming causality as we understand it, we find it difficult to imagine that higher-derivative terms could be visible at macroscopic distances. A better understanding of interactions between gravitational higherspin states and matter should lead to more precise constraints.
In summary, experimental verifications of Einstein's theory test a basic question: can signals travel faster than the speed of light?
We can then convert to the center-of-mass frame where (A.10) Using this parameterization, eq. (A.9) can then be analytically summed to Wigner-d functions.
We will also be interested in the behavior at large J, taken with fixed impact parameter b = 2J/m, which is given by where J h−h (bp) on the right-hand side is a Bessel function of the first kind.
The following are also useful when expanding around the forward limit, which corresponds to expanding x around 1

B Amplitudes from matter and Kaluza-Klein exchanges
In this appendix, we record the contribution from light "matter" fields to the low-energy amplitudes (2.7). If the light matter fields have discrete masses m , these amplitudes are rational functions whose poles match the spin-0 and spin-2 contributions to the above partial waves: To get the last factor, for example, we used the J = 2 partial wave (2.16b) on the t = m 2 pole:d 2 0,0 1 + 2s Away from the pole, the spin-2 exchange (B.1) is unique up to contact interactions, which are, to the same derivative order, either linear in t or constant. (That is, different choices of writing (B.1) could shift the g 4 and g 5 coefficients in (2.7) by multiples of |g ++ 2 (m )| 2 .) The above choice will lead to the amplitude with the best possible Regge growth (spin-4 at fixed-s, and spin-2 at fixed-t), and to the simplest sum rules, but other choices would not significantly change the analysis.
In reality, since they can decay (in particular, to two gravitons), the light fields cannot give sharp poles but at best resonances; this simply replaces the sum by an integral: where the partial waves c ++ J,m 2 are normalized as in (2.16). Only same-helicity couplings g ++ 2 appear in the above due to angular momentum rules, which forbid a particle of spin < 4 from decaying to a (+−) two-graviton state. For the same reason, the single minus amplitudes do not receive contributions Finally, the all-plus amplitudes are It is important here that each numerator is quartic in Mandelstam invariants, so as to cancel the denominator in (2.3). Contact ambiguities have been fixed to avoid high-energy growth with Regge spin greater than 2.

C Details on numerics
In this appendix, we give details on our numerical implementation of dispersive bounds. We mostly follow [13], though some new complications arise when considering spinning external particles (as opposed to scalars). Most of the issues are not specific to 4d, so we phrase our discussion in general spacetime dimensions. A generic dispersive optimization problem for takes the following form: Aside from a few special cases at small J, the ρ's fall into families where J can be arbitrarily large, while λ remains fixed. For each ρ-family, the heavy density B i [p 2 , m 2 , J, λ] is an n × n matrix, where n (which depends on λ) is the number of three-point structures for two external particles producing an internal ρ-particle.
Following [13], we choose a polynomial basis for the wavefunctions f i (p): Then (C.1) becomes a semidefinite program with decision variables a i,n and an infinite number of positivity constraints labeled by m, ρ. We truncate to a finite number of constraints using a combination of discretization and polynomial approximations. Specifically, we split (m, ρ)space into the following four regimes: Fixed J, fixed m. Choosing some J max , we impose positivity at the finite set of representations ρ = [J, λ] with J ≤ J max and masses m 2 = 1/ (1−x), where x ∈ {0, δx, 2δx, . . . , 1 δx δx}. Our detailed parameter choices are listed in table 2.
Large J, fixed m. Unlike in the case of external scalar particles [13], we found that it is important to explicitly control the large-J limit of our functionals at fixed m. To do so, we compute a series expansion of the integrated heavy density at large J 1 0 dp p n B i [p 2 , m 2 , J, λ] ∼ #J r + #J r−1 + . . . .

(C.3)
Truncating this expansion to a finite number of terms, and multiplying by an appropriate power of J, we obtain a polynomial approximation for the heavy density as a function of J. We discretize m as before and, for each m, impose positivity of the resulting matrix polynomial of J on the interval [J max , ∞).
Large m, fixed b = 2J m . An important set of positivity constraints come from the impact parameter scaling limit m → ∞ with fixed b = 2J m . In this regime, the heavy densities behave as where k i is the Regge spin of the i-th sum rule. Consequently, only the lowest-spin sum rules contribute in this limit. Thus, in the scaling limit, we set all higher-spin sum rules to zero and replace the lowest-spin sum rules with their approximations (C.4). The matrices J i,λ (pb) have entries that involve Bessel functions and their derivatives. Following [13], we choose a cutoff B and impose positivity at a discrete set of Large m, large b. As in [13], we must take care to impose positivity in the impact parameter scaling limit for b > B. Let us review the trick used there. For scalar scattering, the heavy density C imp 2,−p 2 [m 2 , J] is simply a 1 × 1 matrix, with scaling limit The key idea is to replace (C.6) with and impose positivity for b > B and arbitrary φ -not just for φ = b − π(d−1)

4
. Although this is a stronger condition than positivity of (C.6) (and thus, could lead to weaker bounds), we expect it to be a good approximation at large b, where b − π(d−1) 4 is a rapidly-varying phase relative to A(b), B(b), C(b).
To see how to impose positivity of (C.7), we write Positivity of (C.8) for arbitrary φ is equivalent to the statement that M (b) is positivesemidefinite: We can now expand M (b) at large b, and approximate it as a power of b times a 2 × 2 matrix polynomial of b. We then impose positivity of this matrix polynomial for b ≥ B.
Let us now consider the case of interest for this work, where the heavy density is a matrix. In the impact parameter scaling limit, the integral against wavefunctions f i (p) has the same form as before: where now A λ (b), B λ (b), C λ (b) are n × n matrices (n depends on λ). As before, we replace b − π(d−1) 4 with φ and seek to impose positivity for arbitrary φ. In other words, we would like to impose where M λ (b) is a 2n × 2n matrix of the form (C.9) with n × n block entries.
Note that we can freely rescale the vectors v and (cos φ 2 , sin φ 2 ) without changing the positivity conditions (C.12). Thus, we can think of v as parametrizing a point [v 1 : · · · : v n ] ∈ RP n−1 , and (cos φ 2 , sin φ 2 ) as parametrizing a point [w 1 : w 2 ] = [cos φ 2 : sin φ 2 ] ∈ RP 1 . In this language, (C.12) is equivalent to imposing that M λ (b) is positive on the image of the Segre embedding σ : RP 1 × RP n−1 → RP 2n−1 σ : ([w 1 : w 2 ], [v 1 : · · · : v n ]) → [w 1 v 1 : · · · : w 1 v n : w 2 v 1 : · · · : w 2 v n ] . (C.13) A theorem of [80] states that if a quadratic form Q is nonnegative on a variety X ⊆ RP k−1 of minimal degree, then Q is a sum of squares of linear forms (and hence represented by a positive semidefinite matrix on R k ). X has minimal degree if it is nondegenerate (not contained in a hyperplane) and deg(X) = 1 + codim(X). Fortunately, the image of the Segre embedding σ has precisely these properties: it is nondegenerate, has degree n, and has codimension n − 1. 15 Hence, we conclude that (C.12) is equivalent to the statement that M λ (b) is a positive-semidefinite 2n × 2n matrix: 14) It is remarkable that such a simple condition captures the necessary positivity conditions, even for n = 1. We can now proceed as in the scalar case: we expand M λ (b) at large b to r max subleading orders and approximate it in terms of a matrix polynomial of b. We then impose positivity of this matrix polynomial for b ≥ B.
Having imposed positivity of the heavy density in these four regimes (fixed m and J, large J and fixed m, large m and fixed b, and large m and b), we find that the resulting functionals are positive in practice for almost all m, J. Violations of positivity come from the functional becoming slightly negative between discretized values of m or b. Such violations can usually be fixed by perturbing the functional slightly (for example by including a small admixture of another nearly positive functional).
Our parameter choices are listed in table 2.