Soft-Pion Theorems for Large Scale Structure

Consistency relations -- which relate an N-point function to a squeezed (N+1)-point function -- are useful in large scale structure (LSS) because of their non-perturbative nature: they hold even if the N-point function is deep in the nonlinear regime, and even if they involve astrophysically messy galaxy observables. The non-perturbative nature of the consistency relations is guaranteed by the fact that they are symmetry statements, in which the velocity plays the role of the soft pion. In this paper, we address two issues: (1) how to derive the relations systematically using the residual coordinate freedom in the Newtonian gauge, and relate them to known results in $\zeta$-gauge (often used in studies of inflation); (2) under what conditions the consistency relations are violated. In the non-relativistic limit, our derivation reproduces the Newtonian consistency relation discovered by Kehagias \&Riotto and Peloso&Pietroni. More generally, there is an infinite set of consistency relations, as is known in $\zeta$-gauge. There is a one-to-one correspondence between symmetries in the two gauges; in particular, the Newtonian consistency relation follows from the dilation and special conformal symmetries in $\zeta$-gauge. We probe the robustness of the consistency relations by studying models of galaxy dynamics and biasing. We give a systematic list of conditions under which the consistency relations are violated; violations occur if the galaxy bias is non-local in an infrared divergent way. We emphasize the relevance of the adiabatic mode condition, as distinct from symmetry considerations. As a by-product of our investigation, we discuss a simple fluid Lagrangian for LSS.


Introduction
In studies of large scale structure (LSS), we are familiar with the consequence of linearly realized symmetries. For instance, consider a time-independent spatial translation: where ǫ is constant in time and space. For a small ǫ, we can think of its effect on the density field δ as: Notice how the change in the field ∆δ depends linearly on the field i.e. the symmetry is linearly realized. Its consequence for an N-point correlation function is well known: where ∇ a stands for ∂/∂ x a . This is nothing other than an infinitesimal form of the statement that the correlation function is translationally invariant: δ( x 1 )...δ( x N ) = δ( x 1 − ǫ)...δ( x N − ǫ) . Assuming the initial conditions were translationally invariant, the correlation function at late times has the same property since the dynamics itself does.
What is somewhat less familiar in LSS is the consequence of nonlinearly realized symmetries. By this, we mean that the symmetry involves transforming the field of interest in a nonlinear way. In this paper, a field that transforms this way will turn out to be the velocity field v, i.e.
where ∆ lin. v depends linearly on the fluctuating variable v, while ∆ nl. v does not depend on the fluctuating field, or any fluctuating field for that matter -in this sense, it might be more precise to call ∆ nl. v sublinear. An example is a time-dependent spatial translation: x → x + n(η) , where n depends on (conformal) time η but not space. It is evident that under such a transformation, in addition to the usual linear transformation ∆ lin. v = − n · ∇ v, the velocity experiences a nonlinear shift: where ′ denotes a time derivative. It was pointed out recently by Kehagias & Riotto and Peloso & Pietroni [1,2] (KRPP) that just such a time-dependent spatial translation is in fact a symmetry of the familiar system of equations for pressureless fluid coupled to gravity in the Newtonian limit: the continuity, Euler and Poisson equations. As pointed out by the same authors, the consequence of such a nonlinearly-realized symmetry is not the simple invariance of a general correlation function (such as in Eq. 3), but rather a relation between an (N+1)-point function and an N-point function: where P v is the velocity power spectrum, the superscript c ′ denotes a connected correlation function with the overall delta function removed, and O denotes some observable, which could be different for each k. This sort of consistency relation, which relates a squeezed (N+1)-point function to an N-point function, is well known in the context of single field inflation. The first example was pointed out by Maldacena [3] (see also [4]): where ζ is the curvature perturbation. It arises from a spatial dilation symmetry, which is non-linearly realized on ζ. Recently, more non-linearly realized symmetries were uncovered, including the special conformal symmetry [5,6] and in fact a whole infinite tower of symmetries [7] (H2K hereafter). Recent work has emphasized the non-perturbative nature of these consistency relations as Ward/Slavnov-Taylor identities [7][8][9][10][11][12][13]. They can be viewed as the cosmological analog of the classic soft-pion theorems, which relate a scattering amplitude with N particles to another one with the same set of particles plus a §2.2, we discuss the robustness and limitations of the consistency relations, and delineate the underlying assumptions concerning the galaxy dynamics and biasing. We mention in §2.3 a simple Lagrangian that reproduces the Newtonian LSS equations for dark matter, though this Lagrangian is not needed for the analyses in this paper. In §3, we place the KRPP consistency relation in a larger context: all consistency relations result from residual coordinate transformations (diffeomorphisms) allowed within a given gauge. We establish the connection between such symmetries in the ζ gauge and the Newtonian gauge. The KRPP symmetry is the sub-Hubble limit of one (or more precisely, two) of these symmetries. We conclude in §4. Readers not interested in the detailed arguments can find here a summary of our main results. A few technical results are collected in the Appendices. A few words are in order on our notation and terminology. We use the symbol π to represent the Nambu-Goldstone boson of a non-linearly realized symmetry (the pion), in accordance with standard practice. For our LSS application, π is the velocity potential. The same symbol is also used to denote the numerical value 3.14.... Which is meant should be obvious from the context. Essentially, the numerical π always precedes the Newtonian constant G in the combination 4πG. In cases where both could potentially appear, we use M 2 P ≡ 1/(8πG) to avoid confusion. Also, we use the term nonlinear to refer to quantities that are not linear in the LSS observables (fields such as density or velocity). Sometimes, this has the usual meaning that such quantities go like the fields raised to higher powers: quadratic and so on. But, sometimes, this means the quantities of interest do not depend on the field variables at all, such as the nonlinear part of certain symmetry transformations e.g. Eq. (6). We rely on the context to differentiate between the two.

Consistency Relation from Time-dependent Translation -a Newtonian Symmetry
In this section, we focus on the Newtonian symmetry uncovered by KRPP. §2.1 is a review of the symmetry and its implied consistency relation. In §2.2 we discuss the robustness and limitations of the consistency relation, and go over what assumptions can or cannot be relaxed, especially concerning the nonlinear, astrophysically messy, galaxy observables. We discuss what kind of galaxy dynamics, and what sort of galaxy selection, could lead to violations of the consistency relation. As a by-product of our investigation, we describe a simple Lagrangian for LSS in §2.3.

Time-dependent Translation Symmetry and the Background Wave Argument -a Review
We begin with a review of the Newtonian symmetry discovered by KRPP. We go over the background wave derivation of the consistency relation in some detail, emphasizing the underlying assumptions, and making the derivation easily generalizable to the general relativistic case. Two fundamental concepts are: (1) the existence of a non-linearly realized symmetry (one that shifts at least some of the LSS observables by an amount that is independent of the observables), and (2) an adiabatic mode condition, which is an additional condition that dictates the time-dependence of the symmetry.
Time-dependent Translation Symmetry. The set of Newtonian equations of motion for LSS is: 1 where δ is the mass overdensity, ′ ≡ ∂/∂η denotes the derivative with respect to conformal time η, ∂ i denotes the derivative with respect to the comoving coordinate x i , v i is the peculiar velocity dx i /dη, Φ is the gravitational potential, G is Newton's constant,ρ is the mean mass density at the time of interest, a is the scale factor, and H ≡ a ′ /a is the comoving Hubble parameter. The first equation expresses continuity or mass conservation. The second equation is the Euler equation or momentum conservation for a pressureless fluid. The third equation is the Poisson equation. Let us start with this basic set. We will later consider generalizations to include pressure, relativistic corrections, and even complex galaxy formation processes. KRPP pointed out that this system of equations admits the following symmetry: where n i is a function of time but not space. It can be shown that under this set of transformations, Eq. (9) takes on exactly the same form, with all the variables replaced by ones with a˜on top. To see that this is true, it is important to keep in mind: where on the left,x i is held fixed, and on the right, ∂/∂x i is at a fixed η, and ∂/∂η is at a fixed x i . On the other hand ∂/∂x i = ∂/∂x i . The symmetry transformation described by Eq. (10) is a timedependent spatial translation. (Henceforth, we will occasionally refer to this somewhat sloppily as simply translation.) Under this translation, the velocity gets shifted in the expected manner. The gravitational potential needs to be shifted correspondingly to preserve the form of the Euler equation. The density δ, on the other hand, does not change at all, in the sense thatδ(x) = δ(x). Eq. (10) is a symmetry regardless of the time-dependence of n i . For the purpose of deriving the consistency relations, however, we need to impose an additional condition. Suppose we start with v i = 0 in Eq. (10); we would like the velocity generated by the transformation, i.e.ṽ i = n i′ , to be the long wavelength limit of an actual physical mode that satisfies the equations of motion. We say long wavelength because n i′ has no spatial dependence and so is strictly speaking a q = 0 mode (q being the wavenumber/momentum in Fourier space). What we want to impose is this: if we take a physical velocity mode at a finite q, and make its q smaller and smaller, we would like n i′ to match its time-dependence. Following terminology used in general relativity, we refer to it as the adiabatic mode condition [25]. This condition ensures the symmetry transformation generates a velocity mode that evolves in a physical way. It is easy to see that at long wavelength, where the equations can be linearized, Eq. (9) can be combined into a single equation: or written in a more familiar form: One might be tempted to say a velocity going like n i′ satisfies Eq. (12) trivially for an n i′ of arbitrary time-dependence, since n i′ has no spatial dependence. What we want, however, is for n i′ to have the same time-dependence as that of a velocity mode at a low, but finite momentum. In other words, we impose the adiabatic mode condition: i.e. n i (η) has the same time-dependence as the linear growth factor D(η), assuming growing mode initial conditions. Effectively, we demand that our symmetry-generated velocity-shift (or more precisely, the nonlinear part thereof) satisfy Eq. (12) with the spatial gradient removed.
The Background Wave Argument. Next, we give the background wave derivation of the consistency relation. More sophisticated and rigorous derivations exist [7,8,[11][12][13], but the background wave argument has the virtue of being fairly intuitive. Our goal here is to go over the underlying assumptions, and formulate the argument in such a way to ease later generalizations. The form of our expressions follow closely those in H2K [7]. Before we carry out the argument, it is convenient (especially for later discussions) to introduce the velocity potential π, assuming potential flow on large scales, 2 in which case, the symmetry transformation of Eq. (10) tells us: The velocity potential π has the hallmark of a pion or Nambu-Goldstone boson: it experiences a nonlinear shift under the symmetry transformation. 3 Note that π also implicitly has a linear shift: where we have Taylor expanded π(x) to first order inx i − x i , assuming a small n i . Here, linear and nonlinear refer to whether or not the transformation is linear in π (or any other LSS fields). The total shift in π, i.e.π(x) − π(x), thus has both linear and nonlinear pieces: Similarly, the overdensity changes by the amountδ(x) − δ(x): 2 Assuming potential flow is not strictly necessary, as the argument can be made using the velocity v i itself in place of ∂ i π. The reason we make this assumption is that the velocity enters into our derivation mainly as a large scale or low momentum mode. Assuming the growing mode initial condition, the large scale velocity does take the form of a potential flow. Indeed, vorticity remains zero until orbit crossing. We do not assume potential flow on small scales. 3 Note that Φ also experiences a nonlinear shift (Eq. 10), and thus can also be used as the pion in this derivation. The two give the same result, see §3. 3. i.e. δ experiences no nonlinear shift.
Consider an N-point function involving a product of N LSS observables. Let us denote each as O, labeled by momentum, so that the N-point function is Here, the O's at different momenta need not be the same observable. For instance, one can be δ, the other can be the gravitational potential, et cetera. They need not even be evaluated at the same time. We are interested in this N-point function in the presence of some long wavelength (soft) π. Let us imagine splitting all fluctuations into hard and soft modes, with k 1 , ..., k N falling into the hard category. The N-point function obtained by integrating over the hard modes, but leaving the soft modes of π unintegrated, can be Taylor expanded as: where we have taken the functional derivative with respect to, and summed over the Fourier modes of, π with soft momenta p. Multiplying both sides by π q (where q is also soft) and ensemble averaging over the soft modes, one finds We have used the definition of the power spectrum: π q π * p = (2π) 3 δ D ( q − p)P π (q), with δ D being the Dirac delta function. We can on the other hand compute the derivative on the right hand side this way: This statement says that the change to the N-point function induced by the symmetry transformation (the right hand side) is equivalent to the change to the N-point function by adding a long-wavelength background π induced by the same symmetry (the left hand side). We will unpack it a bit more in §2.2.
A careful reader might note that there is no reason why one should include on the right hand side only the linear part of the transformation of the N-point function. That is true: by including only the linear transformation, we are effectively dealing with the connected N-point function. For a proof, see H2K. 4 Combining Eqs. (20) and (21), and adding the superscript c for connected N-point function, we have: It is important to note that the connected correlation functions on both sides contain delta functions. This way of writing the consistency relation follows the Ward identity treatment of H2K, and is applicable to any symmetries with π as the Nambu-Goldstone boson. The background wave argument has the advantage of being intuitive, but is a bit heuristic. Readers interested in subtleties can consult e.g. H2K.
Our final result here matches theirs. Let us apply Eq. (22) to the translation symmetry. To be specific, let us take our observable O to be the mass overdensity δ. We use the following convention for the Fourier transform of some function f : Eqs. (17) and (18) thus implies: Substituting into Eq. (22), we find where η is the time implicitly assumed for π q , and η a is the time for δ ka . Note that k j a refers to the j-component of the vector k a . Since n j (η) has the time-dependence of the linear growth factor D(η) (from the adiabatic mode condition), but can otherwise point in an arbitrary direction, we conclude: We have yet to remove the delta functions from both sides. To do so, we use the following, pure shift, symmetry: This symmetry does not involve transforming space-time at all, and so none of the observables receive a linear shift. The argument leading to Eq. (22) thus tells us We use the superscript c ′ to denote the connected correlation function with the overall delta function removed: Combining Eqs. (26) and (28), we have the Newtonian translation consistency relation: where η is the time for the soft mode π q , and each η a is the time for the corresponding hard mode δ ka . This turns out to be a common feature for all consistency relations as we will see: Eqs. (28) and (26) are two consistency relations that differ by one derivative with respect to q; the former allows us to remove the delta function from the latter in a straightforward way -the result is Eq. (30). The form adopted by KRPP is to integrate the above over q, and using Eq. (28), to obtain: 5 The soft mode π q , by the linearized continuity equation, is related to δ q by 6 One can therefore rewrite Eq. (31) as 7 In the context of this Newtonian derivation, we would like to think of the form expressed in Eq. (30) and Eq. (28) as more fundamental, since it is π that experiences a nonlinear shift in the symmetry transformation, acting as the pion, and since the expression in in Eq. (33) contains two non-relativistic consistency relations, one trivial and one nontrivial. It is also worth stressing that Eq. (30) does not constrain corrections, and Eq. (33) can contain O(q 0 ) corrections. Let us close this derivation by observing that the only assumption made about the hard modes is how they transform under the symmetry (in particular, the linear part of their transformation; see Eqs. 18,24). Thus, suppose we have some observable O whose linear transformation under the spatial translation is: For instance, if O is the galaxy overdensity, we expect g j = k j , but g j could take other forms for other observables. Exactly the same derivation then gives the Newtonian translation consistency relation in a more general form: 5 We are grateful to Lasha Berezhiani and Justin Khoury for helping us understand this point. 6 In the context of a consistency relation which is purported to be non-perturbative, one might wonder if using the linear relation between δ and π (for the soft mode only) is justified. It can be shown that including nonlinear corrections to this relation leads to terms subdominant in the squeezed limit of the correlation function. See [20]. 7 The use of the equation of motion within the connected correlation function does not lead to contact terms. See e.g. H2K and [11] on the role of contact terms in Ward identity arguments.
where we have allowed the possibility that the N hard modes correspond to different observables, thus potentially a different g j a for each a = 1, ..., N . Corollaries -analogs of Eqs. 31, 33 -follow in the same way: and where it should be understood that Eq. (36) and Eq. (37) contain O(q 2 ) and O(q 0 ) corrections respectively. Phrased as such, the consistency relation is fairly robust: the detailed dynamics of the hard modes has no relevance; it matters not whether the corresponding observables are astrophysically messy or highly nonlinear. All we need to know is how they transform under a spatial translation. To understand this robustness better, it is helpful to study concrete examples, which is the subject of the next section.

Robustness and Limitations of the Consistency Relation(s)
To understand better the robustness of the consistency relation, it is instructive to ask the question: when does it fail? As we will see, the consistency relation stands on three legs: the existence of the timedependent translation symmetry, the single-field initial condition, and the adiabatic mode condition. All three are necessary in order for the consistency relation to hold. Here, we focus on the KRPP consistency relation as a specific example, but the points we raise are general, pertaining to other consistency relations ( §3) as well.
1. Initial condition: the single-field assumption. A crucial step in the derivation is Eq. (21): that linear transformations of a collection of hard modes, represented by O k , can be considered equivalent to placing the same hard modes in the presence of a soft mode -the pion π q . That this single soft mode is sufficient to account for all the transformations of the hard modes is an assumption about initial conditions. In the context of inflation, the assumption is often phrased as that of a single field or a single clock. In our Newtonian LSS context, in addition to keeping only the growing modes, essentially the assumption is that of Gaussian initial conditions. 8 More precisely, one demands that the initial condition does not contain a coupling between soft and hard modes beyond that captured by Eq. (21). We will follow the inflation terminology and call this the single-field assumption.
2. Adiabatic mode condition. Another crucial ingredient in the derivation is the adiabatic mode condition, that the symmetry transformation have the correct time-dependence so that the nonlinear shift ∆ nl. π may be the long wavelength limit of an actual physical mode. We stress that this is an additional requirement on top of demanding a symmetry, with some non-trivial implications. To spell them out, it is useful to have a concrete example. Since the consistency relation is purported to be robust, in the sense that the hard modes can be highly nonlinear and even astrophysically complex, let us write down a system of equations that allow for these complexities: Here a labels the species: for instance, it can be dark matter, populations of galaxies, baryons and so on. R (a) represents a source term for the density evolution. For dark matter, we expect R (a) = 0 (barring significant annihilation or decay). For galaxies, R (a) quantifies the effect of galaxy formation and mergers. All particles are subjected to the same gravitational force plus a species-dependent force F (a) . The gravitational potential Φ is sourced by the total mass fluctuation: where δ T represents the effective total mass density fluctuation from all particles. A natural generalization of the (time-dependent) translational symmetry from the previous section would be: This means that R (a) and F (a) remain invariant under this symmetry. For instance, they are invariant if R and F depend only on δ, on the spatial gradient of v, or on the second gradients of Φ. R and F could even have explicit dependence on time η. What would violate invariance is if R or F depends on v with no gradients -that is, unless the dependence is of the form v (a) − v (b) (in which case the shifts in the velocities of the two different species (a) and (b) cancel out). Specializing to the case of galaxies, what this means is that the number density evolution and dynamics of the galaxies does not care about the absolute size of the velocity, but only about the velocity difference (either between neighbors, or between species). The only context in which the absolute size of velocity plays a role is through Hubble friction -this is the origin of the H dependent term in the nonlinear shift of Φ. In other words, Hubble friction aside, galaxy formation and dynamics is frame invariant, which seems a fairly safe assumption. For instance, dynamical friction or ram pressure, which no doubt exerts an influence on galaxies, should depend on velocity difference. Thus, let us assume Eq. (40) is a symmetry of our system -note that this is a symmetry regardless of the time-dependence of n i . As emphasized earlier, this is not enough to guarantee the validity of the consistency relation. To derive the consistency relation, n i must have the correct time-dependence: n i′ (the nonlinear shift in v i ) must match the time-dependence of a physical long-wavelength velocity perturbation. 9 This has to hold for all species, meaning the same n i′ matches the long-wavelength velocity perturbation of each and everyone of the species. In other words, all species should move with the same velocity on large scales. This leads to two subtleties, which are best illustrated by assuming an explicit form for F . Consider: The first term on the right of the expression for F represents some sort of pressure -c s is the sound speed -this would be relevant if the subscript (a) represents baryons at finite temperature. The second term represents some sort of friction that depends on the velocity difference between two species, with a coefficient β. The third term represents an additional fifth force, mediated by the scalar ϕ, with a coupling α. The scalar ϕ obeys a Poisson-like equation. In scalar-tensor theories, the tensor part of the theory mediates a universal gravitational force (described by the gravitational potential Φ), but the scalar need not be universally coupled: hence we allow the coupling α (a) to depend on the species (see e.g. [26,27]). The form for the additional force F proposed in Eq. (41) is fairly generic: counting derivatives, we can see that the pressure term goes like ∂δ, whereas the other two terms go like ∂ −1 δ. In terms of the symmetry transformation, one can see that is compatible with Eq. (40). Thus, we have a system (Eq. 38) that respects the translational symmetry spelled out in Eq. (40), even if many different kinds of forces are present, including non-gravitational or modified gravitational ones such as in Eq. (41). We wish to see how, despite the presence of the (time-dependent) translational symmetry, there can still be a breakdown of the consistency relation, due to obstructions in satisfying the adiabatic mode condition -that the velocity perturbations of all species should be equal on large scales.
2a. Soft dynamics constraint. In the long wavelength limit, one can ignore the pressure term compared to the other two terms in the expression for F (a) . Let us first focus on the fifth-force term. This term is at the same level in derivative as the normal gravitational force − ∇Φ, and thus both have to be taken into account on large scales. The problem with a long-range fifth-force is the non-universal coupling: if there is a different coupling α (a) for each kind of particles, the different species will move with different velocities even on large scales. This means no single n i′ can possibly generate long-wavelength velocity perturbations for all species. In other words, unless the soft (large-scale) dynamics obeys the equivalence principle, the consistency relation would be violated, as emphasized by [18,21]. We stress that, in our example, the violation of equivalence principle occurs without the violation of the translation symmetry described by Eq. (40). The fact that the consistency relation is not obeyed is entirely because of the failure to satisfy the adiabatic mode condition when the equivalence principle is violated. The friction term (second term on the right of Eq. 41), on the other hand, is compatible with the adiabatic mode condition -it simply vanishes if the velocities of different species are equal, and is therefore consistent with the large scale requirement that all species flow with the same velocity. To sum up, the soft dynamics constraint is: for the consistency relation to be valid, the dynamics on large scales must be consistent with all species moving with the same velocity.
2b. Squeezing constraint. Let us next turn to the pressure term (first term on the right of Eq. 41). Since different species have different sound speeds, this also leads to differences in velocity flows. This is relatively harmless though, since the pressure term becomes subdominant on large scales. Thus, there is no problem with the adiabatic mode condition, which is really a condition on motions in the soft limit q → 0. The presence of pressure does lead to a practical limitation on the application of the consistency relation, however. The consistency relation is a statement about an (N+1)-point function in the squeezed limit q ≪ k 1 , ..., k N . There is the practical question of how small q has to be. An important requirement is: q must be sufficiently small such that the velocity perturbations of different species have the same time-dependence as that generated by a single n i′ . In the present context, it means q < H/c s , i.e. the length scale must be above the Jeans scale. 10 We refer to this as the squeezing constraint: the soft leg of the consistency relation must be sufficiently soft that any difference in force on the different species becomes negligible. This is worth emphasizing, because clearly dark matter and baryons are subject to different forces: while that does not by itself lead to the breakdown of the consistency relations, one has to be careful to make sure that the squeezed correlation function is sufficiently squeezed.
3. Galaxy-biasing. It is also instructive to approach the subject of consistency relation violation from the viewpoint of galaxy-biasing. What kind of galaxy-biasing would lead to the violation of consistency relation? We can only address this question in the perturbative regime, but it nonetheless provides some useful insights. Suppose the galaxy overdensity δ (a) (of type a) and matter density δ are related by: where b (a) is a linear bias factor (independent of momentum) and W (a) is a kernel that describes a general quadratic bias. To the lowest order in perturbation theory, it can be shown that the bispectrum between three types of galaxies a, b and c, at momenta q, k 1 , k 2 and times η, η 1 , η 2 respectively, is where P δ is the linear mass power spectrum -its two time-arguments signify the fact that the two δ's involved can be at different times. We are interested in the q → 0 limit: We have used and so on (appropriate only perturbatively). Comparing this expression with the consistency relation expressed in Eq. (37) (identifying g a with k a ), we see that the two agree if E, W (b) and W (c) can be ignored, and b (a) = 1. A number of comments are in order.
First, let us focus on the case with no galaxy biasing, so that Eq. (45) simply constitutes a perturbative check of the consistency relation for the mass overdensity i.e. Eq. (33). We see that the term E can be ignored compared to the terms kept only if the soft power spectrum is not too blue: assuming P (aa) (q) ∼ q n for small q, the validity of the consistency relation requires n < 3. This is a limitation on the consistency relation that is not often emphasized. In practice though, the realistic power spectrum has no problem satisfying this requirement.
Let us next consider the effects of galaxy biasing. The second point we would like to raise is that the soft mode must be kept unbiased. There are two reasons for this, one trivial, the other less so. The trivial reason is that the left hand side of the consistency relation has to be corrected by a factor of b (a) , the linear bias factor for the soft mode. This is not a big problem: one can obtain an estimate of the linear bias and correct the consistency relation when comparing against observations of the galaxy bispectrum. The more non-trivial problem is the presence of the quadratic bias kernel W (a) in E. Consider for instance a local biasing model of the form: δ (a) = b (a) δ + b (a),2 δ 2 /2 in real space, where b (a) and b (a),2 are constants, typically referred as the linear and quadratic bias factors. In this case W (a) = b (a),2 /(2b (a) ) has no momentum dependence, and so E contains a contribution that goes like q −n for P (aa) ∼ q n . This means one needs n < 1 for E to be negligible compared to the terms we keep in the consistency relation. On the largest scales, n approaches 1, though observations suggest it is slightly less than 1. On smaller scales (but still keeping q ≪ k 1 , ..., k N ), the relevant n is on the safe side. Nonetheless, this perturbative check suggests that one should be careful in using the a biased observable for the soft-mode.
Henceforth, let us assume the soft-mode is unbiased but the hard modes are biased, in which case E is safely negligible in the squeezed limit as long as n < 3. The third point we wish to raise is that the validity of the consistency relation requires the hard modes be biased in a way that is not too infrared-divergent: W (b) (− q, k 1 ) and W (c) (− q, k 2 ) cannot contain terms that go like k 1 /q or k 2 /q i.e.
in the q → 0 limit. As mentioned above, the local biasing model typically assumed in LSS studies implies the kernels W (b) and W (c) are momentum independent, and is thus consistent with the consistency relation. It is worth emphasizing that the word local in local biasing is a bit misleading: it merely states that the galaxy density at a given point in real space is related to the mass density at the same point. In reality, galaxies form out of the collapse of larger regions, influenced by the tidal field of the environment: there are therefore good reasons to believe that galaxy biasing is at some level non-local, i.e. the galaxy density at a given point is affected by the mass density at other points. This non-locality is not non-locality in the field theory sense, in that there is nothing non-local in the dynamics, and the so-called non-local galaxy bias arises completely out of local processes. A violation of the consistency relation requires more than a non-local galaxy bias though. It requires the non-local biasing kernel to be infrared divergent. This does not appear to be so easily obtained in a random-walk halo-biasing model, even if tidal effects are taken into account [28]. One way it arises is in a model in which galaxies are born with a velocity bias, as pointed out by [29]: the quadratic kernel W (c) (or W (b) ), for some galaxy population with a velocity bias of b * v at birth (i.e. the galaxy velocity equals b * v times the dark matter velocity when the galaxy forms): where D * is the linear growth factor at the time of birth, and D is growth factor at the time of interest. We display only the term that has a dipolar dependence on the angle between q and k, and have taken the late time limit. This has precisely the kind of infrared divergence in the q ≪ k limit which would invalidate the consistency relation. It is interesting that this is also an example where we should have expected a violation of the consistency relation based on earlier arguments -the existence of a scaleindependent velocity bias b * v means dark matter and galaxies do not flow in the same way, even on large scales. This violates the adiabatic mode condition, and so it is not a surprise that the consistency relation fails. Realistically, velocity bias is present at some level of course, but is expected to approach unity on sufficiently large scales, unless of course the equivalence principle is violated [21]. As a general statement, we can say that a non-local galaxy bias that is more infrared-divergent than Eq. (46) is what one needs to violate the consistency relation. It is interesting to ask whether there are other ways to physically generate such a galaxy bias besides through equivalence principle violations. This naturally brings us to the issue of selection.
It is worth emphasizing that the galaxy bias is also partly a selection bias: one chooses to study galaxies of a certain luminosity, color, morphology or some other property of interest. The question is then: can one choose the galaxy sample in such a way as to violate Eq. (46)? What if one chooses galaxies based on their motions, for instance, selecting galaxies that have systematically large velocities? It would seem by hand we have introduced a velocity bias, and thus a violation of the consistency relation. This is actually not a violation in the technical sense. Choosing galaxies based on their motions can be thought of as weighing the galaxies by velocities, i.e. δ g → δ g (1 + π). From the point of view of violating the consistency relation, it is most relevant to consider weighing by the large scale velocity. In that case, it is not surprising one finds additional terms that diverge in the squeezed limit of the correlation function -this is because we have included in the correlation function additional soft modes that carry with them additional powers 1/q. 11 be astrophysically messy observables, such as those associated with galaxies. The presence of pressure effects, multiple components, multiple-streaming, 13 star formation, supernova explosions, etc. does not lead to violations of the consistency relations, as long as the adiabatic mode condition -i.e. the soft dynamics constraint and the squeezing constraint -is satisfied. This is why the LSS consistency relations are interesting: they provide a reliable window into the non-perturbative, astrophysically complex regime.

A Simple Fluid Lagrangian for LSS
The time-dependent translation symmetry laid out above was justified at the level of the equations of motion. It would be useful to see the same at the level of the action. In this section, we provide the action that describes the dark (i.e. pressureless) matter dynamics under gravity. 14 We should stress that, for our discussion of the consistency relation, the action is not strictly necessary; the equations of motion are as good a guide to the symmetry. Moreover, the action we will write down concerns only dark matter; it does not cover realistic observables such as galaxies, while the consistency relation applies regardless of the complex astrophysics that might be present in such observables. Nonetheless, the dark matter action is useful for conceptual understanding. We provide it here for completeness, and connect it with a more well known fluid action in Appendix A. For simplicity, we assume potential flow; an extension to allow for vorticity should be straightforward, along the lines of [30]. Readers not interested in the action perspective can skip to §3 -the rest of the paper does not depend on this section.
Let us motivate the construction of the action by reducing the standard pressureless LSS equations (9) into a single equation for the velocity potential π. The Euler equation can be integrated once to give: where (∇π) 2 stands for ∂ i π∂ i π. 15 The Poisson equation then gives us: The continuity equation can thus be turned into a single equation for π: This is a complicated looking equation, but it is not too difficult to guess the form of the associated action: The overall normalization (and sign) is arbitrary from the point of view of reproducing the desired equation of motion, but is chosen to conform to a more general action discussed in Appendix A. It is straightforward to check that this action is invariant under the time-dependent translation symmetry discussed earlier, namely: The dynamics of the velocity potential π is completely fixed by this action. From this point of view, the π equation of motion has the interpretation of the continuity equation, if δ is defined by Eq. (49); the gravitational potential Φ is defined by Eq. (48) so as to reproduce the Poisson equation. With this understanding, the action takes a fairly simple form: i.e. the Lagrangian is the difference between what resembles potential energy and kinetic energy, though with an unexpected overall sign, which can be understood from the larger context of a fluid with pressure (see Appendix A).

Consistency Relations from Diffeomorphisms -General Relativistic Symmetries
The time-dependent translation symmetry noted by KRPP (Eq. 10) appears to be a global symmetry of the Newtonian LSS equations. (Or, more generally, the time-dependent translation as described by Eq. 40 is a symmetry of the equations of motion for dark matter and galaxies.) Our goal in this section is to place it in a larger context: the claim is that this symmetry is actually part of a diffeomorphism in the context of general relativity. This perspective is useful for two reasons: first, it helps us make contact with the earlier work on consistency relations in inflation, which are based on diffeomorphism invariance; second, diffeomorphism invariance allows us to systematically write down further consistency relations. The earlier work generally uses the ζ-gauge, alternatively referred to as the unitary or comoving gauge. On the other hand, in LSS studies, the Newtonian gauge is the more natural one to use. Here, we take advantage of the fact that the full list of consistency relations are already known in the unitary or ζ-gauge [7], and transform each known symmetry in ζ-gauge into a symmetry in the Newtonian gauge. This way, we will obtain an infinite tower of consistency relations in the Newtonian gauge. We emphasize that we could equally well proceed by directly working in the Newtonian gauge, and obtain the same results (see [16] on the dilation and special conformal consistency relations obtained this way). One might wonder why writing down consistency relations in the Newtonian gauge is useful if we already know what they are in the unitary gauge. It has to do with the taking of the Newtonian limit, a subject we will discuss later in this section, and in §4.
In the interest of generality, we allow the presence of multiple components of which pressureless matter/dust is one. We assume adiabatic initial conditions in the sense that all components fluctuate in the same way in the long wavelength limit: in particular their velocity potentials coincide in this limit. We give in §3.1 the general prescription for transforming symmetries known in the unitary gauge to symmetries in Newtonian gauge. In §3.2 we focus on the dilation and the special conformal symmetries, which are the symmetries that generate only scalar modes, and we show how the KRPP Newtonian consistency relation arises as the sub-Hubble limit of the latter. We comment on the robustness and limitations of the consistency relations in §3.3, adding a relativistic twist to some of the comments made earlier. We also discuss the taking of the Newtonian/sub-Hubble limit. We close with §3.4 on further consistency relations that form an infinite tower -they generally involve the tensor modes. We comment on why there is no useful sub-Hubble limit in these cases.

Symmetry Transformations from Diffeomorphisms
Here we are interested in symmetry transformations coming from residual gauge/coordinate transformations (i.e. diffeomorphisms) that are allowed even after we have applied the usual gauge-fixing. In the context of inflation, a common gauge is the unitary or ζ−gauge: where we have omitted the time-time and time-space components of the metric which are obtainable from the given space-space parts by solving the Hamiltonian and momentum constraints. Here, ζ represents the scalar perturbation and the transverse traceless γ ij represents the tensor perturbation. Vector perturbations are ignored because they are not generated by single field models (a brief discussion of vector modes can be found in Appendix D). The equal time surface is chosen so that the matter field, which we have called φ, has no spatial fluctuation. For our application, there can in general be multiple components, in which case δφ is chosen to vanish for one of them. To be concrete, let us choose this to be the dark matter fluid, i.e. we model it as a fluid described by a Lagrangian of the form P (X), where P is some function of X ≡ −(∂φ) 2 . The velocity potential π is related to φ by δφ = φ −φ = −φ ′ π, wherē φ is the background, andφ ′ is its conformal time derivative (see Appendix A). 16 The full list of residual diffeomorphisms that respect the unitary gauge is worked out in H2K. Since the unitary gauge is a complete gauge-fixing for diffeomorphisms that vanish at spatial infinity, the residual diffeomorphisms must be those that do not vanish at infinity. They take the form: No time-diffeomorphism is allowed since that would violate the δφ = 0 (or π = 0) unitary gauge condition, and the allowed spatial diffeomorphism, which we refer to as ξ i unit. ), goes like x n , where n = 1, 2, .... We will give explicit expressions for ξ i unit. later. They satisfy: 17 scalar + tensor symmetries : This set of symmetries contains subsets that only generate (nonlinearly) scalar modes, and subsets that only generate tensor modes: The spatial diffeomorphism ξ i unit. can be considered to be time-independent. 18 In LSS studies, it is more common to employ the Newtonian gauge instead: where we no longer impose π = 0, Φ and Ψ are the scalar modes, the transverse traceless γ ij denotes the tensor modes as before, and the divergence-free S i represents the vector modes (which is set to zero in this paper). Here, we work perturbatively in the metric perturbations, since the Newtonian-gauge metric perturbations are expected to be small even in the highly nonlinear regime where the density fluctuation δ is large, and including higher order metric perturbations corrects the consistency relations by negligible amounts. 19 Under a small diffeomorphism ξ µ , the nonlinear transformations of the metric fluctuations are: 20 Given each symmetry in the unitary gauge, it is straightforward to deduce the corresponding symmetry in the Newtonian gauge. Let us break it down into a number of steps. First, we begin with the metric in Newtonian gauge, where π = π 0 = 0. We assume Ψ = Φ, in the absence of anisotropic stress. 21 To convert to the unitary gauge, we apply a time-diffeomorphism ξ 0 = −π 0 to make the scalar field φ spatially homogeneous. Second, we apply the known unitary-gauge symmetry transformation ξ i = ξ i unit. . Third, we wish to return to Newtonian gauge. The first and second steps in general make Ψ = Φ. To restore equality, we apply an additional time-diffeomorphism ξ 0 = π 0 + ξ 0 add. . We also need to ensure g 0i = 0 (no vector modes 22 ), and thus an additional spatial diffeomorphism ξ i add. may be necessary. It is shown in Appendix B that the requisite additional time-and space-diffeomorphisms are: 18 Adiabatic mode conditions in the unitary gauge actually make ξ i unit. time-dependent in general. As shown in H2K, its time-independent part alone is sufficient to deduce the consistency relations. We will implement the adiabatic mode conditions separately in the Newtonian gauge computation. 19 See footnote 32 for a more detailed discussion of this point. 20 The net (linear + nonlinear) transformation of the metric is given by 21 This is an adiabatic mode condition in the Newtonian gauge. See discussions in Appendix B. 22 The absence of vector modes is assumed in two places. Assuming ∇ 2 ξ i unit. + ∂ i (∂ · ξ)/3 = 0 means there is no vector mode in the spatial part of the metric. In addition, our choice of ξ µ add. ensures there is no vector mode in the space-time part of the metric either.
Here, D is the linear growth factor satisfying the following equation: where c is a constant (independent of time and space). In other words, the following diffeomorphism is a symmetry of Newtonian gauge: where ξ i unit. is the residual (time-independent) diffeomorphism allowed by the unitary gauge (Eqs. 57 & 58). Furthermore, it can be shown that this diffeomorphism satisfies the adiabatic mode conditions, i.e. the perturbations that are nonlinearly generated match the time-dependence of very soft (growing) physical modes. This is why the linear growth factor D appears in the diffeomorphism. The derivation is given in Appendix B. (The attentive reader might wonder why the linear growth factor D -a quantity that shows up in the Newtonian discussion of sub-Hubble perturbations -appears also in a general relativistic discussion, and how Eq. (62) is related to the more familiar growth equation (Eq. 14). This is discussed in Appendix C). An important underlying assumption is that all fluid components move with the same velocity in the soft limit. Under this assumption, it is shown in Appendix C that the velocity, or velocity potential π, evolves as: In the context of a general relativistic discussion, this statement (strictly speaking) holds in the super-Hubble limit q ≪ H. What is interesting is that for the π q of pressureless matter, this statement holds also for sub-Hubble (but linear) scales . It is this fact that makes an interesting Newtonian consistency relation possible. 23 For the purpose of deducing the consistency relations, we also need to know how other LSS observables transform under a diffeomorphism. From the way a scalar should transform, one can see the velocity potential π ≡ −δφ/φ ′ should transform by We will mostly need only the nonlinear part of the π transformation. As emphasized above, the assumption of potential flow is not strictly necessary. The nonlinear transformation of the velocity can also be deduced by transforming the 4−velocity U µ : 24 Another LSS observable of interest is the mass density fluctuation δ. Its transformation is: 23 The fact that the soft π is proportional to D ′ is nicely consistent with ξ 0 ∝ D ′ , since ∆ nl. π = ξ 0 (see Eq. 65). 24 One can use U µ = (1 − Φ, v i )/a, valid to the lowest order in velocity and perturbations, with the understanding that v i = dx i /dη. In this paper, by relativistic effects, we are generally interested in effects on super-Hubble scales as opposed to effects associated with high peculiar velocities.
One could set −ρ ′ /ρ = 3H forρ that redshifts like pressureless matter, but we will keep the discussion general. The generalization to the galaxy density fluctuation δ g (or the fluctuation of any component) is immediate: whereρ g is the mean galaxy number density. In both cases, the linear part of the transformations would resemble more what one expects for a scalar if we consider δρ =ρδ instead of δ:

Scalar Consistency Relations
Let us first derive the consistency relations that involve only scalar modes, i.e. where only scalar modes are nonlinearly generated. Recall from §3.1 that the scalar symmetries take the form: with ξ i unit. and ξ µ add. satisfying: where D is the linear growth factor obeying D ′′ + 2HD ′ − c = 0, with c being a constant. As discussed before, since the unitary gauge is a complete gauge-fixing for diffeomorphisms that vanish at spatial infinity, the residual diffeomorphism of interests must be one where ξ i unit. does not vanish at infinity. Following H2K, we can express ξ i unit. as a power series: where each M iℓ 0 ...ℓn represents a constant coefficient, symmetric in its last n + 1 indices. As pointed out by [5,6], the only scalar symmetries are those associated with n = 0: ξ i unit. ∼ x (dilation) and n = 1: ξ i unit. ∼ x 2 (special conformal transformation).

The Dilation Consistency Relation
Dilation is described by ξ i unit. = λx i where λ is a constant. Plugging this into Eq. (72) tells us ξ 0 add. = −(λ/c)D ′ and ξ i add. = 0. In other words, the net residual diffeomorphism in Newtonian gauge is where λ is a constant. This symmetry involves a spatial dilation + an accompanying time translation, with the two related by a differential equation: ǫ ′ + 2Hǫ + λ = 0. We will refer to the resulting consistency relation simply as the dilation consistency relation, even though the symmetry involves more than spatial dilation. To deduce the associated consistency relation, we employ Eq. (22). Two pieces of information are needed to use it. One is the nonlinear shift of π in Fourier space, obtained by taking the Fourier transform of Eq. (65): The other piece of information we need is the linear transformation of the high momentum observable(s).
Here, let us use the density fluctuation δ k as the observable at high momentum. By a Fourier transform of Eq. (67), we find Plugging these two pieces into the master equation (22), we see that where the time-dependence should be understood as follows: the soft q mode is evaluated at time η, while the hard mode k a is evaluated at time η a , meaning each hard mode can be at a different time. This is why the ǫ on the left is at time η -it is associated with the nonlinear shift in π and therefore the soft mode -and the ǫ's on the right are evaluated at the respective η a , since each is associated with the linear transformation of the corresponding hard mode. The connected N-and (N+1)-point functions on both sides contain the momentum conserving delta function. Its removal requires some care since the derivatives with respect to momentum on the right hand side act on the delta function: which can be established by rewriting the delta function as (2π) −3 d 3 x e i( k 1 +...+ k N )· x , and integrating by parts. Thus, removing the delta function on both sides, with ... c ′ representing the connected correlation function without δ D , we have 25 with the understanding that ǫ and (the constant) λ are related by Eq. (74), and where the N-point function depends on the time associated with each of the N modes. Using the relation, we can rewrite the dilation consistency relation as with the understanding that c = D ′′ + 2HD ′ is a constant. It is trivial to generalize the consistency relation by changing the hard modes from δ for the mass density to δ g for the galaxy density: simply change the mean mass densityρ on the right hand side to the mean galaxy number densityρ g . This consistency relation can be further rewritten in different forms. We will postpone this discussion until after we discuss the special conformal consistency relation.

The Special Conformal Consistency Relation -Containing the Newtonian Translation Consistency Relation
Next, we consider the special conformal transformation: where b is a constant vector. Plugging this into Eq. (72), we see that the requisite accompanying diffeomorphism is: Putting everything together, we see that the symmetry is: where b is a constant vector. We refer to the implied consistency relation as the special conformal consistency relation, even though the full symmetry transformation involves a time diffeomorphism and a spatial translation in addition to the special conformal transformation -these transformations are related via n i′′ + 2Hn i′ + 2b i = 0. As we will see, the Newtonian translation consistency relation is contained in here.
Once again, we employ the master equation (Eq. 22), for which we need the nonlinear transformation of the velocity potential π and the linear transformation of the hard modes -as in the case of dilation, we choose the observable to be the density fluctuation δ for the hard modes. Under the current symmetry transformation, we have: Substituting the above into Eq. (22), we see that the left hand side (LHS) is The right hand side (RHS) is The connected N-point function δ k 1 ...δ k N c contains an overall momentum conserving delta function. The momentum derivative acts non-trivially on it. To simplify, it is useful to know: where k tot ≡ k 1 + ... + k N , and which can be proved by rewriting the delta function as the spatial integral of a plane wave. The term a 6 b · ∂ ka δ D ( k tot. ) can be rewritten as 6N b · ∂ ktot. δ D ( k tot. ); indeed, any ∂ k i a δ D ( k tot. ) can be written as ∂ k i tot. δ D ( k tot. ). Then there are terms that involve one momentum derivative on the delta function and one momentum derivative on the N-point function: where we have used rotational invariance of the N-point function to remove the last two terms on the first line. With this understanding, Eq. (85) can be expressed as The first line of the above can be equated with the first line of Eq. (84), since what multiplies the derivative of the delta function on both sides replicates the dilation consistency relation Eq. (79). Note that n i and b i are related by Eq. (81). Using this, eliminating the dilation consistency relation from both sides, 26 and removing the delta function, we obtain the special conformal consistency relation: with the understanding that c = D ′′ + 2HD ′ is a constant (Eq. 165). Just as in the case of the dilation consistency relation, this consistency relation can be easily generalized to the hard modes being the galaxy overdensity -changing δ to δ g , and changingρ toρ g . Examining the terms on the right hand side, we see that in the sub-Hubble limit, i.e. k ≫ H, the term that dominates on the right hand side is − a (D(η a )/D ′ (η))k i a δ k 1 ...δ k N c ′ , reproducing the translation consistency relation (Eq. 30) derived from the Newtonian equations. We will have more to say about the non-relativistic limit in §3. 3.
At the level of the spatial diffeomorphisms, dilation and special conformal transformations exhaust the list of purely scalar symmetries, since it is only dilation and special conformal transformations that respect the second expression of Eq. (71) and do not generate vector or tensor modes. It is also worth noting that the special conformal transformation consistency relation strictly speaking receives (small) corrections on the right hand side, a point to which we will return (footnote 32).

Robustness and Limitations of the Consistency Relations -a Relativistic Perspective and the Newtonian Limit
It is useful to pause, and reflect on the fully relativistic consistency relations derived so far. Some of our discussions here mirror the earlier ones in the Newtonian context ( §2.2), but with a relativistic twist. We also discuss the issue of taking the Newtonian, i.e. sub-Hubble, limit.
1. Newtonian limit. The special conformal consistency relation Eq. (89) is the relativistic analog of the Newtonian translation consistency relation Eq. (30). The former reduces to the latter in the sense that: where the H 2 /k 2 -suppressed terms can be ignored in the sub-Hubble limit. Note that the unsuppressed (Newtonian) terms are of the order of k H δ k 1 ...δ k N c ′ . Similarly, we can think of the dilation consistency relation Eq. (80) as the relativistic analog of Eq. (28). The dilation consistency relation takes the form: which can be compared against q × Eq. (90). We can see that the right hand side of the above expression is O(H/k)O(H/q) times q × Eq. (90). In the sub-Hubble limit where H is small compared to both q and k, it is therefore consistent to think of Eq. (91) as vanishing -reducing to Eq. (28). 27

Combining consistency relations.
It is worth pointing out that, just as in the Newtonian case where Eqs. (28) and (30) can be combined into a single equation (31), the general relativistic dilation 27 Eq. (28) was derived using the shift symmetry π → π + b, where b is a constant. The reader might wonder how that argument breaks down in the relativistic context. The point is that a constant shift in π has to be accompanied by a time-dependent shift in Φ (see e.g. Eq. 48). Such a time-dependent shift is not a symmetry of the kinetic term for the metric once time-derivatives are taken into account, unless coordinates change too. It is interesting to note that π → π + b/φ ′ is a symmetry (see Appendix A) because of the shift symmetry in φ; however, this symmetry does not correspond to the growing mode vacuum and therefore does not lead to a consistency relation. See Appendix D for a further discussion. and special conformal consistency relations can be combined into: where the constant c = D ′′ + 2HD ′ .
3. Alternative pions. Recall that π, δ and Φ all shift nonlinearly under the symmetries of interest. 28 One might wonder whether we could have derived the consistency relation with δ or Φ playing the role of the pion instead. The answer is affirmative. Let us compare these nonlinear shifts: ∆ nl. π = ξ 0 , ∆ nl. δ = −ξ 0ρ′ /ρ, ∆ nl. Φ = −ξ 0′ − Hξ 0 . Recalling that ξ 0 ∝ D ′ , we see that ∆ nl. δ = −ρ ′ /ρ × ∆ nl. π, and ∆ nl. Φ = −(D ′′ + HD ′ )/D ′ × ∆ nl. π. One can thus run the same arguments as before, and arrive at essentially the same consistency relation Eq. (92), with the right hand side unaltered, but the left hand side replaced by or The consistency relations expressed using π, δ or Φ as the soft pion are all equivalent -with one important caveat, which is related to the squeezing constraint.
recover the Newtonian consistency relation: the term − a [D(η a )/D ′ (η)] q · k a δ k 1 ...δ k N c ′ dominates on the right hand side. 29 Similar statements hold for Φ as the soft pion. The same is not true for δ (here, we focus on the matter δ as the soft mode): from the continuity equation (143), it is evident that the time dependence of δ (which is the same as δ n for pressureless matter) depends on whether the wave-mode is inside or outside the Hubble radius. The consistency relation written using δ as the soft mode takes the form of Eq. (93) only for q < H. If the soft δ mode is within the horizon, the continuity equation tells us δ q = q 2 (D/D ′ )π q , and so the left hand side of the consistency relation should read: while as discussed above, the right hand side reduces to − a [D(η a )/D ′ (η)] q · k a δ k 1 ...δ k N c ′ . This reproduces the Newtonian translation consistency relation written in terms of δ q (Eq. 33). To conclude: the consistency relation expressed in terms of a soft δ q takes a different form outside versus inside the Hubble radius i.e. Eq. (93) versus Eq. (95). The consistency relation expressed using the matter π or Φ as the soft pion maintains the same form regardless. 30 5. The existence of an interesting Newtonian limit. From the discussion above, we see that the special conformal consistency relation has a non-trivial Newtonian limit (i.e. the right hand side is non-vanishing), whereas the dilation one does not. What is the underlying reason? From Eq. (72), we see that for a given unitary-gauge transformation ξ i unit. , the corresponding residual diffeomorphism in Newtonian gauge is The associated consistency relation, making use of the relation δ q ∼ q 2 π q /H in the sub-Hubble limit, can be written schematically as: Here, [ ] k denotes the Fourier transform of the quantity of interest at momentum k, with the delta function removed. For instance, for 29 There is one subtlety though: for a wave-mode that enters the Hubble radius during radiation domination, its time-evolution deviates from D ′ during part of its history, and so strictly speaking the consistency relation does not apply if the soft mode belongs to this category. An alternative way to put it is this: when the wave-mode is within the Hubble radius (or more precisely, within the sound horizon) during radiation domination, neither the matter nor the radiation moves with a velocity that agrees with D ′ . A diffeomorphism that obeys the adiabatic mode conditions (e.g. Eq. 74 for dilation, or Eq. 81 for special conformal transformation) cannot generate the correct velocity for either component. Even in this case, we expect the consistency relation to still be a good approximation in the late universe, to the extent that most of the late-time non-Gaussianity is generated after radiation domination. We thank Paolo Creminelli for discussions on this point. 30 Using the baryon π as the soft pion is permissible too, as long as one stays above the Jeans scale.
In the sub-Hubble (and squeezed) limit where H ≪ q ≪ k, this suggests we have the dimensionless ratio δ q δ k ... c ′ /(P δ (q) δ k ... c ′ ) ∼ (q/k) n (k/q) 2 . For n = 1, the special conformal case, this reproduces correctly the Newtonian translation consistency relation. For n = 0, the dilation case, this does not work. The reason is that the k 2 /q 2 term in Eq. (98), which is the dominant term in the sub-Hubble limit, originates from ∇ 2 ξ i unit. , which vanishes for the dilation ξ i unit. = λx i . Our naïve power-counting argument also suggests there could be additional n > 1 consistency relations that are non-trivial in the Newtonian limit. As we will see in the next section, the n > 1 consistency relations generally involve tensors, which complicates taking the squeezed mode to within the Hubble radius.
Let us close this section by emphasizing the robustness of the consistency relations. As in the Newtonian derivation, the general relativistic derivation makes no assumptions about the dynamics of the hard modes -all we need to know is how they transform under diffeomorphisms. Thus, we expect the consistency relations to hold even for nonlinear, or astrophysically messy, hard modes (though the right hand side of the consistency relations might need to be modified depending on exactly how the modes of interest transform; see comments after Eq. 80 and in footnote 32). Besides the existence of symmetries, which according to the general relativistic perspective are nothing but residual diffeomorphisms, the two key assumptions are the same as in the Newtonian derivation: single field initial condition and adiabatic mode conditions, in particular that all species move with the same velocity in the soft limit.

Consistency Relations Involving Tensor Modes
In this section, we move beyond dilation and special conformal transformation to discuss residual diffeomorphisms that generate tensor modes (with or without accompanying scalar modes). We apply the same strategy as the one used for the pure scalar symmetries: use the full set of symmetries derived in the unitary gauge by H2K, and map each to a symmetry in the Newtonian gauge by Eq. (63).
The unitary gauge residual diffeomorphisms can be written as (Eq. 73): where the constant coefficients M satisfy: This condition is derived by substituting the power series into Eq. (57): ∇ 2 ξ i unit. + ∂ i (∂ k ξ k unit. )/3 = 0. Note that M by definition is symmetric in its last n + 1 indices. Since we are interested in M that generates tensor modes (in addition to possibly scalar modes), we should impose an additional adiabatic transversality condition:q This condition can be understood as enforcing that the tensor generated by our diffeomorphism be extensible to the q → 0 limit of a transverse physical tensor mode. Imagine an M that is nearly constant but tapers off to zero at sufficiently large x: while a constant M yields (derivatives of) a delta function peaked at q = 0 in Fourier space, a tapering M yields a smoothed-out version thereof. A tensor mode at a small but finite momentum should be transverse to its own momentum. We demand that even as we take the q = 0 limit (allowing the tapering of M to occur at larger and larger distances), transversality continues to hold, keeping the directionq fixed. This is the content of Eq. (101). The choice ofq is arbitrary; one could for instance choose it to point in the z direction. In addition, if the diffeomorphism generates only tensor modes, further conditions on M come from Eq. (58): i.e. M is traceless over any pairs of indices, which also trivially satisfies Eq. (100). As discussed in §3.1, for each unitary gauge residual diffeomorphism, there is a corresponding one in Newtonian gauge: with D being the linear growth factor and where c is a constant satisfying D ′′ + 2HD ′ − c = 0. Thus, at the level n, the Newtonian gauge diffeomorphism is: This expression holds for all n, with the exception of n = 0, in which case the last term on the right for ξ i is absent. One thing which is immediately clear is that for purely tensor symmetries -diffeomorphisms that generate only tensor modes -we have ξ 0 add. = ξ i add. = 0, and so they are identical in the unitary gauge and the Newtonian gauge, as expected. It is worth emphasizing that Eqs. (104) and (100) are general: they apply to symmetries that generate only scalar modes, or only tensor modes, or both. For symmetries that generate tensor modes, adiabatic transversality expressed in Eq. (101) is an additional requirement, and Eq. (102) applies if only tensor modes are generated.
To derive the corresponding consistency relations, we need a master equation analogous to Eq. (22) but generalized to allow for the possibility of tensor modes: Here, the label s denotes one of the two possible tensor polarization states; a given tensor perturbation γ ij ( q) can be decomposed as γ ij ( q) = s ǫ s ij (q)γ s q , where the symmetric traceless polarization tensor ǫ s ij (q) obeysq i ǫ s ij (q) = 0 and ǫ s ij (q)ǫ s ′ ij (q) * = 2δ ss ′ . The tensor power spectrum is defined by γ s Eq. (105) can alternatively be written as: using the fact that ∆ nl. γ s q * = ∆ nl. γ ij ( q) * ǫ s ij (q)/2. Let us step through a few low n examples to get a feel for the kind of consistency relations that arise from these diffeomorphisms. The discussion follows that of H2K, with suitable deformations to the Newtonian gauge.
For n = 0, M iℓ 0 can be written as a trace (dilation), an antisymmetric part (which does not generate a nonlinear shift in the metric 31 ) and a symmetric traceless part (anisotropic rescaling, which generates tensor perturbations). Note that n = 0 is a special case, in the sense that the last term of Eq. (104) does not exist (because ∇ 2 ξ i unit. = 0). Focusing on a symmetric traceless M iℓ 0 , there are 5 independent components. Imposing the adiabatic transversality conditionq i M iℓ 0 (q) = 0 reduces the number of independent tensor modes to 2. Thus, at the level of n = 0, we have one pure scalar and 2 pure tensor symmetries. It is straightforward to infer the corresponding symmetries in Newtonian gauge: dilation gets deformed as discussed in §3.2.1; the purely tensor symmetries take exactly the same form in the two gauges. The n = 0 anisotropic rescaling tensor consistency relation reads: a relation first pointed out by Maldacena [3]. To derive this, we use ∆ nl.
For n = 1, there are 3 purely scalar symmetries (the special conformation transformations) and 4 purely tensor ones. The special conformal transformations correspond to This is manifestly symmetric between ℓ 0 and ℓ 1 , and satisfies Eq. (100). We see that plugging this into Eq. (104) reproduces Eq. (81), and thus the special conformal consistency relation of Eq. 89 follows. Each of the 4 tensor symmetries come from an M iℓ 0 ℓ 1 that is symmetric between ℓ 0 and ℓ 1 , fully traceless over any pair of indices, and transverse in the sense of Eq. (101). The corresponding n = 1 tensor consistency relation reads: where the dependence on M iℓ 0 ℓ 1 can be removed by applying suitable projectors (see H2K). For each n ≥ 2, there are 4 purely tensor symmetries and 2 mixed symmetries where both π and γ transform nonlinearly. In general, any n ≥ 0 consistency relation reads: This is our most general result: each Newtonian gauge residual diffeomorphism described by Eq. (104) gives rise to a consistency relation given by Eq. (111). 32 The consistency relations can also be written in a form in which the matrix M is projected out (see H2K). The familiar dilation and special conformal consistency relations are contained here and the M 's in those cases take a form that projects out the tensor term on the left hand side. Note that the soft modes are assumed to be at time η, while the hard modes are at time η a for each momentum k a . Purely tensor consistency relations follow from those M 's that are fully traceless, hence setting the scalar contributions on the left hand side to be zero (and zeroing out terms proportional to the Kroenecker delta on the right hand side as well). For n > − 2, there are choices of M (2 for each n ≥ 2) that have a structure that gives both non-vanishing tensor and scalar contributions on the left hand side. It is worth pointing out that on the right hand side, the first set of terms (second line) are time-independent; they originate from the unitary gauge diffeomorphisms. 33 The second set of terms (third line) originate from ξ 0 add. , the additional time diffeomorphism that is necessary to keep us in Newtonian gauge. Likewise, the last set of terms (fourth line) come from ξ i add. , and we have used the pure tensor consistency relation at level (n − 2) to move the terms proportional to D(η) to the right hand side.
Let us study the taking of the Newtonian, i.e. sub-Hubble, limit. As explained in §3.3, it is helpful to rewrite the consistency relations using δ q ∼ q 2 π q /H (the precise relation is δ q = q 2 π q D/D ′ for the δ and π of pressureless matter in the sub-Hubble limit). Recalling that c = D ′′ + 2HD ′ ∼ H 2 , we see that Eq. (111) naïvely has a sub-Hubble limit of the schematic form: 32 With the exception of the dilation consistency relation (n = 0 with M iℓ0 ∝ δ iℓ0 ), these consistency relations in general receive corrections on the right hand side which either involve replacing one of the hard modes by a hard (scalar or tensor) metric perturbation, or involve higher powers of the metric perturbations. These corrections arise because the associated diffeomorphisms generally need to be corrected order by order in metric perturbations (H2K, [16]). What we focus on in this paper are the lowest order terms in the diffeomorphisms (i.e. metric-independent contributions). Even in the nonlinear regime where density perturbations are large, the metric perturbations are in general small. Thus, the corrections to the consistency relations are negligible in applications where the hard modes are density (as opposed to metric) perturbations on sub-Hubble scales. 33 The term −δ iℓ0 δ n0 /N arises from the removal of delta functions. See H2K for discussion.
where we have equated ∂ q ∼ 1/q and ∂ k ∼ 1/k. Of the terms on the right hand side, the term suppressed in the sub-Hubble limit by (H 2 /q 2 ) arises from the second and third lines of Eq. (111), and the unsuppressed term comes from the last line of Eq. (111). It is also worth noting that the unsuppressed term is in general non-vanishing even if all the hard modes are at the same time, as long as the soft mode is at a different time; the n = 1 case (that gives rise to the KRPP consistency relation) is an exception rather than the rule. At first sight, this suggests there is a non-trivial Newtonian limit for each n > 0, with the n = 1 case (KRPP) being one example. This is not the case because of the presence of tensor modes. In all n ≥ 2 cases where the diffeomorphism generates a soft scalar, the same diffeomorphism generates a soft tensor as well. The tensor equation of motion γ ′′ ij + 2Hγ ′ ij − ∇ 2 γ ij = 0 (Eq. 151) tells us that (1) ignoring ∇ 2 , γ ij = const. is the growing mode solution (or more properly, the dominant mode solution; the other mode decays); (2) allowing for a small ∇ 2 , the growing mode tensor solution gets corrected by a term proportional to D (see Appendix B); (3) when ∇ 2 is important, the tensor mode oscillates with an amplitude that decays as 1/a. Cases (1) and (2) pertain to super-Hubble modes while case (3) has to do with sub-Hubble ones. The purely tensor consistency relations follow from diffeomorphisms that are time-independent which generate tensor modes of type (1). The mixed scalar-tensor consistency relations follow from diffeomorphisms that generate tensor modes of type (2) (see footnote 39). In neither case are we allowed to take the soft tensor mode to within the Hubble radius. This is in contrast with the purely scalar consistency relations (such as dilation and special conformal transformation), where the time-dependence of the soft π q remains the same whether it is outside or inside the Hubble radius. One might be tempted to say: within the Hubble radius, the tensor mode decays anyway, so why not just drop the tensor term from the consistency relations? This is not allowed because the tensor mode enters in both the numerator and denominator of γ q ... c ′ /P γ (q). In general this ratio is independent of the amplitude of the tensor mode; the consistency relations express precisely this fact.

Discussion
Let us give a summary of our main results.
1. Consistency relations, between a squeezed (N+1)-point function and an N-point function, are a generic consequence of nonlinearly realized symmetries, which shift the fields of interest by amounts that do not depend on the fields. They can be thought of as the LSS analogs of soft-pion theorems in particle physics, with the squeezed (soft) mode playing the role of a soft pion. The master formula for deriving consistency relations for any such nonlinearly realized symmetries is given in Eq. (22). 34 2. In this paper, we focus on nonlinearly realized symmetries that involve a change of coordinates; they are diffeomorphisms from the point of view of general relativity. These are residual diffeomorphisms that are allowed even after the usual gauge-fixing; they all share the property that they do not vanish at spatial infinity. From earlier work in the unitary gauge (or ζ−gauge, where the equal time surface is chosen to be one where the matter perturbation vanishes, and where for us 'matter' means the dark or pressureless matter), it is known that there is an infinite number of such symmetries, of the schematic form x i → x i + M i x n+1 , where n = 0, 1, 2, ... and the detailed index structures of M i and x n+1 are suppressed (H2K). Associated with theses change of coordinates are both linear and nonlinear shifts in the fields, from which we derive consistency relations. For each n, the consistency relation (in unitary gauge) takes the schematic form: where ζ and γ are the (soft) curvature and tensor perturbations, P ζ and P γ are their respective power spectra, and O represents some observables of interest at hard momenta k 1 , ... k N . The correlation functions are connected and have the overall momentum-conserving delta functions removed. For n ≤ 1, the tensor term on the left hand side is absent, hence the step function Θ(n > 1). For LSS applications, especially when one is interested in sub-Hubble scales, it is useful to have the analogous relations in Newtonian gauge where the matter fluctuations are non-vanishing. We give the prescription for mapping each unitary gauge symmetry to its Newtonian gauge counterpart. This is given in Eq. (104). The diffeomorphism becomes where some of the suppressed indices of M need to be internally contracted; the factors of H −1 and H −2 are meant to indicate coefficients that are time-dependent, and the powers of Hubble reflect the order of magnitude of these coefficients. From these diffeomorphisms, the Newtonian gauge consistency relations read schematically: where π q is the soft velocity potential. The precise form is given in Eq. (111). Because π q is dimensionful (velocity v i = ∂ i π), it is useful to rewrite this using the dimensionlessδ q ≡ q 2 π q D/D ′ ∼ q 2 π q /H: 35 Two noteworthy points: first, the right hand side should always be understood to contain corrections that vanish in the q → 0 limit i.e. we expect O(q) corrections to the right; second, it appears that a non-trivial sub-Hubble limit exists (by sending H → ∞) for all n > 0, but there is a subtlety.
3. The subtlety has to do with the adiabatic mode condition. The consistency relations make three assumptions: the existence of nonlinearly realized symmetries, the single field initial condition, and the adiabatic mode condition. The last says that the soft mode generated nonlinearly by our symmetry transformation must satisfy the equation of motion at a low but finite momentum, i.e. the symmetry generated soft mode must have the correct time dependence to match that of a long wavelength physical mode. A corollary is that since each of our symmetries is a diffeomorphism, the same diffeomorphism must generate all the soft modes in the problem. This means that in a universe with multiple particle species, all the particles must move with the same velocity on large scales, implying that the equivalence principle is obeyed. (It is worth stressing that the equivalence principle needs to hold only on large scales; baryons or galaxies can move differently from dark matter on small scales.) This also means the sub-Hubble or Newtonian limit must be taken with care. Recall that the general relativistic consistency relations are strictly speaking q → 0 statements, and thus valid for super-Hubble scales q < H. For the relations to continue to hold even as one takes q > H (keeping k ≫ q of course), the soft mode must maintain the same time dependence across the Hubble radius. The soft π q has this property, and can be safely taken to be sub-Hubble. The same is not true for the soft tensor mode γ q . Thus, consistency relations which involve the tensor mode strictly hold only if the soft mode is super-Hubble. It is only in the special cases of n = 0, 1 (dilation and special conformal transformation) that the tensor term is absent from the left hand side, and the corresponding purely scalar consistency relations hold even within the Hubble radius (i.e. for H ≪ q ≪ k). Of these two cases, n = 0 does not give a non-trivial right hand side; only n = 1 does in the sub-Hubble limit -this gives precisely the Newtonian consistency relation obtained by KRPP (Eq. 30). In fact, it is worth emphasizing that the KRPP relation as originally expressed (Eq. 33) should be thought of as containing two separate non-relativistic consistency relations: the lack of a 1/q 2 pole is the sub-Hubble limit of the dilation relation and the leading 1/q term comes from the special conformal transformation, which (suitably generalized to Newtonian gauge) reduces to a time-dependent spatial translation in this limit.

4.
The fact that the symmetries of interest are diffeomorphisms, albeit ones that do not vanish at spatial infinity, suggest the consistency relations are fairly robust: no detailed dynamical assumptions need to be made about the hard modes, and the only information we need is how they transform. They could be highly non-perturbative and even astrophysically complex, such as galaxy observables on small scales (though some care is needed if the hard modes are metric as opposed to density fluctuations, see footnote 32). We demonstrate this in the context of the Newtonian consistency relation by writing down explicitly a model of galaxy dynamics that allows for galaxy formation, mergers, dynamical friction and so on ( §2.2). From the symmetry standpoint, the key assumption is that the dynamics and formation process of galaxies be frame invariant, i.e. aside from Hubble friction, the only possible dependence on velocity arises from gradients thereof or on velocity difference between different species. From the standpoint of the adiabatic mode condition, the key assumption is that all objects fall in the same way on large scales. From the practical standpoint, in terms of checking the consistency relation at a small momentum q (as opposed to a literally vanishing q), it is important to ensure q is sufficiently squeezed such that the soft modes evolve in a way consistent with the adiabatic mode condition. Checking the consistency relation observationally will allow us to test some fundamental assumptions in cosmology -in particular the single-field initial condition and the equivalence principle [18,21].

5.
We carry out a perturbative check on the robustness of the Newtonian consistency relation, by including a galaxy bias at the quadratic level ( §2.2). For the consistency relation to hold, we find that any non-local quadratic bias cannot be too infrared divergent: supposing the galaxy density δ g is related to the mass density δ by , the quadratic kernel W ( k ′ , k − k ′ ) must grow slower than 1/k ′ in the small k ′ limit (Eq. 46). Thus, a local quadratic bias where W is independent of momenta, which is a form of bias commonly invoked, respects the consistency relation. Interestingly, the one case we know of where a W ∼ 1/k ′ behavior occurs is one where galaxies are born with a velocity bias -which is consistent with our understanding that having a large scale velocity bias violates the adiabatic mode condition [17]. The consistency relation allows the hard modes to be astrophysically complex observables such as the galaxy density δ g . But the soft mode strictly speaking should still be that of the dark (pressureless) matter. On the other hand, from an observational standpoint, galaxy density is much easier to measure. We show that the soft mode can be a galaxy observable provided that (1) the consistency relation is corrected by a multiplicative linear bias factor; (2) the galaxy power spectrum on large scales has a spectral slope of n < 1 (where n ≡ d ln P (q)/d ln q; see §2.2).
There are a number of outstanding questions for future investigations. The symmetries we are using are gauge symmetries: why is it that we manage to derive physically relevant statements out of gauge redundancies? We know consistency relations are not empty statements, because there are models that violate them, for instance models of inflation that involve multiple fields [4]. The answer presumably has to do with two aspects of the consistency relations: (1) they make certain physical assumptions about the initial condition, in particular single field initial condition; (2) while a strictly zero-momentum mode is unobservable, a soft mode with a small but non-zero momentum is observable, and consistency relations are ultimately statements about the dominant terms in a correlation function with a soft mode. It would be useful to clarify the curious role of gauge symmetries in physical statements. Further clarity in the derivation of the consistency relations is desirable for another reason: we see from our perturbative check in §2.2 that, even without galaxy bias, the validity of the consistency relations requires the soft power spectrum to be not too blue (n < 3). Why this should be so is not easy to see from the background wave argument. Can one see this from other arguments in the literature, such as the operator formalism (H2K), operator product expansion [8], or the effective action approach [11]?
Lastly, in this paper, we focus exclusively on nonlinearly realized symmetries that originate from diffeomorphisms. Could the LSS dynamics have other nonlinearly realized symmetries? Recently, it was pointed out by [22] and [23] that, indeed, further nonlinearly realized symmetries exist, albeit ones that involve the transformation of parameters as well. Can there be more?

A A Lagrangian for Fluid with Pressure
In this Appendix, we connect the LSS Lagrangian discussed in §2.3 with a more commonly used fluid Lagrangian. We will continue to work within the zero vorticity regime. For generalizations to include a non-vanishing vorticity, see [30]. Let us consider the following action: where the first term is the Einstein-Hilbert action, and P(X) is the fluid action, where P is some function of X ≡ −g µν ∂ µ φ∂ ν φ with φ describing the single degree of freedom of an irrotational fluid. 36 The ... stands for other possible matter or energy content in the universe, i.e. the background expansion need not be determined solely by the P(X) fluid in question. This is a completely relativistic action, and has been used by many authors [14,15]. Our goal here is to take the non-relativistic limit, and connect the result with the action in §2.3 (Eq. 51).
We assume a metric of the form: The fluid energy-momentum T µν can be obtained from the fluid action by √ −gT µν = −2δS fluid /δg µν : which is the energy-momentum of a perfect fluid, with the 4-velocity U µ , energy density ρ and pressure P given by: A fluid with an equation of state P = wρ can be modeled by a P(X) of the form: We are interested in the case of a small w. Let us split φ into a backgroundφ(η) and perturbation: where we have defined π in terms of the field fluctuation δφ. This definition is consistent with the interpretation of π as the velocity potential, as can be seen by working out U µ in terms of φ and equating U µ = a −1 (1, v) i.e. v i = ∇ i π to the lowest order in perturbations. The backgroundφ obeys: which impliesφ ′ ∝ a 1−3w , using the fact thatρ ∝ a −3(1+w) . We denote byX the value of X evaluated at φ =φ. Using the fact that P = wρ, we find 1 + δ = (1 + [δX/X]) (1+w)/2w which implies 2w 1 + w ln(1 + δ) = ln 1 + δX X (123) 36 The vanishing of vorticity can be expressed covariantly as ǫ µνρσ u ν ∂ ρ u σ = 0.
Assuming both w and δX/X are small, but without assuming δ is small, we can approximate this by Let us write out δX/X explicitly in terms of the metric and φ fluctuations: where we have approximatedφ ′ ∝ a (for small w), assumed Φ ∼ v 2 π ′ ∼ Hπ ≪ 1, and ignored terms to higher order (we regard Φ 2 , Φ ′ and wΦ as both higher order). Eqs. (124) and (125) combined give: Applying the spatial gradient on this equation reproduces the Euler equation in the presence of pressure (Eq. 41), upon identifying w with c 2 s , the sound speed squared. We are interested in rewriting the action in Eq. (116) in terms of the fluctuations. In other words, we are not so much interested in the background as in the dynamics of the fluctuations. Thus, we ignore the background term in √ −g[M 2 P R/2+P(X)]. We also remove (tadpole) terms that are linear in fluctuations -they only serve to multiply the background equation of motion. Thus, we have (in the sub-Hubble, non-relativistic limit): where S EH comes from the Einstein-Hilbert action, and S fluid comes from the fluid part of the action. For the latter, we have used the fact that P(X) = wρ = wρ(1 + δ), and removed the background piece wρ. We add and subtract (1 + w)δX/(2wX ) to facilitate the removal of tadpole terms from expanding out (1 + δ) = (1 + δX/X) 1+w/2w . We are to understand the last line as follows: the first (1 + w)δX/(2wX ) should be understood to have the linear fluctuations removed, while the second (1 + w)δX/(2wX ) has all terms in it. 37 The fluid part of S is therefore where F is 37 The determinant √ −g contains terms of order Φ and Φ 2 . Terms of order Φ multiplying the background are removed as tadpoles. Surviving terms can be seen to multiply at least one factor of w or of v 2 ∼ Φ (the latter with no compensating 1/w), and so are small compared to what we keep (which are or order v 2 or v 2 (v 2 /w)).
with δ and δX understood to be expressible in terms of Φ and π using Eqs. (124) and (125). We have already verified that the Euler equation with pressure holds (Eq. 126) -from the point of view of the action S fluid , this merely serves as a definition for δ. Let us verify we obtain the Poisson and continuity equations by varying the action. First, we see that Ψ can be integrated out by setting Ψ = Φ. In other words, let us work with the action: The variation ∆F when we vary Φ, using Eqs. (124) and (125), is giving us and therefore the Poisson equation. This assumes that the only fluctuations sourcing Φ is from the P(X) fluid, which of course can be relaxed. The π equation of motion, on the other hand, follows from which together with the variation ∆ − 1 2 (∇π) 2 gives us the continuity equation δ ′ + ∇ i [(1 + δ)∇ i π] = 0. The action in Eq. (128) is a bit hard to use, because F involves a fairly nonlinear function of the fields. There are two possible simplifications.
We are interested in the w → 0 limit. Keeping δ finite, Eq. (126) tells us which is just the pressureless Euler equation again. Sending F → 0, and substituting the above into Eq.
(130), we obtain: reproducing Eq. (51) that we wrote down in §2.3. This justifies the normalization and sign that was adopted there. The other possible simplification is to expand out (1 + δ) = (1 + δX/X) 1+w/2w to second order in δX/X. We have resisted doing so earlier, because doing so effectively assumes δX/X is parametrically smaller than w (which is itself small). This is equivalent to assuming small δ, something we might not want to impose. It is nonetheless instructive to see what results: where we have set w = c 2 s . In the context of this action, we treat δ as defined by: Notice how this differs from Eq. (126) in replacing ln (1 + δ) by δ on the left hand side. The reason for this definition is so that the Φ equation of motion gives the Poisson equation as usual. The π equation of motion can be seen to give the continuity equation. In other words, the full set of equations in this system are: This is the set of equations one expects for a fluid with pressure, except the pressure term in the Euler equation is slightly modified from the non-perturbative one displayed in Eq. (41). Aside from this modification, this system of equations has the correct nonlinear structure. In particular, on length scales above the Jeans scale i.e. k < k J where k 2 J ≡ a 2ρ /(2M 2 P c 2 s ), one can ignore the pressure term, and the system reduces exactly to the standard pressureless LSS equations (Eq. 9). A useful feature of the action in Eq. (136) is that it shows clearly π has a kinetic term of the correct sign.
To summarize, the fluid action Eq. (130) gives the exact nonlinear equations for the perturbations of a fluid with pressure in the Newtonian limit. It simplifies in the zero-pressure limit to the action in Eq. (51), which gives the exact nonlinear equations for a pressureless fluid. It can be approximated by the action in Eq. (136) which gives a linearized pressure term for the Euler equation, but otherwise retains the full nonlinear structure of the exact theory.

B Derivation of the General Relativistic Adiabatic Mode Conditions in Newtonian Gauge
In this Appendix, we derive the adiabatic mode conditions appropriate for the Newtonian gauge, and derive the additional diffeomorphism laid out in Eq. (61). For the purpose of deriving consistency relations, it is important that the modes generated nonlinearly by the symmetries be the low momentum limit of actual physical modes, i.e. they must obey adiabatic mode conditions (see §2.2). For the low momentum modes (and for them only), it is sufficient to consider the linearized Einstein equations, and study the time-dependence they imply for the perturbations. The linearized Einstein equations in Newtonian gauge are: We have allowed the possibility that there might be multiple fluid components present (for instance dark matter, baryons, radiation, etc.), hence the summation on the right hand side, though we suppress the label for each component. Also useful are the linearized conservation equations, assuming each fluid is individually conserved. The continuity equation for each fluid component is where δ n is related to the density fluctuation δ ≡ (ρ −ρ)/ρ by (1 + w)δ n = δ, with w = P/ρ being the equation of state parameter of the fluid component of interest. This definition of δ n is motivated by the fact thatρ ∝ a −3(1+w) , and so it isρ [1/(1+w)] that redshifts like a −3 , i.e. one can think of n ≡ ρ [1/(1+w)] as the "number" density, and of δ n as its fractional (small) fluctuation (for instance, for w = 1/3, n would be the number density of photons). In deriving Eq. (143), it is useful to know H 2 −H ′ = 4πGa 2 (ρ+P ). Note also that, in an analogous manner to Eq. (67): The relativistic Euler equation for each component is: where we have assumed the fourth Einstein equation has a vanishing source. 38 Decomposing this last equation into scalar, vector and tensor parts, we have Following Weinberg [25], we demand that the (nonlinear part of the) symmetry-generated perturbations, as described in §3.1), solve the Einstein equations in a non-trivial way, that is, in a way that works even if we deform those perturbations slightly away from the zero momentum q = 0 limit. (See Eq. 14 for the Newtonian analog of this statement.) For scalar fluctuations, we therefore insist: scalar adiabatic mode condition : Ψ = Φ , −(Ψ ′ + HΦ) = 4πGa 2 (ρ +P )π , vector adiabatic mode condition : (151) Note that we do not wish to simply set the gradient to zero, because we are interested in diffeomorphisms generating a γ ij that is the soft limit of a finite momentum physical mode. Applying the above conditions to the (nonlinear part of the) symmetry-generated perturbations (Eqs. 60 and 65), we obtain: The first equality enforces Φ = Ψ. The second equality enforces the second part of the scalar adiabatic mode condition, with the understanding that in the soft limit, all fluid components share the same velocity perturbation π. The third equality equates the tensor mode with the traceless part of the spatial metric generated by the diffeomorphism -this holds only if a certain gauge condition is satisfied such that the scalar contribution to the spatial metric resides entirely in its trace (see below). As far as the adiabatic mode condition is concerned, the important point is that γ ij defined this way satisfies the tensor equation of motion (151). These three expressions constitute the adiabatic mode conditions on residual diffeomorphisms in Newtonian gauge. For a diffeomorphism to respect the Newtonian gauge, it must satisfy such that ∆ nl g 0i = 0 and the traceless part of ∆ nl g ij is transverse (see Eq. 60). As discussed in §3.1, one way to organize the set of Newtonian-gauge diffeomorphisms that satisfy Eqs. (152) and (153) is to relate each such diffeomorphism to a corresponding known residual diffeomorphism in the unitary gauge ξ unit. (Eq. 63): where the time-independent ξ i unit. is supplemented by a time-and space-diffeomorphism ξ 0 add. , ξ i add. . The time-independent unitary-gauge diffeomorphism ξ i unit. satisfies Eq. (57). Comparing this with Eq. (153), we see that ξ i add. itself must satisfy the same: ∇ 2 ξ i add. + ∂ i (∂ k ξ k add. )/3 = 0. For this reason, we might as well absorb any time-independent part of ξ i add. into the definition of ξ i unit. . From the second condition in Eq. (152), we see that ∂ i ξ i add must be independent of time. Suppose it is equal to some function f (x). One can express ξ i add. as a gradient and a curl (plus possibly some function that depends only on time). The divergence of the gradient is what matches up with f (x), i.e. the gradient part is time-independent, and so by definition, it should have been absorbed into ξ unit. already. Thus, we can set f (x) = 0 and we can assume ∂ i ξ i add. = 0 without loss of generality. The first condition of Eq. (152) thus tells us Recall from Eq. (65) that ∆ nl. π = ξ 0 = ξ 0 add. . From Appendix C, we see that π in the soft limit has the time dependence of D ′ where D(η) is the linear growth factor satisfying: where c is a constant. Comparing this against Eq. (155) and keeping only the growing solution, we see that confirming the time-diffeomorphism of Eq. (61). We can then solve for ξ i add. from the first expression of Eq. (153) which tells us ∂ i ξ 0 add. = ∂ 0 ξ i add. , i.e.
where the second equality follows from Eq. (57). This confirms the space-diffeomorphism of Eq. (61). As a self-consistency check, one can see that Eq. (57) also implies that ∂ i ξ i add. = 0. Lastly, it can also be checked that the tensor mode created by this diffeomorphism (the third expression of Eq. 152) obeys the tensor equation of motion. To see this, it is useful to note that because ξ i unit. satisfies ∇ 2 ξ i unit. +∂ i (∂ k ξ k unit. )/3 = 0, we also know ∇ 2 ∂ k ξ k unit. = 0, ∇ 2 ∂ i ξ j unit. = −∂ i ∂ j ∂ k ξ k unit. /3, and ∇ 2 ∇ 2 ∂ i ξ j unit. = 0. 39 It is worth noting that for pure tensor symmetries, where ∂ i ξ i unit = 0, both ξ 0 add. and ξ i add. vanish, and so the pure tensor symmetries coincide in the Newtonian gauge and unitary gauge, as they should.

C Derivation of the General Relativistic Velocity Equation
Our goal in this Appendix is to derive the following equation for the velocity potential π: (π ′ + 2Hπ − C) ′ − 3wH(π ′ + 2Hπ − C) = w(g ′ + Hg) − (1 + 3w) (H 2 − H ′ )π − 4πGa 2 (ρ +P )π , (159) where g ≡ dη∇ 2 π, and C is a constant in time but not space (determined by initial conditions). Here, π refers to the velocity potential of some particular fluid component of interest with an equation of state parameter w -except in the very last term where (ρ +P )π refers to a sum over all fluid components. We will use this equation to deduce useful statements about the time-dependence of π in the soft limit. The continuity equation (143) can be integrated once to obtain: δ n = 3(Ψ + C) − g , g ≡ dη∇ 2 π , where C denotes some integration constant -independent of time, but dependent on space in general. This can be substituted into (the scalar part of) the relativistic Euler equation (145) and integrated once to give π ′ + H(1 − 3w)π = −(1 + 3w)Ψ − 3wC + wg .
(161) 39 In other words, the combined action of ξ i unit. + ξ i add. generates a tensor mode of the form γ ij = [1 + (D/c)∇ 2 ]γ ij unit. where γ ij unit. is the tensor mode generated by the time-independent unitary diffeomorphism alone. That the constant tensor (growing) mode gets corrected at finite momentum by a term proportional to momentum squared should not be surprising. The time dependence can also be checked explicitly by solving the tensor equation of motion in the small but finite momentum limit.
On the other hand, (the scalar part of) the δG 0 i equation (140) can be integrated once to obtain − (Ψ ′ + HΨ) = 4πGa 2 (ρ +P )π , where we have assumed Ψ = Φ. One can solve for Ψ from Eq. (161), substitute the result into Eq. (162), and subtract from both sides (H 2 − H ′ )π. This gives Eq. (159). Note that in this derivation, we have not thrown away any gradient terms, i.e. we have not made any super-Hubble approximation. Equation (159) simplifies if π happens to be the same for all fluid components, in which case what appears within the square brackets [ ] sums to zero, by virtue of H 2 − H ′ = 4πGa 2 (ρ +P ). This happens, for instance, if we work on super-Hubble scales and assume adiabatic initial conditions. One can check that this is a self-consistent solution on super-Hubble scales, and assuming all fluid components move with the same π, the entire right hand side of Eq. (159) vanishes, implying: This suggests different fluid components (with different w's) evolve differently, unless the proportionality constant is in fact zero, i.e. π ′ + 2Hπ − C = 0 .

(164)
With this choice of the initial condition, it is thus consistent to have the same π for all fluid components on super-Hubble scales. Interestingly, for pressureless matter (w = 0), Eq. (164) holds even on sub-Hubble (but linear) scales, after radiation domination. This can be seen by setting w = 0 in Eq. (159), and noting that during matter or cosmological constant domination, the terms within the square brackets [ ] still sum to zero. This means that for a wave-mode (of pressureless matter) that enters the Hubble radius after radiation domination, Eq. (164) holds for its entire history. For a wave-mode that enters the Hubble radius before matter domination, however, Eq. (164) does not hold in the intermediate period when the mode is within the Hubble radius during the radiation dominated phase. 40 As we see in §3.2, the fact that Eq. (164) holds for pressureless matter both inside and outside the Hubble radius (as long as the wave-mode of interest crosses the Hubble radius after radiation domination) enables us to have interesting consistency relations in the Newtonian limit. It is also worth noting that since ∂ i π describes the dark matter velocity on all (linear) scales, including sub-Hubble ones, where we know the velocity scales with time as D ′ (D being the linear growth factor), we expect where c is some constant whose normalization is arbitrary -its normalization is tied to the normalization of the growth factor D. That this relation holds for the Newtonian growth factor in a matter dominated universe is easy to check: D ∝ a. That this is true for more general cases is less familiar. Let us check this for a universe with a cosmological constant.
For a flat universe with pressureless matter and a cosmological constant, the linear growth factor can be written in closed form [32]:  40 It is worth pointing out that Eq. (164), when substituted into Eq. (161) gives Ψ = −(π ′ + Hπ) -this holds as long as the wg term can be ignored, which can be justified either for super-Hubble scales, or for w = 0.
for (n > − 1), where the M 's are constant and obey the usual transversality and adiabatic transversality conditions. Can we derive consistency relations for the decaying modes using these symmetries, using Eq. 22 or its generalization Eq. 105? We argue that the answer is no, though it is not enough to say that these simply decay away. Rather, keeping the decaying modes would correspond to a nonstandard choice of the initial vacuum state in the far past: if the decaying mode is not set to zero the energy associated with these modes (the scalar part of the action scales like ρa 4 ∼ a 2 (∇Φ) 2 ∼ H 2 /a 2 ) becomes divergent at early times. Had we chosen to ignore this problem and work within the putative vacuum containing only decaying modes, we could have, in which case the lack of the time-dependent piece would make our consistency relations look slightly different from Eq. 111. Note that it is not true that the consistency relations should vanish in this case because the modes decay at late times: this is because the consistency relation is a ratio between the (N+1)-pt function and the power spectrum, both of which decay, but the ratio on the right hand side does not.
For the sake of completeness, we discuss the case where the symmetries may involve vector modesunlike the scalars and tensors, these have only a decaying solution. The condition on the spatial part of the diffeomorphism will still be obeyed, but the condition ∂ 0 ξ i = ∂ i ξ 0 will be violated and replaced by the weaker condition where in the second equality we have made use of the second equation in Eq. 152. Using the vector adiabatic mode conditions (Eq. 150) we have whereξ i is transverse and time-independent. The second of these conditions is clearly incompatible with the first condition in Eq. 152: unless (∂ i ξ 0 ) vec. vanishes, and so the vector part of the symmetry will beξ i dη a 2 ⊂ ξ i . We can Taylor expandξ i = 1 (n + 1)!M i ℓ 0 ℓ 1 ···ℓn x ℓ 0 · · · x ℓn (178) TheM 's are completely traceless, and the obey the usual tensor transversality conditions Eqs. 100, 101 as well, so these are vector-tensor symmetries. Note that they obey the tensor equation of motion Eq. 151 on superhorizon scales, though they correspond to the decaying mode solution. For n > − 1 there will be 4 such symmetries at each level. For n = 0, there are additional symmetries where withM iℓ 0 is antisymmetric in the indices; these correspond to time-dependent rotations. 41 They will obey the further adiabatic transversality conditionq i (M iℓ 0 (q) −M ℓ 0 i ) = 0 (179) which will reduce the number of allowed polarizations from 3 to 2. Since a localized rotation necessarily involves shearing, we need this condition to enforce transversality in addition to the antisymmetric tensor structure.
To summarize, at n = 0 there are two purely vector symmetries, and for n > − 1 there are four vector + tensor symmetries. Since vector modes always decay, for our choice of vacuum there are no consistency relations that involve vector modes.