Ostrogradsky in Theories with Multiple Fields

We review how the (absence of) Ostrogradsky instability manifests itself in theories with multiple fields. It has recently been appreciated that when multiple fields are present, the existence of higher derivatives may not automatically imply the existence of ghosts. We discuss the connection with gravitational theories like massive gravity and beyond Horndeski which manifest higher derivatives in some formulations and yet are free of Ostrogradsky ghost. We also examine an interesting new class of Extended Scalar-Tensor Theories of gravity which has been recently proposed. We show that for a subclass of these theories, the tensor modes are either not dynamical or are infinitely strongly coupled. Among the remaining theories for which the tensor modes are well-defined one counts one new model that is not field-redefinable to Horndeski via a conformal and disformal transformation but that does require the vacuum to break Lorentz invariance. We discuss the implications for the effective field theory of dark energy and the stability of the theory.


Introduction
Given the wealth of data on gravitational physics coming from cosmology with surveys such as Euclid [1] and the Large Synoptic Survey Telescope (LSST) [2], and the new era of gravitational wave astronomy that has been opened up by LIGO [3], it is important to develop theoretically consistent alternative theories of gravity to test against observations. This is especially timely as some modified theories of gravity can help to resolve important open problems in cosmology such as the nature of dark energy or can have distinctive signatures in inflation or in astrophysical observations [4][5][6][7]. From a theoretical perspective, modifications of gravity are interesting to study in their own right as they lead to insights on what properties a consistent theory must have.
A major challenge in constructing consistent theories of gravity is avoiding the presence of an Ostrogradsky instability (or Ostrogradsky ghost) [8] (see [9] for a review of problems with the Ostrogradsky ghost and [10] for methods to constructing theories without ghosts). The Ostrogradsky instability is a kinetic instability with an arbitrarily fast time scale, which can only be avoided if one includes new operators at the same scale as that of the ghost which remove the instability (i.e. the interaction scale of a ghost should always be at least of the order of the cutoff of the theory). An Ostrogradsky instability may arise in generic modifications of gravity as they typically introduce higher derivative interactions which tend to excite the Ostrogradsky mode, unless great care is taken in choosing the form of these interactions.
Typically the Ostrogradsky instability is associated with the equations of motion involving third or higher order time derivatives. In the case of a single field, there is indeed a direct link between higher order equations of motion and the existence of an unstable, propagating Ostrogradsky mode. However when multiple fields are present, diagnosing an Ostrogradsky instability may be more subtle.
In the context of massive gravity this subtlety was for instance realized in the Stückelberg language [11] and the helicity language in [12], where the existence of higher derivatives is manifest beyond the decoupling limit. Historically, a problem with massive gravity has been the existence of the Boulware-Deser (BD) ghost mode [13]. In the decoupling limit, the BD ghost can be identified with higher order equations of motion for the helicity-0 mode [14,15]. However, beyond the decoupling limit, the equations of motion for the Stückelberg fields are higher order, even when the potential is chosen to avoid the BD ghost. To resolve this apparent paradox it is necessary to realize that the number of degrees of freedom is determined by the number of independent pieces of initial data that are needed to evolve the system. This cannot necessarily be read directly from the order of the equations of motion when multiple fields are present, because the equation of motion for one field can involve the derivative of the equation of motion of another field. This possibility is related to the existence of welldefined, invertible field redefinitions which can change the order of the equations of motion (but of course without changing the number of pieces of initial data that need to be specified).
Precisely this same subtlety can be used in constructing consistent scalar-tensor theories. The most general class of scalar-tensor theories in which both the metric and the scalar field have second order equations of motion is known as Horndeski theory [16], see also [17].
As scalar tensor theories, Horndeski theories have many important applications in cosmology ranging from inflation (see for example [18][19][20]) to late universe cosmology (see for instance [17] and references therein).
However just as is the case for massive gravity, demanding that the equation of motion for every field be second order may be overly restrictive. Indeed an explicit construction of such interactions has been performed and is known as 'Beyond Horndeski' [21,22]. Even though they were discovered later, Beyond Horndeski theories belong in the same class as the Horndeski theories in the sense that they are scalar-tensor theories avoiding an Ostrogradsky instability, and can be applied to phenomenology. It is then natural to wonder whether Beyond Horndeski is equivalent to Horndeski after a field redefinition. This interesting question was investigated in [21][22][23][24][25]. In [26] it was shown that the higher derivatives that appear in the equations of motion of some of these theories can be eliminated and a Hamiltonian analysis for a specific model showed the existence of a primary constraint. Then a fully systematic analysis for all Horndeski and Beyond Horndeski was performed in [27] proving that the number of propagating degrees of freedom in these theories in three.
Inspired by the existence of Beyond Horndeski, a very natural question in the context of scalar-tensor theories is: What is the broadest possible generalization of Horndeski which is free from the Ostrogradsky instability? Interesting progress along these lines was recently made by [28][29][30][31][32] leading to new classes of scalar-tensor theories endowed with an additional constraint that eliminate the ghost first found in [28]. Those theories have been dubbed Extended-Scalar-Tensor theories of gravity (ESTs) [29] or DHOST (Degenerate Higher Order Scalar-Tensor Theories) [30].
In this paper we review some of the considerations that arise in analyzing the Ostrogradsky instability in theories with multiple fields. First we present a general discussion about avoiding Ostrogradsky instabilities that emphasizes similarities between developments in massive gravity and in scalar-tensor theories. The fact that these similarities exist is perhaps not surprising as massive gravity can be viewed as a theory of gravity coupled to scalar fields forming a non-linear sigma model as discussed by [33,34].
Then, we further analyze the nature of the degrees of freedom in these ESTs. While these theories have a primary constraint by construction and so the total number of propagating degrees of freedom is less than four (which would correspond to gravity in addition to a scalar field and its Ostrogradsky ghost), the next logical step is to determine exactly how these degrees of freedom are distributed among the scalar and tensor sectors. We find that in some cases, the tensor is either not dynamical (as also found in [28,30]) or infinitively strong coupled.
Among the remaining possibilities, one either counts Horndeski and field redefinitions from Horndeski as well as a specific EST model for which the tensor modes are well-behaved and show how the time derivatives of the lapse can always be removed in unitary gauge. We discuss the implications for the construction of an effective field theory for dark energy [23,[35][36][37][38] (or for cosmology in general).
The rest of this work is organized as follows. In section 2 we review techniques for diagnosing an Ostrogradsky ghost in theories with multiple fields and discuss their relevance for gravitational theories. In section 3 we then review the recently proposed EST theories and the existence of a primary constraint as well as the propagation of tensor modes. We then look at a special case of EST in section 4 which differs from Horndeski and beyond Horndeski. We show how the presence of time derivatives on the lapse in unitary gauge can always be absorbed and discuss the implications for the construction of an effective field theory for cosmology. We also analyse the stability of this class of models and show the existence of gradient instabilities where no other interactions are present. Finally we summarize our results in section 5.

Counting the number of Degrees of Freedom
For a single field theory, the existence of higher-time derivatives leads to the well-known Ostrogradsky instability [39] which manifests itself as an additional ghost degree of freedom (dof) which suffers from a kinetic instability and leads to an inconsistent theory (see Ref. [9] for the different consequences of this instability).
On the other hand, in theories with multiple fields, the existence of high-derivatives may not necessarily immediately lead to such an Ostrogradsky pathology. To our knowledge, one of the first explicit realization of this case manifested itself within the context of ghostfree massive gravity [40,41] which was proven to be entirely free of ghosts in [42,43] and yet higher-derivatives are manifestly present when considering interactions beyond the decoupling limit. The reason why higher-derivative are not necessarily fatal in theories involving multiple fields (independently of their exact nature) was highlighted in [11,12] and lies in the possibility to perform field redefinitions (rotating the field space variables) in a way that does not change the number of dof. Take for instance the two-scalar field toy-model in flat spacetime proposed in [12] (where later on h may symbolically play the role of the gravitational field and φ the scalar field in a tensor-scalar theory of gravity) and X = (∂φ) 2 . As written, the theory (2.1) manifestly involve higher derivatives. Yet, as highlighted in [12], (2.1) satisfies unitarity and all its scattering amplitudes are trivial. Explicitly computing the scattering amplitudes for arbitrary N point functions and checking whether they satisfy unitarity is perhaps one of the most unambiguous way to determine whether the theory exhibits a ghost, but there exists a multitude of other ways to establish whether or not the theory (2.1) is free of any Ostrogradsky instability. The most standard way determine the number of dofs and the existence of ghosts is to perform a full Hamiltonian constraint analysis. In what follows we also present a few alternative tricks to establish the number of dofs.
1.-Equations of motions and number of initial conditions: One way to establish the absence of Ostrogradsky instability for instance in (2.1) is to determine the number of initial conditions needed to solve the equations of motion. While higher derivatives are present in the equations of motion with respect to φ, 4) no higher derivatives are present in the equations of motion with respect to h, and all the higher derivatives in the equation of motion for φ actually disappear once solving the equation of motion for h. This means that one only needs to specify two initial conditions per field to solve for the system, and the theory is not genuinely higher derivative. This counting is similar to that emphasized in [11,44,45]. Notice however that in a theory with gauge symmetries one should also ensure that what should be auxiliary variables do not become dynamical in that process.
2.-Field Redefinition: Another way to see the absence of pathology is simply to perform the well-defined and fully invertible field redefinition In terms of these new variables, the Lagrangian is manifestly healthy, We emphasize that the existence of multiple fields is crucial in avoiding ghosts associated with higher derivatives. Indeed the field redefinition (2.5) is only well-defined because it corresponds to shifting the field h by another field. On the other hand, another field redefinition of the form φ =φ + 1 Λ 5 (∂ µ ∂ νφ ) 2 would not be fully-invertible and would hide non-perturbative dofs.
The existence of such a field redefinition is simple to establish in this scalar-field toymodel but in a gravitational theory as in massive gravity or beyond Horndeski, it is highly non-trivial to perform these redefinitions and more systematic arguments can then be employed, including a full Hamiltonian analysis.
3.-Hessian: Another way to establish whether or not the theory (2.1) admits an Ostrogradsky ghost is to look at the Hessian of the dynamical variables as was argued in [11]. For simplicity and without loss of generality, we consider the ultra-local limit of the theory where both fields only depend on time. Since the field φ enters with up to three time-derivatives we define two new variables v and w which are set respectively toφ andv with two Lagrange multipliers. The resulting Lagrangian then reads (after appropriate integrations by parts) It is now obvious that h and w do not have an independent conjugate momentum and this statement can be written more clearly by establishing the rank of the Hessian H ab determined by where the Lagrangian is written in first order form and the Ψ a represent all the fields involved, Ψ a = {h, φ, v, w}. One can easily check that in this case the rank of the Hessian is two, corresponding to two dynamical dof. We stress that the vanishing of the determinant of the Hessian only indicates the existence of a primary constraint. To remove a full dof, a secondary constraint should also be present. However without parity violation there cannot be halfinteger number of dof, and therefore the existence of a secondary constraint is guaranteed in any theory which for instance preserves Lorentz invariance. In some of the theories which we will be looking at below, Lorentz invariance is broken and in these cases the existence of a secondary constraint is no longer necessary guaranteed.
4.-Hamiltonian analysis: Another direct and unambiguous way to establish whether or not the theory (2.1) admits an Ostrogradsky ghost is to perform a proper Hamiltonian analysis. However we emphasize that a Hamiltonian analysis is not the only way as we have shown through the previous arguments. Once again, for simplicity and without loss of generality, we consider the ultra-local limit of the theory where both fields only depend on time. Since the field φ enters with up to three time-derivatives we define two new variables v and w which are set respectively toφ andv with two Lagrange multipliers. The resulting Lagrangian then reads (after appropriate integrations by parts) It is now obvious that h and w do not have an independent conjugate momentum. In this language, λ 1,2 are auxiliary variables (Lagrange multipliers) while h, φ, v and w are (in principle) dynamical variables with conjugate momenta As we have seen, the Hamiltonian is a very clean way to establish the number of dof but not the unique way. Moreover we emphasize that there are other relevant questions besides the number of dof, such as the scale at which degrees of freedom may enter, that may be easier to see in a different language.

Setting vs Decoupling
As we have seen in the previous section, a theory which involves multiple fields can include higher derivatives without necessarily suffering from an Ostrogradsky instability. In what follows we shall emphasize the distinction between setting a variable to a given value and taking an appropriate decoupling limit. This distinction is important in the case of gravity where h may for instance symbolically play the role of the metric (or the metric fluctuation about flat spacetime).
First we point out that taking a healthy theory with auxiliary variables and making these auxiliary variables dynamical is not a consistent procedure and can very well change the number of dofs. This confusion is at the origin of the results of [46,47] where the terms proposed manifestly exhibit a ghost as shown explicitly in [45,[48][49][50].
Similarly, the opposite procedure of 'ignoring' the kinetic term of a field and setting it as fixed or considering it as an auxiliary variable would not be a consistent procedure and typically changes the number of dofs. For instance starting from the healthy theory (2.1) and simply setting for instance h = 0 would lead to the following sick theory which has a ghost at the scale Λ.
Rather if one wants to decouple the two fields h and φ it is instead possible to take a scaling limit which preserves the kinetic terms of the dynamical degrees of freedom while sending their interactions to zero. In the scalar field example (2.1), we see that the scalar fields are already canonically normalized and scaling the interactions between the two fields corresponds to sending Λ → ∞ so that the cubic term in (2.1) scales out. However in the same process we see that the last term also scales away and the resulting decoupling limit where the scalar field sees no interactions with h (or what plays the role of the gravitational field in this example, i.e. the scalar field sees flat spacetime in that limit) would be the theory which is manifestly well-defined: Notice that taking this decoupling limit does not 'kill' the interactions for φ itself that are present in P (φ, X).
Taking the decoupling limit argument from the other side, it means that if a theory is healthy, its decoupling limit ought to be healthy (i.e. exhibit the correct number of degrees of freedom), and since in the decoupling limit the different fields do not interact, the single field Ostrogradsky argument should be valid and the theory should exhibit no higher derivatives when the coupling between the different fields are scaled to vanish. In the context of scalartensor theories this implies that in the limit where gravity decouples the scalar theory should end up being a generalized Galileon [51], although taking that limit may not necessarily be trivial.
A potential loophole behind the previous decoupling limit argument is if the theory itself necessarily breaks Lorentz invariance (not necessarily in its formulation but in its allowed vacua). This is what happens in the new class of scalar-tensor theories presented in [28] and further in [29,30] which we review in what follows. One could then argue that by taking this road one could in principle allow the theory to have a preferred frame and hence allow theories which manifestly evade the Ostrogradsky ghost in a specific frame.
conventions of [29] and when overlapping, our results agree with those of [30]. Following [28][29][30] we will consider the most general covariant action which is at most quadratic in second order derivatives on φ (and with no derivatives higher than two). The action takes the form with the shorthand notation φ µ ≡ ∂ µ φ and φ µν ≡ D µ D ν φ, as well as X ≡ (∂φ) 2 . The functions G and A i (i = 1, · · · , 5) satisfy some relations which depend on the class of EST.
If G = 0 to start with, then one can always go to Einstein frame and set G ≡ 1 however the relevance of G becomes important when coupling to matter. An important aspect of these types of theories is therefore their stability when matter coupling to the metric g is included.
All of the Lagrangians L i are quadratic in φ µν and each one of them leads to higher derivatives in the equations of motion. However there are special combinations of the L i for which the equations of motion are second order in derivatives and in flat spacetime this corresponds to the quartic Galileon [51,52]. In addition when interaction with gravity is included i.e. the theory has multiple fields, higher derivatives in the equations of motion are not necessarily fatal as illustrated in the previous examples and explained in [11,12].
This possibility was successfully exploited in [21] where a family of 'Beyond Horndeski' Lagrangians which have no ghost (at least without considering couplings to matter) was established. Very recently, this possibility was pushed even further in [28] where is was shown even besides 'Beyond Horndeski', that there are other classes of EST for which the six functions A i , G satisfy special relations which allow the theory to enjoy a primary constraint which potentially removes the Ostrogradsky instability. For these new EST's, the functions P and Q can be freely chosen without affecting these conditions.
One important point to stress though, is that almost none of thew new consistent EST/DHOST's proposed in [28] and [29] admit a Lorentz-invariant vacuum. We are therefore dealing with a new class of scalar-tensor theories for which the vacuum necessarily breaks Lorentz invariance and for which the scalar field ∂ µ φ is necessarily either spacelike or timelike and can never flip between the two. Since in the context of [29] and [21,36,53] these theories were originally developed with cosmological applications in mind, it does make sense to think of them in vacua where the scalar field is timelike.

Classes of EST/DHOST's
First let us give a broad overview of the set of EST theories. We will reproduce the exact relations defining these classes in Appendix A. There are several different classes of EST which were identified in [28]. We will follow the naming scheme of [29].
The functions P and Q can be specified arbitrarily without introducing an Ostrogradsky instability. The EST theories are therefore defined by the relationships between the functions A i and G. The classes are • Minimal Cases: First there are, where G = 0 (dubbed 'Class M'). As we will see in sections 3.2 and 3.3, the tensor modes have vanishing gradients in this case and are thus ill-defined. Further if A 1 = 0 the tensors are not dynamical fields.
• Non-Minimal Cases: The rest of the models all involve a non-vanishing G and are called 'Class N'. Among those, This includes Horndeski and Beyond Horndeski. As shown in [29], this entire class can be generated by a field redefinition from Horndeski (up to a few subtle cases where the field redefinition may not be welldefined).
-Class N-II: defined by A 1 = A 2 = G/X. Just like in the minimal cases, this class is ill-defined as it has no propagating tensor modes, as we will discuss in section 3.3.
-Class N-III: Finally there are theories where A 1 = A 2 . There are two subclasses, N-III-i which is a rather special case for which A 1 = G/X that we discuss in section 4 and N-III-ii for which A 1 = G/X and which again has no propagating tensor modes as we discuss in section 3.3.
In [30] it was shown that each subclass below transforms into itself under a field redefinition of the form g µν → A(φ, X)g µν + B(φ, X)∂ µ φ∂ ν φ. (3.7) Each class (except M-III) has 3 free functions. Therefore, we expect to be able to remove 2 of these 3 functions with the above field redefinition (except potentially for certain special cases where the field redefinition might fail to be invertible).

Non-dynamical tensors
To gain some insight on the behaviour of some of these theories, we consider the following example (which belongs to the class M-III considered in [28][29][30], see (A.3) in Appendix A) For simplicity we may consider A 2 to be a constant although none of the arguments below are affected by this choice. Varying with respect to the metric and the scalar field we obtain the following strongly modified Einstein and Klein-Gordon equations: where we symmetrize the indices as V (µν) = 1/2(V µν + V νµ ).
From these equations of motion, it is now manifest that there are indeed fewer than four propagating dofs confirming the results of [28,29]. However those dofs are not split into two tensor modes and one scalar mode. Rather the theory only contains a scalar mode.
To see this explicitly, we can consider the following equation: As a result, in the vacuum, we need to have φ ≡ 0 and the rest of the dynamical equations are automatically satisfied. We emphasize that this is an entirely covariant statement and is hence background independent, the only assumption has been the absence of matter fields as well as setting the functions P (φ, X) = Q(φ, X) = 0. If on the other hand we had chosen to include the stress-energy tensor for other matter fields on the right-hand side of (3.10), those would have appeared on the right hand side of (3.11) and the rest of the equations of motion could be read as constraints for the first derivative of the metric, but there would still be no dynamical equations for the metric itself.
Having worked covariantly with a special example, we now turn to the more general classes of EST and perform an analysis of the tensor modes of FLRW to establish whether or not they are dynamical.

Propagating tensors
The EST theories were constructed to guarantee a primary constraint. Ideally this primary constraint will remove the Boulware-Deser ghost, leaving a healthy scalar sector, while leaving the dynamics of the tensor modes unchanged. However, a constraint analysis does not directly address this question. Therefore a natural first check is to understand how the tensor modes are affected by this constraint.
We now consider the general theory (3.1). So long as G = 0, this theory has an explicit Einstein-Hilbert term and we would expect tensor modes to be propagating, however as we shall see that this not always necessarily the case. Consider for instance the tensor perturbations h ij on FLRW so that the metric is g µν =ḡ FLRW µν + h µν . On this background solution, the quadratic Lagrangian for the tensor fluctuations is of the form where ∇ 2 is the three-dimensional spatial Laplacian, a is the scale factor, N the lapse and m eff is an effective mass term that depends on the background profile (i.e. the background scalar field as well as the scale factor and the lapse). The exact expression for the effective mass term is not relevant to the discussion.
This result is easy to establish in FLRW but is actually much more general and background independent. Indeed so long as the scalar field is timelike we are always (at least locally) allowed to work in a gauge where φ = t (unitary gauge), and one can easily check that the previous results hold: i.e. the kinetic term of the tensors is proportional to (G − XA 1 ) (where in unitary gauge X = g 00 ) and the gradient terms are always proportional to G. We therefore note two important cases • Case 1: G = XA 1 . This applies to the classes M-III, N-II, and N-III-ii. In this case the tensor modes lose their kinetic terms. As a result, the tensor modes are not dynamical. This is in complete agreement with the other examples explored earlier where the EST considered lost the dynamical tensor modes and consistent with the results presented in [28].
• Case 2: G = 0, A 1 = 0. This applies to the classes M-I and M-II and in particular to the isolated quartic Beyond Horndeski model which is not field redefinable back to Horndeski [25]. In this case the tensor modes still have a kinetic term, but no gradient terms. This implies that the tensor modes are infinitely strongly coupled and hence are ill-defined.
Note that if φ was instead spacelike, for instance φ = φ(x), then the opposite would occur (the coefficient of the gradient along x and that of the kinetic term of the tensors would switch) and the two previous cases would remain pathological.
As a result, we can conclude that N-III-i is the only remaining new class of theories which is potentially well-defined and not field-redefinable to a Horndeski theory using a conformal and disformal transformation. We emphasize that N-III-i may still be field-redefinable to a more standard theory where the equations of motion are manifestly second order via a more general set of field transformations but finding the precise form of this more generic field transformation is in general difficult without more insight on how the theory is behaving. In what follows we shall explore this new theory N-III-i further and derive some implications for the construction of effective field theory for dark energy (or for cosmology in general).

Implications for the Effective Field Theory of Dark Energy
In the case of the new class of theories N-III-i, the vacuum necessarily breaks Lorentz invariance and the theory cannot make sense if ∂ µ φ is null. So within the entire region where the theory is defined, ∂ µ φ should either be time-like or space-like and either unitary gauge (i.e. φ = t) or the gauge φ = x 1 can always be chosen everywhere (i.e. everywhere where the theory makes sense). Moreover since the vacua of these theories necessarily break Lorentz invariance it is natural that there will be a preferred frame where all the equations of motion will be manifestly second order. If, as in the case of the EST theories, we were primarily interested in theories relevant for cosmology (and potentially for dark energy), then focusing on backgrounds for which the scalar field is timelike is a justified restriction.
As a specific example of the new class EST N-III-i, we shall consider the following theory  Table 1. Different classes of Extended-Scalar-Tensor theories proposed in [28] and further in [29].
Only the two subclasses have well-behaved tensor modes. The first line of N-I corresponds to the models that are field-redefinable to Horndeski while the second line corresponds to the models for which such a field-redefinition would be singular. When isolated from Horndeski (i.e. with G = 0), Beyond Horndeski (BH) belongs to the first class M-I or the second line of N-I. When coupled to Horndeski, BH is field redefinable to Horndeski and belong to the first line of N-I. Besides Horndeski and its field redefinitions, all the models that have well-defined tensors necessarily break Lorentz invariance.
While the equation of motion for the scalar field appears to be forth order in derivatives, it is easy to see that there exists a specific combination of the Einstein's equations which is only second order in derivatives on the scalar field and at most first order in derivatives on the metric: which is consistent with the arguments presented in section 2.1. Since we are dealing with a gauge theory, to fully prove the absence of ghost here, one should also check that the rest of the equations of motion can be solved for the metric without involving time derivatives on the lapse and the shift. However since the existence of a constraint has already been proven fully covariantly in [28] and [29], for the rest of this argument it is sufficient to show that the constraint is indeed removing the Ostrogradsky ghost we may work for that in Unitary gauge. We also point out that since this model admits no Lorentz-invariant vacua, the existence of a secondary constraint is not necessarily guaranteed in principle, however this subtly is not an issue in this case.

Unitary Gauge
To get more insight from this interesting new class of theories we work in unitary gauge where we can set the scalar field to be φ = t. We further perform a (3 + 1) ADM split [54] so the metric is written as

3) and the standard Einstein-Hilbert term is given by (after integration by parts)
where γ = det(γ ij ), R 3 is the three-dimensional curvature built out of γ ij and only involve spatial derivatives, square brackets represent the three-dimensional trace with respect to γ ij and K ij is given by In terms of these ADM variables the Lagrangian density of special example of EST N-III-i given in (4.1) in unitary gauge is A remarkable feature of this theory is the emergence ofṄ 2 terms. Since this theory was shown to be free of the Ostrogradsky ghost, it has been proposed that such operators be allowed in the general effective field theory for the description of dark energy. We emphasize that this is not the correct logic to constructing appropriate effective field theories. The only reason why operators involvingṄ are allowed in this description is because all such operators are removable with a field redefinition. In other words these operators all disappear after an appropriate change of variable and we are thus left with a theory in a much 'more conventional' form where no operators of the formṄ enter the field theory description in unitary gauge.
Indeed, by performing the following change of variable: the Lagrangian density for (4.1) is simply and involves no time-derivative neither on the lapse nor on the shift. The lapse is hence manifestly an auxiliary variable that can be integrated out while the shifts are the Lagrange multipliers ensuring three-dimensional diffeomorphism invariance.

Field Redefinitions and Coupling to matter
The field redefinition (4.7) can be written covariantly as where B is an arbitrary function (indeed in unitary gauge the disformal transformation generated by B corresponds to a redefinition of the lapse). Setting B = 0 for simplicity we have Plg µν , (4.10) withX =g µν ∂ µ φ∂ ν φ, leading to which is manifestly well-behaved in unitary gauge as one can see from (4.8).
This result is neither new nor surprising. Indeed it was already pointed out in [36] and [55], that the existence ofṄ in unitary gauge does not necessarily imply the presence of ghost since those can be removable via a conformal and disformal transformations of the metric. However the points we would like to stress are the following: 1. Operators involvingṄ are actually only acceptable if they can be removed with a field redefinition and are hence not genuine operators that should be included in the effective field theory. In other words the existence ofṄ in unitary gauge should not distract from the fact that N should still remain an auxiliary variable and therefore arbitrary operators involvingṄ cannot be introduced in the effective field theory.
2. The metric for which allṄ disappear in unitary gauge (i.e.g µν = (−X) −1 g µν ) is the most natural metric matter should couple to covariantly and generic covariant couplings to the other conformally/disformally related metrics (for instance directly to the metric g µν ) could generically lead to ghosts as will be shown below.
Interestingly for the class of theories N-III-i, there are no theories for which the tensors maintain a standard kinetic term (i.e. G ∼ M 2 Pl , A 1 = 0) which do not involve time-derivative of the lapse in unitary gauge. However rather than reading this statement as opening up for the possibility of new operators in the effective theory of dark energy, we rather see this an indication that the tensor mode always manifest a peculiar kinetic structure (with a potentially time-dependent effective Planck scale).
Without coupling to matter the two theories (4.1) and (4.11) are of course equivalent. However when coupling to matter the distinction between the two frames takes more significance. This is not so dissimilar to the distinction between Einstein frame and Jordan frame in standard scalar-tensor theories. While physics does not depend on the frame, much more insight on the stability of the theory can be gained from working in Einstein frame. This is because in Einstein frame the standard energy conditions on the matter sector can be used directly to imply the stability of the theory while in Jordan from those can take a modified shape.
To see this distinction for the EST's it is now instructive to couple the theory [28,29] to external matter, i.e. other fields ψ. For instance let us couple the theory (4.1) to another scalar field χ which happens to have Galileon interactions, Assuming for instance that there are solutions in unitary gauge for which χ = χ(t) then in unitary gauge the theory can be written as After integrations by parts the terms going asχ 2χ /N 3 cancels that going asχ 3Ṅ /N 4 and the second line involves terms that go as [K]χ 3 but no terms that would involve a time derivative on the lapse (which of course precisely why Galileons lead to no ghost). So it is now clear that the presence ofṄ terms on the first line can no longer be removed by any field redefinition. To convince ourselves we could determine the determinant of the Hessian of the scalar modes defined as in [11] (up to order 1 coefficients), It is straightforward to check that det H EST ab = 0 which is related to the existence of a constraint and the reason why the time derivatives on the lapse can be removed via field redefinitions. However as long at the Galileon interaction is present (i.e. finite Λ), the determinant of this full Hessian is now non-zero: Since the determinant of H EST ab vanishes, the total determinant of H IJ is necessarily nonzero meaning that there are three scalar dynamical degrees of freedom corresponding to the Galileon, the scalar field φ and its Ostrogradsky ghost. This is a simple consequence to the fact that general matter should not couple covariantly to the metric g µν (for which time derivative of the lapse enter in unitary gauge) but rather to the metricg µν (for which no time-derivatives of the lapse enter unitary gauge).
In conclusion while out of a healthy theory it is always possible to generate an infinite number of new formulations via field redefinitions. These fields redefinition may be of the conformal and disformal form but could also take on a much more complicated form, involving for instance higher derivatives of one field as was presented in the example (2.5). For instance one could generate a new infinite class of higher derivative theories which would be free from Ostrogradsky instability by starting from Horndeski and performing a change of variable where the metric is sent to a new metric which may involve many derivatives of the scalar field, so long as it does not involve more than one derivative on the spatial metric and no time derivative on the lapse or shift. In the vacuum this infinite class of theories would be fully equivalent to Horndeski. However we stress that those field redefinitions matter when one couples to other fields. When seeking for consistent scalar-tensor theories relevant for cosmology it is hence 'advisable' to focus (when possible) on those theories which do not involve time derivatives on the lapse in unitary gauge so that any covariant coupling to matter fields will preserve the constraint and the number of dof.

Stability on flat FLRW
Before concluding, we quickly glance at the behaviour of this exciting new class of models (N-III-i) by studying the scalar fluctuations on flat FLRW. We start by looking at the theory (4.11) in the vacuum. One could in principle add an additional arbitrary function of P (φ, X) as well as a generalized cubic Galileon, however we start by assuming that the interactions in (4.11) are the dominant ones and then draw conclusions for the general theory. Then the Lagrangian for the background FLRW is simply where a is the scale factor, N the lapse and dots represent the time-derivatives. It is then clear that in the vacuum, despite the presence of the scalar field φ, the scale factor has to be constantȧ = 0 and the metric is Minkowski, while the profile for the scalar field remains undetermined signaling the existence of an accidental symmetry on that background. Note however that this background does break Lorentz invariance since φ ought to depend on time (indeed forφ = 0 the tensors would lose their kinetic terms and that background would hence be infinitely strongly coupled). Now turning to scalar fluctuations, and working in the gauge where N = 1 and where the metric fluctuations are of the form 17) and the scalar field is φ = φ 0 (t) + δϕ(t, x). The accidental symmetry present for the background manifests itself at the level of perturbations and the resulting perturbed Lagrangian is insensitive to Ψ: where we have made the change of variables 1 a 2 ∇ 2 δφ = 1 a 2 ∇ 2 δϕ + |φ 0 |Φ. After integrating out δφ we obtain the Lagrangian for single scalar degree of freedom as it should. As engineered in [28,29], there is no Ostrogradsky instability for that field. However in that specific case we see that the field itself has the wrong sign kinetic term and is a ghost. This is no contradiction with [28,29] and we confirm in this special example that the total of number of dofs is less than four as proved in [28,29] in all generality.
While the existence of a ghost instability on FLRW was shown for a simple example, the existence of instabilities -and particularly gradient instabilities -is actually generic to all class of N-III-i models proposed in [28,29] (which we recall is the only EST class of model for which the tensor modes are well-behaved and which is not field-redefinable to Horndeski). We show this generic statement in appendix B where we look at all classes of N-III-i models in the vacuum and without any P (φ, X) or a generalized Galileon Q(φ, X) φ. If an instability occurs in the absence of matter and standard kinetic term or cubic Galileon, then the latter can 'save the day' and restore stability if they dominate over the new interactions of the EST models. However if these EST interactions are never allowed to dominate, the fact that they are free of the Ostrogradsky instability is far less relevant since in the effective theory approach one can add any other interaction which may or may not carry an Ostrogradsky instability so long as these interactions do not dominate (see Refs. [44,56] for related discussions).

Summary
While the existence of higher time derivatives automatically involve an Ostrogradsky instability, there has been a recent revived interest in how the Ostrogradsky instability manifest itself in theories with multiple fields where the notion of higher derivatives can be more subtle as it can change under field redefinitions (or mixing between different fields). In this paper we have reviewed different methods one can diagnose the existence of absence of Ostrogradsky instability in a theory with two scalar fields, and indicate that similar arguments can be applied to more involved theories such as those involving gravity and additional scalar fields, as in massive gravity or scalar-tensor theories of gravity.
With this general arguments in mind we have analysed an exciting new class of Degenerate-Higher-Order-Scalar-Tensor theories (DHOST) proposed in [28], also known under Extended Scalar-Tensor-Theories (ESTs) investigated in [29] (see also [30]) and have discussed the implications for the effective field theory of dark energy. While it is true that in some frames, some of these theories involve time-derivatives on the lapse in unitary gauge, we emphasize that such operators are not free to enter independently in a consistent effective field theory. We also emphasize the importance of change of variables when it comes to coupling to matter and we study the stability of the new class of ESTs. We find that if we restrict ourselves to theories for which the tensors are well-behaved and the scalar is free from gradient or ghost instabilities on FLRW then the resulting ESTs reduce to Horndeski or field redefinitions thereof. However this is not to say that a subclass of the other theories could not provide an interesting phenomenology on more generic backgrounds, or when the standard kinetic terms, potential terms, or the generalized cubic Galileon interactions dominate.

A Explicit form of Extended Scalar Tensor Theories
In this appendix we review the class of DHOSTs/ESTs introduced in [28] and follow the conventions and notations of [29]. All the results presented here were derived in those papers and we simply present them here for completeness. Each class of EST has 3 different free functions (except for M-III which is special). In principle, we expect to be able to reduce two of those functions using a combination of conformal and disformal transformations.
First we recall that the general EST action is given by where the L i are given in Equation 3.2. We now look at special cases that amount to constraints we can impose on the A i . In what follows, G X denotes the partial derivative of G with respect to X.

A.1 Minimal Cases (G = 0)
M-I The free functions are A 1 , A 2 , A 3 (with the restriction A 2 = −A 1 /3). The other functions are given in terms of these by

M-II
The free functions are A 1 , A 4 , A 5 . The other functions are given by M-III This case is special and is defined by and A 2 , A 3 , A 4 , A 5 are free. Note that since G = 0 that A 1 = G/X for this case and therefore M-III is actually a special case of N-III-i.

A.2 Non-minimal Cases (G = 0)
N-I The free functions are G, A 1 , A 3 , subject to the condition that A 1 = G/X. The other functions are given by

N-II
The free functions are G, A 4 , A 5 . The other functions are given by N-III-i The free functions are G, A 1 , A 2 , subject to the conditions that A 1 = −A 2 and A 1 = G/X. The other functions are given by The free functions are G, A 2 , A 3 , subject to the condition that A 2 = −G/X. The other functions are Note that the constraint for A 4 is the same in both N-III-i and N-III-ii (taking into account that A 1 = G/X in the latter case).

B Perturbations for N-III-i about flat FLRW
In this appendix we study the interesting class of N-III-i models. As shown in [29], as well as in section 4.1, those models may involve time-derivatives of the Lapse in unitary gauge but those can always be absorbed via field redefinitions. In order to focus the analysis, it is therefore sufficient to restrict ourselves to the subclass of models which involves no timederivatives on the lapse in unitary gauge without needing to perform a field redefinition.
First we notice that any theory in the class N-III-i for which the function G is of the form (whereG could be an arbitrary function of the field), involves no time-derivatives on the lapse in unitary gauge 1 (without needing to resort to any field redefinition). In order to focus on the stability of the theory and avoid unnecessary change of variables, we therefore restrict ourselves to the sub-class of N-III-i EST theories for which we have Indeed even if we were dealing with a theory of the class N-III-i for which G is different to start with, we can always put G in the form of (B.2) via an appropriate conformal transformation and as shown in [29] such a transformation would map a N-III-i theory into another N-III-i theory, so setting G as in (B.2) amounts to no loss in generality (unless we started coupling to matter in which case those conformal transformations would matter, but as argued in section 4.13 when one couples to matter it is usually 'wiser' to do so to the frame which involves no time derivatives on the lapse in unitary gauge.) Notice that the functionsG(φ) is a priori arbitrary. However such a function can also always be set to unity by appropriate conformal transformation which only involve the scalar field and not its derivatives. Such a conformal transformation could generate a kinetic term for the scalar field but as mentioned in section 4.3 for the purpose of this discussion we focus on the case where only the EST interactions are present and we can therefore simply set G(φ) = 1. IfG (φ) = 0 is needed to ensure stability, this signals the fact that the kinetic term for φ one would get after the conformal transformation needs to dominate over the EST interactions, and there is hence less motivations in ensuring that the EST interactions are ghost-free if they are never allowed to dominate.
Following the previous arguments and those of section B, in the rest of this section we consider the theory in the vacuum and in the absence of P (φ, X) terms and generalized cubic Galileon. We denote by A the following function: A(φ, X) ≡ −2(−X) 3/2 (A 1 + 3A 2 ) , (B.3) as this function enters frequently in the analysis.
Starting with G = √ −X and arbitrary functions A 1,2 we see that the resulting theory is very similar to what was derived in section 4.3. At the background level, the Lagrangian on flat FLRW is The background equations of motion impose either: (a) a = const, (b) or in general admits another branch of solution with the following constraint: A ,X = 1 .

(B.5)
In the special case of section 4, A was linear in X so we could never be in that branch of solution since that equation could never be satisfied. However for generic functions A, the constraint (B.5) could admit consistent solutions.
(c) Finally, in principle we could also have the solutionφ 0 = 0 but the tensors would be ill-defined on that background, so we do not consider it any further.
In what follows, we analyse the two first cases one after the other.
(a) Starting with the branch where a = const in the background, we find that the resulting Lagrangian for the scalar field fluctuation (working in the same gauge as (4.17)) is (after integrating out the constraints), and that branch of solutions always exhibits a gradient instability (if G > 0, the instability is in the scalar mode and had we taken G < 0, the instability would have been in the tensor modes).
(b) Next we may look at the branch which satisfies (B.5) for the background. The constraints are slightly more intricate in that case but an be solved for 2 After integrating out Ψ, it is easier to perform the change of variable, δϕ = ȧ a |φ 0 |Φ + χ, (B.8) (notice that if we hadȧ = 0, we could go back to the first case scenario). Finally we can integrate Φ out. Its exact expression is not particularly illuminating, but the resulting Lagrangian density for the scalar degree of freedom χ is which is remarkably similar to that found in the previous branch of solution (B.6). Just like in that case, as long as 3φ 2 0 − 2 G−XA 1 A 1 +A 2 > 0, the time kinetic term would have the correct sign, however the gradients always enter with the wrong sign and just like for the previous branch, this class of theories suffer from gradient instabilities on flat FLRW.