Beyond Lovelock gravity: Higher derivative metric theories

We consider theories describing the dynamics of a four-dimensional metric, whose Lagrangian is diffeomorphism invariant and depends at most on second derivatives of the metric. Imposing degeneracy conditions we find a set of Lagrangians that, apart form the Einstein-Hilbert one, are either trivial or contain more than two degrees of freedom. Among the partially degenerate theories, we recover Chern-Simons gravity, endowed with constraints whose structure suggests the presence of instabilities. Then, we enlarge the class of parity violating theories of gravity by introducing new"chiral scalar-tensor theories". Although they all raise the same concern as Chern-Simons gravity, they can nevertheless make sense as low energy effective field theories or, by restricting them to the unitary gauge (where the scalar field is uniform), as Lorentz breaking theories with a parity violating sector.


I. INTRODUCTION
It is well known that the Einstein-Hilbert plus cosmological constant action is the unique diffeomorphism (diff) invariant action for a four dimensional metric, whose equations of motion (EOM) are at most of second order [1]. The metric field contains only two physical degrees of freedom, corresponding to a massless spin-2 field. Any other action leads to higher order EOM (or trivial ones).
According to Ostrogradsky's analysis [2,3] higher order EOM may signal, under certain hypotheses, the presence of instabilities which generically render the theory pathological. However, recent examples of theories (breaking the above hypotheses) show that having higher order EOM is not equivalent to having ghost(s) propagating in the theory. In other words, although it is clear that the presence of an Ostrogradsky mode necessarily implies (by definition) higher order Euler-Lagrange equations, the reverse is not true.
A prime example is the one of scalar-tensor theories beyond Horndeski which were introduced in [4,5] and also studied in [6][7][8][9]. Later, these theories were further understood and generalized under the degeneracy criterion [10]. Basically, a higher order scalar-tensor theory still propagates 3 degrees of freedom (DOF) if, in addition to the usual Hamiltonian and momentum constraints associated with diff invariance, it admits another primary constraint 1 . These theories, denoted as Degenerate Higher Order Scalar-Tensor (DHOST) theories (or also Extended Scalar-Tensor (EST) theories), were introduced in [10] and further analysed in [11,[14][15][16]. A complete classification up to cubic order in second derivatives of the scalar field is given in [17]. Their cosmological perturbations, in the framework of the Effective Theory of Dark Energy (see e.g. [18]), are studied in [19]. Analogously, similar constructions for vector interactions were introduced in [20] and a classification for degenerate vector-tensor theories up to quadratic order was given in [21].
The Ostrogradsky problem and the notion of degeneracy (necessary to avoid such a problem) were systematically studied in the context of classical mechanics in [22,23] and later in the context of higher order field theories without gauge symmetries in [12]. A similarly rigorous analysis however is still missing for field theories that possess gauge symmetries, such as gravity theories enjoying diff invariance. In this paper we attempt a first step in this direction.

A. Ostrogradsky instabilities and constraints
Before presenting the content of our paper, let us briefly discuss our present understanding concerning the presence of Ostrogradsky modes in a field theory. We follow the results of [12,22,23] and underline some difficulties to extend them to diff invariant theories (see also [24] as an alternative way to deal with Ostrogradsky modes). In general there is a potential Ostrogradsky mode for each field in the action appearing with second time derivatives. In order to remove all of them, as a first requirement, we need a set of primary constraints equal in number to the fields that appear with second time derivatives. In case we have fewer primary constraints, then Ostrogradsky modes, at least as many as the number of missing primary constraints, propagate in the theory.
Although these modes lead to instabilities in absence of extra symmetries, in the case of diff invariance for instance, they can be healthy. A well known example is f (R) where the higher derivative mode described by the trace of the 3-dimensional metric is left unconstrained and leads to a propagating extra degree of freedom. In this case however this mode is perfectly healthy as can be seen by reformulating the theory as a standard scalar-tensor one with no higher-order derivatives at all.
Having the primary constraints however is not enough, each of them has to generate a secondary constraint, when evolved over time, in order to remove the Ostrogradsky mode associated (we do not discuss here the very special case where the primary constraints are first-class). It is indeed upon exploiting the secondary constraint that the linear momentum in the Hamiltonian -the characteristic signature of Ostrogradsky instability -is removed. When a primary constraint does not generate a secondary one, then the Hamiltonian is still left unbounded from below rendering the theory unstable. Again, also this point could have loopholes when applied to gauge invariant theories, although we do not know any explicit counter-example showing its failure. Therefore, bearing in mind all these subtleties that certainly deserve a deeper investigation, in this paper we retain a conservative approach and also consider theories with fewer primary constraints (as in the case of f (R)) but, if there is not a secondary constraint generated by each primary one, then the theory is potentially unhealthy.
B. From degenerate metric theories to "Chiral Scalar-Tensor theories" In this paper we begin exploring higher order, diff invariant, pure metric theories in a four dimensional space-time which are degenerate and discuss whether they appear to be free (or not) of Ostrogradsky modes. We restrict ourselves to the case where the Lagrangian depends at most on second derivatives of the metric. In this context we recover Chern-Simons gravity [25] as a partially degenerate theory 2 and analyse its number of degrees of freedom in full generality. Inspired by this parity violating theory of gravity, we extend our analysis and construct new scalar-tensor theories with the same feature. We dub these theories "Chiral Scalar-Tensor theories". Although they might be pathological in their covariant form, the Ostrogradsky modes disappear in the unitary gauge (where the scalar field depends on time only) and the restricted version of these theories therefore makes sense as Lorentz breaking theories similar to Horava-Lifshitz.
The paper is organised as follows. In Section II, we study four dimensional diff invariant pure metric theories that are degenerate. We start with fully degenerate Lagrangians and continue with a large class of partially degenerate theories. In Section III, we introduce the notion of chiral scalar-tensor theories and find new classes of theories which violate parity and propagate only three degrees of freedom in the unitary gauge. We draw our conclusions in Section IV.

A. Action and ADM decomposition
We consider the general action governing the dynamics of the four-dimensional metric g µν . This action is assumed to depend at most on the second derivatives of the metric and, due to Thomas' replacement theorem [26] (see also [27] for a modern version), the derivatives of the metric enter the Lagrangian through the Riemann tensor R µνρσ ; this also guarantees diffeomorphism (diff) invariance. The action can thus be constructed by contracting the three following building blocks: the Riemann tensor, the metric and the Levi-Civita tensor 3 ε µνρσ . In order to perform a Hamiltonian analysis of the system, we need to separate space and time. We therefore foliate the space-time manifold M as Σ × R and introduce the unit time-like vector n µ orthogonal to Σ, thus satisfying the normalization condition n µ n µ = −1. This induces a threedimensional metric on Σ defined by γ µν ≡ g µν + n µ n ν . Let us then consider the time direction vector t µ ∂/∂x µ ≡ ∂/∂t (i.e. t µ = (1, 0, 0, 0)) associated with a time coordinate t that labels the slicing of spacelike hypersurfaces. One can always decompose such a vector as t µ = N n µ + N µ , thus defining the lapse function N and the shift vector N µ orthogonal to n µ . The time derivative (indicated with a dot) of spatial tensors is defined as the spatial projection of their Lie derivative with respect to t µ . In the following we will use latin indices (i, j, k, · · · ) to denote 3-dimensional objects living on the hypersurface γ µν .
The ADM decomposition of the metric gives 2) and the components of the Riemann tensor in terms of the ADM variables are (see for instance [28] where a Hamiltonian analysis of f (Riemann) was presented) Note that the Levi-Civita tensor is defined by ε µνρσ = ǫ µνρσ / √ −g where ǫ µνρσ is the fully antisymmetric symbol which takes value in {−1, 0, +1}.
The components on the LHS of the above equations are the bulk curvature components projected onto the surface Σ. We have used the notation (3) R ijℓm for the three-dimensional Riemann tensor, L N for the Lie derivative along N i , D i for the covariant derivative compatible with γ ij and K ij for the components of the extrinsic curvature tensor defined by Second time derivatives appear only for the spatial metric components γ ij , and only in R ij via the time derivative of the extrinsic curvatureK ij . Notice that the same term is also the only one which contains time derivatives of the lapse and shift. Therefore, according to the Ostrogradsky analysis, for a generic Lagrangian one could expect as many Ostrogradsky modes as the number of components of γ ij .
A necessary (but clearly not sufficient) condition to get rid of all of them, or part of them, is that the theory has constraints in addition to the usual constraints associated with diff invariance. It means that the Hessian matrix of the Lagrangian with respect to the second time derivatives of the spatial metric is degenerate 4 . According to the rank of the above matrix we will have a different number of primary constraints. In this paper we study in detail only two cases: Lagrangians associated with a Hessian matrix of rank 0 (fully degenerate case) and of rank 1 (partially degenerate case). We will also briefly discuss in Appendix B the case of larger ranks, leaving the detailed analysis for future works. If, by contrast, the Hessian matrix A is invertible, then the theory propagates 8 degrees of freedom, some of them being necessarily ghosts (see [29] for the linear analysis).

B. Fully degenerate theories
In this section we study all the theories that satisfy A ij,ℓm = 0, which implies that their Lagrangian is linear inγ ij . Requiring the Lagrangian to be linear in second time derivatives, means that the corresponding equations of motion can be at most of third order.

Degenerate Lagrangians
In a pioneering paper [30], Lovelock already classified all the possible Lagrangians satisfying this condition and showed that there are only 3 independent terms in addition to the usual Einstein-Hilbert Lagrangian R: where ⋆ holds for the Hodge dual The Ricci scalar (R) gives second order field equations; the Gauss-Bonnet (GB) and the Pontryagin (P) terms are topological invariants (in 4 dimensions) and their variation yields no term to the field equations [31]. The last curvature invariant (C) is the only one whose equations of motion are of third order 5 . Linearity inγ ij translates into linearity in R ij , given in (2.3), and from this respect it is easy to understand Lovelock's result. Indeed the Ricci scalar is the only density which is linear in the Riemann tensor while at the quadratic and cubic levels we need to make use of the ε tensor to avoid non-linearities in R ij , which leads uniquely to GB, P and C. Finally, even using the ε tensor, it is not possible to avoid at least quadratic terms in R ij when one considers more than 3 powers of the Riemann tensor.
Therefore, in addition to GR, there is only one other non trivial fully degenerate Lagrangian, namely C. Since this term leads to third order EOM it has never attracted much attention in the literature and a canonical analysis to count its number of DOF, as well as their stability, is still missing, to the best of our knowledge. In the rest of this section we partially fill this gap and perform the Hamiltonian analysis of this theory.

Hamiltonian analysis of C
Using equations (2.3), (2.4) and (2.5), the action can be rewritten in the form where the 3-dimensional rank 2 density Π ij is defined by 12) and the "potential" V contains all the other terms from the decomposition of (2.10) that do not involveK ij . The explicit form of V is quite long and we do not reproduce it here, as only some of its general properties will be useful in the following. An important property is that, after several integrations by parts, we can rewrite the potential V as where V 0 and V i , like Π ij , depend only on γ ij , (3) R ijℓm , K ij and their spatial derivatives and do not depend explicitly on the lapse function and shift vector, which enter only through the extrinsic curvature. This can be seen as a consequence of the diff invariance of the action (2.10).
Since the action involves second time derivatives of the spatial metric γ ij inK ij , it is convenient to consider the following equivalent form (2.14) where we have introduced the new 3-dimensional symmetric tensors Q ij and p ij in order to make the Lagrangian depend explicitly on first time derivatives only. The equations of motion for p ij enforce the condition Q ij = K ij , recovering therefore the original action (2.11). Note that in (2.14), Π ij , V 0 and V i now depend on Q ij and not on K ij . In this form, the action has a linear dependence on the lapse and the shift, which clearly appear as Lagrange multipliers in this reformulation. Indeed, expanding the last term in (2.14) and integrating by parts, we get where We are now ready to perform the Hamiltonian analysis starting in a phase space endowed with the following 16 pairs of conjugate variables denotes the Dirac delta distribution on the space hypersurface. The fact that γ ij and p ij are conjugate variables is manifest from (2.15).
As the action does not involve time derivatives of the lapse and the shift, we recover the usual four primary constraints which, in analogy with the Hamiltonian formulation of GR, are closely related to the diffeomorphism invariance of the theory. Furthermore, since the Lagrangian is fully degenerate, computing the conjugate momenta P ij leads to 6 additional primary constraints Hence, the total Hamiltonian of the theory takes the form where ξ µ and ξ ij are Lagrange multipliers that enforce the primary constraints (2.18) and (2.19).
Requiring the time conservation of (2.18) leads to the following secondary constraints These constraints are closely related to the usual Hamiltonian and momentum constraints, which generate space-time diffeomorphisms. More precisely, they are first class up to the addition of the other second class constraints. On the other hand, requiring the conservation in time of (2.19) leads to the equation Furthermore, the Dirac matrix between the constraints χ ij , defined by To conclude, let us notice that H 0 and H i in (2.16) are linear in p ij and therefore the Hamiltonian appears unbounded from below. This is the characteristic feature of Ostrogradsky instabilities, indicating that the extra 3 DOF are likely to be ghosts. These extra DOF could be eliminated if secondary constraints were present, thereby removing the linear dependence of the Hamiltonian on the momenta associated with the higher derivative modes. In the present case the absence of secondary constraints suggests that the extra modes are not stable. In Appendix A, we confirm the instability of (2.10) at linear order in perturbation theory.

C. Partially degenerate theories
In this section, we study theories with a Hessian matrix (2.7) of rank 1. A straightforward method to construct models of this type simply consists in considering generic functions of the fully degenerate Lagrangians 6 studied in the former section Indeed, the linearity argument concerningγ ij ensures that, when f ′′ = 0, the kernel of A is of co-dimension 1, which means that the theory admits (6 − 1) primary constraints. Since we have already discussed the potential problems of the C term, we will not consider f (C) theories here and we will concentrate our attention on f (GB) and f (P ), the case of f (R) being already well known. In Appendix B we also give a short discussion about theories with a Hessian matrix (2.7) of rank higher than 1.

General discussion
The Lagrangians f (R) and f (GB) are well known to define theories that propagate 3 DOF and are equivalent 7 to scalar-tensor theories within the class of Horndeski. This fact has been known for a long time for f (R) (see e.g. [32,33] for reviews on f (R) theories): the action can be rewritten as a Brans-Dicke-like theory. The equivalence of f (GB) with Horndeski is more recent and was established only at the level of the equations of motion 8 [34]. Finally, the last theory, f (P ), can be related to Chern-Simons gravity [25], which has been much studied in the literature (see [35] for a review). Indeed, repeating the same procedure that transforms f (R) into a scalar-tensor theory (see for example [32,33]), action (2.24) can be rewritten as where U (φ) is a potential given by The reformulation (2.25) will be useful for our analysis of the various cases considered below.

f (GB) theory
In order to exploit our previous analysis of fully degenerate theories, it is convenient to study f (GB) in the equivalent form (2.25), i.e.
where the potential U will be ignored in the following, as its presence does not modify the conclusion.
Using the ADM decomposition of section II A, we can apply the same strategy used for studying the C term in section II B 2. All we need to do is to construct, for the Gauss-Bonnet action, the analogs of Π ij and V defined previously, and introduce an extra pair of conjugate variables to account for the scalar field φ, i.e. (2.28) For the action (2.27) the total Hamiltonian takes the form where and now V 0 , V i involve also the scalar field φ and its space derivatives. We avoid to report their explicit form here, they are however straightforward to compute. We have also introduced in the total Hamiltonian the Lagrange multiplier λ to enforce the new primary constraint π φ ≈ 0, since the action (2.27) does not contain any kinetic term for the scalar field. Notice that it is the linearity in p ij of the Hamiltonian (2.30) which is potentially dangerous.
Concerning the study of the stability under time evolution of the primary constraints π µ ≈ 0, the same arguments of section II B 2 apply, and they generate the secondary constraints H µ ≈ 0. They are first class (up to adding second class constraints).
The degeneracy of the Hessian matrix leads to 6 primary constraints χ ij given by where Π ij are obtained from the GB term Making use of the explicit form of Π ij in (2.32), it is easy to compute the Dirac matrix between the constraints χ ij and show that it identically vanishes Let us now see whether secondary constraints arise. First, requiring the stability under time evolution of (2.31) leads to Taking the trace of (2.34) enables one to determine the Lagrange multiplier λ in terms of the phase space variables. The traceless part gives 5 secondary constraints. We then consider the constraint π φ ≈ 0 whose time evolution yields which can be solved to write the trace of ξ ij in terms of the canonical variables and the remaining 5 components of the traceless part of ξ ij . Finally, the evolution of the 5 secondary constraints given by the traceless part of (2.34) determines the traceless component of ξ ij and the analysis stops.
Therefore, starting with 32 (metric) + 2 (scalar) canonical variables, and having 8 first class and 12 (7 primary + 5 secondary) second class constraints, we end up with a total of 3 DOF, which is compatible with the equivalence of f (GB) with a scalar-tensor theory. However, here, the 6 primary constraints (2.31) coming from the higher derivative modes in the action, generate only 5 secondary constraints and the Hamiltonian still remains linear in the trace of the momentum p ij . This seems to indicate that the theory possesses one Ostrogradsky mode. Note that this Ostrogradsky mode could be removed by adding to the action (2.27) a kinetic term for the scalar field so that the primary constraint π φ ≈ 0 disappears from the total Hamiltonian and the 6 primary constraints (2.31) generate 6 secondary constraints. This does not change the total number of DOF, but makes the theory Ostrogradsky free by removing any linear momentum dependence. This suggests that only f (GB) supplemented with an explicit kinetic term for the scalar field is classically equivalent to some Horndeski theory 9 .
Notice that the same argument a priori seems to apply to f (R) too, suggesting the (erroneous) conclusion that f (R) needs an explicit kinetic term for the scalar field in order to be ghost free. This is obviously not the case and we believe the reason lies in the very special structure of this theory. Indeed a conformal transformation, performed on the equivalent formulation (2.25) of the theory, removes the coupling between the metric and the scalar field which acquires its own kinetic term. A similar transformation does not seem to exist for f (GB).

f (P ) -Chern-Simons gravity
The reformulation (2.25) shows that f (P ) is related, up to a potential term, to non-dynamical Chern-Simons, whose action reads (2.36) Chern-Simons modification of gravity is usually seen as an effective field theory (EFT), truncated at quadratic order in the curvature, in a low-energy expansion of a more fundamental theory [35]. Indeed, since it leads to equations of motion with higher-order derivatives, it is expected to contain Ostrogradsky modes if treated as a complete theory (i.e. not as a perturbative expansion).
For the so-called dynamical Chern-Simons gravity (where also an explicit kinetic term for φ is present), [36] showed that there is at least a ghost instability above a certain momentum cutoff and [37] provided evidence that the theory does not admit a well-posed initial value formulation (see also [38] for numerical simulations using the perturbative approach). However, to the best of our knowledge, a proper canonical analysis of this theory has never been performed in order to count the number of DOF at the nonlinear level. In the following, we present a canonical analysis of non-dynamical Chern-Simons gravity (2.36), then we add a potential U to study f (P ). Finally, we also add explicitly a kinetic term for φ in order to analyse the dynamical Chern-Simons gravity.

a. Non-Dynamical Chern-Simons gravity
Using the decomposition of the Riemann tensor given in (2.3), (2.4) and (2.5), and the equivalent first-order formulation of the action, the Pontryagin tensor gives and where we have used ε ijk = ǫ ijk / √ γ.
The above Π ij and V satisfy two important properties, related to the invariance of the action (2.36) under conformal transformations. First Π ij is traceless, meaning that the action does not contain time derivatives of the trace of Q ij , Q ≡ γ ij Q ij . Second, one can check that the dependence of the potential V on Q is at most linear, meaning that Q effectively plays the role of a Lagrange multiplier, similarly to the lapse and shift. It is therefore useful to explicitly decompose any tensor into its trace and traceless components: we drop the indices to indicate the trace and use a tilde to denote the traceless part.
The total Hamiltonian takes the form where (2.42) The 6 primary constraints χ ij ≈ 0 (2.31) can be divided into trace and traceless parts: The time evolution of the primary constraint P ≈ 0 leads to the secondary constraint The evolution of the constraints π µ ≈ 0, yields, as usual, the secondary constraints Before investigating the time evolution ofχ ij ≈ 0, it is useful to compute the Dirac matrix associated with these constraints, which is given bỹ where φ m ≡ D m φ. Assuming that φ i is non zero, one sees that the symmetric matrix (φ k φ ℓ ) is a null eigenvector of the Dirac matrix, i.e.∆ ij,kℓ φ k φ ℓ = 0 . (2.47) At this stage, it is useful to introduce the projector orthogonal to φ î the projection orthogonal to φ i of any 3-dimensional tensor will be denoted with a hat in the following.
Let us now return to the constraint analysis. Evolving the 5 primary constraintsχ ij and taking the projection along the direction (φ i φ j ), one gets which can be solved in general to determine the Lagrange multiplier λ. The projection alongγ ij determines the 4 Lagrange multipliersξ ij in terms of the canonical variables as the matrix∆ ij,kℓ is invertible. Finally, the time evolution of the constraint π φ ≈ 0 yields the componentξ ij φ i φ j of the Lagrange multipliersξ ij asξ At this point we are left with the secondary constraints H 0 , H i and H c and the Lagrange multipliers ξ µ and ξ are still undetermined. It is easy to see that the primary constraints π µ ≈ 0 and P ≈ 0 have vanishing Poisson brackets with all the other constraints, i.e. they are first class. By contrast, it is a non-trivial task to show that their associated secondary constraints H 0 , H i and H c are also first class (up to the addition of second class constraints) and that the algebra closes. However, it is natural to expect that this is indeed the case since these constraints are associated with symmetries, namely the diffeomorphism and the conformal invariance of the action (2.36), and we will assume so in the following 10 . In summary, we thus have 32 (metric) + 2 (scalar) canonical variables constrained by 10 first class and 6 second class constraints, leading to [34 − (10 × 2) − 6]/2 = 4 degrees of freedom.
This counting applies only to the action (2.36). If one adds a potential U (φ), as is necessary for f (P ), or the standard Einstein-Hilbert (EH) term, as in the case of Chern-Simons modified gravity, the total Lagrangian is no longer conformally invariant. As a consequence, the constraints P and H c become second class. This gives one extra DOF in comparison with the above analysis, leading to a total of 5 degrees of freedom for f (P ) or for Chern-Simons modified gravity.
As in the case of the fully degenerate theory (2.10), the primary constraintsχ ij , associated with the higher derivative modes in the Lagrangian, do not generate secondary constraints, leaving therefore the Hamiltonian linear in the momentap ij . According to our discussion in the introduction, this potentially signals that the extra 2 or 3 DOF (depending on whether there is a conformal invariance or not) are Ostrogradsky modes and the theory is likely to be unstable. Note however that these modes could be ignored if one considers the Chern-Simons term as a perturbative correction to General Relativity in the EFT spirit.
The absence of secondary constraints generated byχ ij comes from the non vanishing of the Dirac matrix (2.46), due to the presence of the spatial derivatives of Q ij in (2.37). In the so called unitary gauge, i.e. where the scalar field is by construction uniform, the Dirac matrix (2.46) vanishes and the evolution of the 5 primary constraintsχ ij leads to 5 extra secondary constraints, removing all the Ostrogradsky modes. Considering therefore the unitary gauge expression of CS as a Lorentz breaking (different) theory, saves the day and represent a healthy parity violating extension of Horava-Lifshitz involving alsoK ij .

b. Dynamical Chern-Simons gravity
To conclude our analysis of CS gravity, let us briefly discuss the case of dynamical CS gravity, defined by the action (2.36) supplemented with a kinetic term for the scalar field φ of k-essence form for instance where F is an arbitrary function with a non-trivial dependency on X (i.e. F X = 0). In that case, the primary constraint π φ ≈ 0 disappears from the Hamiltonian analysis, as a consequence we set λ = 0 in the total Hamiltonian (2.39) and equation (2.49) now becomes a secondary constraint Remarkably, the Poisson brackets of this new constraint (2.52) with P and H c do not vanish in general, making these latter second class and not anymore first class constraints. This is not surprising since the kinetic term of φ breaks in general the invariance under conformal transformations of the original action (2.36). From the evolution of H c and the evolution of (2.52) it is now possible to fix the componentξ ij φ i φ j of the multipliersξ ij and the last multiplier ξ that remained undetermined in the non-dynamical case. The analysis therefore ends up with 6 (primary) + 2 (secondary) second class constraints, in addition to the 8 first class constraints due to the diff invariance, resulting in a total of [34 − 8 × 2 − 8]/2 = 5 DOF. We thus obtain in the dynamical case as many DOF as in non-dynamical Chern-Simons gravity plus the EH term 11 . In the present case, some of the primary constraints associated with the higher derivative modes in the Lagrangian do not lead to secondary constraints. This implies that the Hamiltonian is left linear in the componentsp ij of the momentum p ij , making the theory probably unstable. 11 If the EH term is not included, in the very special case of a conformally invariant kinetic term F (φ, X) = f (φ)X 2 in (2.51), we still lose the primary constraint π φ ≈ 0, but the constraints P ≈ 0 and Hc ≈ 0 remain first class due to the preserved conformal invariance of the action. In addition we have the 5 primary constraintsχ ij and the secondary constraint (2.52) that are second class. As a consequence, the total number of DOF is [34 − (10 × 2) − 6]/2 = 4.

III. CHIRAL SCALAR-TENSOR THEORIES
Inspired by the analysis of f (P ) and Chern-Simons gravity, in this last section we entertain the possibility of constructing healthy scalar-tensor theories, i.e. without Ostrogradsky modes, featuring parity violating effects. For this purpose it is essential that the action involves an odd number of Levi-Civita tensors ε µνρσ and, for simplicity, we will restrict our attention to the cases where there is only one. CS action (2.36) is the simplest scalar-tensor theory of this kind one can write down, but, given the structure of constraints revealed in the previous section, it is potentially unstable. It is possible however to generalise this action by including first and second derivatives of the scalar field: φ µ ≡ ∂ µ φ and φ µν ≡ ∇ µ φ ν . We will explore two types of extensions. In the first case, we consider Lagrangians involving only first order derivatives of φ, which implies that the Lagrangians must be at least quadratic in the Riemann tensor. In the second case, we consider terms that are linear in the Riemann tensor while linear or quadratic in second derivatives of φ.

A. First derivatives of the scalar field only
With only first derivatives of the scalar field, one cannot construct a Lagrangian that depends linearly on the Riemann tensor (and on the Levi-Civita tensor). With two Riemann tensors, one finds four independent terms of this type: where we recall that X ≡ φ µ φ µ and P ≡ ε µνρσ R ρσαβ R αβ µν is the Pontryagin term. In the following, we will analyse the linear combination where a A (φ, X) are a priori arbitrary functions of φ and X.

Brief Hamiltonian analysis
To perform the Hamiltonian analysis of the action (3.2) we can rely on the same tools used in the previous sections, i.e. the ADM decomposition of equations (2.3), (2.4) and (2.5) together with the first order reformulation of the action. The only new ingredient we need is the decomposition of φ µ , namely One must now take into account two velocity terms, i.e.Q ij andφ. Whereas the presence of the ε tensor prevents terms quadratic inQ ij , it allows mixed terms inQ ij andφ. Therefore, in order to have the 6 primary constraints of the form (2.19) that are the first necessary (but not sufficient) condition to remove the Ostrogradsky modes, the functions a A need to be tuned to avoid this coupling. This requirement leads to the conditions that the a A depend on φ only, and One is left with only one free function, say a 2 ≡ f (φ), and the action (3.2) becomes Thus, by construction, we get the primary constraints of the form Notice that Π ij in the above expression turns out to be traceless and it is therefore useful to decompose these six primary constraints into trace and traceless parts, where we used the same notations as in (2.43).
Remarkably, the tuning (3.4) not only leaves the action linear inQ ij , but also inφ. Therefore, we get one additional primary constraint: where S is the action (3.5). In other words, ϕ contains all the terms proportional toφ in the action and its explicit form is not needed. Using the expression (3.6) one can compute the Dirac matrix∆ ij,kℓ (x, y) = {χ ij (x),χ kℓ (y)} between the primary constraintsχ ij and find out that it is not completely degenerate. This means that not all the 5 primary constraintsχ ij , associated to the higher derivative modes in the Lagrangian, lead to secondary constraints. Hence, the action (3.5) is expected to contain Ostrogradsky modes.
In this case, even considering the restriction to the unitary gauge, where the scalar field is assumed to depend on time only, does not help since the action (3.5) identically vanishes. However, one can go back to the full action (3.2) and study it in the unitary gauge, as we do just below.

Unitary gauge
Let us analyse the action (3.2) in the unitary gauge. We express it in a first order formulation, and we still get primary constraints of the form (3.7) with now the following expression for the (traceless) Π ij tensor In contrast with the CS term in the unitary gauge, the five primary constraintsχ ij ≈ 0 do not generate 5 secondary constraints, because of the presence of the lapse function in the denominator of (3.9). As a consequence, the only way out is to tune the functions a A in order to eliminate the Π ij tensor (3.9) itself, namely removing any higher order derivative in the action. This requirement gives the condition (3.10) Solving for instance for the function a 3 , we obtain the following action The action (3.11) does not involve any higher order time derivative of the metric and in this form represent a parity breaking extension of Horava-Lifshitz gravity. It is indeed clear that it propagates [20 − (6 × 2) − 2]/2 = 3 DOF, exactly as does the Einstein-Hilbert action augmented with the CS term in the unitary gauge. However, its phenomenology should be completely different since, in the action (3.11), we do not have any higher order time derivative of the metric, but only higher order space derivatives.

B. Including second derivatives of the scalar field
In this final part, we enlarge the scope of our exploration to include theories with second derivatives of the scalar field in the action. There is only a single Lagrangian that is linear in both the Riemann tensor and the second derivative of the scalar field φ, namely At the next level, i.e. still linear in the Riemann tensor but quadratically in the second derivative of φ (up to quadratic order in first derivatives of φ), we find 6 independent Lagrangians: We will not investigate here terms of higher order and thus simply consider the general linear combination of the above terms where b A (φ, X) are functions of the scalar field and its kinetic term.

Brief Hamiltonian analysis
The action (3.14) now involves second time derivatives of the scalar field, in addition to the second time derivative of the spatial metric γ ij . As a consequence, it is useful to perform a first order reformulation of the action also for taking into accountφ, in the same way as we do for γ ij (see section II B 2). For this purpose, let us introduce a one-form A µ that will replace φ µ in the Lagrangians (3.12) -(3.13) and add to the action (3.14) the following constraint through a Lagrangian multiplier [10] A µ − φ µ = 0 . (3.15) The one-form A µ decomposes in its time (A * ) and spatial (Â µ ) projections and using the fact that ∇ µ A ν = ∇ ν A µ , we get the following ADM decomposition for the derivative of A µ [10] Substituting this decomposition to the action (3.14), one obtains the first order form that is needed to start the Hamiltonian analysis. At this stage we have a priori 6 Ostrogradsky modes described by the Q ij variables and 1 additional Ostrogradsky mode described by A * . In order to get rid of all of them, the generalization of the Hessian matrix (2.7) that includesȦ * must be fully degenerate, which means that the action must not contain terms quadratic inQ ij orȦ * . Because of the ε tensor, the action is automatically devoid of terms quadratic inȦ * , but it does contain mixed termsȦ * Qij in general. The latter disappear if one imposes the conditions which we now assume. In this case, one gets 7 primary constraints of the form where α contains all the terms proportional toȦ * in the action. The Dirac matrix ∆ ij,kℓ (x, y) = {χ ij (x), χ kℓ (y)} between the constraints χ ij turns out to be non-degenerate and no choice of functions, except the trivial one, can make it vanish. Therefore, the time evolution of the 6 primary constraints χ ij determines the Lagrange multipliers ξ ij and no secondary constraint is generated. In conclusion, the action (3.14) with conditions (3.18) contains 3 Ostrogradsky modes in the metric sector.

Unitary gauge
Let us now examine the restriction to the unitary gauge of the action (3.14) with the conditions (3.18). In the unitary gauge, the scalar field depends only on time and therefore the components of A µ reduce to while the free functions b A depend now on t and N only. The Π ij tensor becomes traceless and the Dirac matrix∆ ij,kℓ (x, y) simplifies to However, the above condition also removes all theQ ij terms in the action, which then reduces, in the unitary gauge, to The theory defined by (3.23) propagates only [20 − (6 × 2) − 2]/2 = 3 DOF. This is the same number of degrees of freedom as found for (3.11) but one can note that the present action involves also space derivatives of the lapse function, due to the higher order derivatives of the scalar field.
In principle, one could apply the same type of analysis for more complicated Lagrangians. Our results for the "simplest" Lagrangians do not lead us to believe that one would find a theory devoid of Ostrogradsky ghosts in its fully covariant version. So far, one can conclude from our exploration that the theories we already studied should be considered as low energy EFT or as Lorentz breaking ones, on the same footing as Chern-Simons or Horava-Lifshitz gravity respectively. In that respect, we leave the phenomenological study of both the actions (3.11) and (3.23) for future work.

IV. CONCLUSIONS
In this paper, we have studied fully and partially degenerate metric theories in four dimensions whose action is invariant under diffeomorphisms and contain at most second derivatives of the metric. Apart from the Einstein-Hilbert action which propagates two physical degrees of freedom, fully degenerate theories are either trivial (which correspond to the Gauss-Bonnet and the Pontryagyn Lagrangians) with no degree of freedom, or contain Ostrograsky modes (which is the case for the cubic C Lagrangian). We have performed a complete Hamiltonian analysis of the C Lagrangian which shows that the theory indeed contains 5 DOF, 3 of them being Ostrogradsky ghosts, as confirmed by the analysis of linear perturbations.
We have also considered partially degenerate theories whose Lagrangian is given by an arbitrary (non-linear) function of one of the fully degenerate Lagrangians, i.e. f (Y ) Lagrangians, with Y = R, GB, P . More general partially degenerate Lagrangians (depending on several of the Y 's) are discussed in an appendix. Following the conservative criterion we set in the Introduction, i.e. that each (second class) primary constraint needs to generate a secondary constraint in order to remove the Ostrogradsky ghost, we conclude that, apart from f (R), partially degenerate theories seem to contain Ostrogradsky modes. f (GB), after being reformulated as a scalar-tensor theory, can easily be cured by adding a kinetic term for the scalar field. f (P ) instead, which can be reformulated as non-dynamical Chern-Simons plus a potential, contains three extra modes, equally for the dynamical case. However, when one restricts Chern-Simons modified gravity to the unitary gauge where the scalar field is a function of time only, one obtains a Lorentz breaking theory where all the Ostrogradsky modes are removed.
Finally, we considered new parity breaking scalar-tensor theories constructed by combining the Riemann tensor and the (first or second) derivatives of the scalar field. Even though they contain Ostrogradsky modes in their covariant version, we have classified new classes of "chiral scalartensor theories" which propagate only three degrees of freedom in the unitary gauge. In this sense, they have to be considered as generalizations of Chern-Simons modified gravity, i.e. as low energy EFTs, or as Lorentz breaking theories with a parity violating sector.
Various phenomenological developments in these new theories are worth exploring: in particular the propagation of gravitational waves and black hole solutions. A preliminary study shows that it is only when we introduce metrics that break parity, such as rotating axisymmetric geometries, that these terms kick in modifying GR solutions, while admitting certain GR solutions notably Schwarzschild in the other cases (see [40] for similar behaviour in CS gravity).
where the tensors A, B, C and D are evaluated in the background, ∇ denotes the covariant derivative compatible with the background metric g µν and we have included in E all terms which do not involve second time derivative of the perturbation. Integration by parts allows to simplify the term in the action which is linear inḧ ij as follows with a redefinition of E. Now, if we make a Fourier transform in the space coordinates, we find that the dynamics of the Fourier components ϕ ij of h ij is governed by an action of the typê where K ijkl is evaluated in the background but could depend on wave number andÊ is the spatial Fourier transform of E. Only the skew symmetric component of K is relevant for us because the symmetric component leads (after an integration by parts) to a term which involves only first time derivatives. Hence, without loss of generality, we assume that K is skew symmetric. It is well-known that any skew symmetric matrix can be brought to a block diagonal form by a special orthogonal transformation. As K is a 6 × 6 matrix, its block diagonal form is where κ A depend on the explicit form of K. Therefore, a change of variable (ϕ ij ) → (ϕ 1 , · · · , ϕ 6 ) allows us to decouple the different components of h ij in such a way that the action reduces to dt d 3 k κ 1φ1φ2 + κ 3φ3φ4 + κ 5φ5φ6 ) +Ê .
For simplicity, we omitted to write the explicit space dependency. Thus, when H is invertible, the kernel of A ij,kℓ is of dimension 2 corresponding to the two directions orthogonal to the four (6-dimensional) vectors Γ A : in that case, rank(A) = 4. In general, it is easy to see that rank(A) = 4 − corank(H) = rank(H) , and, as expected, the theory (B1) is always (partially) degenerate. In particular, we recover the result of the previous subsection, namely rank(A) = 1 when L is non-linear function of a single variable.

Constraint analysis
To count the number of DOF, we make a Hamiltonian analysis. We proceed as in Sect. II C, and we replace (B1) by the equivalent first order action whose Lagrangian does not involve anymore second derivatives of the metric. The corresponding phase space is defined by the same Poisson structure as in (2.17). Computing the momenta P ij gives This relation immediately shows that the momenta are constrained. Indeed, when f X = 0, one can use four out of the six equations (B8) to express the four functions f X in terms of the phase space variables, and the remaining two equations are constraints. If only r functions f X are different from zero, we only need r equations to solve them in terms of the phase space variables, and we get c = 6 − r constraints denoted χ c ≈ 0. This result is compatible with the formula (B6) as rank(A) = 6 − c = r and rank(H) = r. Hence, we start with c primary constraints χ c ≈ 0 in addition to the usual 4 constraints π µ ≈ 0. The latter lead to the usual 4 Hamiltonian and vectorial constraints which form with π µ ≈ 0 a set of first class constraints (up to the addition of second class constraints). The analysis of the stability under time evolution of the constraints χ c ≈ 0 is subtler. Even though we do not perform a complete analysis (which goes beyond the scope of this paper) here, let us quickly describe generic cases.
When L is a function of R and GB only, one obtains c secondary constraints, and there are no tertiary constraints. Except if L depends on P only (in which case the theory admits a conformal invariance in addition to the diff invariance, as it was shown in the previous section), all these (2 × c) constraints are second class. Thus, the theory propagates [32 − (2 × 8) − (2 × c)]/2 = (8 − c) DOF. We recover the fact that f (R) and f (GB) propagates 3 DOF, whereas f (R, GB) propagates 4 DOF. See [41] for an interesting class of f (R, GB) theory.
When the Lagrangian L depends on P and/or C, the analysis is more complicated. The time evolution of the c primary constraints does not produce generically secondary constraints. However, this is true only when c is even, in which case the theory propagates [32 − (2 × 8) − c]/2 = (8 − c/2) DOF. When c is odd, the Dirac matrix of the primary constraints is necessarily degenerate, which implies that there is (generically) 1 secondary constraints. In that case, the theory propagates