Non-local gravity with a Weyl-square term

Recent work has shown that modifications of General Relativity based on the addition to the action of a non-local term $R\,\Box^{-2}R$, or on the addition to the equations of motion of a term involving $(g_{\mu\nu}\Box^{-1} R)$, produce dynamical models of dark energy which are cosmologically viable both at the background level and at the level of cosmological perturbations. We explore a more general class of models based on the addition to the action of terms proportional to $R_{\mu\nu}\,\Box^{-2}R^{\mu\nu}$ and $C_{\mu\nu\rho\sigma}\, \Box^{-2}C^{\mu\nu\rho\sigma}$, where $C_{\mu\nu\rho\sigma}$ is the Weyl tensor. We find that the term $R_{\mu\nu}\,\Box^{-2}R^{\mu\nu}$ does not give a viable background evolution. The non-local Weyl-square term, in contrast, does not contribute to the background evolution but we find that, at the level of cosmological perturbations, it gives instabilities in the tensor sector. Thus, only non-local terms which depend just on the Ricci scalar $R$ appear to be cosmologically viable. We discuss how these results can provide a hint for the mechanism that might generate these effective non-local terms from a fundamental local theory.


Introduction
Locality is one of the fundamental principles of quantum field theory. Nevertheless, even in a fundamental local theory, non-localities in general appear at an effective level. This can already happen classically, when one integrates out some degree of freedom and obtains an effective theory for the remaining degrees of freedom. Cosmological perturbation theory provides a simple example. In this case, integrating out the short-wavelength modes, one finds effective non-local equations for the long-wavelength modes [1]. At the quantum level, non-local terms emerge when one computes the effective action that takes into account loops of massless or light particles.
Non-local terms featuring inverse powers of the d'Alembertian operator are relevant in the infrared (IR), and it is natural to ask whether such non-localities could play a role as IR modifications of general relativity (GR), and in particular whether they could provide a dynamical explanation for dark energy. An attempt in this direction has been done in [2][3][4], adding to the Ricci scalar R in the Einstein-Hilbert action a term of the form Rf (2 −1 R). In models of this form there is no explicit mass scale, since the function f is dimensionless. However, this class of models has been ruled out by the comparison with cosmological observations [5]. Indeed, after fixing the function f (2 −1 R) so that it gives a viable cosmological evolution at the background level, the corresponding cosmological perturbations fail to reproduce structure-formation data.
The situation is more promising for models where the non-local terms depend on an explicit mass scale (whose physical origin could be due to mechanisms such as those discussed recently in [6]). Models of this type have been explored in the context of the degravitation mechanism [7][8][9]. Partly inspired by these works, as well as by attempts at writing massive gravity in non-local form [10,11], in ref. [12] has been proposed a phenomenological modification of gravity, based on the non-local equation of motion where the superscript T denotes the operation of taking the transverse part of a tensor (which is itself a nonlocal operation), and m 2 is a mass parameter. The extraction of the transverse part ensures that energy-momentum conservation is automatically satisfied. A closed form for the action corresponding to eq. (1.1) is not known. A related non-local model, proposed in [13], is defined by the action where m Pl = 1/ √ 8πG is the reduced Planck mass, and m a mass parameter. These two models are related by the fact that, linearizing over flat space, they give the same equations of motion. However at the full non-linear level, or linearizing over a background different from Minkowski (such as FRW), they are different. Both in eq. (1.1) and in the equations of motion derived from (1.2), the inverse d'Alembertian is taken to be the one defined using the retarded Green's function, to ensure causality. A non-local action such as (1.2) must indeed be understood as an effective action which takes into account quantum loop effect. For the in-out expectation values of a quantum fields the variation of such a nonlocal effective action gives non-local equations of motion which depend on the Feynman propagator. However, the equations of motion for the in-in expectation values, which can be obtained with the Schwinger-Keldysh formalism, depend on the inverse d'Alembertian defined with the retarded Green's function, and are therefore causal.
As discussed in [12][13][14][15][16], in the comparison with cosmological observations the models (1.1) and (1.2) perform remarkably well. They dynamically generate a dark energy and have a realistic background FRW evolution. Their cosmological perturbations are wellbehaved and their quantitative effects are consistent with CMB, supernovae, BAO and structure formation data [15][16][17]. Implementing the cosmological perturbations of the non-local models into a Boltzmann code and performing parameter estimation and a fit to CMB, supernovae and BAO data, one finds that these models fit very well the data, with a χ 2 comparable to that of ΛCDM [18]. 1 It should be stressed that both the model (1.1) and the model (1.2) have the same number of parameters as ΛCDM, with the mass m replacing the cosmological constant, and in this sense they are the only existing models that can compete quantitatively with ΛCDM from the point of view of fitting the data, without being merely an extension of ΛCDM with extra free parameters. Further work on these models has been presented in [20][21][22][23][24][25][26][27][28].
The purpose of this paper is to investigate whether some possible generalizations of these non-local theories are cosmologically viable. We are not driven here by the aim of improving the agreement with the cosmological observations. From this point of view the one-parameter models (1.1) and (1.2) already work so well that, with present data, there is not much point in enlarging the parameter space. Rather, our question is of more conceptual nature. Indeed, the models (1.1) and (1.2) have been proposed on a purely phenomenological ground. The next step should be to understand how they could be generated from a fundamental local theory. From this point of view, it is important to understand whether the fundamental local theory should generate exactly a non-local structure such as that given in eq. (1.1) or in eq. (1.2), or whether more general non-local structures are phenomenologically acceptable. This would give a precious hint on the mechanism at work for the generation of the non-local terms. In this sense, the results of the present paper are meant to pave the way for future works aiming at understanding the origin of the non-local terms.
In this paper we study the generalization of non-local actions of the type (1.2), at the level quadratic in curvatures. As a basis for the curvature-square terms one could use R 2 µνρσ , R 2 µν and R 2 . However, for cosmological application it is more convenient to trade the Riemann tensor R µνρσ for the Weyl tensor C ρσµν , which in d = 3 spatial dimensions is given by where [µν] denotes the anti-symmetrization over the indices, e.g. T [µν] = (1/2)(T µν − T νµ ). We will then consider an action of the form where µ 1 , µ 2 and µ 3 are independent parameters with dimension of squared mass. Another notable combination is the one that corresponds to the Gauss-Bonnet term, However, the choice of basis in eq. (1.4) is more convenient for cosmological applications, because the Weyl tensor vanishes in FRW. It should be made clear that the space of all possible non-local theories is of course very large, and our choice by no means exhaust all possibilities. For instance, taking the transverse part of a tensor opens up new possibilities for writing non-local equations of motions, such as eq. (1.1), which are not easily expressed in terms of an action. However, while the addition to Einstein equations of a term proportional to g µν 2 −1 R T gives a viable cosmological model, the addition of term proportional to (2 −1 R µν ) T has been shown to be cosmologically non-viable, already at the background level [12,14]. Several other options are possible, which can be considered as non-linear extensions of eq. (1.2) or of eq. (1.4). A particularly well-motived extension of eq. (1.2) is obtained writing [29] S = m 2 is the so-called Paneitz operator. Indeed, this operator enters in the computation of the anomaly-induced effective action which, as discussed in [6], and as we will recall in the Conclusions, could be at the origin of these effective non-local IR modifications of GR. More generally, we could also replace 2 2 by ∆ 4 in eq. (1.4).
In the present paper we study the cosmological consequences of the action (1.4), both at the level of background evolution and at the level of cosmological perturbations. The plan of the paper is as follows. In Sect. 2 we discuss the effect of the term R µν 2 −2 R µν , and we will see that it is ruled out already at the background level. We then turn to the effect of adding the non-local Weyl term C µνρσ 2 −2 C µνρσ to the theory that already features the R2 −2 R term. In Sect. 3 we show how to write the non-local Weyl term in local form introducing auxiliary fields. The cosmological consequences of the model are studied in Sect. 4. The non-local Weyl term does not affect the background evolution, but only the cosmological perturbations. The background evolution is therefore the same induced by the R2 −2 R term, which is briefly reviewed in Sect. 4.1. The decomposition into scalars, vectors and tensors of the perturbations associated to non-local Weyl term is non-trivial, and we discuss it in Sect. 4.2. Scalar perturbations are then discussed in Sect. 4.3 and tensor perturbations in Sect. 4.4. We draw our conclusions in Sect. 5. In the appendixes we perform a more detailed comparison between analytic and numerical solutions of the equations in the scalar and tensor sector.
We use the MTW conventions for the metric, Riemann tensor, etc., so in particular our signature is (−, +, +, +), and we set = c = 1. A prime will denote the derivative with respect to x ≡ log a, where a is the scale factor in FRW.

The
The term proportional to µ 2 in the action (1.4) is quadratic in the Weyl tensor. Thus, in the equations of motion it gives a contribution linear in C µνρσ , which vanishes in FRW. Therefore this term does not contribute to the background evolution, but only at the level of cosmological perturbation. To study first of all whether the model (1.4) is viable at the level of background evolution, we then only need to consider the action (1.4) with µ 2 = 0. The corresponding equations of motion can be obtained in a standard way localizing the action through the introduction of auxiliary scalar fields, see e.g. [3,[11][12][13][30][31][32][33][34]. In particular, we introduce two auxiliary scalar fields and two auxiliary tensor fields, These relations can be implemented at the level of the action introducing four Lagrange multipliers ζ 1 , ζ 2 , ζ µν 1 and ζ µν 2 , and writing 3) The variation with respect to the Lagrange multipliers and with respect to the auxiliary fields enforces the constraints ζ 1 = µ 1 S, ζ 2 = µ 1 U , ζ µν 1 = µ 3 S µν and ζ µν 2 = µ 3 U µν . Varying the action (2.3) with respect to the metric, substituting these constraints and making use of eqs. (2.1) and (2.2) afterwards, we find the following covariant equation is the part coming from the R2 −2 R term, already found in [13], and We now specialize to a flat FRW metric, ds 2 = −dt 2 +a 2 (t)dx 2 in d = 3 spatial dimensions and in the presence of an energy-momentum tensor T µν = (ρ, a 2 (t) p δ ij ). Then, only the (0, 0) and the (i, i) component of the field equations are non-trivial. It is convenient to where the sum over i = 1, 2, 3 is understood). Then the non-trivial equations are where ∂ t is the derivative with respect to cosmic time. It is easy to see that the equation for V contains growing modes. To this purpose it is convenient to parametrize the time evolution in terms of the variable x ≡ log a. Then eq. (2.8) reads where f ≡ df /dx and ζ(x) = h (x)/h(x). During RD, MD and a De Sitter phase ζ(x) can be approximated by a constant ζ 0 , with ζ 0 = {−2, −3/2, 0}, respectively. The homogeneous equation associated to eq. (2.9) is the same encountered in the non-local model introduced in [11] and further analyzed in [14]. The solution in the approximation ζ( We see that β + > 0 both in MD and RD, which leads to a growing solution for V . This affects the whole cosmological evolution, since the function V is coupled to all other variables through its contribution to I 00 . So, similarly to the model discussed in [11] and in app. A of [14], based on a term (2 −1 G µν ) T in the equations of motion, there is no stable cosmological evolution. In particular, if we start close to a standard FRW solution at early times, the evolution quickly departs from it and explodes. Thus, the term µ 3 R µν 2 −2 R µν is not cosmologically viable (unless of course the parameter µ 3 is tuned to be extremely suppressed with respect to µ 1 ; imposing such a condition on a phenomenological model is quite unnatural so we do not further explore this possibility). This result is similar to the one found in [35], where it was shown that a term R µν 2 −1 R µν also produces instabilities in the cosmological evolution. Observe that the latter term is rather of the Deser-Woodard type, i.e. of the form R µν f (2 −1 R µν ), with a dimensionless function f and no explicit mass scale m. However, in both cases the instability is ultimately due to the form of the 2 −1 operator on the tensor R µν .

Non-local Weyl-square term and auxiliary fields
The study of the background evolution shows that the term R µν 2 −2 R µν cannot be present in a viable model, i.e. µ 3 = 0, but tells us nothing about the non-local Weyl-square term, since the latter does not contribute to the background evolution. To see whether this term is viable we must move one step forward, and see if the cosmological perturbations in the presence of this term are well-behaved. We then consider the model with action The equations of motion of this model can again be derived introducing auxiliary scalar fields. In this case we need again two auxiliary scalar fields as well as two auxiliary fields with the symmetry properties of the Weyl tensor, defined as Proceeding as in sect. 2, we find the following covariant equation where K µν is the same as in eq. (2.5), and The set of equations describing the model is therefore given by eqs. (3.4) and (3.5), together with and As discussed in detail in [3,21,[32][33][34], transforming a non-local equation such as U = −2 −1 R into the local equation 2U = −R introduces spurious solutions, given by the most general solution of the associated homogeneous equation 2U = 0. These spurious solutions are eliminated fixing once and for all the boundary condition of the equation 2U = 0 (which corresponds to choosing once and for all the definition of the 2 −1 operator). As in [12][13][14]18], we will fix the boundary conditions requiring that the auxiliary fields vanish deep in RD. 2 We also observe that, since the modified Einstein equations (3.4) are obtained varying an action invariant under diffeomorphisms, the energy-momentum tensor T µν is automatically covariantly conserved, ∇ µ T µ ν = 0, as can also explicitly checked taking the covariant derivative of the right-hand side of eq. (3.4) and using the equations of motion for the auxiliary fields.
4 Cosmology with the non-local Weyl-square term 4

.1 Background evolution
At the level of the background, the evolution is the same as in the model µ 2 = 0, already studied in [13]. For later use, let us recall the main results. We consider a flat FRW metric, and we use conformal time, ds 2 = a 2 (η)(−dη 2 + dx 2 ). Then, the Einstein equations and the evolution of the background configurations of the auxiliary fields U and S, eq. (3.6), can be recast in the form where the bar over U and V denotes their background value, the prime is the derivatives with respect to x = log a, and h = H/H 0 = H/H 0 , where H = (1/a)da/dt. As usual, Ω R and Ω M are the energy fractions of radiation and matter at present time normalized with respect to ρ 0 = 3H 2 0 / (8πG). We also introduced the notationV ≡ H 2 0S , as well as ζ ≡ h /h and Eqs. (4.1)-(4.3) is a closed system and can be solved numerically given the initial conditions on the auxiliary fieldsŪ ,V fields and their first derivative. In particular, we assumē U =V =Ū =V = 0, at an initial time deep in RD. It is also useful to define an effective dark energy density ρ DE , rewriting the modified Friedman equation (4.1) as Similarly, we can define an effective DE pressure p DE from the trace of the (ij) modified Einstein equations, and the corresponding equation-of-state (EOS) parameter In ΛCDM one typically uses, as independent cosmological parameters, the baryon density today ω b = Ω b h 2 0 , the cold dark matter density ω c = Ω c h 2 0 , the Hubble parameter today H 0 = h 0 × 100 km s −1 Mpc −1 , the amplitude of scalar perturbation A s , the scalar spectrum index n s and the redshift at which the Universe is half-reionized z re . The dark energy density Ω Λ is then a derived parameter, fixed by the flatness condition. In the nonlocal model with µ 2 = 0 the situation is similar. We can use the same set of independent cosmological parameter and then µ 1 , or equivalently γ 1 , is a derived parameter, fixed again from the condition Ω tot = 1. The corresponding comparison with data for the R2 −2 R model has been performed in [18], implementing the cosmological perturbations of the non-local models into a Boltzmann code and performing parameter estimation. The best-fit values for ω c , ω b and h 0 , using Planck 2013 CMB data, JLA SNe and BAO are ω c = 0.1204 (14), ω b = 2.197(25) × 10 −2 and h 0 = 0.709 (7), corresponding to Ω M = (ω c + ω b )/h 2 0 = 0.283 (9). The value of γ 1 is then fixed so to reproduce this best-fit value, which gives 3 γ 1 9.286 × 10 −3 .

Scalar-vector-tensor decomposition of the perturbations
We now turn to the study of the cosmological perturbations of this non-local model. We consider the following metric, in conformal time and longitudinal gauge where Φ and Ψ describe the scalar perturbations, the transverse vector w i describes vector perturbations, and the transverse-traceless tensor h ij describes tensor perturbations. Let us recall that, for a generic anisotropic fluid, at first order in perturbation theory we have where the density and the pressure have been perturbed around their background values, ρ =ρ + δρ and p ≡p + δp. The pressure perturbation is proportional to the density perturbation, δp = c 2 s δρ, where c 2 s is the speed of sound of the fluid. We define as usual δ ≡ δρ/ρ and θ ≡ δ ij ∂ i v j . The anisotropic stress Σ i j is a symmetric and traceless tensor, Σ i i = 0. We consider the universe filled with radiation and non-relativistic matter, and therefore in the following we will set Σ i j = 0. We expand the auxiliary fields U and S around some background configuration as U =Ū + δU and S =S + δS. In principle one can do the same with U µναβ and S µναβ , parametrizing perturbations around some background configurationŪ µναβ and S µναβ . However, as we will explicitly show later, from the definitions (3.3) it turns out that the background configurationsŪ µναβ andS µναβ are vanishing on a cosmological FRW background. Therefore, the auxiliary tensor fields U µναβ and S µναβ do not enter the background equations and only affect the perturbations. This is consistent with what already observed: at the background level the model (3.1) is indistinguishable from the one with µ 2 = 0. We also observe that, as long as we are interested in perturbations of first-order in the equations of motion, the term P µν in eq. (3.4) does not contributes, since it is quadratic in the tensor fields U µναβ and S µναβ , which have vanishing background configurations on a FRW background for the metric. Computing the Weyl tensor in terms of the metric perturbations we find, to first order and, for any tensor A µν , we define The other components of the Weyl tensor follow from the symmetry relations  16]. Observe also that this value of γ has been obtained using the Planck 2013 data. An updated analysis using Planck 2015 data will be presented in [19].
Note that, since we are retaining only terms of first-order in the metric perturbations, the components of the Weyl tensor may be raised and lowered using the unperturbed metric. For the independent components of the U µναβ tensor, i.e. U 0i0j and U 0ijk , eq. (3.6) on a FRW background is equivalent to the following set of coupled differential equations 12) where, in the first equation, we used the fact that the tensor U ijkl satisfies the same symmetry relation as the Weyl tensor, so in particular also the analogous of the relation (4.10). Analogously one has, for S µνρσ , (4.14) We observe that, as already mentioned in section 4, if we solve eqs. (4.12)-(4.13) for the background configurations of the auxiliary fields, the source term C 0ijk vanishes and we get as a solutionŪ µναβ = 0 andS µναβ = 0.
Note that all equations above involve the same differential operator in conformal time which may allow for unstable solutions. This can be seen explicitly by using x = log a as time variable. Then the differential operator becomes As in section 2, we can take ζ(x) ζ 0 as approximatively constant in the various cosmological epochs, with ζ 0 = {−2 , −3/2 , 0} in radiation, matter and de Sitter, respectively. One then finds that in RD there is a mode growing as e 6x = a 6 which might suggest that the model is badly unstable; however, we will see that the auxiliary fields suffering this instability enter in the Einstein equations with a suppression factor which is just a −6 , so an explicit inspection of the Einstein equations is necessary in order to understand the behavior of the system. We now parametrize the various components of U µναβ and S µναβ in terms of scalar, vector and tensor degrees of freedom. The five degrees of freedom characterizing the symmetric and traceless tensors U 0i0j and S 0i0j can be decomposed in the usual way into a scalar, a transverse vector, and a transverse traceless symetric tensor: with For the other components, one may notice that S 0ijk enters in the Einstein equations only through the combinations S 0(ij)k,k ≡ S ij , and therefore U 0ijk only enters through the combination U 0(ij)k,k ≡ U ij . These combinations are symmetric, traceless and with one vanishing longitudinal component (S ij,ij = 0), so they only carry four degrees of freedom (the scalar is missing). Then, they are naturally parametrized as Observe that ∇ i ∇ j S ij = 0 to first order in perturbation theory since, on FRW, the Christoffel symbol Γ k ij = 0. Similarly, we write with We now have all the elements for writing down the Einstein equations to first order in the cosmological perturbations, and decomposing them in the usual way into scalar, vector and tensor components. We also go in Fourier space, denoting the comoving momentum by k. We introducek ≡ k/(aH), and we use x = log a as time variable, with df /dx = f . It is also useful to introduce the dimensionless couplings 21) and to rescale the fields as follows,

Scalar sector
In the scalar sector, we can take as independent equations the trace and the traceless parts of the (ij) component of the perturbed Einstein equations. The former gives and the latter .
From the covariant conservation of the energy-momentum tensor, ∇ µ T µν = 0, we find where w is defined byp = wρ, while c 2 s from δp = c 2 s δρ. 4 These equations are independent of the specific dark energy content, since they simply express the conservation of T µν . For matter-radiation fluids with no energy exchange among them, we have as usual We set to zero the amplitude of the auxiliary fields and of their derivatives at an initial time deep into RD. For the matter and metric variables, we take the standard adiabatic initial conditions expanded to second order (see e.g. [36], or eqs. (6.1)-(6.3) of [16]). For the spectral index n s and for the amplitude of the gravitational potential δ H , we take the values, n s 0.96 , δ 2 H 3.2 · 10 −10 . We can now solve the set of coupled differential equations (4.23)-(4.28), (4.31)-(4.34), for different values of k.
Let us recall that the modes relevant for the comparison with CMB and with structure formation data correspond roughly to the range k = (10 −3 − 10) h/Mpc, and that in ΛCDM modes with k < For each of these wavelengths we show the standard ΛCDM result (black solid line), the result for the pure R2 −2 R model (blue, dashed line) with µ 1 = 0.014H 2 0 , which corresponds to the best fit value for γ given in Sect. 4.1, and the result for the model with also the Weyl-square term, where we set for definiteness µ 1 = µ 2 = 0.014H 2 0 (red, dotted). We see that, compared to the pure R2 −2 R model, and up to the present time, only the largest modes are affected by the addition of the Weyl-square term. Even for these very long wavelengths, which are the largest observable with CMB anisotropies, the difference with the R2 −2 R model is negligible during MD, and only shows up as a difference of at most (1-2)% percent near the recent epoch, x = 0 (although scalar perturbations in the Weyl model will start to show an instability in the cosmological future, x > 0). This means that, compared to the pure R2 −2 R model, the addition of the Weyl-square term has basically no effect on SNe, BAO, and structure formation data while, in the CMB, it only affects the late integrated Sachs-Wolfe (ISW) effect, just at the level of order (1-2)% percent. The late ISW effect only contributes to the very low multipoles, and even for these multipoles it is sub-leading with respect to the (non-integrated) Sachs-Wolf effect. Thus, among all cosmological observations testing scalar perturbations, the inclusion of the Weyl-square term will only affects the lowest CMB multipoles, and even for these it will only result in an overall deviation of the C l at the sub-percent level. Therefore, in the presence of the non-local Weyl-square term, scalar perturbations remain are well-behaved (at least up to the present time) and in principle consistent with the data.
Conceptually, it is however interesting to observe that, as we extend the time integration more into the future, we start to see more and more deviations in the scalar perturbations of the Weyl-square model, compared to ΛCDM or to the pure R2 −2 R model. The behavior of the long wavelength modes can be understood analytically, and we perform the corresponding analysis in App. A.

Tensor sector
We next turn to tensor perturbations, where we will see that the inclusion of the non-local Weyl-square term can be more problematic. In the tensor sector we have to solve 5 coupled second-order differential equations in the variables h ij , u ij , v ij , σ ij , s ij , coming from the TT part of the Einstein equations and of the equations for the auxiliary fields. With the field redefinition (4.22), the relevant equations arê where in the last equation we have neglected the contribution coming from the anisotropic stress.
As usual, we separate the Fourier components h ij (η, k) into its plus and cross polar- 40) and similarly for all other transverse-traceless tensorsŝ ij ,û ij , etc. The equations for the two polarizations are the same, and we denote generically their amplitudes as h g (not to be confused with the reduced Hubble parameter h = H/H 0 ), s g , u g , etc.
We solve numerically eqs. (4.35)-(4.39) starting from an initial redshift z i = 10 9 in radiation domination, with GR-like initial conditions for the metric tensor mode h g , i.e. h g (z i ) = 1 and h g (z i ) = 0, and vanishing initial conditions for the auxiliary fields u g , v g , s g and σ g and their first derivatives. The result of the numerical integration is shown in Fig. 2 where, for clarity, we show separately the result for the same three comoving wavelengths already used in Fig. 1 for scalar perturbations. For each wavelength we now display only the standard ΛCDM result (black solid line) and the result for the Weyl-square model with µ 1 = µ 2 = 0.014H 2 0 (red, dotted), since the result for the pure R2 −2 R model, on this scale, are hardly distinguishable from the result for ΛCDM. We now find the surprise that, in the Weyl-square model, tensor modes become unstable. The instability starts some time after the RD-MD transition and, as we will show analytically in App. B, is due to the fact that the auxiliary fields, which are zero during RD, gradually become non-vanishing in MD, and when they are sufficiently large they begin to source the instability. We also observe that the instability is stronger for the short wavelengths, i.e. large frequencies. This is due to the fact that it is actually the time derivative of the auxiliary fields that sources the instability, so the effect is larger for the modes oscillating with large frequency. Observe also that in Fig. 2 we only plot the result up to log a = −1.8. If we extended these plots up to the present time log a = 0, the instability would bring the amplitude well outside the vertical scale of the plots. The corresponding plots on a logarithmic vertical scale, extended up to the log a = 1 in the cosmological future, are shown in Fig. 3.
To assess whether this instability in the tensor sector rules out the model which features the non-local Weyl square term, we need to make assumptions on the initial spectrum of perturbation, h g,in (k), whose subsequent evolution is then amplified as in Fig. 3. This initial spectrum depends on the theory of the primordial Universe. Some alternatives to the standard slow-roll inflationary paradigm predict a negligible primordial tensor spectrum. For instance, the pre-big-bang model [37] predicts a primordial relic GW spectrum that, in some range of its parameter space, can be sizable, but for other values of the parameters can be totally negligible [38,39]. As another example, the ekpyrotic model predicts a negligible amount of primordial tensor perturbations [40,41]. However, to generate an adiabatic spectrum of scalar perturbations, the pre-big-bang must appeal to mechanisms such as the decay of massive axions, leading to the 'curvaton' mechanism [42] which, even if potentially viable, are less natural than the generation of adiabatic perturbations in standard slow-roll inflation.
If we stick to the standard inflationary paradigm, the amplitude of the primordial tensor perturbations is characterized by a tensor-to-scalar ratio r, with r < O(0.1) to comply with the joint BICEP2/Keck Array-Planck analysis [43]. Several single-field inflationary models give predictions close to this experimental bound (or in excess of it, in which case they are ruled out). In any case, even the inflationary models that give a small primordial GW spectrum, such as the Starobinsky model, still predict values of r at least of order 10 −3 .  In terms of the GW energy density per unit logarithmic interval of frequency, defined as usual as h 2 0 Ω gw (f ) = (1/ρ c )dρ GW /d log f (where the frequency f today is related to k by k = 2πf ), the standard functional dependence of the inflationary prediction is h 2 0 Ω gw (f ) ∝ f −2 for frequencies f in the range 3×10 −18 Hz < f < f eq , where the condition f > 3×10 −18 imposes that the GW is inside the horizon today, while f eq 1.6 × 10 −17 Hz is the value of the frequency of the mode that re-enters the horizon at the RD-MD transitions.
Modes with f > f eq re-entered the horizon already during RD, and for these frequencies h 2 0 Ω gw (f ) ∝ f n T , where n T = −r/8, so in this regime the spectrum is basically flat (see e.g. [44] for review).
In the presence of the non-local Weyl-square term, this GW spectrum is further amplified by its evolution during MD, as shown in Fig. 3. We can compute this extra amplification by comparing mode by mode the ΛCDM result with the result in the presence of the non-local Weyl-square term. We find that, because of this further amplification, in the regime 3 × 10 −18 Hz < f < f eq now h 2 0 Ω gw (f ) grows as f 3 , while in the regime f > f eq it grows as f 4 .
Let us denote by f * = k * /(2π) the value of the frequency corresponding to the pivot scale used by Planck. For the typical value k * = 0.002 Mpc −1 we have f * 3.09 × 10 −18 Hz. We denote by Ω * the corresponding value of Ω GW (f = f * ). Then, the form of the inflationary spectrum today, after the further amplification by the Weyl-square term, is  Even assuming a very low value of h 2 0 Ω * , such as h 2 0 Ω * = 10 −16 (which corresponds to a tensor-to-scalar ratio r ∼ 10 −4 ), such an extremely blue spectrum wildly violates all existing bounds on h 2 0 Ω gw (f ), coming e.g. from pulsar timing, extra radiation or bigbang nucleosynthesis, and even quickly becomes larger than one, as we see from Fig. 4. In principle such a behavior will continue until the intrinsic cutoff of the inflationary spectrum is met, e.g. up to f = O(10 9 ) Hz for typical GUT scale inflation. 5 In this figure we used the result obtained for µ 2 = µ 1 = 0.014H 2 0 and it is worth reminding that, contrarily to µ 1 which is fixed by the background evolution, µ 2 is in principle a free parameter. However it is clear that, unless µ 2 is tuned to ridiculously small values, the resulting GW spectrum is unacceptably large.
The conclusion from this analysis is that, in a cosmological model in which the primordial tensor perturbations are generated by inflation, as well as in any other model in which the primordial tensor perturbations are not completely negligible, the presence of the non-local Weyl-square term in the effective action at late time is ruled out by the resulting overproduction of tensor perturbations.

Conclusions
The aim of this paper was to study a non-local theory of the general form (1.4), which features non-local terms proportional to the mass-square parameters µ 1 , µ 2 , µ 3 , to see which terms are phenomenologically allowed. The term µ 1 R2 −2 R has already been studied in a number of papers, and has been shown to produce a viable cosmological model, both at the level of background evolution and of cosmological perturbations. Here we have found that the addition of a term µ 3 R µν 2 −2 R µν is quickly ruled out, since it induces instabilities in the background evolution. The case for the non-local Weyl-square term, µ 2 C µνρσ 2 −2 C µνρσ , turns out to be more delicate. This term does not contribute to the background evolution. Its scalar perturbations, at least up to the present epoch, are 5 Actually, our numerical integration only works up to momenta k 10 4 H0, i.e. frequencies of order f 3.5 × 10 −15 Hz. For larger frequencies it becomes numerically difficult to follow reliably the growth of the tensor perturbations. However, such frequencies are already well in the range of frequencies that entered the horizon during RD, so we do not expect a further change of regime until the inflationary cutoff is met.
sufficiently close to that of ΛCDM and of the µ 1 R2 −2 R model, so that they do not pose a threatens to the phenomenology of the model (although, for very large wavelengths, they become unstable in the cosmological future). However, the tensor perturbations of the model become unstable already during MD, with the instability being more important for the short wavelengths. This produces a strong amplification of any primordial background of gravitational waves. In particular, if we assume the typical primordial GW background generated by single-field slow-roll inflation, even with very small values of the tensor-toscalar ratio r, the resulting spectrum for h 2 0 Ω gw (f ) is extremely blue, growing as f 4 for modes that re-entered in RD, leading to a complete violation of all bounds on h 2 0 Ω gw (f ). Indeed, increasing the frequency, h 2 0 Ω gw (f ) even quickly becomes larger than one. Thus, at least in the standard scenario where the initial relic GW background is not totally negligible, also the term µ 2 C µνρσ 2 −2 C µνρσ is phenomenologically ruled out. 6 The crucial next step is to understand the mechanism that generates such non-localities from a fundamental local theory. The results that we have presented in this paper give a useful hint, since they indicate that we need a mechanism that generates precisely the R2 −2 R term (or the term in eq. (1.1)), rather than most general non-local structures involving R µν ot the Weyl tensor. A mechanism of this type has been recently suggested in [6]. The starting point was the observation in [45,46] that, when gravity is coupled to massless particles, the conformal mode acquires a kinetic term because of the conformal anomaly. In four dimensions the anomaly-induced effective action is where E is the Gauss-Bonnet combination, C 2 is the square of the Weyl tensor, b, b are coefficients that depend on the number and type of conformal massless fields, and ∆ 4 was defined in eq. (1.7). Restricting the theory to the conformal mode σ, i.e. writing the metric as g µν (x) = e 2σ(x) , the non-local action (5.1) becomes local, and reads [45,47] where again Q is a coefficient that depends on the number and type of conformal massless fields. The interesting aspect of this result is that, in momentum space, the propagator of the σ field goes as 1/k 4 , and therefore has strong infrared (IR) effects. In particular, the corresponding propagator in coordinate space grows logarithmically, The situation is quite similar to what happens in the twodimensional XY model. In this case the 1/k 2 propagator in two dimensions again produces a coordinate-space propagator which grows logarithmically. In the XY model this induces a vortex-vortex interaction that, depending on the value of Q, can disorder the system and dynamically generate a mass gap, leading to the well-known Berezinsky-Kosterlitz-Thouless (BKT) transition. It is then quite natural to expect that, in the four-dimensional case, the strong IR effects induced by the 1/k 4 propagator of the conformal mode might lead to a dynamical mass generation for the conformal mode, similarly to the dynamical mass generation in the BKT transition. In this case, it is precisely the term m 2 R2 −2 R that emerges [6]. Indeed, there is no local diff-invariant term that, when written in terms of the conformal mode and expanded in powers of σ, gives a mass term for σ, i.e. starts from a σ 2 term. However, the Ricci scalar computed from the metric g µν = e 2σ(x) η µν , to linear order in σ, is given by R = −62σ + O(σ 2 ). Thus, (upon integration by parts) The non local term m 2 R2 −2 R therefore just provides a diff-invariant way of giving a mass to the conformal mode, while the higher-order interaction terms on the right-hand side of eq. (5.3) (which are non-local even in σ) are required to reconstruct a diff-invariant quantity. The same is true for the model (1.1). Indeed, as we have mentioned, the models (1.1) and (1.2) have the same expansion to linear order over Minkowski space, so the corresponding actions are the same in an expansion to quadratic order over Minkowski, and they both reproduce the term ∝ σ 2 in the action, although with different non-linear completions.
It is quite interesting to see that the models (1.1) and (1.2) have a physical interpretation, as a mass term for the conformal mode, which is absent for the other non-local terms that we have considered, and which could justify the emergence of this specific non-local structure. Further work along these lines is of course needed to put this picture on firmer grounds. The results of the present paper however show that, among the possible nonlocal terms of the form (1.4), only the R2 −2 R term is phenomenologically acceptable, and this points toward a mechanism, such as the dynamical mass generation for the conformal mode, which would produce exactly this non-local structure.
given by the particular solution, u 0 ∝ a 2 /h 2 and s 0 ∝ a 2 /h 4 . The zeroth-order evolution of the auxiliary fields u, s acts as a source for the first-order correction to the gravitational potentials in eqs. (4.23)-(4.24), which to this order become The most relevant correction comes from the effective anisotropic stress generated by the effective dark energy in equation Note that we must however retain the term ink 2 in (4.24) since it may become relevant at late times: indeed, taking for definiteness a MD epoch (h ∝ a −3/2 ), and remembering that k = k/(ah), we see that the contribution to the effective anisotropic stress proportional to k on the right-hand side of A.3 is diluted by a factor a −4 , while the other source terms are diluted by h 2 /a 2 ∝ a −5 . We can solve algebraically (A.3) for Ψ 1 and use the solution to get a second-order differential equation for Φ 1 from (A.2). Then, to first order in γ 2 , the particular solution for the gravitational potential Φ in MD contains a growing mode, which, at late times, dominates over the constant solution. In the left panel of Fig. 5 we show the numerical evolution of the potentials Φ and Ψ in MD, in the approximation in which the effect of dark energy on the background evolution is neglected (an approximation that, as we know from Sect. 4.1, breaks down near the present epoch). In this case, at late times the universe is dominated by matter and the potentials switches from the constant behavior (A.1) at the beginning of the MD epoch, to the analytic behavior (A.4) (red, dashed line) induced by the growing mode at later times. Figure 5: Numerical evolution of the potentials Φ (blue, solid line) and Ψ (red, dashed line), with k = 10 −3 H 0 and γ 2 = γ 1 = 0.014H 2 0 . Left: the background evolution is driven by standard matter and radiation only. The analytical solution for Ψ in matter dominance is the green, dotted line. Right: the background evolution includes dark energy. When the latter becomes relevant at the background level, the solution follows the deSitter growing mode.
We next consider the dark-energy dominated regime. We mimick it with a de Sitter phase with ζ = 0, while h 2 and Ω DE are taken to be constant; we will see that this rather crude approximation (our nonlocal effective dark energy fluid does not behave as a cosmological constant) captures anyways the essential behaviour of superhorizon scales. Then, the growing modes of the auxiliary fields become dominated by the solutions u ∝ a λ + /h 2 , s ∝ a λ + /h 4 , with λ + = (5 + √ 33)/2 ≈ 5.37. We can then proceed as before and compute the correction to the scalar potentials, which now has a growing mode Φ ∝ (α + βx)a λ + x /h 2 . In the right panel of Fig. 5, we show the evolution of the potentials on the background given by the full non-local model. At late times, the growing mode of the solution in de Sitter becomes dominating. Because of this growing behavior, scalar perturbations of very large wavelength eventually leave the linear regime, even if this happens only in the cosmological future. We see that these analytic estimates allow us to correctly reproduce the qualitative features of the numerical results.

B Analytic study of the instabilities in the tensor sector
In this section we show how to understand analytically some aspects of the evolution of the tensor perturbations, studied numerically in Sect. 4.4.
We consider first superhorizon scales. Again, we solve first the equations to zero-th order in γ 1 , keeping only the leading growing terms, and we then plug the solution back into the equation for h g . To first order in γ 2 we find The tensor mode h g will experience the effects of the source term on the right-hand side of eq. (B.1) when the latter will have grown of order of thek 2 h g term present on the left-hand side. Substituting the solutions for u g ,v g s g and σ g we find that the leading-order growth of the source term is given by ∼ γ 2 (k/H 0 ) 4 Ω −3 R a 8 log a. Therefore, a mode with momentum k will show an instability only for values of the redshift that satisfy z 6 ≤ Ω −2 R γ 2 (k/H 0 ) 2 . On the other hand, the super-horizon approximation is valid for scales satisfying the condition z 6 (k/H 0 ) 6 Ω −3 R . These two conditions cannot be satisfied at the same time, unless an unnaturally large value of γ 2 is chosen. Therefore, modes that enter the horizon after the equality are stable during the entire radiation dominated epoch. For these modes the effect of the coupling in eq. (B.1) is negligible and their evolution is the standard one in term of a constant and a decaying mode. This agrees indeed with our numerical result for the mode with λ − = 1 Gpc, shown in Fig. 3.
We next consider the late-time instability of sub-horizon modes. A full analytic understanding of the growth of sub-horizon perturbations in the different epochs is rather complicated since several terms in the solution grow with different powers of the scale factors, but also with differentk-dependent pre-factors. Then, it is quite involved to see, as a function ofk, what is the term that dominates the growth at any given time. However, it is relatively straightforward to understand the behavior of sub-horizon modes at late times, i.e. deep in the DE dominated epoch (so, in particular, in the cosmological future), since in this case the result is dominated by the terms that grows with the highest power of a. As for the scalar case, we approximate the late phase of the evolution of the background as a pure de Sitter phase with ζ = 0 and h = Ω DE = constant. As one can check self-consistently, the fastest-growing modes are the ones associated to the homogeneous solutions of eqs. growth given by u g , v g , s g , σ g ∝ a 5 . Making use of this behavior for the auxiliary fields, to first order in γ 2 the eqaution for h g can schematically be written as where c 1 , c 2 and c 3 are some coefficients constant in time, but which depend onk, as well as on Ω R , Ω M , Ω DE and z i . At late times, the GW mode h g will then develop a growing mode, with leading order behavior h g ∝ a . (B.3) In Fig. 6 we plot the numerical evolution of a tensor mode with very short wavelength, k = 10 3 H 0 , and we compare it to the slope of the analytic estimate (B.3) in the DE dominated epoch (observe that we now plot the result as a function of log 10 (z + 1), where z is the red-shift, so time evolves from right to left). We see that the analytic estimate correctly reproduces the numerical result at late times.