Optimizing large-scale structure data analysis with the theoretical error likelihood

An important aspect of large-scale structure data analysis is the presence of non-negligible theoretical uncertainties, which become increasingly important on small scales. We show how to incorporate these uncertainties in realistic power spectrum likelihoods by the appropriate change of the fitting model and the covariance matrix. The inclusion of the theoretical error has several advantages over the standard practice of using the sharp momentum cut $k_{\rm max}$. First, the theoretical error covariance gradually suppresses the information from the short scales as the employed theoretical model becomes less reliable. This allows one to avoid laborious measurements of $k_{\rm max}$, which is an essential part of the standard methods. Second, the theoretical error likelihood gives unbiased constrains with reliable error bars that are not artificially shrunk due to over-fitting. Third, it even improves the parameter constraints by including the information on the smoothness of the theoretical model, which is ignored in the standard analysis. We demonstrate these points using the large-volume N-body data for the clustering of matter and galaxies in real and redshift space. In passing, we validate the effective field theory description of the redshift space distortions and show that the use of the over-constrained phenomenological models for fingers-of-God causes significant biases in parameter recovery.


Introduction
Galaxy clustering on large scales becomes ever more important in modern cosmology. The measurements of baryon acoustic oscillations (BAO) and the power spectrum shape in the current data allow one to determine cosmological parameters with precision that rivals the cosmic microwave background analysis [1][2][3][4][5][6]. Even more progress is expected in the era of the upcoming high-precision surveys like Euclid [7,8] and DESI [9], see e.g. [10][11][12][13].
One of the crucial ingredients for measurement of cosmological parameters from large-scale structure (LSS) is an accurate covariance matrix for a given summary statistic. The question of how to estimate covariance matrices and how they impact cosmological constraints has stimulated a broad line of research over the past decades. This includes perturbative calculations [14][15][16][17][18][19], measurements from mock catalogs (e.g. [20][21][22]) and studying the systematic biases that can arise due to the uncertainties in the covariance matrix [23][24][25]. While these significant efforts focus mainly on statistical errors, it is important to note that there is another aspect of the covariance matrix treatment that has attracted attention only recently. This is the theoretical error (TE) covariance, which is as important as the statistical covariance for any realistic large-scale structure analysis [10,11,26,27]. The TE covariance originates from imperfect knowledge of the theory model that is used to fit the data. The LSS observables are sensitive to various nonlinear effects whose accurate description is quite challenging, both in analytical approaches to LSS clustering and in N-body or hydrodynamical simulations. The importance of these nonlinear effects grows on small scales and therefore impacts a large number of Fourier modes that are important to constrain cosmological parameters.
Theoretical error becomes a leading source of uncertainty when the statistical errors become sufficiently small (either due to a large volume of a survey or the analysis being pushed to smaller scales), at which point it has to be included in the analysis to avoid biasing the output cosmology. The standard approach to deal with this situation is to assume that the fitted theory model is perfect up to a certain scale (e.g. k max in the power spectrum case) and perform the analysis with this data cut. This approach has a number of disadvantages. First, it is very timeconsuming, since the only way to determine the optimal k max is to run many Markov-Chain Monte-Carlo analyses of the realistic mock data samples, see e.g. [1,[28][29][30][31]. Moreover, if a different model or a set of priors are used, one has to re-validate the k max choice because the parameter variances and degeneracy orientations change in this case [28,32]. The second drawback of the k max analysis is that biases for different parameters become sizeable at very different scales. This makes the choice of k max ambiguous. The usual approach is to choose the cuts such that biases on all cosmological parameters are significantly smaller than the statistical error. On the one hand, this choice is the most conservative as it ignores possible improvements on cosmological parameters that are unbiased for larger k max . On the other hand, it still does not guarantee the absence of bias in nuisance parameters, which can be problematic when doing a joint analysis of different observables (for example, power spectrum and bispectrum) or combining different datasets. Finally, using a sharp data cut neglects two important properties of the power spectrum: (a) the broadband power spectrum is smooth, and (b) the shape of the common features, such as BAO wiggles, can be reliably calculated for any wavenumber [33][34][35][36][37][38]. These two facts reveal yet another important limitation of the standard k max analysis: it neglects all information from scales beyond k max , a part of which can be recovered assuming some reasonable smoothens of the power spectrum and using the shape information from the BAO wiggles.
In this paper we show how all these issues can be resolved including the theoret-ical error covariance in the analysis. As a first step, we re-derive the theoretical error covariance emphasizing the similarity of this procedure with the exact marginalization over a nuisance parameter. This example naturally suggests that the theoretical error likelihood should include both the mean and the covariance of the theoretical uncertainty, with identical shapes. Then, we propose a new way of extracting these quantities directly from the mock data, and argue that this method gives more reliable estimate than the original theoretical error approach relying exclusively on perturbation theory.
We apply the TE formalism to the N-body simulation data with volume ∼ 100 (Gpc/h) 3 , similar to the cumulative volume of upcoming spectroscopic galaxy surveys. We explicitly illustrate how the TE approach allows one to obtain accurate and optimal constrains that are independent of the choice of k max and that are not affected by over-fitting. Moreover, the TE guarantees that both the principal components and their 1d projections onto the particular parameter planes are unbiased, along with the variances of these parameters. Finally, we show that using the TE covariance indeed improves constrains on cosmological parameters by including extra information from the BAO wiggles and exploiting the smoothness of the broadband power spectrum. 1 It is important to stress that all these results can be obtained with a single MCMC analysis, in contrast with the standard k max approach.
We scrutinize the effect of TE covariance on the power spectrum analyses in four difference setups: dark matter and galaxies in real and redshift space. We analyze the large-volume N-body simulation data using the power spectra calculated in the framework of the effective field theory of large-scale structure (see [1,28] and references therein) and explicitly demonstrate that the true input cosmology is extracted in an unbiased manner in all of these different examples. Our analyses also validate the implementation of the effective field theory for various tracers with the CLASS-PT code [39]. In this context, our work can be viewed as a companion paper of Ref. [39], as it proves that the accuracy of CLASS-PT meets the requirements of future high-precision surveys.
The paper is organized as follows. We give a theoretical background on the theoretical error in Sectiion 2. Section 3 specifies our methodology, the theoretical model, and N-body simulations. Then, we present the TE analysis and confront it with the standard k max approach for dark matter in real space in Section 4, dark matter in redshift space in Section 5, galaxies in real space in Section 6, and galaxies in redshift space in Section 7. We conclude in Section 8. An additional discussion about the choice of the theoretical error is given in Appendix A.

Theoretical error likelihood
In this section we review the theoretical error formalism. Let us first repeat the derivation of Ref. [26] focusing on the power spectrum likelihood, where the sum over (i, j) is implicitly assumed, C ij is the data covariance matrix, P th is the true theoretical model and P d is the datavector. In practice, the true theoretical model is not known and we use an approximate model P approx i , e.g. the one-loop perturbation theory, which becomes less and less accurate on small scales. Theoretical error is the difference between the (unknown) true theory and this approximation, Let us assume that P TE i is drawn from a Gaussian distribution with meanP TE i and the covariance C (E) , Marginalizing our original likelihood (2.1) over the theoretical error we get the following marginalized likelihood: It is instructive to consider a few illustrative examples. First, let us assume that the theoretical error covariance is diagonal, C With this choice the bins with k ≥ k max are thrown away from the likelihood and the bins with k < k max are assumed to have no theoretical error. This limit corresponds to the standard analysis with fixed k max . Let us now take a look at the opposite extreme, where the theoretical error is correlated in all k bins. A simple example is given by whose coefficient is expected to be Gaussian-distributed as Such form of the TE as well as the prior on α can be motivated either from perturbation theory (as higher derivative counterterms) or from N-body simulations. Let us now marginalize over α just like one normally marginalizes over nuisance parameters in realistic cosmological analyses. The marginalized Gaussian likelihood can be easily obtained from (2.1), where we have used the Sherman-Morrison identity. We can see that in this case, since the k-dependence of the TE is entirely fixed, the new covariance is fully correlated even if the data covariance C ij is diagonal. We stress that so far our calculation has been exact, i.e. up to marginalization over α, the likelihood (2.8) contains the same information as the original likelihood (2.1).
In reality, the exact k-dependence of the theoretical error is not known (otherwise it would be included in the theory model), but we want to marginalize over it just like we did with α. In other words, we want to marginalize over all possible curves within some bounds given by the expected size of the theoretical error and with sufficient degree of smoothness. This can be achieved choosing an appropriate TE covariance matrix. Following [26], we use the ansatz where we introduced finite coherence scale ∆k, whereas E i is some k-dependent smooth envelope for the theoretical error. Note that we neglect the cosmologydependence of C (E) in the same way as it is customarily done for the data covariance [40] (see [41] for the justification of this practice for the CMB). The coherence scale ensures that the neighboring bins at distance smaller than ∆k are almost fully correlated, but the allowed theoretical uncertainties can freely vary only on separations larger than ∆k. Choosing the appropriate limits, we can recover the two previous examples. In the limit ∆k → ∞ all k bins are correlated and the theoretical error corresponds to adding the shape E(k) to the theory model and marginalizing over its amplitude. In the opposite limit ∆k → 0 all bins can fluctuate independently and the theoretical error becomes diagonal as in our first example. Note that this limit is unphysical because it allows theoretical uncertainty to have arbitrarily fast oscillations and it makes the result dependent on the binning. We will proceed with the following choice for the coherence scale On the one hand, it is small enough to allow for typical variations of nonlinear corrections within the usual range of scales used in the data analysis. On the other hand, it is big enough to ensure that the broadband theoretical error is correlated on scales corresponding to the frequency of the BAO wiggles, k BAO ∼ 0.01 h/Mpc. This can help extract the information from the BAO even if the envelope E i is large. Indeed, the theoretical error with this coherence length will effectively discard the broadband information, but will retain the oscillatory features like the BAO wiggles [5]. Therefore, while increasing the total covariance, the theoretical error can possibly improve the constraints on cosmological parameters by taking into account the information on the smoothness of E i and exploiting any additional features. Once the coherence length is fixed, the envelope and the mean of the theoretical error have to be chosen as well. This is the most difficult task, since it strongly depends on the exact model being used and the observable being analyzed. In the context of perturbation theory a reasonable guess isP i = 0 and the envelope can be estimated as the typical size of higher loop corrections not included in the model [26]. This theory-inspired estimate does not use any input from simulations or data, but, strictly speaking, it is expected to be accurate only on the order-of-magnitude basis. Indeed, we will see that the perturbation theory-inspired theoretical error can overestimate the actual uncertainty in real space by a factor of few. The perturbation theory estimate is even less reliable in redshift space, and can significantly underestimate the true theoretical uncertainty, particularly for higher order multipoles. We illustrate this issue in detail in Appendix A.
In order to avoid making some overly optimistic or pessimistic choices in our analyses, we choose an alternative, simulation-driven strategy to estimate the mean and the envelope of the theoretical error. The main idea is to estimate the TE covariance from the difference between the data and the best-fit model inferred from an analysis based on large scales. We assume that the typical size of the envelope is equal to the TE mean, 11) in which case the theoretical error is fully characterized by a single shapeP (TE) (k). Eq. (2.11) is motivated by the example of the counterterm marginalization (2.8), where the mean and the variance of the theoretical error have identical shapes. In practice, we use the following algorithm to estimate the theoretical error. Given the data, we compute the theory prediction for the cosmological parameters fixed to some fiducial values. Then we fit only the nuisance parameters at some fiducial k fid. max , which is chosen to be reasonably small, such that the theoretical error is negligible for this data cut. Note that this analysis is very fast, since it requires computing a single best-fit point involving only the nuisance parameters. After that we take the best-fit theory curve and compute We use the following procedure to select k fid. max . Having fixed some fiducial cosmology, we minimize the usual power spectrum likelihoods w.r.t. nuisance parameters for a set of k max and plot the best-fit reduced χ 2 statistic as a function of k max . The corresponding profiles are flat up to a certain scale k 2−loop , at which they start exhibiting significant scale-dependence. This is the scale at which the two-loop corrections become important. Any choice of k fid. max that is lower than k 2−loop is suitable for our purposes. In practice, we use k fid. max = k 2−loop −∆k , where ∆k = 0.04 hMpc −1 . We have tested our procedure by varying k fid. max and found that it does not have any significant effect on the results. This happens because any residual error made in choosingP (TE) i is absorbed by the TE covariance matrix. The details about the choice of the data cuts k fid. max can be found in Appendix A. Overall, our strategy of choosing k fid. max turned out to be quite accurate post factum, and we find that it does not induce any bias in cosmological measurements.
It is important to point out that our algorithm does not exclude the situation where the initial best fit is accidentally close to the data points for k > k fid. max , which would make one conclude that the theoretical error is very small. This is certainly not a generic situation. Since we are using sufficiently low k fid. max in our analyses, the difference between the initial best-fit power spectrum and the data points at smaller scales is expected to be a good representation of the realistic uncertainty of the model. We have checked that our method works well in all analyses of different N-body data studied in this paper. Nevertheless, it should be used with care when the data from different redshifts or tracers is considered.
Realistically, one can calibrate the theoretical error from mock catalogs of a given survey and use it in all analyses of the real data. Unlike the statistical uncertainty, the theoretical error covariance does not scale with volume or shot noise and hence needs to be calibrated only once for a particular tracer and redshift bin.
So far all our formulae were written for the simplest case of the real space power spectrum analysis. The treatment of the theoretical error becomes slightly more complicated in redshift space. In the plane-parallel (flat sky) approximation, the redshift space power spectrum is the function of two variables, the wavenumber k and the cosine between the wavevector k and the unit line-of-sight directionẑ, denoted by µ, µ ≡ (k ·ẑ)/k .
The redshift space power spectrum can be conveniently cast in the form of Legendre multipoles L , (2.14) where = 0, 2, 4 are monopole, quadrupole, and hexadecapole moments, respectively.
We need to implement the condition that the theoretical error is a smooth function both in k and µ. If the theoretical error curve is a single fixed function of k and µ, and only its amplitude is unknown, an explicit marginalization over this amplitude (similar to Eq. (2.8)) suggests that the theoretical error covariance is fully correlated across µ and k bins. At the level of the Legendre multipoles, this means that the theoretical error multipoles are 100% correlated. However, since the precise shape of the theoretical error is not known by definition, we will impose some finite correlation between the TE multipoles using the model similar to that describing the correlation of k-bins, where i, j now run only over the k-bins. In what follows, we will use the minimal non-trivial multipole coherence length ∆ = 2.
The appearance of e − ( − ) 2 2∆ 2 in Eq. (2.15) can be understood from the following argument. If the theoretical error in (k, µ) space were characterized by a single parameter, i.e. in analogy with Eq. (2.6) it were given by then the theoretical error covariance between the multipoles, obtained after marginalization over α, would be 100% correlated. The shape (2.16) indeed appears at the next-to-next-to-leading order (NNLO) in the Taylor expansion of the redshift-space mapping [1]. However, if the theoretical error were characterized by an independent parameter for every single multipole (L is the Legendre polynomial of order ), i.e.
then marginalizations over the nuisance parameters α would produce the effective theoretical error covariance that is diagonal in multipole numbers. Thus, having a finite coherence length in the multipole space represents a sensible compromise between these two extreme situations.

Methodology, simulations and covariances
In this section we discuss technical details of our analysis: the simulation data, theoretical model and the covariance matrces.

Theoretical model
We use one-loop cosmological perturbation theory to compute power spectra for dark mater and galaxies in real and redshift spaces. The details of our theoretical model can be found in Ref. [39]; it includes IR resummation to capture the non-linear evolution of the BAO wiggles [33][34][35][36][37][38]42] and UV counterterms that are required to account for the effects of short-scale dynamics, whose description is impossible within perturbation theory itself [43][44][45].

Data
Throughout the paper we will use a suite of LasDamas Oriana simulations [46], which consists of 40 boxes with L = 2.4 Gpc/h on a side, totaling the volume of 553 (Gpc/h) 3 . The cosmological parameters used to generate mock catalogs are , n s = 1, and m ν = 0. The details of LasDamas simulation can be found at. 2 To reduce the stochastic scatter, we fit the mean of power spectra extracted from 40 independent simulation boxes. However, the statistical error corresponding to the total simulation volume is so small that the two loop corrections and inaccuracies of N-body modeling can supersede cosmic variance already on very large scales. In order to be more realistic, we will use the covariance that corresponds to V = 100 (Gpc/h) 3 and not to the actual LasDamas volume. This reduced volume is comparable to the total volume of future spectroscopic surveys like DESI [9] or Euclid [10], which allows one to interpret our results in the context of these surveys.
We will analyze the statistics of dark matter particles and galaxies in real and redshift space. The redshift snapshots available to us are z = 0 and z = 0.974 for dark matter, and z = 0.342 for galaxies. The galaxy distribution is generated using the HOD model that matches the luminous red galaxy (LRG) sample of the BOSS survey [47], with the shot noise level (3.1) The theoretical non-linear spectra are generated with the CLASS-PT code [39] that computes 1-loop perturbation theory integrals using the FFTLog method [48]. We run our MCMC chains with the Montepython sampler [49,50], analyze these chains and produce triangle plots with the GetDist package [51]. In all analyses of this paper, we fit the following set of cosmological parameters: and also present the derived parameters Ω m and σ 8 . We fix ω b to the fiducial value used in simulations. This is done in order to simulate the ω b prior that can be taken from BBN or Planck in a realistic analysis [1].

Statistical covaraince
We use the covariance matrix in the Gaussian approximation to describe sample variance [14]. This approximation is very accurate for the power spectrum at least for scales k max ≤ 0.4 hMpc −1 [52]. The Gaussian covariance for the real space power spectrum takes the following form for the matter-matter, galaxy-galaxy power spectra (P mm and P gg ) and galaxy-matter cross-spectrum (P gm ), respectively, 3 is the number of modes in a k-bin of size (dk = 0.0025 h/Mpc in all our analyses) and δ kk is the Kronecker delta. In practice, we use the measurements P mm and P gm in the covariance matrix estimates (3.3). The Gaussian covariance between the redshift space galaxy multipoles is given by [10], Expressions (3.3), (3.4) are valid for both dark mater and galaxy statistics. It is worth noting that the galaxy power spectrum should be considered along with shot noise term (3.1) which accounts for the discrete nature of galaxies. In practice, we evaluate (3.4) in the linear (Kaiser) approximation [53]. All analyses for sharp data cuts are performed with the usual statistical covariances. To run the analyses with the theoretical error, we supplement the statistical covariances with the theoretical ones, (2.9) in real space, (2.15) in redshift space.

Dark matter in real space
We start by analyzing the dark matter power spectrum in real space. To that end we use the one-loop IR-resummed template of [36] with the addition of the effective dark matter sound speed counterterm γ [43,44], which will be the only nuisance parameter in our analysis, As a first step, we fit the datavectors with sharp cuts k max and pure statistical covariance. As a second step, we add the theoretical error to the analysis. The results in this case do not depend on k max provided that it is reasonably high. In practice, we cut the datavector at k = k N L ≈ 0.3 h/Mpc and 0.5 h/Mpc for z = 0 and z = 0.974, respectively. The results for the marginalized 1d variances of cosmological parameters as functions of k max are shown in Fig. 1. Note that we have run a complete MCMC analysis for every k max in this plot. The rightmost points correspond to the theoretical error analysis. Let us begin with the z = 0 case. One can see from the left panel of Fig. 1 that the measured optimal parameters start deviating from the fiducial ones at k max = 0.12 h/Mpc. But starting from k max = 0.24 h/Mpc, the estimates start moving in the opposite direction and accidentally become fully compatible with the fiducial cosmology at k max = 0.3 h/Mpc. Clearly, this behavior is caused by over-fitting and the constraints at k max = 0.3 h/Mpc cannot be trusted even though they enclose the fiducial values within 1σ. Importantly, the error bars obtained at k max = 0.3 h/Mpc are much smaller than the ones we got before the fit started deviating from the truth at k max = 0.12 h/Mpc. This illustrates the danger of the k max approach, which can be significantly affected by over-fitting. In what follows, we choose k max = 0.12 to be a baseline cut at z = 0 as the estimated parameters match the fiducial cosmology within 1σ there, but σ 8 becomes biased by more than 2σ for larger k max . Now we focus on the z = 0.974 case, whose results are shown in the right panel of Fig. 1. We see that σ 8 undergoes an excursion beyond 1σ when k max is varied between 0.14 hMpc −1 and 0.3 hMpc −1 , but accidentally crosses the fiducial value at k max = 0.28 hMpc −1 . The error bars at k max = 0.14 hMpc −1 and k max = 0.28 hMpc −1 are significantly different, but both measurements enclose the fiducial cosmology within 1σ, hence the results depend on whether one wants to be more conservative or aggressive. In practice, the second option is often more popular, thus we choose k max = 0.28 h/Mpc as a baseline data cut at z = 0.974.
Let us compare the baseline k max results with the theoretical error analysis, focusing first on the z = 0 case. The values of the marginalized 1d variances are displayed in Table 1. The marginalized 2d triangle plot for the k max = 0.12 h/Mpc and the theoretical error (TE) analysis are displayed in Fig. 2. The theoretical error envelope for TE analysis was computed using a best-fit model at fiducial k fid. max = 0.10 h/Mpc (see Appendix A for more detail). We see that the TE covariance helps par.
fid. us measure all cosmological parameters without any significant bias. Moreover, there is a moderate improvement over the k max analysis for all cosmological parameters of interest: the errors on ω cdm , h , n s and σ 8 reduce by 20%, 30%, 20%, and 10%, respectively. Now we turn to the z = 0.974 case, which is more relevant for upcoming surveys. The values of the marginalized 1d variances for the k max = 0.28 h/Mpc and the TE analysis are displayed in Table 1. The envelope for TE analysis was computed with k fid. max = 0.26 h/Mpc (see Appendix A for more detail). We found that using the TE covariance matrix leads to unbiased estimates of all cosmological parameters. However, unlike the z = 0 case, it does not noticeably improve the error bars compared to the baseline k max results. This happens mainly because the BAO wiggles are strongly suppressed for wavenumbers larger than 0.28 h/Mpc [36], thus going to larger wavenumbers with the TE does not yield as much new information as in the z = 0 case since the baseline k max is larger. Moreover, our TE constraints on σ 8 are somewhat weaker (by 13%) than those from the baseline k max analysis, and the mean value is not shifted from the true cosmology. From the marginalized 2d triangle plot shown in Fig. 3 one clearly sees the source of this worsening: the degeneracy between σ 8 and the counterterm γ is broken less efficiently when the theoretical error is taken into account. Recall that the measurements of the effective sound speed are very sensitive to the two-loop corrections [54]. If we ignore these corrections, the measurement of γ will be biased and the error bars will be underestimated. Given the apparent degeneracy γ − σ 8 , this propagates into the posterior of σ 8 , whose width is underestimated too. In contrast, including the theoretical error allows one to marginalize over the two-loop contributions and get a correct measurement of γ and σ 8 with unbiased, albeit larger, error bars. This shows that the TE covariance is imperative in order to get accurate estimates of the parameter variances.

Dark matter in redshift space
In this section we study the clustering of dark matter in redshift space. In a real survey one always observes dark matter tracers (like galaxies) in redshift space. In this case, it is hard to disentangle between the nonlinear effects of redshift-space and the nonlinear galaxy bias. Thus, the case of redshift space dark matter will clearly show us how well perturbation theory can describe nonlinearities induced specifically by redshift-space mapping. We will see that the theoretical error plays a crucial role in this analysis too, and allows for noticeable improvement of the cosmological constraints.
Normally the analysis of the redshift space power spectrum is limited to the first three non-trivial moments = 0, 2, 4 because only these moments exist in linear theory. We have found that the hexadecapole signal is dominated by a systematic leakage from lower moments due to discreteness effects. Given this reason, we perform our analysis only for the monopole and quadrupole moments. We fit the IR-resummed one-loop power spectrum, where P ctr denotes conterterms which fix the UV-dependence of the loops and parameterizes the ignorance about short scale-dynamics. This term also addresses the "fingers-of-God" effect [55]. In what follows we describe this phenomenon perturbatively along the lines of the effective field theory. The section has the following outline. First, we validate the effective field theory treatment of fingers-of-God and compare it with popular phenomenological prescriptions in Sec. 5.1. Then, we present the measurements of cosmological parameters from the mock redshift-space power spectrum for the cases of the theoretical error approach and the sharp momentum cuts in Sec. 5.2.

Fingers-of-God modeling
The biggest challenge in the analytic description of redshift-space distortions is the fingers-of-God effect, induced by virialized motions of dark matter particles (or galaxies) in halos. The virialized velocity field is fully non-linear and there is very little hope that it can be described from first principles. However, the effect of fingersof-God on long-wavelength fluctuations can be captured within effective field theory by a finite set of operators. In particular, the lowest order corrections are given by [56,57], where c 0 , c 2 and c 4 are called "counterterms" and f ≡ d ln D/d ln a is the logarithmic growth factor. The values and time-dependence of counterterms are not known a priori, and hence we treat them as free parameters and marginalize over their amplitudes. It is quite natural to keep three different free coefficients since they fix the UVdependence of different loop diagrams and capture different physical effects. Namely, the monopole counterterm includes the contribution similar to the higher-derivative bias ∇ 2 δ (along with the dark matter effective sound speed operator), which is absent for higher order multipole moments. In contrast, the quadrupole counterterm captures the dispersion of the short-scale velocity field. In what follows, we will justify that keeping independent coefficients in every counterterm is necessary for proper description of dark matter clustering on large scales.
The expansion (5.2) assumes that the dimensionfull coefficients c i are small, However, the peculiar velocities can be rather large, which can violate the condition of the applicability of perturbation theory (5.3) even on large scales. For instance, the BOSS galaxies are expected to have the velocity dispersion ∼ 6 Mpc/h, which gives an estimate c 2 ∼ 50 [Mpc/h] 2 [1]. It implies that characteristic momentum scale of higher order short-scale velocity cumulants is significantly lower than the non-linear scale that controls gravitational nonlinearities in real space, k NL ∼ 0.5 hMpc −1 [26].
Hence, the usual one-loop power spectrum model (5.1), (5.2) can become insufficient for an accurate description of the data even on large scales. We proceed by introducing an additional counterterm to capture the redshift space nonlinearities at next-to-leading order [1], where b 4 denotes the next-to-leading counterterm and b 1 = 1 for the dark matter. The redshift-space mapping effectively generates an expansion in powers of ∇ 2 z δ. From this point of view, the extra counterterm can be viewed as ∇ 4 z δ contribution in this expansion. We choose the next-to-leading order contribution (5.4) to be universal for all multipole moments, as dictated by the redshift-space mapping. We assume that the contributions from other physical effects (higher-derivative counterterms etc.) are sub-dominant since their nonlinear scale is similar to k NL for the real space dark matter.
The full counterterm contribution is given by P ctr = P ∇ 2 δ + P ∇ 4 z δ . Doing the integrals with the Legendre polynomials, we get 3 where c 0 , c 2 and b 4 are free fitting parameters and b 1 is the linear bias term which we explicitly inserted in Eq. (5.5) for the future reference when we study redshiftspace galaxies. For dark matter b 1 = 1. Note that, unlike galaxies, the lowest order stochastic contribution to the dark matter redshift-space power spectrum starts with the k 4 −term [57], and hence, it is a higher order correction that can be ignored. The effects of these corrections are taken into account by the theoretical error. The first goal of this section is to validate the inclusion of the next-to-leading order conterterm (5.4) into the one-loop theoretical model (5.1), (5.2). Along the way, we will test the accuracy of the fingers-of-God modeling with phenomenological fitting functions that are often used in the literature, see e.g. [58]. These popular models approximate the fingers-of-God effect by a simple one-parameter Gaussian or Lorentzian damping, e.g.
where Σ can be interpreted as a short-scale velocity dispersion. We stress that the phenomenological models like (5.6) are not derived from first principles and introduce uncontrollable errors in parameter inference. To show this, we analyze the the redshift-space dark matter power spectrum data using the model (5.6) instead of the full EFT description. In the following analyses, for simplicity, we fix all cosmological parameters to their fiducial values except for σ 8 , which is allowed to float freely. This choice is done only for simplicity. All our conclusions hold true even if we vary all relevant cosmological parameter. We consider three models: (a) the fitting function (5.6) applied to the one-loop SPT power spectrum, i.e. P NL (k, µ) = P SPT (k, µ), (b) the vanilla one-loop EFT with free c 0 , c 2 , but b 4 = 0, (c) the one-loop EFT model extended to NNLO with free b 4 . Note that we implement the full IR resummation in case (a). In this setup the model (a) is a non-linear resummation of perturbative models (b) and (c), which implies the following relationship between the counterterms, Let us first discuss the z = 0 snapshot. The resulting posterior distribution for nuisance parameters and clustering amplitude for the three models at k max = 0.1 hMpc −1 are shown in Fig. 4. The first relevant observation is that the approximate model (5.6) biases σ 8 by more than 5σ. However, varying c 0 and c 2 independently, one reduces the bias down to the 2σ level. This illustrates that the phenomenological model with a single parameter Σ fails to reproduce the data because it does not have enough freedom to describe the monopole and quadrupole simultaneously. Thus, keeping free coefficients in different conterterms is a crucial part of any reliable analysis. The residual bias in σ 8 is removed by adding the next-to-leading order counterterm (5.4), which justifies its presence in the data analysis. This counterterm, however, does not allow us to increase the k max range, which may be a signal of perturbation theory breakdown.  To demonstrate this, we perform the following exercise. We fix all cosmological parameters to their fiducial values and extract the best-fit parameters c 0 , c 2 and b 4 from the data at k max = 0.1 hMpc −1 , where the fit is unbiased. Then we compare the contributions from leading and next-to-leading order counterterms of the quadrupole moment to the tree-level dark matter power spectrum evaluated for the best-fit parameters. We chose the quadrupole because this moment is most sensitive to the fingers-of-God. The results are shown in the left panel of Fig. 5. One can see that the NLO contribution surpasses the LO counterterm at k ≈ 0.13 hMpc −1 , and the whole tree-level expression at k ≈ 0.18 hMpc −1 . This can be naturally interpreted as a breakdown of the perturbative description for the dark matter in redshift space at k ≈ 0.15 hMpc −1 for z = 0. Now let us consider the z = 0.974 snapshot. The relevant posterior distributions are shown in Fig. 6. One can observe the same trends as before: the    fitting function (5.6) (model (a)) yields a highly biased estimate of σ 8 already at k max = 0.1 h/Mpc. The naive EFT model (b) is accurate at k max = 0.1 h/Mpc, but fails to give a good fit beyond k max = 0.14 h/Mpc. The situation improves after the addition of the the next-to-leading order counterterm b 4 . We can clearly see that the presence of b 4 increases k max and yields constraints better than the model (b) at k max = 0.1 hMpc −1 . This illustrates the crucial importance of fourth order short-scale velocity cumulant (5.4) for robust parameter inference. In what follows, we always include (5.4) in our baseline theoretical model (5.1), (5.2). Finally, we confirm the validity of perturbation theory by plotting the LO and NLO quadrupole counterterms, normalized to the linear theory prediction, in the right panel of Fig. 5.
We can see that the typical data cuts of our analysis k max = (0.1 − 0.15) hMpc −1 are indeed lower than the nonlinear scale k NL,RSD ≈ 0.25 hMpc −1 at z = 0.974. Let us summarize the results of this section so far. First, we have shown that the use of over-constrained fitting formulas for the fingers-of-God can generate significant bias in the inferred cosmological parameters. Second, we have justified the inclusion of the next-to-leading order k 4 P lin -counterterm, by showing that it allows us to extend the regime of applicability of the EFT description and improve the parameter constraints.

Cosmological parameters
We present now the complete cosmological analysis of the redshift-space dark matter power spectrum in the k max and TE analyses.
In the previous section we have seen that the perturbative description breaks down at z = 0 already on quite large scales k ∼ 0.1 hMpc −1 , and hence this case may not be very illustrative. We focus on the z = 0.974 snapshot in what follows. The marginalized 1d constraints on cosmological parameters for different momentum cuts and the theoretical error are shown in Fig. 7. The theory prediction for TE analysis was computed at fiducial k fid. max = 0.12 h/Mpc. We found that the momentum cutoff k max = 0.14 h/Mpc reproduces the true parameters within 1σ interval. However, the clustering amplitude becomes biased by more than 2σ starting from k max = 0.16 h/Mpc. Given this reason, we choose k max = 0.14 h/Mpc as our baseline data cut. The final constraints on cosmological and nuisance parameters for this k max and for the theoretical error analysis are displayed in Tab. 2 and in Fig. 8. As in the previous section, we note three key features of the theoretical error covariance. First, we can obtain the cosmological constraints with a single MCMC analysis. This can be contrasted with the standard approach that requires running many analyses with different choices of k max . Second, we get unbiased estimates for cosmological parameters and reliable error bars. Third, these errorbars happened to be smaller than those obtained in the k max analyses; the constraints on ω cdm , h, n s and σ 8 improve by 30%, 40%, 25%, 10%, respectively. This happens mainly due to additional BAO wiggles, which are included beyond k max = 0.14 hMpc −1 . Indeed, we can see that the 2d probability distribution in the ω cdm − h plane, which reflects the BAO signal, is significantly narrower in the TE analysis.

Galaxies in real space
In this section we focus on the galaxy clustering in real space. Although this analysis is academic in nature, it will allow us to assess the validity of the perturbative bias model and clearly see the implications of the theoretical error for biased tracers. We will use the effective field theory model characterized by the following set of nuisance parameters (see [39] for our notations), which includes the local quadratic bias b 2 , the quadratic tidal bias b G 2 , the higherderivative counterterm 4 R 2 * and the constant shot noise contribution P shot . Even though the pair-counting shot noise predictionn −1 was subtracted from the power spectrum data, we still need to keep a constant nuisance parameter in the fit because the value of shot noise is, in general, expected to deviate fromn −1 due to exclusion effects [59] (or fiber collisions in realistic surveys [60]). These deviations can be as large as ∼ 30% ofn −1 for the BOSS-like host halos [61]. Given this reason, we marginalize over P shot within the following Gaussian prior: Note that the value of the residual shot noise contribution can be negative. We do not impose any priors on the other biases and counterterms, except for the cubic tidal bias b Γ 3 , which we found to be very degenerate with b G 2 . We marginalize over b Γ 3 assuming the physical prior centered at the prediction of the coevolution model [62,63], and with the unit Gaussian variance, Finally, we have checked that the data does not show any evidence for the scaledependent stochastic contributions a 0 k 2 for k 0.3 hMpc −1 , and hence we did not use it in our model. This is consistent with the results of N-body simulations done in Ref. [61]. We analyzed the galaxy-galaxy power spectrum and the galaxy-matter crossspectrum at z = 0.342 using the k max approach and found that the posterior distributions are unbiased for all k max up to k NL 0.3 hMpc −1 . We do not push to higher k's because the perturbative expansion is clearly not valid there. The resulting posterior distributions for nuisance and cosmological parameters are shown in Fig. 9 (blue contrours), the 1d marginalized parameter limits are presented in Table 3.
All in all, we do not see any evidence for the theoretical error up to very high k max for the power spectrum of galaxies in real space. We believe that it is caused by the following two reasons. First, the statistical covariance for galaxies contains a large shot noise contribution, which slows down the reduction of the power spectrum errors on short scales compared to the dark matter case studied before. Thus, unlike the dark matter case, the theoretical error for galaxies is always smaller than the statistical one on the scales of interest. Note that the situation can be different for another type of galaxies, whose shot noise is lower, e.g. the DESI bright galaxy 4 Since the degeneracy between R 2 * and c 2 s (in the notation of [39]) cannot be broken at the power spectrum level, we fix the dark matter counterterm c 2 s = 1 [h −1 Mpc] 2 to avoid clutter. sample [9]. The second reason is a large number of fitting parameters: the real space galaxy power spectrum data alone cannot efficiently break the degeneracies between these parameters. An example is the strong degeneracy b 1 − σ 8 , which also generates a highly not-Gaussian posterior distribution of σ 8 seen in Fig. 9. Indeed, these two effects can be clearly assessed by analyzing the galaxy-matter cross spectrum P gm . The shot noise contribution to the covariance matrix is reduced for P gm , see Eq. (3.3) and there is no constant shot noise contribution P shot in the fit, which narrows the posterior distribution compared to the P gg case for the same k max . The 1d marginalized limits on the cosmological parameters as a function of  Table 3. The marginalized 1d intervals for the cosmological parameters estimated from the Las Damas real space galaxy power spectrum and galaxy-matter cross spectrum at z = 0.342. The shown are the fitted parameters (first column), fiducial values used in simulations (second column), the results for P gg at k max = 0.3hMpc −1 (third column), for P gm at k max = 0.2hMpc −1 (fourth column) for P gm with theoretical error approach (fifth column). Galaxy-matter, real space k max are shown in Fig. 10. The theory prediction for TE analysis was computed at fiducial k fid. max = 0.18 h/Mpc. We see that the 1d posterior distributions for P gm are unbiased up to k max ≈ 0.22 hMpc −1 . However, a closer inspection of the MCMC results revealed a bias in the 2d posterior contours. This suggests that using the marginalized 1d distributions to select k max can be misleading: one has to ensure that not only the 1d marginalized constraints are unbiased, but also the 2d contours enclose the fiducial cosmology e.g. within 68% CL. Indeed, for some data cuts the principal components of certain parameters can be biased, but their projections onto particular parameter planes can accidentally fall close to the fiducial values. Using such data cuts can lead to biases when different probes are combined, as in this case the resulting posterior distributions are driven by the degeneracy breaking between different principal components (PCs) of the combined data sets. This motivated us to choose k max = 0.2 hMpc −1 as a baseline data cut for P gm .
Since the baseline k max is quite low, we expect to gain some information with the theoretical error. The posterior distributions for the k max and the TE cases are shown in Fig. 9, whilst the parameter limits are presented in Table 3. Just like in the dark matter case, the TE covariance sharpens the constraints on the parameters related to the power spectrum shape and the BAO: ω cdm , h, and n s improve by 20%, 20% and 25%, respectively. In contrast, the constraint on σ 8 is not significantly affected. Note also that the mean σ 8 shifts toward the true fiducial value. The constraint on σ 8 does not improve due to the notorious degeneracy b 1 − σ 8 . Indeed, the linear-theory degeneracy b 1 − σ 8 can be broken only by the one-loop corrections. This makes the result very sensitive to largest wavenumber bins used in the analysis, because for these bins the amplitude of the loop corrections is large. However, when the one loop correction becomes large, the two-loop contribution becomes non-negligible as well. The theoretical error covariance is exactly introduced to alleviate this problem and marginalize over all possible 2-loop shapes. It does not sharpen the constraints on σ 8 , but yields some improvement for other parameters related to the shape and the BAO.
One possible way to shrink the posterior distribution in the galaxy power-and cross-spectra cases is to fix some bias parameters to the predictions of some phenomenological models, i.e. the coevolution model for the dark matter halos [62]. However, this can result in over-fitting and wrong estimation of the parameter error bars. We believe that a more appropriate approach is to vary all relevant nuisance parameters in the fit within physical priors, as we do in this paper. In a realistic survey one always observes galaxies in redshift space, which allows for breaking of the b 1 − σ 8 degeneracy through the redshift-space distortions. Let us now consider the latter case.

Galaxies in redshift space
In this section we scrutinize the galaxy clustering in redshift space in the context of the theoretical error covariance. We will analyze the monopole and quadrupole moments of the redshift space galaxy power spectrum at z = 0.342. The redshift of this sample is somewhat lower than the effective redshifts of future surveys, hence this represents the stringent test of our approach.
The EFT model with NLO fingers-of-God corrections is characterized by the following set of nuisance parameters (see Ref. [10] for our conventions), We use the same priors on b Γ 3 and P shot as in the previous section, and infinitely large flat priors for other parameters. Note that on general grounds one can expect the presence of the additional stochastic contribution P stoch (k, µ) = a 2 µ 2 k 2 . However, we have found that this contribution is completely degenerate with b 4 for the P 0 + P 2 datavector, which motivated us to fix a 2 = 0 as in Refs. [29,39]. As a first step, we run a sequence of analyses varying the data cut k max . The 1d marginalized constrains on the cosmological parameters as functions of k max are presented in Fig. 11. The rightmost points correspond to the theoretical error analysis with the cutoff k max = 0.32 h/Mpc. One may observe that the standard analyses yields a biased estimate for σ 8 for k max > 0.20 h/Mpc. Shortly after, ω cdm and h start deviating from the true values at k max ≈ 0.23 h/Mpc. This motivates us to choose k max = 0.18 h/Mpc as our baseline analysis with the sharp momentum cutoff as in this case all marginalized posteriors contain the true values well within 1σ. Galaxy-galaxy, redshift space Figure 11. Same as Fig. 10, but for the redshift space galaxy multipoles.
The values of the marginalized 1d constraints for the baseline k max and the TE analyses are shown in Tab. 4. The 2d posteriors are presented in Fig. 12  Overall, we see that for the realistic example of redshift-space galaxies the theoretical error approach improves the parameter constraints over the k max case quite modestly ( 10%), unlike the previous case of the galaxy-matter cross-spectrum. This happens because the BAO wiggles at k > 0.18 hMpc −1 , neglected in the k max analysis, are strongly suppressed due to redshift-space displacements [38] and the large shot noise covariance (c.f. Eq. (3.3) and Eq. (3.4)). There is, however, some non-negligible improvement for the nuisance parameters, whose values are quite sensitive to the large-k modes. As far as the cosmological parameters are concerned, qualitatively, we can conclude that the TE covariance, if properly chosen, automatically optimizes the choice of k max and guarantees that parameter limits are unbiased, but does not noticeably improve them. This implies that the application of the theoretical error approach most likely will not sharpen the cosmological constraints from the current surveys like BOSS, where k max has already been measured for various analysis settings [1,28]. However, we expect it to be useful for future deep surveys like DESI or Euclid.

Conclusions
We have validated the theoretical error approach on large-volume N-body simulations data for the clustering of dark matter and galaxies in real and redshift space. First, we introduced a new mock-based approach to extract the theoretical error covariance from the data, which allowed us to avoid uncertainties in the theoretical estimates of higher-order nonlinearities. Then, we demonstrated that the use of the TE covariance in the power spectrum likelihoods for these tracers allows us to recover the input cosmological parameters used in the simulations. Crucially, using the TE approach we can avoid lengthy measurements of k max and perform the full parameter inference with a single MCMC run. We have also performed the detailed comparison of the TE results with the k max analyses. For dark matter in real and redshift space, cosmological constraints actually improve with the theoretical error covariance. This happens because it allows us to extract the information encoded in the BAO wiggles at wavenumbers larger than k max , while marginalizing over the broadband uncertainties.
For galaxies, the presence of the large shot noise term in the covariance and proliferation of nuisance parameters inflate the error bars and force us to use more aggressive data cuts, where the BAO information in saturated. Due to this reason, the TE approach does not significantly improve the parameter limits unlike in the dark matter case. For the galaxy auto-spectrum in real space, we have not found any evidence for theoretical error up to the nonlinear scale, and hence the TE approach is fully identical to the standard k max analysis. The situation is somewhat different for the galaxy-matter cross-spectrum, which has a smaller statistical covariance. In this case the linear theory mass fluctuation amplitude σ 8 is extracted from high-k part of the spectrum, and hence the theoretical error is important to keep under control the sensitivity of the results on the higher order nonlinearities omitted in our theory model. Moreover, the limits on the power spectrum shape and BAO parameters also shrink as a result of taking the TE into account. Finally, for galaxies in redshift space, the theoretical error is also non-vanishing and its inclusion in the covariance matrix effectively optimizes the choice of k max at which the inferred parameters are unbiased. The cosmological parameter constrains improve quite marginally in this case, unlike the nuisance parameters, whose posterior volume shrinks appreciably.
As a by-product, we have validated the effective field theory implementation of the CLASS-PT code [39] by showing that it can accurately reproduce the true fiducial parameters from various power spectrum datavectors used in this paper. In passing, we have shown that the use of over-constrained non-linear fingers-of-God models leads to biases in parameter inferences. Besides, we have advocated the inclusion of the higher-order counterterm in the description of nonlinear redshift-space distortions. Indeed, it allows us to extend the range of scales where the modeling is accurate and eventually improve the cosmological constraints as compared to the standard oneloop effective field theory model without the higher-order corrections. This justifies the analyses of Refs. [1,5,29] that included the higher derivative fingers-of-God correction.
It is worth pointing out that in this paper we have studied the mock galaxies that simulate the BOSS luminous red galaxy (LRG) sample [47]. This sample exhibits large fingers-of-God features, which do not allow us to extend the perturbative treatment to sufficiently short scales. The situation will be different for emission line galaxies (ELG), which are the main targets for the upcoming surveys like DESI [9] or Euclid [8]. The clustering properties of ELGs have been recently measured for the first time by the eBOSS survey [64]. Crucially, these measurements already show that ELG sample is less affected by the fingers-of-God and hence one can expect some improvements in cosmological constraints over the LRG-based analysis, due to larger effective k max .
Our analysis can be extended in various ways. First, the theoretical error approach can be applied to the bispectrum data [26]. Second, it would be curious to check to what extent it can be useful for the ELG sample. Third, our formalism can be extended to the case of the projected statistics like weak lensing, where it could play an important role to minimize systematic biases due to imperfect theoretical modeling, see e.g. [65]. We leave these research directions for future work.

Acknowledgments
We thank Roman Scoccimarro for sharing with us the power spectra from the Las-Damas N-body simulation. We are grateful to Marcel Schmittfull Figure 13. Theoretical error envelopes for dark matter in real space, normalized to the tree-level spectra. We show the envelope suggested by Ref. [26] (envelope 1) and the envelopeP (TE) measured directly from the data.

A.1 Comparison to perturbation theory
The key ingredients of the theoretical error approach are the envelope for the theoretical error covariance E(k) and the theoretical error meanP (TE) (k). By definition the theoretical error is the difference between the full model describing the data and an analytic approximation to that model. This suggests that a reasonable choice for the mean should be a residual between some simulation result (or the data) and the approximate model fitted to some sufficiently low k fid. max . The theoretical error covariance is then used for k > k fid. max . As a second step, we must define the relationship between E andP (TE) . The toy model of the theoretical error entirely given by a higher-order counterterm suggests that the mean and the envelope should have the same order of magnitude. We impose an even stronger relation such that the theoretical error likelihood is characterized by a single shape. The original work [26] assumed the zero mean and the following envelope: where D(z) is the linear growth factor. This envelope is a smooth fit to the two-loop dark matter power spectrum computed in standard perturbation theory (SPT) [66], after subtracting the leading UV part. It should be stressed that this two-loop SPT correction still has a strong residual unphysical sensitivity to the ultraviolet modes at the sub-leading order [67][68][69], and hence the actual effective field theory result is smaller that the SPT prediction. Nevertheless, we expect Eq. (A.2) to be order-of-magnitude accurate. In Fig. 13 we compare the prediction of Eq. (A.2) to the envelope that we extracted from the data. We used k fid. max = 0.1 hMpc −1 for z = 0 and k fid. max = 0.26 hMpc −1 for z = 0.974. Indeed, as expected, we can see that Eq. (A.2) and our envelope (A.1) agree by order of magnitude, but Eq. (A.2) systematically overestimates the difference between the data and the prediction of one-loop perturbation theory by a factor of ∼ 3.
It is instructive to compare the two different theoretical error prescriptions at the level of the cosmological constraints. To that end we have run the cosmological analysis of the real space dark matter power spectrum at z = 0 withP (TE) = 0 and the envelope Eq. (A.2). The resulting 2d posteriors are shown in Fig. 14 whereas 1d marginalized constraints are listed in Tab. 5. One can see that only the measurements of σ 8 and the effective sound speed are significantly affected. This shows that the use of the perturbation theory-inspired templates can overestimate the actual errorbars on the amplitude parameters. However, the shape and distance parameters ω cdm , n s , h are expected to be less affected by this choice. A similar situation was found in Ref. [5], which showed that the BAO measurements using the theoretical error covariance do not depend on exact shape of the theoretical error envelope. The situation becomes more complicated for redshift space multipoles, where the complete two-loop calculation has not yet been done. Thus, even the perturbation theory estimates can be very uncertain. One can consider two possible estimates,   The envelope 2 is based on the there is a disconnected two-loop diagram which involves a product of two one-loop diagrams, but without an extra propagator P , tree .  Comparing these two estimates with our baseline choice (at k fid. max = 0.12 hMpc −1 ) in Fig. 15, one sees that 'envelope 2' describes the actual difference between the 1-loop PT model and the data surprisingly well. On the contrary, 'envelope 1' underestimates the theoretical error quite significantly, by more than on order of magnitude.
Finally, Fig. 16 shows the envelopes for the redshift space galaxies. Our envelope is computed using k fid. max = 0.16 hMpc −1 . We can see that 'envelope 1' agrees well  Figure 16. Same as Fig. 15 but for galaxies in redshift space.
our theoretical error for the monopole, while 'envelope 2' largely overestimates it on short scales. In contrast, 'envelope 1' significantly underestimates the theoretical error for the quadrupole, whereas 'envelope 1' still overestimates it; our envelope lies in between these two perturbation theory estimates. All in all, we conclude that the redshift space theoretical error envelopes that we use in this paper roughly agree with the perturbation theory estimates, but the latter are quite uncertain.
A.2 Choice of k fid.

max
In this section we argue our choice of k fid. max in the theoretical error analyses. For that, we fix the cosmological parameters to their fiducial values and find the best-fitting values of nuisance parameters varying k max . Then, we extract the best-fit reduced statistic χ 2 /N dof (N dof = N bins − N params ) for each k max . The results for the real space dark matter are shown in the upper left panel of Fig. 17. Note that the typical values of χ 2 /N dof are around 0.2, which is a result of using the reduced volume V = 100 (Gpc/h) 3 in the covariance instead of the true cumulative volume of the LasDamas Oriana simulation V = 553 (Gpc/h) 3 . We see that the χ 2 profile blows up at k max = 0.14 hMpc −1 for z = 0 and at k max = 0.30 hMpc −1 at z = 0.974. This is a indication that the one-loop perturbation theory model becomes invalid at these scales. Thus, we choose smaller fiducial cuts k fid. max , where the χ 2 profile is still flat: k fid. max = 0.10 hMpc −1 for z = 0 and k fid. max = 0.26 hMpc −1 at z = 0.974. Operationally, it is suggestive to use The fiducial k max for other cases are chosen in a completely similar fashion. The results for dark matter in redshfit space are shown in the upper right panel of Fig. 17. We found that the reduced χ 2 /N dof statistics remains flat up to k max = 0.16 hMpc −1 . However, in order to be conservative, we choose k fid. max = 0.12 hMpc −1 in our analysis.
The results for the real space galaxies are shown in the lower left panel of Fig. 17. In this case, the picture is not so obvious due to the large shot noise contribution in the covariance, which can be larger than the theoretical error covariance. For this reason, one-loop perturbation theory provides accurate description of galaxy power spectrum up to nonlinear scale k max ≈ 0.3 hMpc −1 . Since we do not see any sign of the bias up to k max = 0.3hMpc −1 , we conclude that the theoretical error is negligibly smaller than the statistical covariance dominated by shot noise. The situation is somewhat different for galaxy-matter cross spectrum for which the χ 2 profile shows some scale-dependence for k max > 0.22 hMpc −1 . In this case, we choose k fid. max = 0.18 hMpc −1 in our theoretical error analysis of P gm . Finally, we display the bestfit χ 2 /N dof profile for galaxies in redshift space in the lower right panel of Fig. 17. One can see that the χ 2 profile blows up at k max > 0.2 hMpc −1 . Given this reason, we choose k fid. max = 0.16 hMpc −1 . It should be stressed that using the χ 2 (k max ) profile is inappropriate for defining the baseline cut of the k max -analysis because the fit can be biased at k max lower than k 2−loop when the cosmological parameters are varied. Indeed, we see that k 2−loop is typically larger than the baseline cuts k max in our main analyses. Hence, using the χ 2 profile is appropriate only for the theoretical error and not to fix the baseline k max .  Figure 17. Best-fit reduced χ 2 /N dof as a function of k max for various likelihoods considered in this paper.