Non-linear perturbation theory extension of the Boltzmann code CLASS

We present a new open-source code that calculates one-loop power autoand cross-power spectra for matter fields and biased tracers in real and redshift space. These spectra incorporate all ingredients required for a direct application to data: non-linear bias, redshift-space distortions, infra-red resummation, counterterms, and the Alcock-Paczynski effect. Our code is based on the Boltzmann solver CLASS and inherits its advantage: user friendliness, ease of modification, high speed, and simple interface with other software. We present detailed descriptions of the theoretical model, the code structure, approximations, and accuracy tests. A typical end-to-end run for one cosmology takes ∼ 0.3 seconds, which is sufficient for Markov Chain Monte Carlo parameter extraction. As an example, we apply the code to data from the Baryon Oscillation Spectroscopic Survey (BOSS) and infer cosmological parameters from the shape of the galaxy power spectrum. Our code and custom-built BOSS likelihoods are available at https://github.com/Michalychforever/CLASS-PT and https://github.com/Michalychforever/lss_montepython 1chudy@ms2.inr.ac.ru 2mi1271@nyu.edu 3ohep2@cantab.ac.uk 4marko.simonovic@cern.ch ar X iv :2 00 4. 10 60 7v 2 [ as tr oph .C O ] 2 9 A ug 2 02 0


Introduction
Observations of temperature and polarization fluctuations in the cosmic microwave background (CMB) are one of the main pillars of the ΛCDM model (see [1] and references therein). The most important tools connecting CMB data and cosmological parameters are Boltzmann codes, which allow one to compute various observables in a given cosmological model. Building upon years of development starting with CMBFast [2], the two most popular and independently designed Boltzmann solvers that have emerged are CAMB [3] and CLASS [4]. Both are very efficient and accurate, allowing for fast and robust extraction of CMB likelihoods. These two codes and their various extensions (see Refs. [5,6] for some reviews) have been widely used in the cosmology community.
Another source of cosmological information that is becoming increasingly important is the large-scale structure (LSS) clustering of galaxies in the late universe. This clustering is measured in redshift surveys such as the Baryon Oscillation Spectroscopic Survey (BOSS) [7]. Next generation surveys like Euclid [8,9] and DESI [10] will map a significant volume of the universe across a wide range of redshifts. In order to prepare for these future surveys and eventually harvest cosmological information encoded in the LSS data as efficiently as possible, it is imperative to build simple and robust extensions of the standard Boltzmann codes that can reevaluate LSS likelihoods as one scans over different cosmologies. 1 With this work we present one such tool, a modified CLASS code-CLASS-PT-that embodies an end-to-end calculation of various power spectra using the state-of-the-art perturbation theory models that incorporate all ingredients required for a direct application to data.
We provide a Jupyter notebook 2 that generates the spectra for galaxies and matter in real and redshift space. Additionally, we share a Mathematica notebook 3 that reads the spectra from output tables produced by CLASS-PT if it is run using a .ini file. Additionally, we publicly release custom-built BOSS galaxy power spectrum likelihoods written for the Markov Chain Monte Carlo sampler Montepython [6,12], review the structure of the code in Section 3. In Section 4 we discuss in detail the technical implementation of the non-linear model and test various approximations. A busy reader who is only interested in the final results, can skip directly to Sections 5 and 6. Section 5 contains some examples and important caveats that must be kept in mind when using our code. As an illustration, in Section 6 we apply CLASS-PT to the cosmological analysis of the BOSS galaxy clustering data. We release our BOSS likelihoods along with the code. In Section 7 we draw conclusions. Two short appendices contain some useful additional information: explicit expressions for the FFTLog redshift-space master integrals (Appendix A) and a quick installation manual (Appendix B). Appendix C discusses the pirors for the nuisance parameters used in our BOSS analysis.

The Power Spectrum Model
In this Section we describe the theoretical model used in CLASS-PT. We start with a brief summary of theoretical developments that have led to a complete and consistent description of large-scale clustering. A reader familiar with these results may wish to skip this Section. We will give details of all relevant ingredients needed for the description of the nonlinear power spectrum: the clustering of matter and biased tracers in real space, IR-resummation, the effects of redshift space and Alcock-Paczynski (AP) distortions.

Brief Overview of Perturbation Theory
Since Yakov Zel'dovich proposed a first model for nonlinear gravitational clustering of cosmological fluctuations in 1970 [26], there have been numerous attempts to build a consistent theoretical description of large-scale structure in the mildlynonlinear regime. Historically, the most popular approach was SPT ( [27][28][29], for a review see [30]), where dark matter is treated as a pressureless perfect fluid and the nonlinear equations of motion are solved perturbatively in Eulerian space. The major problem of SPT is that higher order perturbative corrections to the power spectrum do not lead to significant improvements on mildly-nonlinear scales [31,32]. This apparent breakdown of perturbation theory led to attempts to partially resum the diagrammatic expansion in order to improve convergence properties [31,33]. However, such resummation schemes were insufficient, as can be seen from a simple example of the one-dimensional universe [34]. In that case, the whole standard perturbation theory expansion can be explicitly and exactly resummed, but it does not lead to any notable improvement compared to the linear theory prediction.
Efforts to resolve this problem have led to the development of the Effective Field Theory of Large-Scale Structure [35]. The key insight was that the ideal fluid approximation is inconsistent even on large scales, and that the true equations of motion are those of an imperfect fluid with various contributions to the effective stress-tensor.
Starting from the Boltzmann equation (which is the true description of the dynamics for dark matter particles) and focusing on the dynamics of the long-wavelength fluctuations (averaging over the short modes), one can show that the imperfect fluid terms naturally arise and can be organized in a perturbative derivative expansion. Whilst the form of these terms is dictated by symmetry, their amplitudes are unknown free parameters which have to be measured from the data. Since these free parameters-the counterterms-capture the effects of the poorly-known short-scale physics, including them in the power spectrum significantly improves the performance of the theory [36,37]. The realization that the LSS theory must include unknown free parameters has finally resolved the long-standing problem of the consistent description of matter clustering on mildly non-linear scales. Another major advantage of the EFT approach is that it provides reliable estimates of theoretical errors, allowing theoretical uncertainties to be included in the total error budget and guaranteeing unbiased inference of cosmological parameters [38].
Another problem of Eulerian perturbation theory concerns the long-wavelength displacement of dark matter particles, which can be very large in our universe. Whilst the effects of these bulk flows are locally unobservable due to the Equivalence Principle [39][40][41], they still affect features in the power spectrum, such as the BAO wiggles. It is well-known that treating them perturbatively leads to significant errors in the description of the BAO peak [42][43][44], even though their effect on the broadband part of the correlation function (or the power spectrum) remains under perturbative control. Since the dominant dynamical effect of the bulk flows is a simple translation produced by the linear theory displacements, there exists a relatively straightforward way to take them into account non-perturbatively [45][46][47][48][49][50][51][52]. In other words, large contributions from these displacements at different orders in perturbation theory can be rigorously resummed. For this reason, this procedure is referred to as infrared (IR) resummation. It allows one to take advantage of simplicity of the Eulerian description, while keeping the impact of large displacements exact and hence significantly improving predictions for the shape of the BAO wiggles.
To make connection to observations, two additional ingredients are necessary; the first being the nonlinear description of biased tracers. Following the first attempts to build such a description in terms of a local-in-density bias expansion, an important milestone was the realization that various tidal and higher derivative bias operators must also be included [53,54]. Furthermore, since the formation of biased tracers is nonlocal in time [55], the expansion has to include additional terms that cannot be expressed in terms of local operators involving only two derivatives of gravitational and velocity potentials [55][56][57]. The perturbative bias model, (at least up to third order, which is needed for the one-loop power spectrum), is now well established and tested against various numerical simulations (for a review see [58]). The second important ingredient is the treatment of redshift space distortions (RSD). The standard perturbation theory kernels in the presence of RSD have been known for a long time [30], however, a consistent calculation of the one-loop power spectrum in redshift space requires additional counterterms related to the velocity field [59,60]. Below, we discuss these contributions in detail.
While CLASS-PT is entirely based on Eulerian perturbation theory, it is worth emphasizing that similar progress has been made in Lagrangian Perturbation Theory (LPT) as well [61][62][63][64][65][66]. An advantage of the LPT is that IR resummation is automatically incorporated at all orders, but this comes at the cost of larger computational complexity. Nevertheless, when all relevant biases and counterterms are included, the two approaches are fully consistent [65][66][67]. If both theories are well-defined, this is, in a sense, equivalent, since the Lagrangian and Eulerian schemes are just two different ways of solving the exact same equations of motion.

Dark Matter Power Spectrum
To begin, let us consider the model for the matter power spectrum in real space. On very large scales (or early times) the dark matter fluctuations evolve linearly. Thus, to a very good approximation, their power spectrum is given by where D(z) is the linear growth factor and P lin (k) is the linear power spectrum at redshift zero. In the mildly-nonlinear regime one can calculate perturbative corrections to this simple result, the first of which is the so-called one-loop contribution. For dark matter in real space, this is the sum of two terms; where P 1-loop, SPT (z, k) is the SPT contribution [30] and P ctr (z, k) is the counterterm needed for the consistency of the one-loop result [35,36]. The explicit expression for the counterterm at this order in perturbation theory is given by where c 2 s (z) is an effective parameter (sometimes refereed to as the effective sound speed), whose amplitude and time dependence are not known a priori. Thus, c 2 s (z) must be treated as a nuisance parameter in data analysis. The SPT one-loop term can be written as a sum of two well-known pieces, each of which is given by a particular convolution integral, (2.5) Here, and throughout the rest of the paper, we use the notation q ≡ d 3 q (2π) 3 . The convolution kernels F 2 and F 3 are the usual perturbation theory kernels [30,48]. Equations (2.3) and (2.5) give a complete description of the one-loop power spectrum of dark matter in real space. This model has been exhaustively tested against N-body simulations and found to predict the nonlinear matter power spectrum at mildlynonlinear scales quite well, see e.g. Ref. [68].
Strictly speaking, our Eq. (2.4) is correct only in the Einstein-de Sitter (EdS) universe, where the momentum and time-dependences of the loop integrals factorize. However, even in a more general case the common practice is to retain the EdS perturbation theory kernels but replace the the growth factor in EdS 4 with the linear growth factor computed in the true cosmology. We will use this approximation throughout the paper and we also implement it in the CLASS-PT code. 5 We make this choice for two reasons. First, this approximation is quite accurate. Indeed, the residual difference with respect to the full calculation is so small that it is irrelevant even for future galaxy surveys [69][70][71]. Second, it allows the nonlinear corrections to be easily calculated for any time by simply rescaling the result at redshift zero. 6 In other words, one can rewrite Eq. (2.4) as where P 13 (z, k) and P 22 (z, k) are obtained from Eq. (2.5) by performing the loop integrals with the linear power spectrum evaluated at the redshift of interest, P lin (z, k).

Power Spectrum of Biased Tracers
To calculate the one-loop power spectrum of biased tracers, we have to include all possible operators up to third order in the bias expansion: Here we have defined the Galileon operator where Φ g is gravitational potential. The only cubic operator that gives a nontrivial contribution to the one-loop power spectrum can be written as where Φ v is velocity potential. 7 For the definition of G 3 and relations of our operators to other equivalent choices of basis, see [58]. The term denotes the stochastic contribution which is uncorrelated with the large-scale density field. In the simplest approximation, we may treat the power spectrum of as Poissonian, and thus constant. In practice, it is more complicated and has scale-dependent corrections. Finally, the last term in Eq. (2.7) is the higher derivative bias which we keep for consistency and completeness. In Using the particular bias expansion given above, the one-loop auto-power spectrum of the bias tracers takes the following form [54,55,58], where P (z, k) is the power spectrum of the stochastic component. This uses the following definitions [54]: Three important comments are in order here. First, we define I δ 2 δ 2 (z, k) by subtracting the low-k constant contribution, to ensure it has an O(k 2 ) behavior on large scales. The constant contribution is reabsorbed in the stochastic power spectrum since it is perfectly degenerate with the shot noise. Second, the dark matter counterterm is combined with the higher derivative bias since they are perfectly degenerate for the galaxy power spectrum. Third, the contributions from operators δ 3 , δG 2 , G 3 disappeared after renormalization. This is the reason why b 3 , b δG 2 , b G 3 are absent in Eq. (2.10).
Using the same bias model we can also calculate the galaxy-matter cross-spectrum which is of relevance, for instance, for lensing surveys. It has the following form [54]: (2.12) Note that the matter counterterm and the higher-derivative bias enter the crossspectrum and the the auto-spectrum in different combinations. In principle, This allows one to break the degeneracy between them using the galaxy-lensing observations.

Power Spectrum of Biased Tracers in Redshift Space
The radial positions of galaxies in a survey are assigned using their redshifts, which are contaminated by the peculiar velocity field. This gives rise to the so-called redshift-space distortions RSD, which allow one to probe the velocity field along the line-of-sight directionẑ. We will work within the flat-sky plane-parallel approximation, where the redshift-space mapping can be fully characterized by the cosine of the angle between the line-of-sightẑ and the wavevector of a given Fourier mode k, µ ≡ (ẑ · k)/k. In this setup, the expression for the one-loop redshift-space power spectrum reads (see Refs. [59,60]): + P ctr,RSD (z, k, µ) + P ,RSD (z, k, µ) , where the redshift-space kernels are given by where k = k 1 + k 2 + k 3 and G n are the velocity divergence kernels [30]. Note that Z 3 (k 1 , k 2 , k 3 ) contains only bias parameters that give nontrivial contributions to the redshift-space one-loop power spectrum and that it must be symmetrized over its momentum arguments when used in Eq. (2.13). Furthermore, we have omitted the time dependence of f ≡ d log D/d log a and biases for clarity. Let us discuss the structure of the last two terms in Eq. (2.13) in some detail. The leading counterterm contributions in redshift space can be seen as a simple generalization of the dark matter sound speed [59,72], wherec 0 (z),c 2 (z) andc 4 (z) are quantities that are generically expected to have similar value to the real-space dark matter sound speed in units of [Mpc/h] 2 . However, due the presence of fingers-of-God [73] these counterterms can be more significant for some tracers than naïvely expected. Since the fingers-of-God are induced by the higher-derivative terms in the non-linear RSD mapping, one may include an additional counterterm proportional to k 4 µ 4 P lin (z, k) as a proxy of the higher-order contributions, where we have inserted the linear Kaiser factor (b 1 (z) + f (z)µ 2 ) 2 [74] for convenience. Whilst, we leave the systematic derivation of all corrections of this order for future work, we stress that addition of this term can be important in order to fit the data or results from N-body simulations, if fingers-of God effects are large [13,16]. The full counterterm contribution is then given by P ctr,RSD (z, k, µ) = P ctr,RSD,∇ 2 δ (z, k, µ) + P ctr,RSD, ∇ 4 z δ (z, k, µ) , (2.17) and depends on four free functions of timec 0 (z),c 2 (z),c 4 (z) andc(z). Finally, the stochastic power spectrum in redshift space has the following structure at next-to-leading order in derivative expansion: where P shot describes a constant shot noise and the additional two terms are scaledependent shot noise contributions for the monopole and the quadrupole. Note that the amplitude of the shot noise and the two coefficients a 0 and a 2 are functions of time only, while the k and µ dependence of the stochastic power spectrum is very simple. It is worth mentioning that the pair-counting Poissonian contribution 1/n is often subtracted from the power spectrum estimator. It is however important to keep the residual constant P shot in the model in order to capture deviations from the Poissonian prediction, which are expected on general grounds. Whilst all the terms presented above should be kept in a data analysis for consistency, some contributions are quite degenerate at the level of galaxy power spectra. For instance, the P ctr,RSD, ∇ 4 z δ counterterm is very degenerate with the a 2 µ 2 k 2 stochastic contribution, given the slope of the linear power spectrum on mildly nonlinear scales. Therefore, as far as galaxy clustering is concerned and depending on the required precision, one can opt to keep only one of the two terms. For example, the recent re-analyses of BOSS data kept only the higher derivative counterterm [13], whilst the analysis of Ref. [75] includes only the a 2 µ 2 k 2 contribution. As can be seen from these two papers, the particular choice does not impact the inference of cosmological parameters. Moreover, the authors of Ref. [67], have shown that the a 0 contribution can be neglected on scales with k 0.3 h/Mpc. Given these reasons, we will neglect the a 0 and a 2 terms henceforth.
Thus far we have presented the perturbation theory model for the redshift-space power spectrum, keeping the full k and µ dependence. However, it is generally more convenient to summarize the full angular information in a few multipoles, using the relation P gg,RSD (z, k, µ) = even L (µ)P (z, k) , (2.19) where L (µ) are Legendre polynomials. The galaxy power spectrum multipoles are thus (2.20) In both this paper and our code we will focus on the monopole ( = 0), quadrupole ( = 2) and hexadecapole ( = 4), since they contain the bulk of cosmological information. (Recall that these are the only moments that appear at zeroth (linear) order of perturbation theory.) To compute these at next-to-leading order, we will take into account all terms induced by the one-loop corrections up to O(µ 8 ).
The final expression for the galaxy power spectrum multipoles follows from Eq. (2.13) and can be written analogously to Eq. (2.10); where P δδ , P θδ , P θθ are the auto-and cross-spectra of the density field δ and the velocity divergence field θ. The different contributions I ,n and F ,n are redshiftspace generalizations of the real-space bias loop integrals (2.11a). Note that the basis of counterterms has been changed to have a single free coefficient for each multipole moment, and the new contributions are defined as The mapping between the old and new coefficients is given by 8 (2.23)

IR Resummation
As previously discussed, IR resummation is imperative to properly describe the spread of the BAO peak. In this Section we present our implementation of this effect, using two closely related, but distinct, approaches in the real and redshift space cases. Since the large bulk flows affect only the BAO wiggles, the common starting point is to split the linear power spectrum into the smooth P nw and wiggly component P w ; The details of the algorithm used to perform this splitting is given in Section 4.
In real space we follow the approach presented in Refs. [49], which was developed in the context of time-sliced Perturbation Theory (TSPT) [48]. Following the wigglysmooth decomposition one computes the damping factor 9 where k osc is the wavenumber corresponding to the BAO wavelength BAO ∼ 110 h/Mpc, j n (x) are spherical Bessel functions of order n, and k S is the scale separating the long and short modes. We use the value k S = 0.2 h/Mpc as advocated in Ref. [49], even though any other choice in the physically relevant range (0.05 − 0.1) h/Mpc produces a very similar result. When we perform the one-loop calculation, the residual dependence of the final result on k S is comparable to the two-loop wiggly contribution and hence should be treated as a small theoretical error. Once the damping factor Σ 2 (z) is obtained, one computes the tree-level IR-resummed dark matter power spectrum as The various one-loop IR-resummed power spectra for matter (XY=mm), galaxy (XY=gg), and the matter-galaxy cross spectrum (XY=gm) can be obtained from the usual one-loop integrals evaluated using P mm, LO (z, k) as an input instead of the linear power spectrum. Schematically, we can write where the various spectra P tree, XY are given by (2.28) Note that the additional term k 2 Σ 2 (z)e −k 2 Σ 2 (z) P w (z, k) prevents double-counting of the bulk flow contributions that are contained in the one-loop expression.
Let us now focus on the redshift-space power spectrum of galaxies. IR resummation becomes more complicated in this case, since the tree-level IR resummed matter power spectrum picks up non-trivial angular dependence from the anisotropic damping factor [51], where we have introduced the new damping function, which depends on the logarithmic growth factor, f (z); This is a function of the real-space damping (2.25) and on a new contribution, Due to the anisotropy of the BAO damping, the one-loop calculation strictly requires computation of anisotropic loop integrals, which in contrast to the real space case, cannot be reduced to one-dimension. However, these can be simplified by splitting the one-loop contribution itself into a smooth and wiggly part. More precisely, one first computes the usual redshift-space one-loop integrals with a smooth part only. Second, one evaluates the same integrals with one insertion of the unsuppressed wiggly power spectrum and applies the direction-dependent damping factor (2.30) to the output, giving [46] P gg (z, k, µ) Here P ...1-loop [P lin ] are treated as functionals of the input linear power spectrum; For simplicity we have neglected the one-loop contributions obtained from two insertions of the wiggly power spectrum (since these scale as P 2 w ). Once the two contributions P gg,w and P gg,nw are summed, the eventual IR-resummed anisotropic power spectrum can be used to compute the multipoles in Eq. (2.20).
It is important to stress that our implementation of IR resummation at one loop order contains four potential sources of error: • Imperfectness of the wiggly-non-wiggly decomposition; • Dependence of the damping factor on the separation cutoff; • Inaccuracy of the factorization prescription; • One-loop corrections of O(P 2 w ) from two insertions of P w .
In Refs. [49,51] is was shown that these effects are smaller than the two-loop contribution. Furthermore, it can be shown that these errors can be consistently subtracted and shifted to the next order at any given order of perturbation theory. This will be additionally discussed in Section 4.

Alcock-Paczynski Effect
The observed galaxy distribution is a function of angles and redshifts. However, it is more convenient to switch to the geodesic distances between galaxies and consider the "de-projected" 3-dimensional power spectrum instead of the 2-dimensional angular power spectrum (at redshift z) [76]. In practice, this change of coordinates is realized by means of assuming some trial fiducial cosmology [77][78][79][80]. Importantly, if the trial cosmology differs from the correct one, the reconstructed 3D power spectrum appears distorted, which known as the Alcock-Paczynski effect [81]. These effects are routinely used to constrain cosmological parameters from galaxy surveys, see e.g.
[7]. One does not need, of course, to assume a wrong cosmology to generate Alcock-Paczynski distortions. If the fiducial cosmology is correct, there are no distortions in the data, but they are present in the theoretical templates that are fitted to these data. After all, the Alcock-Paczynski coordinate conversion is only a technical tool to extract the distance information that is encoded in the angle-and redshift-dependence of the galaxy distribution. Mathematically, it does not change the information content of the galaxy power spectrum.
To account for the Alcock-Paczynski effect one has to compute the observable galaxy power spectrum using the following formula: where k true and µ true are the values that one would obtain in the true cosmology (and those used to evaluate the theory model), whereas k obs and µ obs refer to quantities in the fiducial cosmology that was used to build galaxy catalogs. The relation between the true and observed wavenumbers and angles is given by (suppressing the explicit time dependences) These formulas realize the map (k true , µ true ) → (k obs , µ obs ), which is used in our code.
During the likelihood analysis one samples cosmological parameters in an attempt to find the true vales H true and D A,true given the fiducial H fid and D A,fid used to create catalogs. Including the AP effect, the final galaxy multipoles are given by Note that the AP effect and IR resummation lead to the leakage of some bias contributions to higher order multipoles. For instance, in the absence of these effects the term I δ 2 δ 2 only contributes to the monopole moment, whilst including the AP effect produces some non-trivial angle-dependence and generates contributions into higher multipole moments. CLASS-PT explicitly computes these contributions, though we drop them in the python wrapper classy for memory optimization reasons, since they are found to be highly negligible. The plots with these contributions can be found in the Mathematica notebook in the code web folder.

Tree-level IR-resummed Bispectrum
The tree-level IR-resummed bispectrum in real space can be easily obtained form our code as well. This is easily formed by taking the usual expression for the treelevel matter bispectrum and replacing P lin (z, k) with the leading order IR-resummed spectrum given in (2.26). Note that this replacement is the exact result in real space.
In redshift space, one should use the anisotropic expression (2.29) and consistently average over the angular variables that include the AP effect;a procedure that will be implemented in future versions of CLASS-PT.

Structure of the Code
Our code is executed as a module, nonlinear_pt.c, in the standard CLASS code v2.6.3., which was the latest CLASS version when the work on the code started. The new module is implemented as a clone of the nonlinear.c module that evaluates HALOFT. A work cycle of our modifications can be schematically represented by a sequence of the following three steps: 1. The function nonlinear_pt_pk_l() takes the linear transfer functions from the module perturbations.c and convolves them with the primordial power spectrum from primordial.c to get the linear matter power spectra at redshifts specified by the user; 2. For each required redshift, various non-linear power spectra are evaluated by the function nonlinear_pt_loop(), which uses FFTLog with pre-computed cosmology-independent matrices M (see Eq. (4.9)); 3. These spectra are passed to subsequent modules similarly to the non-linear spectra computed by HALOFIT in nonlinear.c .
4. Alternatively, there is an option to use external linear P (k) instead of the one computed directly by CLASS.
The most important ingredients that made our FFTLog calculation possible are the CLASS realization of FFT developed in CLASS-matter 10 (see Ref. [82]), and the fast matrix multiplication algorithms included in the open-source C library OpenBLAS 11 . We stress that OpenBLAS is the only external library used in our code. It is free and its installation is fast and straightforward. A detailed installation manual for CLASS-PT is given in Appendix B.
Crucially, our alterations do not alter the way CLASS works. The module is written in C and it is wrapped as a python library classy. Compared to the usual classy, just one function is modified, pk(k,z), and several new functions added. Examples of working sessions of our code are given in the Jupyter notebook available at the code webpage.
Several important flags regulate our non-linear module: • non-linear = PT: If set, this flag executes the non-linear module. The syntax here is analogous to the one used to execute the HALOFIT module, "non-linear = Halofit".
• Bias tracers = Yes: Decide whether the loop integrals for biased tracers are computed.
• RSD = Yes: Decide whether the redshift-space loop integrals are computed.
• AP = Yes: If both "RSD" and "IR resummation" are switched on, this activates calculation of the AP effect.
• Omfid=0.31: Pass the fiducial value of Ω m that was used to create the catalogs with the AP effect. One must never scan over this parameter in an MCMC analysis. The value of Ω m, fid must be always fixed to the one used in the survey data production.
• FFTLog mode=Normal: Depending on a particular situation, the user can either run the code with high precision settings (which is a default choice), or in the fast mode, which is slightly less accurate but much faster. This regime can be activated by the flag "FFTLog mode=FAST" If the code is run in the default regime, the flag "FFTLog mode" does not need to be specified.
• output format=Normal: This flag speficies the size of the wavenumber grid used to compute and store the power spectra. By default, our module uses the standard CLASS array of wavenumber. However, if one computes both the CMB power spectra C and the non-linear power spectra P (k) in one CLASS call, one is encouraged to use the flag "output format=FAST." In this case the non-linear power spectra are stored on a reduced wavenumber grid, which leads to a notable gain in speed.
• cb=Yes. By default, CLASS-PT uses the linear power spectrum of the cold dark matter and baryon ("cb") fluid as an input for the non-linear calculations. This is motivated by the evidence that galaxies trace the cb-fluid and not the total matter density that includes massive neutrinos. If the user is willing to compute the non-linear corrections to the total matter density field, they should use the flag "cb=No." • External Pk = file_pk.dat. This option should be activated if the user wants to perform nonlinear calculations with an external linear matter power spectrum (tabulated in e.g. 'file_pk.dat').
A concrete understanding of our module's architecture is not essential for its running. Moreover, it will likely change in the future to match the most recent official version of CLASS. We warn the users that some functions that exist in the current version of CLASS-PT are redundant and will be optimized in the future. The current stable version of the code is available at https://github.com/Michalychforever/CLASS-PT, where modifications will be commented upon.

Technical Implementation and Approximations
Numerical algorithms and approximations are essential elements of our code. In this Section, we describe some of these technical details including the FFTLog algorithm used to evaluate loop integrals, the wiggly-non-wiggly decomposition, as well as their application to the evaluation of the IR-resummed redshift-space galaxy power spectrum multipole moments.

Basics of the FFTLog Method
The one-loop perturbation theory integrals involve convolution kernels that reduce to simple multiplications in position space. This inspired methods that evaluate these by using the Fast Fourier Transform (FFT) to switch between Fourier and position space to evaluate these integrals [21,22]. Alternatively, FFTs (with uniform binning in log k) can be used instead as a tool to decompose the linear power spectrum into complex power laws [83]. Whilst the loop integrals are then not deconvolved, they have a simple analytical solution for the power-law universes [24]. We refer to this particular approach as the FFTLog method. Importantly, it can be extended to the one-loop bispectrum and the two-loop power spectrum [24], which cannot be written as not simple convolution integrals. 12 Keeping in mind these statistics as our eventual goal, we choose FFTLog for the evaluation of loop integrals in CLASS-PT. Another advantage of this algorithm is that it is very easy to implement, since it boils down to simple multiplications of the cosmology-independent matrices with the cosmology-dependent vectors, which can be easily obtained from FFTs of the linear matter power spectrum.
The discrete approximation to the linear power spectrum in a finite momentum interval [k min , k max ], denoted asP (z, k), can be written as where the Fourier coefficients c m and exponents η m are given by The parameter ν is sometimes refered to as "bias". In principle, ν can be an arbitrary real number, however, the convergence properties of convolution integrals on different scales vary depending on the value ν. Thus, the freedom to choose ν can be used to boost the efficiency of numerical evaluation. As mentioned above, the perturbation theory loop integrals over each power-law function k ν+iηm can be done analytically, which allows one to reduce the evaluation of the whole loop integral to a matrix multiplication. Crucially, the elements of this matrix are cosmology-independent and can be pre-computed and saved as a table. All the cosmology-dependence resides in the coefficients c m , whose evaluation takes very little time by virtue of the FFT algorithm.
In order to find analytical solutions for the loop integrals with power-law power spectra, the integration has to be performed over the whole momentum range, i.e. for q ∈ [0, ∞]. This implies that perturbation theory loop integrals are evaluated with the same integration boundaries. One may be worried about this in the context of perturbation theory, since we are integrating over the small scales where the perturbative description breaks down. However, as already emphasized, the purpose of counterterms in the EFT approach is precisely to absorb all small scale dependence of the loop integrals. In this way, it is guaranteed that the final results are independent of the exact short-distance behavior of the power spectrum. 13

FFTLog in Redshift Space
In real space all one-loop integrals can be expressed in terms of the single 'master integral' where ν 12 ≡ ν 1 + ν 2 and for Gamma function Γ. However, in redshift space the loop integrals become more complicated due to the anisotropy introduced by the line-of-sight directionẑ. One can find up to four loop momenta multiplyingẑ in the one-loop integrands. To evaluate these integrals, we generalize Eq. (4.3) as follows: where A n , B n and C n are some functions of ν 1 and ν 2 , and we have introduced the following operators: (4.6) 13 Alternatively, one may introduce a UV cutoff Λ by simply padding the power spectrum with zeros for all wavenumbers k ≥ Λ. In this case the EFT counterterms absorb the cutoff dependence of the loops and ensure that the final result for the one-loop power spectrum does not depend on Λ. Thus, even though we use Λ = ∞, this choice is irrelevant for the cosmological constraints and can only affect the amplitudes of the counterterms.
By contracting the left hand sides of the integrals (4.5) with different powers of q and k, one can reduce these integrals to the form (4.3). The resulting formulas are a set of simple algebraic equations that can be solved to find the functions A n , B n and C n . The explicit solutions can be found in Appendix A. Plugging these expressions into (4.5), it is straightforward to obtain the following redshift-space master integrals: With these formulas in hand one can compute the one-loop redshift-space integrals in the discrete FFTLog representation just like in the real-space case [24]. Crucially, the dependence on µ is given by simple polynomials, e.g. the one-loop matter power spectrum takes the following form; and each P (n) can be computed via FFTLog in full analogy with the real-space case where M 13 are the standard real-space matrices [24] and M (n) 22 , M 13 with n > 0, are their redshift-space generalizations. The explicit expressions for these matrices are quite cumbersome and thus not quoted here. They can be found in the main body of the code.
Since the µ-dependence of basic perturbation theory one-loop integrals is known explicitly, one can easily do the Legendre integrals analytically at the level of the FFTLog matrices. This allows one to obtain master matrices M 22, and M 13, . Using these matrices each multipole can be computed with only two matrix multiplications, just like in the real-space case. This is not the case when IR resummation and the Alcock-Paczynski effect are present. To account for them, we evaluate each integral entering (4.8) separately, combine them into the full P (k, µ) and then do the µintegrals numerically. This procedure will be discussed in more detail shortly.

Practical Realization
To carry out the FFTLog, we first create a grid with N FFTLog = 256 (default mode) and N FFTLog = 128 (fast mode) harmonics spanning the range [5 · 10 −5 , 100] h/Mpc , and use two different values of the FFTLog "bias" exponent ν for the matter and bias tracer loop integrals (see [24] for details) ν = −0.3 (matter) , ν = −1.6 (biased tracers) . (4.10) It is important to stress that the choice ν = −0.3 leads to poor convergence for the matter one-loop integrals at small scales, k > 1 h/Mpc. To alleviate this issue, we apply an exponential cutoff for these high k's, which is justified because the oneloop predictions are not valid on these scales at the redshifts relevant for current and future galaxy surveys. If necessary, one can always choose a different value of the bias for which the FFTLog calculation will be better convergent for large wavenumbers.

Accuracy Tests
Our code, default Let us now discuss the accuracy of our code, using the one-loop real-space calculations as an example. The purpose of this comparison is to show that our FFTLog routine has comparable precision to that of direct numerical integration.
The residuals between the FFTLog-based calculation and the direct numerical evaluation of the one-loop matter power spectrum are shown in Fig. 1. The singularity at k ≈ 0.1 h/Mpc reflects the fact that the one-loop spectrum crosses zero in this region. We show the results both for the default precision with N FFTLog = 256 and for the fast mode with N FFTLog = 128. The default choice of N FFTLog = 256 is seen to provide a relative accuracy ∼ 0.1% over the range of wavenumbers k 0.2 h/Mpc, as relevant for future galaxy surveys. Note that the relative accuracy quoted here does not depend on time; the absolute accuracy actually increases at high redshifts.
However, the 0.1% accuracy of the one-loop correction can be somewhat excessive in many cases. For this reason, for all practical applications the user is encouraged to run the code in the fast mode. Whilst this provides us with a somewhat lower accuracy ∼ 1%, there is a significant speed gain. On the one hand, this numerical error is still smaller than the two-loop contribution omitted in our model. On the other hand, the one-loop contribution itself must be a small correction to the linear power spectrum in order for perturbation theory to make sense. Thus, the O(1%) accuracy on the one-loop correction translates into the O(0.1%) accuracy on the total power spectrum. Therefore, the fast mode seems to be sufficient for the bulk of practical applications in which the one-loop power spectrum is used as a model. To explicitly verify this, we have rerun the MCMC analysis of the BOSS data from Ref.
[13] and the analysis of the large N-body simulation data from Ref. [16]. In both cases, fast and default modes yielded indistinguishable results.

Wiggly-non-Wiggly Splitting
The algorithm for wiggly-non-wiggly splitting implemented in the code is based on the discrete spectral analysis method proposed in Ref. [86]. The main idea is to Fourier transform the power spectrum to position space, localize the BAO peak, remove it, and smoothly interpolate the correlation function in the previous location of the peak. For computational efficiency this is done by means of a discrete Fourier transform. In practice, we do the following: 6. FST back the new coefficients to recover ln(kP nw (k)).
The resulting wiggly power spectrum P w ≡ P lin − P nw is shown in the right panel of Fig. 2. It is important to stress that we work in units of Mpc, such that the splitting is insensitive to h. The only cosmology-sensitive part of our procedure is the location of the BAO peak, which corresponds to the comoving sound horizon r d . However, r d is a very weak function of cosmology. For instance, in ΛCDM r d ∝ ω −0.25 m ω −0.12 b [13]. Given this reason, we use the same frequency cuts in the wiggly-non-wiggly procedure during MCMC scans over different cosmologies. Alternatively, we have tried an algorithm which rescales the frequency cuts "on-the-fly" according to the value of r d which is being sampled by the code. The difference between the two procedures is negligibly small and does not affect parameter inference even from the large-volume PT challenge simulation data [16].

Error Budget of IR Resummation
In Section 2.5 we listed various sources of error in our implementation of IR resummation. It was argued that these errors are under control, i.e. their contributions can be minimized to arbitrary small values. This is a theoretical statement, which may not hold in reality due to choices made in practical implementation. In this subsection, we will explicitly show that the residual error of the one-loop power spectrum including IR resummation is smaller than the two-loop contributions. Let us discuss each problematic ingredient separately.
Wiggly-non-Wiggly decomposition. The error introduced by our splitting procedure is always smaller than the two-loop corrections. One can argue that this is a generic statement that can be generalized to higher orders. Indeed, imagine that the BAO were described by an analytic harmonic function such that one could find an exact analytic expression for P nw, true . Imagine now that instead of using this analytic expression we perform a numerical wiggly-non-wiggly decomposition that introduces some intrinsic error ∆P w−nw P lin , P nw = P nw, true + ∆P w−nw , P w = P w, true − ∆P w−nw . (4.11) Now let us perform a leading order tree-level calculation, We see that at small wavenumbers, the wiggly-non-wiggly error cancels when we sum up the wiggly and smooth parts. The residual error term on the r.h.s. can be Taylor expanded and compared to the one-loop contribution at low k's, where σ 2 v ≈ 36D 2 (z) [Mpc/h] 2 is the variance of the linear displacement field, which controls the size of the one loop correction at low k. A similar calculation can be repeated at one-loop order. Given this observation, one can argue that as long as the wiggly-non-wiggly splitting error ∆P w−nw P lin is much smaller than the power spectrum itself, the residual error of a n-loop calculation will be smaller than the n + 1 loop correction. In Fig. 3 we show the residuals between the two one-loop spectra produced by changing the cuts of the Fourier harmonics. The difference generated by the wiggly-smooth procedure is clearly much smaller than the two-loop contribution and can be safely neglected.
Dependence on k S . This ambiguity is intrinsic to the IR resummation procedure. However, it was shown in Refs. [49,51] that the effect reduces at higher loop orders. Thus, the error due to the separation scale choice is always under rigorous perturbative control. To estimate it, we compute the residuals between the two spectra evaluated with k S = 0.1 h/Mpc and k S = 0.2 h/Mpc, and display the result in Fig. 3.
Factorization. Another source of error can be the approximate treatment of IR resummation in redshift space. Recall that, in principle, one ought to compute anisotropic 3-dimensional integrals, where the FFTLog algorithm cannot be directly applied. However, it is possible to approximately factorize the BAO damping by neglecting terms which are formally either higher order or exponentially small. For that, one should, essentially, repeat the same arguments as for the wiggly-non-wiggly decomposition, see Ref. [49,51] for more detail.
In practice, we have checked that these terms are indeed smaller than the twoloop corrections at redshifts relevant for future surveys. In order to estimate this error we computed the difference between the full formula for the one-loop matter power spectrum (2.27) and its "factorized" version (2.32). Crucially, the residual generated by the factorization is a smooth function without a pronounced BAO feature. The reason behind this is that the factorization mostly affects the P 22 -like integrals, in which the oscillating residuals are integrated over and hence washed out. The absence of features suggests that even if we neglected the theoretical error associated with two loops completely, this residual could be absorbed by the counterterms without biasing cosmological parameters in a real data analysis.
One may wonder what happens if we approximate the IR-resummation of oneloop redshift-space integrals with a direction-independent damping exponent (as it is the case in real space). Naïvely, this prescription would guarantee the absence of smooth residuals in the loop integrals. However, we have found that this approximation leads to non-negligible oscillation residuals in the density and velocity spectra, motivating the factorization prescription. These residuals are mostly produced by P 13 -like integrals, for which the factorization procedure is exact and fast, thus there is no need to use the isotropic damping template. The most time-consuming process is the IR resummation of the P 22 -like integrals, for which the difference between the direction-independent and full anisotropic templates was found to be quite small. Approximating the BAO damping of the P 22 integrands with the direction-independent template notably reduces the computational cost of IR resummation. This introduces 0.1% error on the full power spectra, which is smaller than the neglected two-loop contributions on mildly-nonlinear scales. Even though this approximation seems promising, it has not yet been fully included in the current version of the code. Currently, we have implemented this method only for the P 22 -like integrals of biased tracers; we plan to run more thorough tests before implementing it for the matter loop integrals as well.
We stress that we have implemented the full factorization formula with the anisotropic damping factor for the P 13 -like bias integrals produced by the operator F G 2 . This formula is exact for these types of integrals.
Corrections of order O(P 2 w ). Finally, we have checked that the terms ∼ P 2 w omitted in the IR resummation procedure of Ref. [49] are indeed negligible. In principle, these corrections can be taken into account at zeroth order at no additional cost, but their contribution is so small that they are irrelevant for all practical applications.
All the sources of error related to IR resummation are shown in Fig. 3. We see that the biggest error is introduced by the factorization procedure, but its contribution is quite smooth and its slope matches the shape of the two-loop contribution.

Evaluation of Redshift Space Multipoles
When neither IR resummation nor the AP effect are present, the code can be greatly expedited by performing the µ-integrals analytically. In this case we use the explicit FFTLog matrices directly for the power spectrum multipoles. This calculation is initiated if the flag "IR resummation = No" is passed to the code. Note the AP effect is not implemented in this case. If the flag "IR resummation = Yes" is passed instead, a different routine is performed; we separately compute all Fourier integrals that multiply different powers of µ 2 (see Eq. (4.8)) and use the following algorithm:

Compute each loop integral separately for wiggly and non-wiggly components;
First, we evaluate it for the non-wiggly input power spectra only. Second, we compute the one-loop integrals with one entry of P w and one entry of P nw ; 2. Suppress the wiggly one-loop spectra with the anisotropic damping factor; 3. Combine these terms and add the tree-level IR-resummed part. This allows us to arrive at the final expression for P gg (z, k, µ) given in Eq. Note that there is only one numerical routine that performs the Legendre integrals over µ both for IR-resummation and the AP distortions. If one needs to take into account the AP-distortions but not IR resummation, one can setthe BAO damping factor, Σ tot , to zero. Moreover, the AP effect can be computed only if all three flags "RSD = Yes", "IR resummation = Yes", and "AP = Yes" are passed to the code. If necessary, one could compute the multipoles induced by the AP effect in real space by setting the growth factor f = 0 before the RSD module is executed.

Neutrino Masses
Massive neutrinos require some special treatment. Strictly speaking, our method is not applicable if the growth rate is scale-dependent. In this case, one cannot use the usual EdS perturbation theory kernels, and the full calculation of time-and-scale dependent Green's functions is needed [87,88]. However, these references showed that for dark matter in real space, the difference between the full calculation and the EdS approximation is very small for realistic neutrino masses. This suggests that it is safe to use our EdS-based FFTLog calculations in this case.
To approximately incorporate massive neutrinos in the calculation of biased tracers, we use the linear power spectrum for the "cold dark matter+baryons" ("cb") fluid as an input in all loop calculations. This prescription has been advocated on the basis of N-body simulations in [89,90,[90][91][92][93]; besides, Refs. [94,95] claimed its importance for the neutrino mass measurement. The "cb" power spectrum is a default input of our non-linear module. If necessary, one can use the total matter density via the flag "cb=No." The situation is more complicated in redshift space. Just like in the biased tracer case, N-body simulations (e.g. [93]) suggest that one has to use the linear logarithmic growth factor f cb of the "cb" fluid. Then the halo power spectra of N-body simulations approach the Kaiser prediction [74] evaluated with the "cb" quantities. Crucially, for observationally allowed neutrino masses, the scale-dependence of f cb is around 0.1% on large scales where the definition of f cb is meaningful. Strictly speaking, the presence of this scale-dependence invalidates our whole redshift-space one-loop calculation including IR resummation and calls for a computation of the appropriate Green's functions. However, given that this effect is very small, we will neglect it and use the EdS approximation with a scale-independent approximation for f cb . In principle, one can include the effect of appropriate Green's functions by perturbatively expanding around the EdS kernels. We leave this for future work.
Overall, in the presence of massive neutrinos, we use the same FFTLog-EdS formulas as before, but apply them to the linear "cb" power spectrum, including the suppression at short-scales by massive neutrinos' free-streaming. At any required redshift, the code takes the power spectrum at this exact time, such that the linear time-dependence of the neutrino suppression is taken into account. This approach is justified by N-body simulations of Ref. [93], which showed that the leading effect of massive neutrinos is always a suppression of the linear power spectrum, and any residual scale-dependence of this suppression is insignificant even for volumes as large as 100 (Gpc/h) 3 . This observation was also confirmed in various forecasts, e.g. [17,96]. Given these reasons, we expect that using the usual FFTLog formulas in the presence of massive neutrinos will be a good approximation even for future surveys like DESI or Euclid.

Non-Standard Extensions of ΛCDM
The code in its current form can be used without any limitations for all non-minimal cosmological models that do not require modification of the perturbation theory kernels. One such example is the Early Dark Energy (EDE) model, in which the standard ΛCDM early universe physics is significantly modified in an attempt to resolve the Hubble tension [97]. CLASS-PT has been already successfully used to put the strongest constraints to date on the EDE model from the combination of the CMB and LSS data [98] (see also [99]).
In principle, CLASS-PT can be extended even to those cases which require modifications in the mode-coupling kernels, e.g. modified gravity. If these models do not violate the Equivalence Principle, one has to simply recompute the perturbation theory matrices incorporating the new kernels from these extended models. In this case, the body of the code does not need to be modified. If the Equivalence Principle is violated, IR resummation must be altered accordingly, see Refs. [100,101].

Modified CMB Lensing Routine
In certain situations, it may be useful to have some alternative estimate for non-linear corrections that can be used instead of HALOFIT for the CMB lensing calculations. This is clearly the case for exploration of the non-standard cosmological models for which the HALOFIT fitting formula was not calibrated. Since the non-linear corrections relevant for CMB lensing are relatively small for angular multipoles < few×10 3 [102], one may expect that perturbation theory gives reasonably accurate results for current lensing data such as, e.g. the Planck measurements. One technical difficulty in applying perturbation theory to CMB lensing is the significant width of the lensing kernel. This requires non-linear corrections from many different redshifts, whose full calculation is very time-consuming. However, given that the perturbations of the lensing potential are only very mildly-nonlinear, and given the statistical errors of current lensing data, the accuracy of non-linear corrections around ∼ 1% is tolerable. In this case one can adopt the following simple approximation scheme: 1. Compute the full matter power spectrum P ref 1−loop at some fixed reference redshift z ref ; 2. Obtain the spectra at different redshifts z i by rescaling P ref 1−loop with scaleindependent linear growth factors, This procedure is exact in EdS if we neglect the time-dependence of IR-resummation and counterterms. The effect of both is around 1% on mildly non-linear scales and hence can be neglected for our purposes.
Our modified lensing module was tested on the Planck 2018 data in Ref.
[14], where it was found to give the same result as HALOFIT for νΛCDM and νΛCDM+N eff models. However, we would like to stress that its accuracy has not been extensively tested for the precision required for future experiments.

Results and Performance
In this Section we show some results and discuss the performance of our code. All plots are generated with the Jupyter notebook that can be downloaded from the GitHub page of the code. Our timing results were obtained on a MacBook Pro Retina Early 2015 laptop, with a 2.7 GHz Intel Core i5 processor and using OS X version 10.11.6. In all cases, CLASS was run with the C compiler gcc-6.1.0. Our classy is based on Python 2.7.10, numpy 1.14.5, and scipy 0.19.0. The results of this Section will be presented for the non-linear power spectrum using the We use the following nuisance parameters: These are consistent with the values extracted from high-resolution BOSS mock galaxy catalogs and the actual BOSS survey data [13]. We stress that these nuisance parameters should be fitted from the data in any realistic analysis. Fig. 4 shows the breakdown of different contributions to the matter power spectrum in redshift space without IR resummation. In Fig. 5 we show the effect of IR resummation. Without this procedure, the one-loop correction fails to capture the shape of the BAO wiggles and even their frequency. This result is well known in the literature [46,49] and it explicitly shows that IR resummation is a necessary ingredient of any realistic non-linear calculation. For comparison, we also display the linear theory power spectrum. In Fig. 6 we show the galaxy-galaxy, galaxy-matter and matter-matter spectra in real space (left panel) and the breakdown of different bias loop corrections (right panel). Fig. 7 displays redshift-space multipoles of dark matter (left panel) and BOSS galaxies (right panel). Note that in the latter case we have included thẽ c ∇ 4 z δ -counterterm. 14 Fig. 8 shows the lensed temperature (TT) and the CMB lensing potential power spectra computed in perturbation theory and with HALOFIT (divided by the linear theory prediction), as well as the relative difference between the two. One sees that the difference between the PT and HALOFIT predictions is less than 0.1% for the lensed TT spectrum and around ∼ 2% at the small-scale part of the C (φφ) spectrum.

Examples of Nonlinear Spectra
These differences can be taken as an estimate for the theoretical error associated   with the modeling of non-linear corrections. We believe that the residual between the PT and HALOFIT spectra can be reduced by an appropriate tuning of the dark matter effective sound speed c 2 s . Moreover, a better description can be wrought by combining the two methods; perturbation theory is very accurate on mildly nonlinear scale, whereas the N-body based fitting formulas capture the leading behavior in the fully non-linear regime. The exploration of the matter power spectrum on these short scales can be done with relatively cheap small-box simulations. A thorough study of this possibility is left for future work.

Performance
Let us discuss now the performance of our numerical routine. Table 1  default one. This is a consequence of the fact that the most time-consuming process is matrix multiplication, which very roughly scales as N 2 FFTLog . We see that basic runs for the real-space power spectrum without IR resummation are quite fast. Their speed is comparable to that of other methods, e.g. FAST-PT [22]. IR resummation and bias tracers increase the execution time by a factor of ∼ 5 separately. Since these two procedures are independent, this results in an overall speed loss by a factor of 10 compared to the basic run. Redshift space distortions affect the calculation in two ways. First, there are additional convolution integrals that appear in multipole moments. Second, redshift space requires a more sophisticated IR resummation procedure, which again increases the number of convolution integrals even further. When the two effects are combined, the execution time reaches the level of 1.3 second for high precision settings and 0.3 seconds in the fast mode. The inclusion of the AP effect does not notably affect the speed.

Cautionary Remarks
There are several caveats to be borne in mind when using our code.
First, at face value, the code can be used for any beyond-ΛCDM cosmology provided that the structure of the perturbation theory kernels is not modified. This is the case for the bulk of extended models explored by Planck [1]. However, the numerical implementation choices made in the code have not been extensively tested for cosmological models that are extremely different from the Planck best-fitting cosmology. Some of our choices, i.e. the frequency cuts in the wiggly-non-wiggly decomposition, would have to be reconsidered if someone wants to explore, say a model with a large numbers of neutrino species, e.g. N eff = 42.
Second, many implementation choices in our code were made to maximize precision on large scales k ≤ 1 h/Mpc. Our baseline realization of the non-linear calculation must not be used for k 3 h/Mpc. Therefore, our code is not suitable for small-scale galaxy clustering or some lensing calculations where a significant amount of signal comes from the highly nonlinear scales. One also must carefully choose k max as a function of redshift, recalling that the loop corrections reduce at high-z, allowing smaller scales to be probed. The maximal wavenumber to which our code can be used to extract information from the matter clustering corresponds to the scale where the loop expansion blows up, i.e. whence the two-loop correction becomes comparable to the tree-level prediction. Using the fit to the two-loop power spectrum from Refs. [17,38], this scale can be estimated as The use of non-linear corrections computed with our code is, strictly speaking, justified only for k < k NL . The corresponding validity domain is shown in Fig. 9 as a function of redshift. 15 One obvious caveat is that our code does not include relativistic corrections and wide-angle effects, and therefore it should be used with care on very large scales. Furthermore, it does not have corrections to the linear bias due to local primordial non-Gaussianities. All of these corrections do not require nonlinear calculations and can be easily added if necessary.

Application to the BOSS Data
In this Section, we illustrate an application of our code to the analysis of the final BOSS data release [7]. To this end, we interface CLASS-PT with the MCMC sampler montepython v3.0 [6,12]. We will analyze a full-shape likelihood built out of the publicly available BOSS data and products taken from 16 , see Refs. [103,104] for more detail. This likelihood has already been used in Refs. [13][14][15], where one can find all technical details. We repeat it here just as an illustration.
As an aside, we would like to mention that the shape of the galaxy power spectrum has been used for cosmological parameter measurements since the dawn of galaxy surveys, see e.g. [105][106][107][108][109][110]. This practice, however, has been abandoned in the recent full-shape analyses that are based on the methodology borrowed from the BAO measurements [7, 104,111]. These analyses infer distance information by studying how the AP effect distorts some fixed-shape power spectrum template. The fixed template method is also adopted in the measurement of rms velocity fluctu- ation f σ 8 . This method has a number of limitations which can compromise the cosmological analysis of future high-precision data [13,17]. 17 An alternative to the fixed shape approach is to return to the methodology of measuring the cosmological parameters of a given model from the full power spectrum. This is the standard method adopted in the analyses of the CMB data [1]. An important advantage of this method is its universality: it can be applied to any model including beyond ΛCDM cosmologies. In this Section, for illustration purposes, we present the constraints obtained in this way for the base ΛCDM model. It is straightforward to repeat this analysis for more complicated beyond-ΛCDM models, see e.g. [25] for the analysis within wCDM and [14, 15] for νΛCDM and νΛCDM+N eff . We stress that the key novelty of our analysis is the most advanced theoretical model for the nonlinear power spectrum. In other aspects our method closely follows the ones proposed and used decades ago.
Our likelihood covers the pre-reconstructed redshift-space power spectra of BOSS galaxies across two non-overlapping redshift bins, 0.2 < z < 0.5 and 0.5 < z < 0.75 from two patches of the sky (North Galactic Cap and South Galactic Cap, NGC and SGC). We use the momentum range [0.01, 0.25] h/Mpc, which is stable w.r.t. instrumental systematics and two-loop corrections, that are omitted in our theory model. We fit the BOSS galaxy power spectra assuming the base flat ΛCDM model, fixing the tilt of the primordial power spectrum of scalar fluctuations n s and the physical baryon density ω b to the Planck 2018 best-fit values [1]; In principle, we can also scan over these parameters in our chains, see e.g. [13, 15], but do not do it here for simplicity. The role of the priors (6.1) and their impact on parameter inference have been thoroughly investigated in Ref.
[13]. Following Ref. [1], we approximate the neutrino sector with only one massive eigenstate and fix its mass to the lowest value allowed by the oscillation experiments, m ν = 0.06 eV.
This choice is made purely for demonstration purposes. We believe that it is more appropriate to scan over this unknown parameter, as is done in Refs. [13][14][15]. Our MCMC chains sample the remaining cosmological parameters of the minimal ΛCDM model: the physical density of dark matter ω cdm ; the Hubble constant H 0 ; the amplitude of primordial scalar fluctuations A s . We do not assume any priors on these parameters. To be more precise, we scan over A 1/2 s normalized to the Planck best-fit value, This choice allows us to treat A s as a nuisance parameters and quickly scan over it, which leads to better convergence. 18 We have run our analysis both in the fast and default modes and obtained identical results.  As far as the nuisance parameters are concerned, we fit them for each galaxy sample separately, due to different calibration and selection functions. We have seven nuisance parameters in total (sampling all in the "fast" mode from Ref. [114] alongside A 1/2 ): linear bias b 1 , quadratic bias b 2 , tidal bias b G 2 , shot noise P shot and three counterterms c 0 , c 2 ,c. The cubic bias b Γ 3 is set to zero. The detailed description of these parameters can be found in Ref. [13]. We chose the following priors for the 18 Modulo IR resummation, the non-linear power spectra depend on A s through a simple rescaling, Once IR resummation is taken into account, this rescaling is, strictly speaking, inexact because A s also controls the amplitude of the BAO damping scale. However, variations of the damping scale are analogous to changes of the separation scale k S , which is a higher order effect. Thus, using the rescaling of A s is accurate up to two-loop contributions, which are omitted in our model anyway. bias parameters: 4) and the counterterms: 5) 19 We use the notation N (mean, variance) for the Gaussian prior.
which are selected such that the corresponding shapes do not exceed the linear theory spectra on the scales used for the fit. Furthermore, the priors for the bias parameters are motivated by the coevolution model and results of N-body simulations. Alternatively, one could fix the priors on the nuisance parameters using the method proposed in Ref. [16]. We discuss the treatment of nuisance parameters, including their priors and measurements, in Appendix C. All in all, the priors for nuisance parameters do not significantly affect the constraints on cosmological parameters. The results of our analysis 20 are shown in Fig. 10 and in Table 2, where we separated the directly sampled parameters (ω cdm , H 0 , A s ) from the derived ones (Ω m , σ 8 ). These constraints agree well with those reported in Ref.
[13], although the priors used in our present analysis are slightly different. Note that presented BOSS constraints should always be taken in conjunction with the priors on ω b , n s and m ν made in our analysis. For comparison, we also show the results of our analysis of the baseline Planck 2018 likelihood [116] for the same cosmological model. 21 We publicly release our BOSS Montepython likelihoods in a separate repository https://github.com/Michalychforever/lss_montepython. The likelihoods are available is the standard and optimized versions. The latter includes the Fourier-space window function treatment and analytic marginalization over the nuisance parameters {c 0 , c 2 ,c, P shot } along the lines of Ref. [75].

Conclusions
In this paper we have presented a new open-source extension of the Boltzmann solver CLASS that incorporates one-loop perturbation theory calculations. This module, called CLASS-PT, computes Fourier-space power spectra of matter and biased tracers in real and redshift space. It contains all ingredients required for the application to data: IR resummation to describe the non-linear evolution of the BAO wiggles; nonlinear bias prescriptions; UV counterterms that capture the effects of poorly known short-scale physics on large scales, such as fingers-of-God and baryonic feedback. We stress that the main advantage of perturbation theory over other approaches is that it guarantees high precision on wavenumbers smaller than the nonlinear scale k NL ∼ 0.5 h/Mpc at z = 0. Many complicated phenomena that operate on short scales drastically simplify in the long-wavelength limit, where they can be consistently and accurately taken into account.
The current execution time of CLASS-PT is fast enough to make the Markov Chain Monte Carlo analysis of redshift-space clustering data feasible. The code was already used for these purposes in References [13][14][15][16][17].
The realization of nonlinear perturbation theory as a module directly inside the Boltzmann code CLASS has many advantages. It is clearly structured, easy to modify, and designed to avoid hard-coding. Moreover, it can be readily interfaced with other software, e.g. conventional MCMC samplers such as Montepython [6,12] and cobaya 22 . The CLASS code is one of the standard tools established in cosmology. By writing our module directly as part of CLASS, it was our aim to make the nonlinear cosmological perturbation theory calculations more available to the broad community: now, all users familiar with CLASS can easily perform these calculations.
Like many things, CLASS-PT is ever-evolving. The first avenue of improvement is devoted to the improvement of efficiency and accuracy of our calculation. We believe that some implementation choices used in the current version of CLASS-PT may not be not optimal and will certainly be revisited in the future.
The second line of research is aimed at incorporating new non-linear effects. In particular, the FFTLog algorithm is convenient for the implementation of two-loop power spectrum and one-loop bispectrum calculations [24]. Moreover, we plan to implement the observer-dependent convolution integrals describing selection effects such as intrinsic alignment of galaxies (see Ref. [117] and references therein). Additionally, it is important to accurately take into account corrections due to the scale-dependent growth introduced, e.g. by massive neutrinos. We leave these research directions for future work. momenta read A 1 (ν 1 , ν 2 ) = 1 2 (I(ν 1 − 1, ν 2 ) − I(ν 1 , ν 2 − 1) + I(ν 1 , ν 2 )) , For the integrals with three insertions one finds Finally, the integrals with four insertions of the loop momentum yield (A.5)

B Brief Installation Manual
CLASS-PT is compatible with both python v2 and v3. It is installed and configured in 8 easy steps: Let us first discuss our choice of priors for the nuisance parameters. We assume a flat non-informative prior on b 1 A 1/2 ∈ (1, 4). Since the satellite fraction of the BOSS galaxy sample is quite small, most of the galaxies are centrals and hence should trace the properties of the host dark matter halos. The measurements of b 2 and b G 2 from N-body simulations [118] yield Note that the values for b G 2 are also consistent with the predictions of the coevolution model [58]. On general grounds, the bias parameters are expected to be O(1) in the EFT, which motivates the priors b 2 A 1/2 , b G 2 A 1/2 ∼ N (0, 1). Note that we have inserted A 1/2 ≈ 1 in the definition of our sample parameters because this choice leads to somewhat better convergence of the MCMC chains. As far as the higher-derivative counterterms c 0 and c 2 are concerned, they are, in general, also expected to be The non-linear scale in redshift space depends on the velocity dispersion of the BOSS galaxies, which can be quite large. Indeed, previous BOSS full-shape analyses report σ v ∼ 5 Mpc/h [104], which is several times larger than the real-space estimate k −1 NL ∼ 2 Mpc/h. It is important to stress that the quoted measurement of σ v from Ref. [104] results from an application of a simplified fitting function, and the actual velocity dispersion can be different if one is using the full EFT model. Nevertheless, we adopt the following priors for the counterterms that are wide enough to accommodate a large velocity dispersion, c 0 , c 2 ∼ N (0, 30 2 ) [Mpc/h] 2 . (C. 3) The prior for the next-to-leading order RSD countertermc is more subtle. Naïvely, this contribution has the order of the two-loop correction and hence has not been originally included in the one-loop EFT theory model [59,72]. However, due to the strong fingers-of-God found in the BOSS galaxy sample, the coefficientc turned out to be enhanced compared to the naive EFT estimates. Finally, as far as the constant shot noise contribution P shot is concerned, its true value is expected to deviate from the Poissonian predictionn −1 due to exclusion effects [119] and fiber collisions [120]. The latter are not possible to predict from first principles and are difficult to model even in mock catalogs [121]. Thus, we adopted a more practical data-driven approach; first finding the best-fit values for P shot from the data itself and then imposing a large conservative prior centered at this best-fit value. The Poissonian partn −1 has already been subtracted from the power spectrum estimator of our data. We found the residual shot noise contribution best-fits to be roughly P shot ∼ 5 · 10 3 [Mpc/h] 3 , (C.5) for all the data chunks studied in this paper. This motivated the prior P shot ∼ N (5, 5 2 ) · 10 3 [Mpc/h] 3 , which is wide enough to accommodate absolute deviations fromn −1 across all data samples, in particular, in the high-z SGC sample, whose Poissonian shot noise is quite large. Notice that the residual P shot can, in principle, be negative. This fact is reflected in our prior. We present the optimal values of the nuisance parameters found in our MCMC analysis in Tab. 3 and in Figs 11 (high-z NGC), 12 (high-z SGC), 13 (low-z NGC), 14 (low-z SGC). We differentiate between nuisance parameters for different BOSS data samples with the following superscripts: (1) = high-z NGC, (2) = high-z SGC, (3) = low-z NGC, (4) = low-z SGC (C. 6) We stress that these parameters are obtained from a joint fit, i.e. the cosmological parameters are assumed to be the same across all samples. One can see that the best-fit values are in good agreement with the ones expected from the BOSS galaxy sample.
The measured values of c 0 , c 2 andc are indeed consistent with the estimate for the velocity dispersion effects (C.3), (C.4). Moreover, the best-fitting values of b 2 and b G 2 agree with the values found in the N-body simulations for the host halos similar to those of the BOSS sample (C.1). It would be interesting to further investigate if these values are also compatible with biases inferred from the bispectrum [122,123].  Figure 11. The posterior distribution for ω cdm , H 0 , A 1/2 ≡ (A s /A s, Planck ) 1/2 and the high-z NGC nuisance parameters inferred from the joint BOSS DR12 full-shape likelihood. [