Primordial Features from Linear to Nonlinear Scales

Sharp features in the primordial power spectrum are a powerful window into the inflationary epoch. To date, the cosmic microwave background (CMB) has offered the most sensitive avenue to search for these signatures. In this paper, we demonstrate the power of large-scale structure observations to surpass the CMB as a probe of primordial features. We show that the signatures in galaxy surveys can be separated from the broadband power spectrum and are as robust to the nonlinear evolution of matter as the standard baryon acoustic oscillations. As a result, analyses can exploit a significant range of scales beyond the linear regime available in the datasets. We develop a feature search for large-scale structure, apply it to BOSS DR12 data and find new bounds on oscillatory features that exceed the sensitivity of Planck for a significant range of frequencies. Moreover, we forecast that the next generation of galaxy surveys, such as DESI and Euclid, will be able to improve current constraints by up to an order of magnitude over an expanded frequency range.


Introduction
Characterizing the nature of inflation is one of the major challenges in cosmology. While current data is compatible with the simplest incarnation of inflation, a single weakly-coupled scalar field on a very flat potential, the space of possibilities for inflation are vast and should ultimately be settled by data. An appealing aspect of the simplest models is that the observed (near) scale invariance of the power spectrum of fluctuations is easily explained by the flatness of this potential [1]. However, attempts to realize inflation from a more fundamental starting point can lead to much more complicated models where many interconnected pieces are needed to achieve scale invariance [2,3]. This dichotomy between the simplest models and ultraviolet-complete examples originates from quantum gravity itself: sufficiently flat potentials can be engineered using symmetries, but quantum gravity famously abhors them [4,5]. As a result, a wide variety of models gives rise to non-trivial deviations from canonical slow-roll and scale invariance (i.e. features) [3,6].
Other mechanisms avoid this picture altogether by invoking non-trivial interactions that can lead to non-Gaussian n-point correlation functions (primordial non-Gaussianity) [7].
Our ability to test these ideas directly with data relies on separating the primordial signatures of interest from a broad range of processes at late times. This challenge is particularly acute for constraints on inflation from large-scale structure (LSS) surveys since the observed objects, i.e. galaxies, owe their existence to the nonlinear gravitational evolution of matter fluctuations in the late universe. While both current and future galaxy surveys have the raw statistical power to compete with other cosmological probes, such as the cosmic microwave background (CMB), the useful information is greatly diminished if we restrict our analyses to modes that are sufficiently linear to use forward modeling in order to isolate the primordial information. There has been steady improvements in this direction, but eventually these techniques are expected to be limited by the complexity of (astro)physics at short distances [8,9].
An alternate and already successful approach is to look for special observables that are (at least partially) immune to the complications presented by LSS data. The best known example of this type are the baryon acoustic oscillations (BAO). Although they manifest themselves as an oscillation in the power spectrum on the scales most sensitive to nonlinearities, it is more usefully understood as a sharp peak in the two-point correlation function at the size of the sound horizon, which is a scale that is much larger than the scale where nonlinear evolution dominates [10]. More recently, a constant phase of the baryon acoustic oscillations was shown to be immune to nonlinear evolution [11]. These are useful examples as they show that smooth and oscillatory power spectra are not sensitive to the same nonlinear effects.
Having said this, the most common inflationary parameters (e.g. the scalar spectral index n s or its running α s ) have proven challenging to be constrained by LSS. Changes to these parameters typically lead to smooth variations in the power spectrum (as a function of wavenumber k) and are therefore degenerate with other contributions such as galaxy biasing and baryonic effects. Furthermore, most inflationary observables get their constraining power from the smallest physical scales accessible in a given survey where gravitational nonlinearities dominate. With current data, large-scale structure is most competitive with the CMB as a probe of inflation for constraints on local primordial non-Gaussanity [12][13][14]. In this case, the non-Gaussian signature in the initial where BOSS has the largest signal-to-noise ratio. We see that the effective frequency of the logarithmic oscillations decreases as we go to larger wavenumbers k.
conditions manifests itself in the biasing of galaxies on the largest scales where nonlinearities are negligible [15]. Future surveys will search these large scales with increasing sensitivity and have the potential to ultimately exceed the CMB [16,17].
In this paper, we demonstrate that features in the primordial spectra, much like the standard BAO signal itself, are immune to short-distance nonlinear processes of the late universe and the effects of large-scale bulk flows can be captured analytically. We can therefore test these models with the full statistical power of LSS surveys. Phenomenologically, primordial features are generally characterized by significant deviations from scale invariance over a narrow range of scales, usually in the power spectrum. The most canonical examples, shown in Fig. 1, are written as oscillations in either k or log k [6]: P ζ (k) = P s (k) 1 + A lin sin(ω lin k + ϕ lin ) + A log sin(ω log log(k/k ) + ϕ log ) , (1.1) where P ζ (k) is the primordial power spectrum of adiabatic density fluctuations (ζ) and P s (k) is a smooth function of k. For a linear oscillation (A lin ), the insensitivity to nonlinear effects is identical to the case of the baryon acoustic oscillations: if we Fourier transform the signal, the linear oscillation is a sharp peak at the scale ω lin in the two-point correlation function. For sufficiently large ω lin , one is effectively looking for a second BAO peak. For logarithmic oscillations (A log ), there is not a simple description in terms of scales, but we will show that local nonlinear evolution is incapable of producing the same oscillation for sufficiently large frequencies ω log . While the analogy with the BAO signal is useful, it is worth observing that the primordial features arise directly in the initial conditions of the dark matter and baryons, and are suppressed only by the amplitude of the oscillation. By contrast, the baryon acoustic oscillations themselves are a consequence of the physics of baryons. This means that their impact on LSS data is suppressed by ω b /ω m and the growth of dark matter density fluctuations prior to recombination. As a result, the amplitude of the BAO spectrum is suppressed relative to a primordial feature and is roughly equivalent to a linear feature amplitude of A lin = 0.05. The fact that we do not see an additional oscillation beyond the BAO signal by eye already suggests that A lin 0.1 without any analysis.
In addition to being protected quantities in LSS, primordial features of this form are a wellmotivated probe of the early universe in their own right [6,[18][19][20][21][22][23][24][25][26][27][28][29][30]. They notably arise in axion monodromy inflation [31] as a direct consequence of the fundamental symmetry structure needed to produce large-field inflation. The periodic, nonperturbative potential generated for axions gives rise to an oscillation in the inflationary potential and manifests itself as a logarithmic oscillation in the power spectrum [23]. In addition, there are a number of scenarios where particles are excited from the vacuum at a specific time through non-adiabatic evolution and can give rise to linear oscillations. Moreover, most features in the power spectrum can be efficiently decomposed in this basis of functions which therefore captures large parts of model space.
The outline of the paper is as follows. In Section 2, we show that features are robust to small-scale nonlinearities and compute the nonlinear damping effect due to long-wavelength modes. In Section 3, we introduce our new analysis to search for these features in LSS data and verify that these oscillatory signals can be reliably constrained. In Section 4, we apply this pipeline to BOSS DR12 data and present a new constraint that exceeds Planck over a significant range of frequencies. Moreover, we forecast the sensitivity of future observational surveys. We conclude in Section 5. Additional details on the theoretical calculation, the employed forecasts, and the performed LSS and CMB analyses are provided in a set of four appendices.

Primordial Features and Galaxy Surveys
In this section, we determine how a primordial feature will appear in the nonlinear (low-redshift) universe. We first characterize the signals in the linear matter power spectrum. Then, we will use a linear-response argument to show that nonlinear evolution on small scales does not change the amplitude of the feature in the nonlinear power spectrum. Finally, we will use infrared resummation to determine the nonlinear damping of the features from large-scale modes.

Oscillatory Features in the Primordial Spectrum
The physics of inflation determines the primordial power spectrum, P ζ (k). In the simplest versions of inflation, this power spectrum arises from the freeze-out of quantum mechanical fluctuations when the physical wavelength reaches the Hubble radius, k = aH(t ), where k is the constant comoving wavenumber, a(t) is the scale factor and H(t) is the Hubble parameter. (This equation defines the freeze-out time t .) As a result, the amplitude of fluctuations for a comoving wavenumber k is determined by the physics of inflation around the time t . Furthermore, the near scale invariance of the resulting spectrum is a consequence of the weak time dependence in the evolution of the perturbations during inflation.
Scale-dependent features in the primordial spectrum therefore arise from strongly timedependent physics during inflation. 1 This may be due to sharp features in the underlying potential for a scalar field, or special locations in the field space of the inflaton where other particles become light and can be excited from the vacuum. Despite the wide range of possibilities, the signatures do not significantly depend on the details of the model since the nature of the time dependence controls the deviations from scale invariance.
We will assume that the smooth spectrum P s (k) of (1.1) is the almost scale-invariant power spectrum of curvature perturbations in vanilla models of inflation, where A s and n s are the scalar amplitude and spectral index at the pivot scale k , which we generally take to be k = 0.05 Mpc −1 . We then write the full power spectrum P ζ (k) including the contribution from features, δP ζ (k), as As suggested in (1.1), we will consider oscillatory features with linearly-spaced oscillations, δP lin ζ , and logarithmically-spaced oscillations, δP log ζ . We phenomenologically parameterize the former, which we refer to as linear features, as follows: 2 with the feature frequency ω lin , and the amplitudes of the sine and cosine contributions A sin lin and A cos lin , respectively, or the overall feature amplitude A lin and corresponding phase ϕ lin . The so-called logarithmic features are similarly defined as with the feature frequency ω log , and the amplitudes of the sine and cosine contributions A sin log and A cos log , respectively, or the overall feature amplitude A log and corresponding phase ϕ log . We refer to the parameterization in terms of two amplitudes as 'amplitude parameterization' and the one in terms of the overall amplitude and a phase as 'phase parametrization'. We note that it has been customary in the literature to define the linear feature frequency ω lin as a dimensionful quantity in units of Mpc, whereas the logarithmic feature frequency ω log is dimensionless. In addition, we remark that the feature amplitudes give the contribution relative to the standard power spectrum P ζ,0 and the mean of δP ζ (k) is vanishing.
The information in the primordial power spectrum is transferred to the matter power spectrum, as illustrated in Fig. 2, through the usual linear evolution from initial conditions,  Figure 2: Imprint of primordial features in several large-scale structure observables at redshift z = 0: the matter power spectrum P (k) (top) and the relative wiggle spectrum P w (k)/P nw (k) (middle) in Fourier space, and the (rescaled) two-point correlation function ξ(r) in real space (bottom).
We compare a featureless model (gray) to scenarios involving a linear feature (left) and a logarithmic feature (right) with A sin X = 0.05 and A cos X = 0, X = lin, log. In addition to the predictions in linear theory (dashed), we also show the observables including nonlinear corrections (solid) from a theoretical calculation of the damping.
where D(z) is the linear growth rate and T (k) the transfer function. For large enough feature frequencies, these oscillations can be distinguished from the broadband shape of the power spectrum, similar to the baryon acoustic oscillations. It is therefore natural to constrain the feature models as contributions to the BAO spectrum which is why we split the power spectrum into a smooth ('no-wiggle') part and an oscillatory ('wiggle') part, (2.6) Since primordial features with large enough frequencies are only contained in the second term, it is useful to further decompose the wiggle spectrum as follows: where P w BAO (k) is the standard BAO spectrum in a featureless ΛCDM cosmology, the auto-spectrum of possible primordial features is with X = lin, log, and the third term is the BAO-feature cross-correlation power spectrum. Since the BAO signal itself is only a small (five-percent) contribution to P (k), we will be able to generally neglect this cross-spectrum term in our theoretical considerations. Given that (2.6) is the linear matter power spectrum, we will show in the two section that small-scale modes do not affect P w (k) for larger enough feature frequencies ( §2.2), but that each of its oscillatory components in (2.7) is affected by gravitational large-scale nonlinearities and exponentially damped ( §2.3).

Robustness of Features to Small-Scale Nonlinearities
The smallest scales in a LSS survey carry most of the statistical power, but are also the most prone to nonlinear corrections, including galaxy bias and baryonic effects. For this reason, a typical analysis might cut at k = (0.1 − 0.2) h Mpc −1 to avoid the complications of modeling and marginalizing over these effects. In the case of the baryon acoustic oscillations, it has long been known that they are robust to those effects which change the power spectrum only by a smooth window function and that it is possible to aggressively marginalize over these smooth corrections without losing any information. In the following, we show that high-frequency oscillatory features in the power spectrum are protected from small-scale nonlinearities in precisely the same fashion as the BAO signal. We first give an intuitive argument that this is indeed the case which we then confirm more rigorously.

Intuitive argument
The power spectrum of linear oscillations P w lin (k) as defined in (2.3) and (2.8) is the same as the approximate form of the BAO signal, where the linear feature frequency ω lin corresponds to the sound horizon r s and the feature phase ϕ lin is the phase shift due to free-streaming neutrinos [11,33,34]. 3 It is a long-established fact that the baryon acoustic oscillations are essentially immune from nonlinearities on small scales because the sound horizon is a large-distance scale. The same should therefore be true for the linear oscillations if ω lin 75 Mpc. More recently, it has been shown that the phase ϕ lin is also protected [11] which implies that this argument will hold for both the sine and cosine contributions.
A priori, it is less clear that logarithmic oscillations will share the same nice properties. Unlike the linear frequency ω lin , the frequency ω log of the logarithmically-spaced oscillations is a dimensionless parameter and, therefore, does not refer to any fixed physical scale in either Fourier or configuration space. As a result, the signal must appear on all scales in both descriptions and, consequently, is not clearly distinct from nonlinear effects. Fortunately, in practice, our  Figure 3: Separation of zeros (or, equivalently, peaks and troughs) in the spectrum of logarithmically-spaced oscillations (2.4) as a function of wavenumber k for a range of frequencies ω log . For comparison, the separations in the standard BAO spectrum and for linear oscillations with ω lin = 75 Mpc and ω lin = ω Ny ≈ 929 Mpc, which is the Nyquist frequency in our BOSS analysis below, are also shown. The dotted and dashed gray lines indicate the nonlinear scale k nl ≈ 0.15 h Mpc −1 and k ω = k, respectively. The former marks the approximate wavenumber where nonlinearities are expected to become large, k k nl , while these nonlinearities do not alter features with k ω k. The zero separations are denoted by k ω /2 since they are equivalent to half of a feature oscillation period.
information only comes from a limited range of scales which is given by k ∼ (0.1 − 0.3) h Mpc −1 for BOSS. We can always (Fourier) decompose any feature into a sum over linear oscillations. For sufficiently large ω log , the signal in this range of wavenumbers is reliably reproduced keeping only the linear oscillations with frequencies large enough to be protected by our previous argument. We illustrate this argument in Fig. 3, where we show the peak-trough separation for the logarithmicallyoscillating power spectrum P w log (k) defined in (2.4) and (2.8) as a function of k in our range of interest. For ω log 20, the peak-trough separation is at the level of the BAO spectrum, while nonlinearities are expected to affect the power spectrum at smaller scales. For the purposes of an analysis of BOSS data, 4 the logarithmic features therefore do not present any significant complications.

Rigorous argument
We can rigorously derive these results using the techniques developed in [11]. Since the amplitude is known to be small (percent level at best), we can find the nonlinear density fluctuations as the linear response in the feature amplitude (A lin or A log ). On general grounds, this response must take the form [11] where δ w in ( x ) is the contribution to the initial density contrast at linear order in A X , X = lin, log. The first index in red, x, arises from the underlying inhomogeneity of the universe and would be absent in a translation-invariant system. The second index in blue, x − x , characterizes the propagation of information from one point to another.
Unlike for the BAO spectrum, we are not concerned that nonlinear evolution will make a small change to the frequency or phase. Since nonlinear evolution is not expected to create such a frequency out of nothing, the primary concern is that nonlinear evolution will make a large change to the amplitude by some incalculable amount, making it impossible to relate bounds on the nonlinear spectrum to the amplitude in the initial conditions. To be dangerous, this change must specifically alter the amplitude of a high-frequency oscillation relative to the amplitude of the underlying smooth nonlinear matter power spectrum. To proceed, we Fourier transform (2.9) to arrive at Note that it is the wavevector k − q ≡ p that characterizes the scale of the inhomogeneities. Following [11], we will define a scale k ω k as the approximate period of oscillations in the power spectrum (e.g. k ω = 2π/ω lin ). We will separate the domains p > k ω and p < k ω to distinguish the "small-scale" and "large-scale" inhomogeneities, respectively.
Using an argument from [11], it is easy to see that small-scale inhomogeneities do not contribute an oscillatory signal in the nonlinear matter power spectrum. 5 Suppose now that p > k ω did contribute a high-frequency oscillatory signal. This would imply that we should see a large change in δ w (k) if we shifted k → k + αk ω where | α| ∼ O(1). However, recall that G( p, q ; τ ) is determined from the nonlinear density field in a universe without the oscillatory signals and should therefore be a smooth function of p and q. As a result, we can Taylor expand G( k + αk ω − q, q ; τ ) in α to find where we used ∇ k ∼ k −1 because G is a smooth function of k − q. From this argument, we see that δ w p>kω ( k, τ ) is a smooth function of k. This means that it does not change rapidly over one period of the initial oscillations and, therefore, does not contribute to the oscillatory signal. In other words, small-scale nonlinearities do not change the amplitude of features in the power spectrum provided that they have a large-enough frequency.
To conclude the discussion of small-scale nonlinearities, we note that this result is independent of the precise shape of the initial feature. It only requires that the feature, δ w (k), changes by order one over a very small range of k. We do not require that it is sinusoidal or that it is associated with a large physical scale in configuration space. Furthermore, the result that k ω /k 1 is not altered by small-scale nonlinearities is the same condition which requires that the oscillation is distinct from a smooth polynomial. The power spectrum on a scale k is "smooth" if ∂ log P ∂ log k 1. This implies that our features will be "sharp" if These conditions are illustrated in the left panel of Fig. 1 where the linear oscillations at small wavenumbers are smooth while the logarithmic feature is sharp on all scales. Of course, implicit in this discussion is that k is a scale where nonlinearities are important. In our universe, nonlinear effects are strongly k-dependent and, therefore, primarily affect modes near the nonlinear scale, k ≈ k nl .

Damping from Large-Scale Nonlinearities
In the previous section, we established that short-wavelength inhomogeneities cannot alter the appearance of an oscillatory signal in the initial conditions. We now turn to the long-wavelength modes. In general, it is hard to compute the consequences of nonlinearities on the matter power spectrum from first principles. Having said that, it has been shown that the nonlinear effects of large-scale modes on the BAO spectrum can be computed and resummed in perturbation theory, resulting in a damping of the amplitude and shape of the standard BAO signal (cf. e.g. [10,[35][36][37][38][39][40][41]). In the following, we generalize this calculation to a generic linear feature and further extend it to the case of logarithmically-spaced oscillations.

Perturbative Treatment
Our aim is to compute the damping of a generic oscillatory feature due to long-wavelength modes. It is well known that a simple perturbative treatment is not enough to capture the full damping effect in the case of the standard oscillatory features in the matter power spectrum, the BAO signal [35][36][37][38][39][40][41]. As in that case, it is however also useful to start with a perturbative treatment for generic features and then include nonperturbative effects in the calculation.
The full one-loop power spectrum is given by where F n are the usual perturbation theory kernels (see [42] for a review). For a generic oscillatory component P w (k) of the matter power spectrum, the effects of long modes q, with q < Λ ≡ k, 6 6 In this case, long modes q are defined as those modes with wavelengths much longer than the typical width σ of the feature, q < Λ 2π/σ. This implies ΛBAO 0.6 h Mpc −1 for the standard BAO signal. However, in practice, we want to predict the matter power spectrum at any wavenumber k, which implies that the prescription for long modes should also satisfy q k. Since the latter requirement is stronger than the former for the range of scales under consideration in this work, we choose the separation scale to be Λ = k, with 1 [37].
can be captured by (2.14) Since features break the scale invariance of the matter power spectrum, we cannot keep only the first few orders in the Taylor expansion P w (| k + q |) = P w (k) + q · ∇ k P w (k) + O(q 2 /k 2 ), but have to resum the entire series into the exponential We can therefore rewrite (2.14) as We stress that we did not assume any particular form of P w (k) in this expression. It is only based on (i ) calculating the contribution of long modes to the one-loop power spectrum (2.13) and (ii ) the matter power spectrum having an oscillatory component of any kind. Since the calculation proceeds by applying the operator cosh q · ∇ k to the wiggle power spectrum P w (k), we now consider the two oscillatory feature models separately.
Linear features. As a consequence of the baryon acoustic oscillations in the early universe, there is an enhanced probability to find pairs of galaxies at a separation given by the size of sound horizon at the drag epoch, r s ≈ 150 Mpc. We therefore find a peak in the galaxy twopoint correlation function and linearly-spaced oscillations in Fourier space, with the location and frequency given by r s , respectively. An enhanced probability of finding galaxy pairs at another distance scale ω lin would produce the same signatures. This is why we can compute the damping of such feature oscillations in exactly the same way as for the BAO spectrum. We will review them here for a generic scale ω lin 75 Mpc in order to not be affected by small-scale nonlinearities.
Since the BAO signal is itself a small contribution to the overall matter power spectrum, we neglect its contribution and simply use P w (k) ≡ P w lin (k) as defined in (2.8). Applying 2n gradients to P w lin (k) results in where we used the series expansion of the hyperbolic cosine and neglected small corrections that arise from acting with the derivative operators on the smooth envelope P nw (k). Plugging this result into (2.16), we get where we defined with the spherical Bessel function of the first kind j n (x). We note that Σ lin is independent of the wavenumber k unless an implicit dependence is introduced by taking Λ = Λ(k) as noted above.
Logarithmic features. In contrast to linear features, logarithmically-spaced oscillations in Fourier space do not have a simple intuitive interpretation in real space. However, we can proceed with the calculation of the damping without additional caveats because the derivation of (2.16) is valid for any oscillations in the matter power spectrum. In Appendix A, we show how the operator cosh q · ∇ k acts on the wiggle component of the linear matter power spectrum in the presence of logarithmic features, which we introduced in (2.4) and (2.8). In consequence, the one-loop wiggle spectrum can be computed to be where we introduced with µ =q ·k. In contrast to the damping scales of the BAO signal, Σ BAO , and of linear features, Σ lin , which are constant in k, the damping factors Σ log andΣ log are scale dependent. Moreover, the one-loop wiggle power spectrum is also no longer directly proportional to the oscillatory power spectrum at linear order. We however note that (2.21) and (2.22) are only valid if q k. In this limit and for large-enough values of ω log , these expressions can be simplified into a form similar to (2.19) for linear features. Since we can always decompose a logarithmic feature in a basis of linear oscillations, this also conforms with our expectation to recover this result in the appropriate limit. For now, we however choose to keep the calculation general and will discuss these limits in detail below.
The crucial aspect of both one-loop results (2.18) and (2.20) is that they correct the linear power spectrum by order O(1)-terms for a wide range of parameter space and wavenumbers, exactly as in the case of the standard BAO spectrum. In other words, we have P w 1-loop (k) ≈ O(1) P w tree-level (k) for k ∈ [0.1, 0.3] h Mpc −1 because k 2 Σ 2 ≈ O(1) in this range. This indicates that the perturbative treatment is insufficient. Fortunately, it is possible to compute the leading-order correction of long modes to the wiggle power spectrum at all orders in perturbation theory.

Infrared Resummation
The infrared (IR) resummation of the large-scale bulk flows that damp the BAO signal has been studied in various ways [35][36][37][38][39][40][41]. Here, we follow the approach of [38], in which the class of loop diagrams that are most IR-enhanced are first identified and then resummed into the nonperturbative effect, the well-known exponential BAO damping. Their L-loop diagram is given by where the subscript 'LO' indicates that the leading-order IR-enhanced loops are taken into account. Furthermore, we introduced the notation [dq] = d 3 q and defined It is easy to verify that we exactly recover (2.16) for L = 1. Since the rest of the calculation depends on the form of P w (k), we discuss the linearly-and logarithmically-spaced oscillations again in turn.
Linear features. By employing (2.18), it is straightforward to compute the wiggle power spectrum of linear features at L th order, Resumming these terms to all orders, we obtain This result is a generalization of the BAO expression with r s → ω lin (cf. e.g. [38]). Since the value of Σ lin however saturates for ω lin 75 Mpc, we can simply use the BAO damping scale, Σ lin ≈ Σ BAO . We can therefore factor out the damping and write the full matter power spectrum as which constitutes a simple generalization of the known result for the standard BAO signal.
Logarithmic features. It is slightly less trivial to compute the expression of the L th -order loop for logarithmic features with arbitrary frequency ω log . We proceed via induction by computing the first few orders and then deriving the general formula. In this way, the IR-resummed wiggle power spectrum is found to be (2.28) We refer to Appendix A for further details on this calculation.
While this expression provides the leading nonperturbative damping for logarithmic features, we find by explicit calculation that genericallyΣ log Σ log . This can be understood by noticing that the integrands in (2.21) and (2.22) can be expanded in ω log (qµ/k) and ω log (qµ/k) 2 , respectively. The fact that the integrals get their largest contributions from q k explains the hierarchy between Σ log andΣ log . As a result, when k 2Σ log is large enough to be important, the signal is already exponentially suppressed. It is therefore a good approximation to setΣ log ≈ 0 for our analysis choice of ω log ≥ 10 (see below), as confirmed by Fig. 11 in Appendix B. We therefore get Second, it is straightforward to show that in the limit q/k 1, Σ log (k) approaches the functional form of Σ lin with the substitution ω lin → ω log /k, 7 with Σ lin given by (2.19). Therefore, whenever ω log /k 75 Mpc, we can use the same approximation as in the linear case, Σ log (k) ≈ Σ BAO . As we also show in Appendix B, this approximation is good enough for the scales and frequencies of interest in an analysis of BOSS data. 8 We can therefore further approximate (2.29) and write the full nonlinear power spectrum as i.e. in the same way as for linear features in (2.27).
Before concluding the discussion of the theoretical damping calculation, a couple of remarks are in order regarding subleading corrections (2.27) and (2.31): • We computed the leading-order IR-resummed power spectrum. It has however been shown that there are subleading contributions which improve the fit to N -body simulations in the case of the featureless BAO signal [38][39][40][41]44]. Since we do not employ the theoretically computed results, but fit the damping scale in our data analysis below, we can neglect these corrections for both the BAO signal (as in the standard BAO analyses), and the primordial linear and logarithmic features.
• We have not computed the damping of the mixed BAO-feature term of (2.7), P w BAO (k) δP X ζ . The size of this contribution is of order A BAO × A X ≈ 0.05 × 0.01 and, therefore, contributes less than per mil to the matter power spectrum. For linearly-spaced oscillations, we checked that it is again a good approximation for ω lin 2r s ≈ 300 Mpc to also factor out the exponential BAO damping for the mixed term. We expect this to also be the case for logarithmic features. Consequently, we implement the mixed term in the nonlinear matter power spectrum as follows: Finally, from now on, we switch to the notation for the BAO damping scale that has been adopted in data analyses, Σ 2 BAO → Σ 2 nl /2. With this, we are ready to implement (2.32) and search for primordial feature models in observational LSS data.

Feature Search in Large-Scale Structure
In this section, we introduce and establish our search for features in the BOSS dataset. We propose an analysis in which the amplitude and frequency of the linear and logarithmic features can be constrained. Moreover, we check the validity of our approach on mock data and compare the results with the expected constraining power obtained in forecasts.

BOSS DR12 Dataset and Analysis Pipeline
The approach of our analysis, which we introduce in the following, is very general and we expect it to apply to a wide range of surveys. Having said this, some of its aspects are particular to BOSS, such as the validity of some of the employed approximations, and should therefore be revisited in future analyses.
Our analysis is based on the BAO pipeline of [45]. We use the commonly employed density field reconstruction procedure [46] to reduce the damping scale caused by gravitational evolution and move information from higher-order statistics back to the power spectrum [47]. 9 We then measure the galaxy power spectrum following the steps described in [45,48]. The corresponding covariance matrix is obtained by measuring the power spectrum monopole in 999 mock catalogs (see §3.3 for more details on these mock catalogs). To extract the BAO (and potential feature) signal, we marginalize over the smooth galaxy broadband power spectrum, with five polynomial terms Here, the bias parameter B is used to marginalize over the power spectrum amplitude, P nw (k) is the linear no-wiggle power spectrum model without any BAO signal and is the velocity damping term arising from the nonlinear velocity field. Finally, the standard BAO signal and the oscillatory features left after the marginalization described above are modeled as where O(k) ≡ P w BAO (k)/P nw (k) is the standard linear BAO spectrum, α is the associated isotropic scaling parameter and Σ nl is the nonlinear damping scale, which we keep as a free parameter. The fiducial ΛCDM cosmology is taken to be the same as in [45], with matter density Ω m = 0.31, physical baryon density ω b = 0.022, amplitude of linear matter flucations on 8 h Mpc −1 scales σ 8 = 0.824, scalar spectral index n s = 0.96 and Hubble constant H 0 = 67.6 km s −1 Mpc −1 . The linear and logarithmic features are contained in the relative primordial spectrum δP X ζ (k), with X = lin, log, that was introduced in (2.3) and (2.4). While the above model is described in terms of a continuous wavenumber k, cosmological experiments can only access a finite number of modes due to their finite survey volume. The associated fundamental mode k f is given by the largest scale included in the dataset and is naturally setting a resolution limit on the oscillation frequency that we can measure. In practice, this is realized through the survey window function which introduces couplings between modes separated by the fundamental mode or less [48][49][50][51]. In addition, we only measure the power spectrum in discrete bandpowers P i , which average wavenumbers k with k ∈ [k i −∆k/2, k i +∆k/2), where the bandwidth ∆k is a choice of the analysis. As a consequence, a signal with an (effective) linear frequency above the Nyquist frequency, ω Ny = π/∆k, 10 will be aliased and is therefore out of reach. Figure 4 highlights the effects that the finite bandwidth and the window function have on the power spectrum, and illustrates why they limit the range of frequencies that are accessible in an LSS analysis.
In the next sections, we apply this analysis pipeline to the Baryon Oscillation Spectroscopic Survey (BOSS), which was part of SDSS-III [52,53] and used the SDSS multi-fibre spectrographs [54,55] at the 2.5 m Sloan Telescope [56] of the Apache Point Observatory in New Mexico. We employ the final version of this dataset, known as data release 12 (DR12) [57], which contains spectroscopic redshifts of 1 198 006 galaxies. The survey covered 10 252 deg 2 of the sky, divided in two patches called the North Galactic Cap (NGC) and the South Galactic Cap (SGC), and a redshift range of 0.2 − 0.75. Following the main BOSS analysis [58], we split this redshift range into two (independent) redshift bins given by 0.2 < z < 0.5 ('low-z') and 0.5 < z < 0.75 ('high-z'). While the standard BOSS analysis uses ∆k = 0.01 h Mpc −1 [45], we employ a bandwidth of ∆k = 0.005 h Mpc −1 , which is close to the fundamental mode of BOSS, to maximize the feature frequency range accessible in this dataset. This limits our analysis to ω lin ≤ ω Ny ≈ 929 Mpc, but we conservatively take ω lin ≤ 900 Mpc.
We analyze this dataset by producing a Markov Chain Monte Carlo (MCMC) with a modified version of emcee [59] which includes the Gelman & Rubin convergence criterion [60] with scale parameter < 0.04. Since the inflationary signal under consideration is isotropic, we focus on the power spectrum monopole and perform our analysis with one isotropic BAO parameter α per redshift bin. Moreover, we treat the low-and high-redshift bins of BOSS DR12 independently. Since we use separate broadband marginalization parameters for the NGC and the SGC, we fit for a total of 18 free parameters per redshift bin: We impose flat priors on all parameters, including the feature frequencies which are sampled within [100,900] Mpc and [10,80] for linearly-and logarithmically-spaced oscillations, respectively. These ranges are motivated by our discussion of small-and large-scale nonlinearities in Section 2 (see also Fig. 3). Given that the primordial feature parameters ω X , A sin X and A cos X are independent of the redshift bin, we combine them when inferring bounds on these models while marginalizing over the other parameters (see Appendix C).

Forecasting Methodology for BOSS
To estimate the expected level of sensitivity, validate and cross-check the described analysis of BOSS data, we perform two types of forecasts: based on the Fisher information matrix and based on the likelihood itself. The Fisher forecasts have to be used with care, but provide useful guidelines over a large range of possible parameters and experimental configurations since they  Including an approximation of the window function (dark colors; see Appendix B for details) further smooths the primordial wiggles. In particular at the Nyquist frequency, which is given by ω Ny = 929 Mpc for the BOSS DR12 dataset used in this work, the feature oscillations are completely aliased with only the featureless BAO spectrum remaining. The light gray lines indicate the estimated noise curves for the high-redshift bin of BOSS including the nonlinear exponential damping which differs between the BOSS data (solid) and mock catalogs (dashed). Finally, we note that the comparison of the left-and right-hand panels together with these noise curves allows for another way to estimate the reliable frequency range for the logarithmic features that is complementary to Fig. 3.
are relatively fast to compute. On the other hand, the likelihood-based forecasts present a more direct picture of the sensitivity and, in particular, allow to study the effects of noisy data and injected signals. In addition, we checked a number of theoretical approximations using Fisher forecasts (see Appendix B).
Our general forecasting methodology and modeling is based on [33], with some modifications that are detailed in Appendix B. Since we are only interested in oscillatory features (and not broadband effects), our forecasts directly employ the relative wiggle spectrum O g (k) ≡ P w g (k)/P nw g (k) and not the galaxy power spectrum P g (k) as the observable. After marginalizing over the bias B and the polynomial coefficients a n of (3.2), we expect these two to be identical. Working with O g (k) removes much of the degeneracy with these broadband parameters and makes the forecasts more reliable (in particular for the Fisher matrix). The Fisher information matrix F ij is typically defined as the average curvature of the log-likelihood, logL( θ), around a fiducial point in parameter space spanned by θ. In our BAO forecasts, we will generally use θ ≡ {α, A sin X , A cos X }, with X = lin, log, where α is the standard BAO parameter, and A sin X and A cos X are the respective feature amplitudes. As in [33], we employ a conservative broadband marginalization scheme. Since the inverse Fisher matrix is the covariance matrix for a Gaussian likelihood, the Cramér-Rao bound, σ(θ i ) ≥ (F −1 ) ii provides a lower limit on the marginalized constraints, with equality commonly assumed for Fisher forecasts.
The likelihood-based forecasts are based on the same modeling and have previously been utilized successfully in [33,34]. In this type of forecast, we compute the likelihood function L( θ) on a grid in parameter space, given a specific fiducial ('data') spectrum computed for a fixed set of parameters θ fid . When specified, the fiducial model includes a random realization of the noise to mimic scatter of experimental data points due to shot noise and cosmic variance. In this case, we talk about "noisy forecasts" which will be useful in our estimates of the probability of experimental noise mimicking the presence of oscillatory features. This is in contrast to the "noiseless forecasts", for which the experimental effects are only captured by the covariance matrix, as commonly employed. (We emphasize that the latter forecasts are not cosmic variance limited.) Except where noted otherwise, all of the following BOSS forecasts are based on this likelihood-based approach.
Finally, we note that we produce two sets of forecasts as in [34]: one for the comparison to the mock catalogs and another to compare to the results from the actual BOSS data. This is due to the fact that the mock catalogs have a known problem of overdamping the BAO spectrum which results in an approximately 30% weaker signal for the traditional BAO analysis [45]. When comparing to mocks, we use a (post-reconstruction) nonlinear damping scale of Σ nl ≈ 7 h Mpc −1 , while we employ the standard (redshift-dependent) values otherwise (see [33]). The gray noise curves in Fig. 4, which include the nonlinear damping terms, indicate this difference and we can anticipate that the bounds on the feature amplitudes will be stronger on the data than in the mocks.

Validation on Mock Catalogs and in Forecasts
To validate our analysis pipeline, we first perform an analysis on the MultiDark Patchy mock catalogs [61], which mimic the galaxy clustering behavior observed in BOSS. These mock data have been produced using approximate gravity solvers and analytical-statistical biasing models. The catalogs have been calibrated to an N -body-based reference sample extracted from one of the BigMultiDark simulations [62], which was performed using gadget-2 [63] with 3840 3 particles in a volume of (2.5 h −1 Gpc) 3 assuming a ΛCDM cosmology with Ω m = 0.307115, Ω b = 0.048206, σ 8 = 0.8288, n s = 0.9611 and H 0 = 67.77 km s −1 Mpc −1 . The mock catalogs use halo abundance matching to reproduce the observed BOSS two-and three-point clustering measurements [64]. This technique is applied as a function of redshift to reproduce the BOSS DR12 redshift evolution.
Note that these are the same mock catalogs that we use to derive the covariance matrix of our analysis as mentioned above. In addition, we remark that the mock catalogs do not contain any inflationary features since they assume a featureless primordial power spectrum (see Appendix B for a check with injected signals).
We apply the described MCMC analysis pipeline to 100 NGC and SGC Patchy mock catalogs for the low-and high-redshift bins. Since nearby feature frequencies are correlated due to the finite range of wavenumbers used in the analysis, (0.01 − 0.3) h Mpc −1 , we bin the samples of the resulting Markov chains with widths of ∆ω lin = 10 Mpc and ∆ω log = 1.0. These values have been obtained from the Markov chains by estimating a scale-independent correlation length of the feature frequencies ω X . For a number of different purposes, we condense the Markov chains into the following three statistical quantities shown in Fig. 5: the mean valueθ i , the variance σ 2 (θ i ) and their ratio, the significanceθ i /σ(θ i ), for the parameters θ = {α, A sin X , A cos X } as a function of the frequency ω X . Apart from providing a validation of important parts of our analysis pipeline, the comparison of these quantities to those obtained in forecasts serves as a check of both the mock analysis leading to results within the expected sensitivity and the forecasts being suitable to compare to the data as well as to perform additional checks. Moreover, the significance provides a metric that helps to quantify any possible detections of features in the data analysis, for instance.
The results from the analysis of the low-redshift mocks are presented in Fig. 5, with the results from the high-redshift bin being similar. The middle panels of the mean values clearly show that our pipeline results in an unbiased estimation of both the BAO parameter α and the feature amplitudes given that the mocks are generated from a featureless primordial power spectrum. For linear features, the larger variance and non-zero mean values around ω lin = 150 Mpc indicate the expected degeneracy between the primordial features and the standard BAO spectrum with a sound horizon of r s ≈ 150 Mpc. (Note that the degeneracy is not perfect because the BAO spectrum is not a perfect sine oscillation, but contains a k-dependent amplitude and phase shift.) As expected, we also observe that the BAO parameter α is independent of the primordial parameters away from the scale of the sound horizon reproducing the constraints in the standard BOSS analysis [45]. The fact that the constraints on the feature amplitudes become (slightly) weaker with growing frequency can be attributed to the finite-survey effects of bandpowers and window function discussed above (cf. Appendix B). Finally, the right column shows that one typically finds a 2σ fluctuation in some frequency bins for any given mock catalog, or for a given frequency bin for some of the 100 mocks. This should not be surprising given the roughly 80 sampled (but partly correlated) frequencies. Furthermore, the fact that we do not find many > 3σ fluctuations is consistent with the statistical expectations.
While the fluctuations seen in individual mocks are consistent with the variance inferred from the posterior, we would also like to know if this variance is consistent with the expectations for this type of survey. For this purpose, we turn to the likelihood-based forecasts whose results are shown in direct comparison in Fig. 5. We see that the mean values of σ(θ i ) andθ i are in excellent agreement with the noiseless forecasts across the entire mock catalogs. Furthermore, when a specific realization of noise is added to these forecasts, one finds the fluctuations in both the mean and the variance are consistent with the fluctuations observed in the mocks. (As expected, the mean of the noisy forecasts approaches the noiseless forecasts in the limit of many We present the standard deviation σ(θ i ), meanθ i and significanceθ i /σ(θ i ) for the BAO parameter α and the amplitudes A sin X and A cos X as a function of the frequency ω X . We display these quantities for the individual mock catalogs together with their mean, showing very good agreement with the noisy and noiseless forecasts.
realizations.) This therefore further establishes that the observed fluctuations in the significances are entirely generated by and consistent with the experimental noise of cosmic variance and shot noise of BOSS. In other words, these fluctuations occur because we fit random noise. Together with the extensive checks using forecasts presented in Appendix B, this establishes our feature search in the clustering of galaxies and we can now turn to the BOSS DR12 data.

First Large-Scale Structure Constraints
In this section, we discuss the constraints on primordial feature models that we infer from the BOSS DR12 galaxy power spectrum. Furthermore, we compare and combine these novel bounds with those obtained from current Planck CMB data. We conclude this section with estimates of the future sensitivity of cosmological observations of the CMB and LSS.

Limits on Features from BOSS DR12
We now apply our analysis and forecasting pipeline to the BOSS DR12 dataset. Figure 6 shows the posterior distributions derived from the Monte Carlo Markov chains of the low-redshift bin in terms of the same characterizing statistical quantities as in Fig. 5 for the mock catalogs. When comparing these results inferred from the data chains with likelihood-based forecasts, we again find very good agreement for the low-redshift bin and similar results for the high-redshift bin. We reiterate that these forecasts differ from those in Fig. 5 especially in the value of the nonlinear damping scale. Since the smaller damping scale in the data leads to a larger signal-to-noise and considerably extends the range of wavenumbers contributing to the feature search, we observe a smaller variance, less scatter in the mean values and a smaller (but statistically consistent) number of > 2σ fluctuations than in the mocks. The fact that the inferred significances in the third column of Fig. 6 agree well with those found in the noisy forecasts indicates that we do not have any significant detection of a feature, but rather that the data analysis is consistent with fitting experimental noise. We note that the oscillations inĀ Y log , Y = sin, cos, that are visible towards smaller ω log in the noiseless forecasts, arise due to interference of the logarithmic feature spectrum with the BAO spectrum in the range k ∼ (0.1 − 0.2) h Mpc −1 . The noisy forecasts show however that this does not impact our BOSS analysis.
Having established the reliability and robustness of our data analysis in the amplitude parameterization of (A sin X , A cos X ), we want to infer the constraints on the overall feature amplitude A X while marginalizing over the phase ϕ X . Since we do not find any significant detections (see also Appendix C), we are mainly interested in deriving limits on the presence of primordial features which is why we take ϕ X ∈ [0, 2π) and the amplitude to be positive semi-definite: In this way, we can directly infer the upper limits on A X at 95% c.l. from the Markov chains of the low-and high-redshift bin, respectively. To derive constraints from the entire BOSS DR12 data, we combine the two sets of Markov chains by multiplying the binned posterior distributions. In this process, we neglect a possible correlation between the BAO parameter α and the feature amplitudes A sin X and A cos X . As previously noted, this correlation is however small away from feature frequencies around the BAO scale and taking this correlation into account would only strengthen the inferred bounds. We refer to Appendix C for an extensive discussion and further details. We present the resulting first constraints on linear and logarithmic features from large-scale structure data alone, i.e. without the inclusion of any other external datasets or information, in Fig. 7. Our analysis limits the amplitude of these primordial feature models, A lin and A log , to be less than one to two percent of the primordial scalar amplitude A s in the range of feature frequencies accessible with BOSS. Moreover, we do not find any significant detections of features as expected from Fig. 6 (see also Appendix C).

Comparison with Planck CMB Bounds
While we present the first limits on feature models from LSS alone, constraints have been inferred from CMB observations for more than a decade (cf. e.g. [1,[65][66][67][68][69][70][71][72][73][74][75][76][77]). It is therefore interesting to compare the inferred constraints. While the frequency coverage is wider in the CMB, our LSS-only bounds interestingly improve the limits inferred from current CMB data by up to a factor of 2.3 and 3.1 for ω lin 200 Mpc and ω log 20, respectively. This is illustrated in Fig. 8, which directly compares the constraints on the feature amplitudes from our BOSS analysis with those deduced from current CMB temperature (TT), and temperature and polarization (TTTEEE) data released by the Planck collaboration in 2015 [78,79] (see Appendix D for details on these limits). 11 Since the common focus of previous analyses was on the best-fit points or the likelihood improvement, we note that the limits on the feature amplitudes from the CMB have not been shown as a function of frequency before.
The improvements of our LSS bounds over those from Planck are primarily the consequence of two effects. First, the number of signal-dominated modes over the employed range of wavenumbers in BOSS and Planck are roughly comparable (approximately k 3 max V and 2 max , respectively). 11 We show the constraints from both TT and TTTEEE since the Planck collaboration had labeled the results employing high-multipole polarization data as preliminary in 2015. Having said that, the available information on feature models released by the collaboration seem to have remained fairly stable between their 2015 and 2018 releases [1,77]. We will therefore use the polarization data when deriving the joint constraints below.  Second, the imprint of high-frequency oscillations in the CMB power spectra is suppressed relative to that in the matter power spectrum, as shown in Fig. 19 of Appendix D. In combination, the signal-to-noise of a high-frequency feature is somewhat larger in BOSS than in Planck which leads to a more stringent constraint.
Finally, we can infer the best current limits on primordial linearly-and logarithmicallyoscillating feature models by combining the BOSS and Planck data. 12 These joint constraints are derived in Appendix D and shown in Fig. 9. As expected, we observe that these bounds are dominated by and, therefore, closely follow our limits from galaxy clustering data of BOSS except at smaller frequencies. Generally speaking, the bounds on features in the discussed range of frequencies ω X are now established at the one-percent level relative to the primordial power spectrum.

Future LSS and CMB Constraints
With the discussed improvements in the constraints on primordial features inferred from BOSS over those derived from Planck CMB data, it is timely to ask how these bounds will evolve with future CMB and LSS surveys. To this end, we performed Fisher matrix forecasts for upcoming, planned and futuristic experiments. We extend previous LSS forecasts (cf. e.g. [83][84][85][86][87][88][89]) in a number of ways, in particular by taking the effects of nonlinearities, bandpowers and window functions into account, and (conservatively) marginalizing over further uncertainties in the broadband power spectrum. Furthermore, we compare the reach of LSS surveys to that of future CMB missions. In this section, we focus on linear features since most other features can be easily decomposed into a  basis of linear oscillations. Before discussing the results of these forecasts, we briefly summarize our approach and refer to Appendices B and D for further details.
For our future LSS forecasts, we use the relative wiggle spectrum O g (k) ≡ P w g (k)/P nw g (k) as the observable up to k max = 0.5 h Mpc −1 based on [33] as outlined in Appendix B, including the effects of nonlinearities, bandpowers and window function. 13 To estimate the sensitivity of the CMB, we directly follow the methodology of [33], employing perfectly delensed temperature and polarization power spectra. The fiducial point in both cases is a featureless ΛCDM cosmology consistent with the Planck measurements [90,91]. After computing the Fisher matrices in the amplitude parameterization, we obtain the forecasted 95% upper limits on A lin by randomly sampling from the associated Gaussian distributions and applying the same procedure as in our BOSS analysis (see Appendices B and C).
The resulting forecasted sensitivity of several LSS and CMB experiments is illustrated in Fig. 10. Apart from BOSS and Planck, we included the planned surveys DESI [92], Euclid [93] and CMB-S3 as an umbrella for the multiple upcoming CMB missions [94][95][96]. In addition, we show the potential reach of CMB-S4 [97] (or, similarly, PICO [98]) and a 'Future' LSS experiment which is assumed to map about 10 8 objects up to redshift z = 3 over half of the sky. To get a sense for the theoretically possible limits, we also forecast a half-sky, cosmic variance-limited LSS survey with z ≤ 6 and k max = 0.75 h Mpc −1 ('LSS-CVL'), 14 and a CMB experiments that measures the temperature and polarization spectra to the cosmic variance limit up to multipoles of T max = 3000 and P max = 5000 on 75% of the sky ('CMB-CVL'). The possibly most notable aspect of these forecasts is that the coming generation of surveys, in particular DESI and Euclid, are projected to be more sensitive than a cosmic-variance limited CMB experiment over a substantial range of frequencies. In this precise sense, large-scale structure 13 In this way, we also find that our choice of kmax = 0.3 h Mpc −1 in the described analysis captures essentially all the information on features available in the BOSS DR12 dataset.
14 A similar performance could in principle be achieved by a 21 cm intensity mapping survey [99,100]. Large-scale structure surveys have the potential to improve over the CMB by more than one order of magnitude, while the CMB will always dominate the reach in feature frequency. As we discussed in Section 2, the LSS forecasts for ω lin 100 Mpc should be treated with care since these frequencies might be affected by the effects of nonlinear gravitational evolution and be generally more sensitive to the details of signal modeling.
will permanently surpass the CMB in sensitivity. Equally significant is the potential for future LSS observations to increase their constraining power on feature models. The new BOSS limit presented in this work and the forecasts for LSS-CVL leave approximately two orders of magnitude that could be achievable with a suitably designed survey. As mentioned previously, it will however be necessary to revisit some of the aspects of the analysis that we employed on BOSS data to credibly achieve such sensitivities. The improvements seen in future surveys come primarily from two factors: smaller shot noise and higher redshifts. The constraining power of a survey is dominated by the number of signal-dominated k-modes. Most of these modes are at large wavenumbers, but are limited by the shot noise of the survey. The significant increase in the number density of objects available in upcoming surveys substantially increases the number of modes and drives the improvements in sensitivity. In addition, the larger redshift range of these observations means that the nonlinear damping is reduced, increasing the size of the signal at higher wavenumbers. Furthermore, future surveys will also benefit from larger survey volumes which can be seen clearly in the larger range of feature frequencies ω lin that are accessible. This is because increasing the volume allows for finer k-bins, which results in a larger Nyquist frequency ω Ny .
In summary, LSS bounds on features are currently competitive with and will surpass those from the CMB (present and future) over an increasing range of frequencies. Large-scale structure observations have a significantly larger sensitivity over their available frequency range due to the large number of modes. Furthermore, the transfer of primordial power to the matter power spectrum is more efficient than for the CMB, which leads to a larger intrinsic signal (see Appendix D for a more detailed discussion). On the other hand, the CMB can cover a wider range of frequencies than will be accessible even with futuristic LSS surveys.

Conclusions
In this paper, we explored the impact of large-scale structure data on the search for primordial features in the power spectrum. We showed theoretically that such analyses are promising since they are not limited by the small-scale nonlinearities of structure formation and the exponential damping caused by large-scale bulk flows can be reliably computed (as we explicitly did at leading order for both linear and logarithmic features). We then applied these results to BOSS DR12 data and found constraints comparable (but somewhat stronger) to the best limits from Planck. The joint bounds on these models are therefore dominated by the galaxy clustering data. Moreover, we forecast that near-term surveys improve on this result by up to an order of magnitude and could out-perform a cosmic variance-limited CMB experiment over a substantial range of feature frequencies.
Large-scale structure surveys offer great promise for dramatically improving our understanding of the very early universe. However, to date, these hopes have been largely limited by the modeling uncertainties around the nonlinear scale. In this work, we have however shown that, for the right observables, the statistical power of current surveys is already sufficient to significantly impact our understanding of inflation and beyond.
While our emphasis was on primordial features, in particular from an inflationary origin, both the method and the results have significantly broader implications. Any sufficiently sharp feature in the matter power spectrum could be analyzed in this way and could even be decomposed in a basis of linear oscillations. We expect that constraints from LSS will be competitive with those derived from the CMB, provided that the signal appears directly in the (dark) matter and is not suppressed by the baryon fraction.
Finally, the statistical power of this approach is not limited to the power spectrum and ultimately could be extended to higher-point statistics. Primordial features are known to have associated non-Gaussian signatures (see e.g. [101][102][103][104][105][106]) which should similarly be robust to the complications presented by nonlinear evolution. This presents the unique opportunity to not only perform joint CMB power and bispectrum analyses [1,76,107,108], but to also include the respective LSS observables. Furthermore, the three-dimensionality of galaxy surveys may allow for entirely new types of analyses that exploit the full angular dependence of higher-point correlation functions. The universe has given us the unprecedented power of large-scale structure to answer the most basic questions of our cosmic origins. We contributed a small step towards this ultimate goal.

A Nonlinear Damping of Logarithmic Features
The effect of gravitational nonlinearities on the BAO signal has been considered in various ways and can easily be extended to linear features as we discuss in Section 2. The effects of large-scale gravitational nonlinearities on logarithmically-spaced oscillations has however not been considered previously. 15 In this appendix, we provide additional details on the computation of the resulting damping of these features, complementing the discussion in §2.3. As in the main text, we first detail the perturbative treatment and then resum the infrared contributions to all orders in perturbation theory.

A.1 Perturbative Treatment
We have found in (2.16) that the effect of long modes on a generic wiggle power spectrum P w (k) at one-loop order implies the action of the derivative operator cosh q · ∇ k on P w (k). For logarithmic features, we need to consider P w log (k) = P nw (k) A log sin [ω log log(κ) + ϕ log ], with κ = k/k . As in the case of linearly-spaced oscillations (2.17), we neglect small corrections that arise from applying the derivative operators to the smooth envelope P nw (k). Moreover, in order to avoid clutter, we set ϕ log = 0 in the following, but note that it is straightforward to include the phase in the calculation. The (2n) th derivative of the oscillatory part, sin [ω log log(κ)], is given by where we employed sin(x log y) = 1 2i e ix log y − e −ix log y = 1 2i We can then perform the sum over n and get cosh( q · ∇ k ) − 1 P w log (k) = cos ω log log 1 − q µ k − 1 P w log (k) where we defined µ ≡k ·q. It is also useful to consider since we need this expression in the following calculation of the IR-resummed damping. 15 We thank the authors of [43], who independently performed this calculation, for sharing a draft with us.

A.2 Infrared Resummation
In §2.3.1, we showed that P w 1-loop ≈ O(1)P w tree-level , which suggests that all higher-order terms might be equally important corrections to the linear wiggle power spectrum of logarithmic features. In order to resum these infrared contributions, we need to evaluate all the higher-loop diagrams of (2.23). In contrast to linear features, there is no straightforward way to write down the L-loop contribution based on the 1-loop result, which is why we proceed by induction. Let us start with the two-loop contribution, L = 2, to get some intuition: where we used (A.3) and (A.4). Here, we should note that the operator D q D − q does not act on Σ 2 log (k) orΣ 2 log (k) since it only acts on the wiggle power spectra [38]. The three-loop term can be derived along the same lines to be Considering the one-, two-and three-loop contributions, it becomes apparent that the structure at L th order is given by The IR-resummed wiggle power spectrum of (2.28) is then obtained by resumming all the loops.

B Large-Scale Structure Forecasts
We employ a suite of likelihood-and Fisher-based forecasts in particular to validate and cross-check our analysis pipeline, and investigate the potential reach of future surveys. In this appendix, we collect further details regarding these LSS forecasts ( §B.1) and collect the utilized experimental specifications ( §B.2). Furthermore, we provide additional checks of our feature search ( §B.3) as well as supplementary information for the forecasts of future experiments ( §B.4).

B.1 Forecasting with the Wiggle Spectrum
As previously stated, the wiggle spectrum is the main observable in our forecasting pipeline, which was developed in [33] for the standard BAO spectrum. In the following, we summarize its main aspects and introduce further advances which especially include the use of bandpowers and the convolution with a window function. These components are not required in a wide range of applications, such as light relics, but are important to reliably predict the sensitivity to (highly-)oscillating features.
We use two types of forecasts in this work, which are either based on the Fisher information matrix F ij or on the likelihood function L itself. The former are computationally efficient and are therefore very useful in particular to cover a large space of parameters and experimental specifications. However, they only allow to access the standard deviation around a fixed fiducial point assuming smooth noise and have to also be taken with care given the involved approximations. We therefore only employ these forecasts to estimate the sensitivity of future surveys and for a limited number of tests. The likelihood-based forecasts come with a larger computational cost, but are much more versatile. For instance, we can not only obtain the standard deviations, but can also extract the mean values which allows to estimate significances and provides more direct comparisons with MCMC analyses. In addition, it is possible to inject random noise realizations and/or artificial feature signals. For these reasons, the majority of forecasts in this work are of the latter type. In the following, we first discuss the Fisher methodology, since it is commonly employed, and especially highlight modifications to the standard approach. We then build on this pipeline and introduce the likelihood-based forecasts.

B.1.1 Fisher Matrix Forecasts
Focusing on the oscillatory part of the power spectrum, the Fisher matrix of a galaxy survey with multiple (independent) redshift bins z can generally be approximated by [33] 16 where µ is the cosine between the wavevector k and the line-of-sight, O z (k, µ) = P w z (k, µ)/P nw z (k, µ) is the (linear) relative anisotropic wiggle spectrum, D z (k, µ) is the nonlinear damping function and V eff (k, µ) is the effective volume. In a featureless universe, the wiggle spectrum is simply the BAO spectrum, while it may also contain a feature signal in our case. Since we assume isotropic clustering, 17 this quantity is given by with the linear BAO spectrum O(k; z) = P w BAO (k; z)/P nw (k; z) being evaluated at the rescaled wavenumber k/q = D fid V (z)/D V (z) k. This rescaling with the radial BAO dilation D V ∝ (D 2 A /H) 1/3 is necessary because the wavenumbers k are derived from the measured angles and redshifts in a survey using the angular diameter distance D fid A (z) and Hubble rate H fid (z) in a fiducial cosmology. 18 Moreover, we introduced the free functions B z (k) and A z (k) which are taken to be smooth polynomials in k and distinct in each redshift bin, m b m,z k 2m and m a n,z k n , with m = 0, . . . , 3 and n = 0, . . . , 4. By marginalizing over these functions with fiducial values a n,z = 0, b 0,z = 1 and b m =0,z = 0 in our forecasts, we effectively discard any information in the observable that might be affected by nonlinearities, biasing or observational systematics so that we only use a robust signal of the primordial features and the standard BAO imprint. Finally, the nonlinear damping and effective volume are implemented as where we assumed a constant nonlinear damping scale, Σ nl (k, µ; z) ≈ Σ nl (z), and position independence of the comoving number density of galaxies, n g ( r ) ≈n g = const, in each redshift bin. Furthermore, the survey volume in a given redshift bin with spherical geometry is denoted by V z and the fiducial galaxy power spectrum by P g (k) which in particular includes the linear galaxy bias. We note that we implicitly assumed in (B.1) that the feature spectrum is nonlinearly damped in the same way as the BAO spectrum (cf. §2.3). (We reiterate that this is a brief summary and all details can be found in [33], including the modeling of the galaxy power spectrum, the nonlinear damping scale and the effects of reconstruction.) We have already written the Fisher matrix (B.1) as a sum over discrete wavenumbers since the finite size of a galaxy survey introduces both a minimum accessible wavenumber 19 k min and a minimum binning width in Fourier space given by the fundamental mode, ∆k ≥ k min . For many current applications, the width ∆k has become small enough so that the power spectrum P (k) is smooth in a given band [k i − ∆k/2, k i + ∆k/2] and we can approximate it by bandpowers P i ≈ P (k i ). However, highly-oscillating primordial features introduce a significant variation within any such band so that we have to compute the finite-sized bandpowers according to In other words, we take the limit of spherically-averaged clustering measurement. This is motivated by the fact that the primordial information that we are interested in is strictly isotropic and most of the information in BOSS is contained in the monopole power spectrum. 18 In contrast to the data analysis, we do not additionally rescale by the fiducial ratio of the sound horizon in our forecasts since we recompute the BAO spectrum O(k) for different cosmologies. 19 The minimum wavenumber, or fundamental mode, which is available in a survey with a spherical geometry is, in principle, given by the survey volume V according to kmin = 2π[3V /(4π)] −1/3 .
The bandpower-averaged wiggle spectrum, which contains both the BAO and the feature spectra, is then given by O i = (P i − P nw i ) /P nw i , for instance. To illustrate the effect of this averaging procedure (see also Fig. 4), the bandpass-filtered primordial power spectrum (2.2) with linear features is given by P lin ζ,i ≈ P ζ,0 (k i ) 1 + sinc(ω lin ∆k/2) δP lin where sinc(x) = sin(x)/x and we assumed P ζ,0 (k) ≈ P ζ,0 (k i ) for k ∈ [k i − ∆k/2, k i + ∆k/2]. This implies that the oscillatory features are suppressed unless ω lin ∆k 2, or ω lin 2/∆k ≈ 600 Mpc for ∆k = 0.005 h Mpc −1 . For logarithmic features, we could decompose the oscillations into linear features in a given band and arrive at an analogous conclusion.
The second effect of a finite survey volume that we have to take into account is the convolution of the power spectrum with the window function. This is of course directly related to the bandpass filtering in reality although we separate them here for convenience. Whereas the former averages the power spectrum over the wavevectors k in a given band, the window function introduces a coupling between otherwise independent wavenumbers. For an all-sky survey with redshift range [z − , z + ] and effective redshiftz, the spherical top-hat window function is are the comoving distances to the edges of the survey (or, equivalently, redshift bin) and Θ(x) the Heaviside step function. In practice, we however do not have access to the full sky, but only to a fraction f sky < 1. For the purpose of our forecasts, we therefore include an incomplete sky by restricting the integration over the azimuthal angle φ: In this case, the Fourier transform of the window function is radial, 20 with the spherical Bessel function of the first kind j n (x). By restricting the power spectrum to finite-sized bandpowers P i and using the fact that the window function (B.8) is radial, we can rewrite the convolved power spectrum, which is generally given by in terms of a matrix equation: which we can evaluate numerically for all k i , k j of an LSS survey. As in the case of bandpowers, we again decompose the convolved spectrum P c i in its smooth and oscillatory components according to (2.6). To summarize, the main extensions to the Fisher forecasting methodology of [33] based on the wiggle spectrum are given in (B.4) and (B.10).

B.1.2 Likelihood-Based Forecasts
We also implemented forecasts based on the likelihood function L( θ) itself, as previously reported in [33,34]. While the modeling of the observables and covariances is the same as in the Fisher analyses, we directly evaluate the likelihood function L( θ) on a grid in the parameter space of θ = (α z , A sin X , A cos X ) as follows: Here, we used the theoretical ('model') wiggle spectrum O z (k i ; θ), the fiducial ('data') spec-trumÕ z (k i ) and the inverse covariance C −1 z (k i ) of the respective experiment. The latter is computed as in the Fisher matrix (B.1) and includes the (white) instrumental noise contribution, cosmic variance and the exponential nonlinear damping. We note that all spectra are generally bandpass-filtered and convolved with the window function as discussed above, , which we have however omitted in (B.11) for ease of notation. The model spectrum O z varies over the considered parameter space and is defined as where O fid is the linear BAO spectrum of the fiducial cosmology, α z = α(z) is the isotropic BAO parameter, and A z (k) = n i=0 a i,z k i and B z (k) = m j=0 b j,z k 2j are the same polynomial 'broadband' polynomials as above, where six terms with m = n = 2 turn out to be sufficient. We marginalize over these terms by minimizing χ 2 of (B.11) for these parameters, i.e. χ 2 ( θ) = min an,bm χ 2 ( θ; a n , b m ) .
The data spectrumÕ z is computed by evaluating (B.12) for a fiducial set of parameters θ fid (which can include non-zero feature amplitudes), with B z (k) = 1 and A z (k) = 0. In addition to the smooth data with the experimental uncertainties being simply captured by the covariance matrix, we also perform forecasts with 'noisy data'. In this case, we obtain the data spectrum by randomly picking the value ofÕ i from a one-dimensional Gaussian distribution function with meanÕ i and variance C(k i ). This therefore simulates the scatter of the actual measurement due to the expected noise of an experiment (including sample variance) as captured by the covariance matrix. We can include this in our forecasts in order to estimate how likely it might be that features are found in the noise instead of the data or, in other words, that the noise mimics the presence of oscillatory features. In the main text, this constitutes an important check of the mock and data analyses, and provides an estimate of the actual significance of possible feature signals.
Having computed the likelihood function L( θ) over all of parameter space in which it is non-negligible, we then infer the predicted posterior distribution p(θ l ) of a parameter θ l by marginalizing over all other parameters θ m =l . Since the one-dimensional posteriors for α, A sin X and A cos X are very close to Gaussian, we finally obtain the meanθ l and standard deviation σ(θ l ) through a Gaussian fit to p(θ l ).

B.2 Experimental Specifications
We do not only build on the signal modeling of [33], but also its characterization of the LSS surveys (which was derived from [119]). In general, we can characterize a cosmological galaxy survey by   [45] as detailed in [33]) with a sky area of Ω = 10 252 deg 2 resulting in roughly 1.2 × 10 6 objects in a volume of about 6.4 h −3 Gpc 3 . We separately list the characteristic quantities employed when comparing to (a) the BOSS DR12 data and (b) the corresponding mock catalogs since they differ in the linear bias b and the (post-reconstruction-equivalent) nonlinear damping scale Σ nl as discussed in the main text.  Table 2: Basic specifications for DESI (derived from [92] as explained in [33]), covering a sky area Ω = 14 000 deg 2 and resulting in roughly 2.7 × 10 7 objects in a volume of about 59 h −3 Gpc 3 .
the following quantities: redshift range, sky coverage, linear galaxy bias b per redshift bin and number (density) of objects N g (n g ) in each redshift bin. Here, we neglect the redshift error in spectroscopic surveys since it is usually small compared to the damping scales, but would need to take it into account for photometric observations. For planned experiments, such as DESI and Euclid, we use specific values (see Appendix B of [33]), with Tables 1 and 2 updating the employed parameterizations of BOSS and DESI. For more distant surveys, we assume a constant number densityn g for a given total number of objects N g and a linear bias of b(z = 0) = 1. Our 'Future' LSS survey contains N g = 10 8 objects distributed over half the sky up to redshift z max = 3. The experiment referred to as 'LSS-CVL' is cosmic variance limited on all employed scales and is designed to survey half of the sky for z ≤ 6. In our forecasts for BOSS, we generally take the maximum wavenumber to be k max = 0.3 h Mpc −1 to coincide with the choice in the data analysis.
All other (Fisher) forecasts use k max = 0.5 h Mpc −1 , except for 'LSS-CVL' for which we choose k max = 0.75 h Mpc −1 since further extending the range of wavenumbers would likely yield only minor improvements in sensitivity due to the exponential damping.

B.3 Additional Tests of the Pipeline
Given the described forecasting pipeline, we can provide additional insights into our primordial feature search and discuss some of the tests that we performed. In the following, we study the impact of the approximations in the theoretical damping calculation on the BOSS constraints, revisit the impact of the finite-volume effects and, in particular, test whether injected feature signals can be detected in the analysis.

B.3.1 Check of Damping Assumptions
When computing the nonlinear damping of the linear and logarithmic oscillations from large-scale bulk flows in §2.3, we made a number of simplifying approximations which allowed us to use a single damping scale, the standard BAO damping scale Σ BAO , in our data analysis. We can explicitly check the validity of these approximations in Fisher forecasts that generalize (B.1) to include the full resummed expressions for the linear and logarithmic spectra of (2.26) and (2.28), and compare with the approximate formulas of (2.27) and (2.31), respectively.
In order to perform this test, we need to numerically evaluate the three damping scales of (2.19), (2.21) and (2.22), while choosing an appropriate value of the cutoff scale Λ which separates long modes q from other wavenumbers. The crucial point of the approximations is the fact that all the computations are strictly valid in the regime of q/k 1, i.e. a separation of long and short modes. The cutoff Λ therefore needs to be smaller than the wavenumbers k of interest. At the same time, however, all long modes within the support of the feature also experience a damping effect. This is the reason why it is sensible to take Λ = k for some 1 (we employ = 0.5). 21 This choice leads to all damping scales, including Σ BAO , to be effectively k-dependent, Σ X → Σ X (k). Having said that, it is important to remark once again that any dependence of these quantities on the specific choice of the cutoff indicates that next-to-leading order effects should be taken into account (see e.g. [38] for the case of the standard BAO signal). Since we fit Σ BAO = const in the data analysis (as is standard), we also compute this damping scale for a k-independent cutoff. Motivated by the maximum wavenumber of k max = 0.3 h Mpc −1 , we take Λ = 0.15 h Mpc −1 in this case. Figure 11 shows the effect of the various approximations on the estimated constraints of BOSS. We note that we evaluate the damping scales at redshift z = 0 for simplicity, given that the redshift dependence is the same for all damping terms. This however also means that we effectively exaggerate the employed damping scales and the actual impact on the constraints is even smaller than shown. Even with this conservative choice, we can deduce that all of our assumptions are 21 We note that the logarithmic damping factors Σ log andΣ log are not well defined in the limit Λ → k because the argument of the logarithms in (2.21) and (2.22) approaches zero. This is precisely the limit in which the computation becomes invalid since it is based on the separation of long and short modes. Interestingly, this is not the case for the BAO damping factor ΣBAO, whose value asymptotes for Λ 0.5 h Mpc −1 and can be integrated to Λ → +∞ without significantly affecting the value of ΣBAO, even though the validity of the nonlinear damping calculation breaks down at Λ ∼ k [37].   δσ(A cos log ) Figure 11: Impact of the various approximations to the theoretical damping scales on the BOSS constraints for linear (top) and logarithmic features (bottom). We display the relative difference of the Fisher-forecasted standard deviation, δσ = σ/σ full , for the BAO parameter α and the feature amplitudes A sin X and A cos X , where σ full is obtained using the full theoretical result. In the considered parameter space, the constraints are essentially unaffected byΣ log (k). Here, we used the effective post-reconstruction damping scales inferred at z = 0. valid in the context of the BOSS DR12 dataset. To be more specific, assumingΣ log (k) ≈ 0 has basically no visible impact on the constraints in the displayed parameter space of interest in this work, as expected. Second, approximating Σ lin (k), Σ log (k) ≈ Σ BAO (k) only results in sub-percent variations to the constraints for ω lin away from the BAO scale and ω log 20, and differences at the few-percent level for ω log ∈ [10,20]. Finally, taking Σ BAO to be constant instead of computing it with a k-dependent cutoff penalizes the constraints by roughly 3% for all linear and logarithmic frequencies. This implies that all of the employed approximations are justified in the context of the BOSS DR12 dataset and the upper limits that we infer in Section 3 are in fact conservative. Having said that, the constraints inferred in future surveys will likely benefit from using the theoretically-computed forms of the damping scales Σ lin and Σ log .

B.3.2 Impact of Finite-Volume Effects
Our ability to search for highly-oscillating features is limited by the fact that we have only access to a finite cosmic volume, as we discussed in the main text. Apart from introducing a cutoff at the Nyquist frequency due to aliasing, the impact of finite-sized bandpowers and the window function has to be taken into account. We illustrate the consequences of these effects on the sensitivity of BOSS in Figure 12. While the constraints on the BAO parameter α are essentially unchanged, as expected given the BAO scale of 150 Mpc, we observe a gradual decrease in sensitivity to the feature amplitudes for larger frequencies ω X . In consequence, we would overestimate the . We compare the likelihood-based constraints on the BAO parameter α, and the feature amplitudes A sin X and A cos X when using continuous spectra, P (k), bandpass-filtered spectra, P i , and window function-convolved spectra, P c i .
constraining power of BOSS by up to a factor of two if we neglected the finite volume of the survey.
These results can be easily understood in the context of Fig. 4 which shows the impact of the finite-size effects on the spectra themselves. If we could employ continuous spectra P (k), a given primordial signal would have the same amplitude independent of the feature frequency in the analysis, resulting in the same sensitivity on all parameters (except for the interference with the BAO signal). Since the amplitude effectively decreases for larger ω X when bandpass-filtering the power spectrum [proportional to sinc(ω lin ∆k/2) according to (B.5) for linear features], the constraints gradually weaken and the feature model becomes essentially unconstrained at the Nyquist frequency. Convolving the bandpowers additionally with the window function of the survey couples otherwise independent modes which leads to an additional reduction in the amplitude and, consequently, the sensitivity. Finally, the frequency of the standard BAO signal (or equivalently the survey volume) is large enough so that the BAO spectrum and ultimately the constraints on α are barely affected.

B.3.3 Detection of Injected Signals
Our likelihood-based forecasts also allow us to test whether we would be able to detect a feature signal if it was present in the data. This is an important check of our analysis pipeline that we cannot perform on mock catalogs because their underlying primordial spectrum is featureless. Since the results of the forecasting pipeline are consistent with both the mock and data analysis, we can still reliably perform a search for injected signals.
We performed this test for a wide range of parameters. In Figure 13, we show the representative results for linearly-and logarithmically-spaced oscillations characterized by (α, ω lin , A sin lin , A cos lin ) = (1.01, 500 Mpc, 0.04, 0) and (α, ω log , A sin log , A cos log ) = (0.98, 45, 0, 0.05). These parameters were chosen to produce a roughly 5σ signal for a single redshift bin in the center of our frequency range. Here, we display the standard deviation and mean values inferred from the marginalized likelihood function (and the significance of any signal) of the low-z bin, as in Fig. 6 for the featureless cosmology, but note that the results are as consistent and positive in the high-redshift bin.
For the linear features in the top panel of Fig. 13, we first of all see that the posterior of the BAO parameter α is barely affected by the injected feature signal. In addition, the underlying value of α is correctly recovered within the noise-related scatter. While the standard deviations of the feature amplitudes are hardly affected, their mean values clearly show the characteristic signal around ω lin = 500 Mpc:Ā sin lin andĀ cos lin peak/vanish at the injected value and frequency, and approach zero away from it in an oscillatory fashion. This is due to the fact that features with neighboring frequencies interfere with the signal and can also be fit with different amplitudes since we only probe a limited range of wavenumbers. Having said this, the shape of the signal in the sine and cosine amplitudes clearly picks out the true value. Furthermore, the noise-induced scatter in the mean values is essentially absent around the injected signal, while it is consistent with the featureless case away from it. Given these observations, it is also evident that the significance of the signal is reproduced at the expected value (with some small variations in a given noise realization).
The injected logarithmic signal can be extracted with a similar level of confidence. We again observe the same characteristic behavior of the mean values around the injected feature frequency ω log = 45. Since we employed a primordial cosine instead of sine feature, the roles of A sin log and A cos log is naturally reversed and correctly captured. In contrast to the linear oscillations, however, the standard deviations show additional variations and the mean values exhibit a slightly more pronounced 'ringing' across the ω log -range. Given the noise levels of BOSS, this however does not have a significant impact on the detectability of a primordial signal with a large enough amplitude. For both types of feature models, we find similar results over a wide range of frequencies. As could be expected, it however becomes somewhat harder to extract signals with small values of ω X due to the interference with the standard BAO signal and associated effects. Nevertheless, we should be able to extract even these oscillations from the data due to their overall signature. We can therefore conclude that we should be able to detect any primordially-imprinted oscillatory feature with a large-enough amplitude in our analysis pipeline.

B.4 Forecasts for Future LSS Surveys
We do not only consider currently available data, but we also employ Fisher forecasts in §4.3 to estimate the sensitivity of future LSS surveys to primordial features. Since large classes of feature models can be expressed in a basis of linear oscillations, we focus on the "feature spectrometer". As in the rest of this work, we initially work in the parameter space spanned by the isotropic BAO parameter α and the feature amplitudes A sin lin and A cos lin , fiducially taken to be α = 1 and A sin lin = A cos lin = 0. Since we use a total of nine polynomial broadband parameters (a m≤4,z , b m≤3,z ) and compute the Fisher matrices for a given frequency ω lin , these forecasts contain twelve parameters per redshift bin. Summing the broadband-marginalized Fisher matrices, we obtain  the forecasted standard deviations A sin lin and A cos lin displayed in Fig. 14. Apart from the well-known degeneracy with the BAO scale, we observe that the constraints on A sin lin and A cos lin are basically identical for ω lin 250 Mpc, but oscillate around a common mean value for smaller frequencies. This is as expected and exemplifies again that the sine and cosine feature terms are essentially independent modes for large enough frequencies ω lin .
To turn these constraints into limits on the overall feature amplitude A lin while retaining the correlations between the parameters, we draw random samples from a Gaussian distribution whose covariance matrix is given by the inverse Fisher matrix. Since the amplitude A lin is positive semi-definite, which implies that the mean of A lin can only fluctuate upwards from zero, we also repeatedly take the mean values from Gaussian distributions with zero mean and covariance given by the same inverse Fisher matrix. Finally, we can compute the 95% confidence limits on A lin by similar means as in our BOSS analysis above (see Appendix C). In this way, we obtain the forecasted bounds of Fig. 10. To conclude, we remark that these constraints are likely conservative since we employed the same constant damping scale for both the BAO and the feature spectra (cf. §B.3.1).

C Further Details on the BOSS Analysis
We employ the amplitude parametrization of the feature models in our analysis and forecasting pipelines since the posterior distributions of A sin X and A cos X , X = lin, log are close to Gaussian (unlike the phase ϕ X ). Since the phase of the primordial features is not expected to carry much information about the inflationary epoch (at least in the pre-discovery era), we are ultimately interested in the constraints on the overall feature amplitude A X . In this appendix, we describe our method to combine the two BOSS redshift bins and infer the reported upper limits from the Monte Carlo Markov chains, including some checks ( §C.1). Moreover, we outline how we determine whether the data exhibits any statistically significant detections of features ( §C.2).

C.1 From Posteriors to Upper Limits
The analysis pipeline of §3.1 results in Markov chains that provide samples from the marginalized posterior distribution as a function of A sin X and A cos X in each feature frequency bin. It is useful to consider constraints on the two-dimensional parameter space of these feature amplitudes as constraints in the complex plane. From this perspective, we are interested in computing the upper limits on the absolute value of the complex amplitude, A X = (A sin X ) 2 + (A cos X ) 2 for which there is however no unique procedure. Since the feature phase is not an independent parameter, the upper limit is actually not a single number, but depends on the phase ϕ X . This is important because the maximum posterior point will in general not be at A sin X = A cos X = 0 in the presence of noise. Given that we marginalize over the feature phase, it is important to keep in mind that a uniform prior on A X and ϕ X corresponds to a non-uniform prior in the A sin X -A cos X plane and vice versa.
Our method of compressing the available information considers circles in the A sin X -A cos X plane centered at the origin that enclose a given probability or, equivalently, a fraction of all Monte Carlo samples. We therefore define the upper limit on A X at a given confidence level as the radius of the respective circle. For the separate MCMCs of the low-and high-redshift bins, this means that we compute the amplitude A X for each sample and rank-order the resulting values. The upper limit is then given by the value of A X at the desired confidence limit percentile.
We are however not only interested in the constraints from a single redshift bin, but want to compute joint limits from both BOSS redshift bins (or of LSS and the CMB). Although running a joint MCMC would be the formally correct statistical approach, it would result in the simultaneous variation of 33 parameters (and even more for a joint analysis with the CMB), which would be computationally more complex and expensive. This is why we proceed as follows: 1. We bin the samples of a single MCMC in the A sin X -A cos X plane. This results in a pixelated posterior distribution p i (A sin X , A cos X ).
2. We obtain the joint posterior by multiplying the pixelated posteriors, i p i (A sin X , A cos X ).
3. We measure the probability P (A X ) enclosed in a circle centered at the origin of the A sin X -A cos X plane as a function of its radius and obtain the inverse (interpolated) func-tionÃ X (P ).  Figure 15: Illustration of our method to infer 95% confidence limits on the feature amplitude A X from Markov chain samples of (A sin X , A cos X ). The top panels show the pixelated posteriors for the low-z (left) and the high-z bins (middle), and the joint pixelated BOSS posterior (right) for the frequency bin centered at ω lin = 700 Mpc. In the bottom panel, the pixelated posterior for the respective Planck TTTEEE samples (middle) and the joint posterior for BOSS and Planck (right) are displayed. The red contours enclose the pixelated 95% confidence region around the maximum posterior point. The solid (orange) and dashed (green) circles enclose 95% of the total probability around the origin A X = 0 (marked by the white dot) as obtained from the pixelated posterior and an ordered list, respectively. The agreement between these circles demonstrates that the error introduced by pixelization is negligible. For BOSS and the joint BOSS+Planck constraint, the dashed (purple) circle shows the constraint when combining the separate low-z, high-z and CMB confidence limits by adding inverse variances, which demonstrates that the non-Gaussianity of the likelihood has a non-negligible effect on the inferred upper limit.
In this way, the 95% confidence limit is then given byÃ X (P = 0.95), for instance. We illustrate this approach in Fig. 15 for one feature frequency bin. This figure shows that the phase-independent limits are necessarily less constraining than those centered at the maximum posterior value since they also enclose low-likelihood regions away from the maximum. Having said that, the described method allows us to correctly infer the quantity that we are interest in, the maximum value of the feature amplitude A X that is allowed by the data for any phase ϕ X . The comparison of the two circles for the joint posteriors also demonstrates that compressing the confidence limits into a single upper limit for a given single dataset and subsequently combining them by summing the inverse variances would result in a significant error on the inferred upper limits from joint probes.
Having outlined our procedure, a few comments are in order. As a consequence of working in the two-dimensional plane spanned by A sin X and A cos X , we assumed that the feature amplitudes are  Figure 16: Convergence test of the BOSS analyses for linear (left) and logarithmic features (right). The constraints inferred from splitting the Markov chains into two independent halves (light colors) are compared to those derived from all Markov chains (dark color). Note that the former bounds are barely visible under the latter due to the high level of convergence.
completely uncorrelated with any of the other parameters, in particular the BAO parameter α. We explicitly confirmed this assumption by computing the three-dimensional (Gaussian) covariance matrix in each frequency bin to estimate the correlation coefficient ρ in forecasts and on data.
For linear (logarithmic) features in BOSS, we find that |ρ| is consistent with zero, but approaches significant values (up to about 0.5) for ω lin 200 (ω log 30), as expected due to the interference with the standard BAO signal. Since this effect is minimal and including these correlations would only strengthen the bounds, the deduced bounds are conservative, albeit slightly suboptimal because we are effectively assuming a different set of non-amplitude parameters in each redshift bin.
We also check that the pixelation does not introduce numerical artifacts due to the choice of too small or too large pixel sizes. The former could lead to a biased estimate because the posterior distribution becomes noisy, whereas the latter might artificially smooth the posterior. To mitigate these possibilities, while including all samples in the analysis, we separately set the pixel size in each frequency bin. For this purpose, we sampled A sin X and A sin X in about 100 pixels over the range given by ±1.2 max{|A sin X |, |A cos X |}. For a single MCMC, we find that this choice results in virtually the same confidence limits as when inferring them from a rank-ordered list of samples (while combining the latter in a Gaussian way leads to suboptimal joint constraints).
Finally, it is important to check that the sampling noise due to the inherently finite length of the Markov chains does not affect the constraints. We therefore test the convergence of our analysis by splitting the chains into several independent parts. Figure 16 shows that the Markov chains are converged and do not show evidence for numerical noise. As a consequence, we can also report that the shape of the constraints as a function of frequency is robust and inherent to the data.

C.2 Upper Limits or Detections?
So far, we have only discussed the inference of upper limits from the data. Of course, any analysis should also allow for the possibility of detecting a signal. Our method of determining detections at a given confidence level is again based on the pixelated posterior distributions.
We start by drawing the two-dimensional contours that enclose the desired confidence limit. We then declare a detection if the origin is excluded from this contour, i.e. if the white dot in Fig. 15 is outside the red contour. This is determined as follows. First, we rank-order the pixelated likelihood values and sort them from the most to least likely pixel. For each value in this list, we then compute the cumulative probability and map the cumulative probability to the pixel likelihood by an interpolating spline. The value of the pixel likelihood at which the cumulative probability reaches P finally determines the contour level at which the total probability P will be enclosed (assuming a unimodal distribution that falls off monotonically away from the peak).
We calculated the number of 95% and 99.7% confidence limit (corresponding to 2σ and 3σ) detections on mocks and on data. We can confirm that detections at the 95% c.l. occur in roughly 5% of the mocks for each feature bin, except around the BAO scale, where we find a modest excess in the number of detections. At the 3σ-level, we find no detections in our data. We note that a small number of detections would have been consistent with the look-elsewhere effect since we sample many independent frequencies. Since we do not find any such detection, there is no need to quantify this.

D CMB Analysis and Forecasts
The focus of this work is the first analysis of primordial features in LSS data alone. Given the long history of searches in the CMB anisotropies, it is however natural to compare (and combine) our newly-inferred bounds from the BOSS DR12 dataset to those derived from current Planck data. In this appendix, we outline the performed CMB data analysis ( §D.1) and discuss the effects of the different transfer of primordial power onto the large-scattering surface and the LSS ( §D.2). Moreover, we provide details on our joint LSS and CMB bounds ( §D.3), and comment on the CMB Fisher forecasts ( §D.4).

D.1 Analysis of Planck Data
The phenomenological feature models of (2.3) and (2.4) have been searched for in CMB data for quite some time, including the Planck collaboration [1,71,77]. These analysis have however focused on reporting the best-fit points, and/or the likelihood improvements and significances of possible signals as a function of feature frequency ω X . Since any possible signals have not been significant to date (in particular after taking the look-elsewhere effect into account [1,76,107,108]), we are interested in studying the entire parameter space of features. We therefore want to report the frequency-dependent constraints on the feature amplitudes A X , as we did in the BOSS analysis.
Following the analyses by the Planck collaboration [1,77], we first run MultiNest [113,114] with a modified version of CAMB [109]. 22 Since we also fix the foreground and nuisance parameters to their best-fit values [90], we vary a total of nine parameters: the six standard ΛCDM parameters (physical baryon and cold dark matter fractions ω b and ω c , angular size of the sound horizon θ s , logarithm of the primordial scalar amplitude ln(10 10 A s ), scalar spectral index n s and optical depth τ ) and three feature parameters (ω X , A sin X and A cos X ). We employ wide flat priors on all parameters, including the feature frequencies, ω lin ∈ [0.5, 1005] and ω log ∈ [0. 1,101]. We note that the CMB is also sensitive to models with larger frequencies ω X , but we restricted ourselves to a range around the region available to BOSS.
From these MultiNest runs, we compute the mean values and covariance matrices of the nine parameters in bins of ∆ω lin = 100 and ∆ω log = 10. To effectively increase the number of samples, we then run standard MCMCs with four chains using CosmoMC [111] in these frequency bins starting from the computed covariance, with the priors chosen to enclose the one-dimensional 5σ ranges. Having acquired enough samples and a Gelman & Rubin convergence criterion [60] with scale parameter generally given by 0.01, we implicitly marginalize over the ΛCDM parameters and compute the 95% upper limits on A X as described in Appendix C for the BOSS analysis in one redshift bin. For convenience, we also use the same binning in the feature frequency, ∆ω lin = 10 Mpc and ∆ω log = 1, although the correlation length differs (e.g. ∆ω lin ≈ 26 Mpc was estimated in the Planck TT analysis of [76]).
Given the preliminary status of the Planck 2015 polarization data, 23 we run this pipeline on 22 Due to the highly-oscillatory nature of the primordial feature spectrum, in particular for logarithmic features at large scales, we have to run CAMB with increased accuracy settings which were checked to resolve all oscillations. 23 Having said that, the comparison of the published 2015 and 2018 results on primordial features suggests little changes. We therefore expect our results employing Planck 2015 polarization to be consistent with those derived by the Planck collaboration in [1]. • 'Planck TT': low-(2 ≤ ≤ 29) commander temperature and polarization data, and unbinned high-Plik half-mission temperature cross-spectra data with T max = 2508, • 'Planck TTTEEE': lowcommander and unbinned high-Plik half-mission temperature and polarization cross-spectra data with T max = 2508 and E max = 1996.
We emphasize that we use the unbinned likelihoods to have access to all measured multipoles without averaging over -bins. This way, we obtain the bounds on the feature amplitudes A X displayed in Fig. 17. We see that the constraints only degrade significantly for very small frequencies and are basically unaffected by the polarization data at small ω X . Over the rest of parameter space, the full dataset yields slightly stronger bounds. Finally, Figure 18 illustrates the excellent convergence of the CMB Markov chains for all frequencies and both sets of data.

D.2 Transfer of Feature Power
We have already discussed the experimental reasons for the better sensitivity of BOSS to features than Planck (or, more generally, future LSS surveys compared to CMB observations) in the main text. In the following, we shed additional light on this by studying the signal of primordial oscillations that is imprinted in the observables of the CMB and LSS.
A comparison of the size and shape of the features in these cosmological measurements is displayed in Fig. 19. We show both the lensed and unlensed auto-spectra of temperature and E-mode polarization for the CMB, and the matter power spectrum in linear and nonlinear theory, i.e. without and with the exponential damping caused by large-scale nonlinearities. In all cases, we can clearly see the primordial oscillations with the given frequencies. We however observe a few notable differences between the imprint of features in these quantities. For small frequencies ω X , the signature in the CMB is comparably similar to the signature in LSS, but with a sinusoidal oscillation that is slightly distorted. Having said this, the amplitude of the feature contribution decreases significantly in the CMB for larger frequencies. 24 Since this effect is additionally more pronounced in the temperature than in the polarization spectrum, we deduce that it is predominantly the CMB transfer functions, especially the projection from the three-dimensional cosmic volume to the two-dimensional CMB sky, that wash out the primordial oscillations.
In the temperature power spectrum, the primordial feature signal becomes suppressed by more than an order of magnitude towards larger frequencies and wavenumbers. Since the Planck measurement has to overcome this smaller signal in comparison to our BOSS observations, the constraints turn out to be somewhat worse for larger frequencies despite the more accurate measurement ( 1600 is cosmic variance limited [79]). We note that the slight difference in the employed range of scales in our BOSS measurement, k max = 0.3 h Mpc −1 , compared to T max = 2508 ≈ 0.27 h Mpc −1 can likely be neglected, but will become important for future surveys with a larger reach in wavenumbers.
Finally, it is also evident from Fig. 19 that future CMB missions will in particular benefit from improved polarization measurements. Apart from the larger signal that survives in the spectrum due to the sharper transfer function compared to temperature, this remaining signature is also partly complementary as can be in particular seen for the highly-oscillating logarithmic features.

D.3 Joint CMB and LSS Analysis
In the main text, we inferred the first LSS-only constraints on primordial features and compared them to the current bounds from the CMB as derived above. Having obtained Monte Carlo Markov chains for these observables, we can also consistently combine them to obtain the best current limits. In the following, we elaborate on our computation of these joint constraints. 24 As we illustrated in Fig. 4, the finite-volume effects present in galaxy surveys also lead to some suppression of the primordial signal in LSS observations (cf. Fig. 12 for the resulting impact on the constraints). This suppression is however not shown in Fig. 19 because it is a survey-dependent effect (similar to the beam in CMB measurements, for instance) that will be less and less important for future LSS measurements at these frequencies due to their much larger observational volume.  Figure 19: Imprint of primordial features in the CMB and LSS power spectra for a set of linear (top) and logarithmic (bottom) frequencies ω X , X = lin, log. We compare the relative contribution of features to the unlensed and lensed temperature (TT) and E-mode polarization (EE) power spectra C , with the contribution to the linear and nonlinear matter power spectrum, P (k). The fiducial spectra, which are denoted by the superscript 'fid', are computed in a standard featureless ΛCDM cosmology, which is then augmented by a feature with amplitudes A sin X = 0.1 and A cos X = 0 for illustrative comparison. We display the same range of scales for the observables, linking multipoles and wavenumbers k via the flat-sky approximation, = D A k, where D A is the angular diameter distance to the last-scattering surface. Finally, we note that we neglected survey-related effects for both the CMB and LSS.
We start by converting the CMB Markov chains into the same parameter space as the BOSS analysis. This means that we keep the three feature parameters ω X , A sin X and A cos X , but reduce the six ΛCDM parameters to the two isotropic BAO parameters α z evaluated at the effective redshifts of the two BOSS bins, z = 0. 38  with the fiducial BOSS cosmology (see §3.1). Ideally, we would combine the frequency-binned samples in the four-dimensional space of {α 0.38 , α 0.51 , A sin X , A cos X }. This is in principle possible by generalizing the approach discussed in Appendix C for the BOSS analysis, but a very large number of chain samples would be required to reliably cover this parameter space. Since we are not interested in constraints on the BAO parameters, we therefore proceed by independently marginalizing the low-z BOSS chains, the high-z BOSS chains and the Planck CMB chains over α z . Having reduced the parameter space to the two feature amplitudes, we can directly follow our procedure of combining the two BOSS redshift bins as outlined in Appendix C, but including the CMB data as a third pixelated likelihood. By repeating this for the TT and the TTTEEE Markov chains, we obtain the 95% confidence limits shown in Fig. 9.
As a consequence of marginalizing over the BAO parameters, we neglect any possible correlations between α z and the feature parameters. We already discussed in Appendix C that this assumption renders our limits overly conservative, but also checked its impact for the CMB data. By inferring the four-dimensional (Gaussian) covariance matrix in each frequency bin, we find that the TT-only analysis shows correlations of |ρ| < 0.5, while the addition of polarization data further reduces this correlation coefficient. We therefore expect our approximate joint analysis to result in the same bounds as the full analysis except around the frequencies that interfere with the BAO scale. This is also confirmed using Fisher forecasts that lead to essentially the same forecasted limits except around the scale of the sound horizon where our analysis is suboptimal at the ten-percent level.
Instead of neglecting the correlations with α z , we could have also assumed the (threedimensional) almost Gaussian posterior distributions inferred in the BOSS analysis to be exactly Gaussian. With this approximation, it would be possible to impose the low-z and high-z BOSS constraints as Gaussian priors on the CMB analysis by importance sampling its Markov chains. 25 We tested this possibility, but found that vanishing α z correlations are a better assumption than the Gaussian approximation.

D.4 Forecasts for Future CMB Surveys
In addition to the analysis of current CMB data from Planck, we also estimate the sensitivity of future CMB experiments to (linear) feature models in §4.3. (As explained, most other types of features can be decomposed in a basis of linear oscillations so that constraints can be deduced from our results.) These forecasts directly follow the Fisher methodology and the experimental specifications of [33]. The fiducial point is a featureless ΛCDM cosmology consistent with the Planck measurements [90,91], i.e. we in particular take A sin lin = A cos lin = 0. Since we compute the constraints as a function of feature frequency ω lin within a ΛCDM universe, the Fisher information matrices are 8-dimensional. By employing perfectly delensed temperature and polarization power spectra, we infer the most optimistic bounds on A sin lin and A cos lin which we present in Fig. 20. As can be understood from the additional smoothing of the oscillations in the lensed compared to the unlensed spectra in Fig. 19, the forecasted sensitivities are worse when using lensed spectra. The degradation of the constraints depends on the experiment and feature frequency, but may be up to about 20% and 50% for Planck and the CMB-S3 missions, respectively. However, not delensing the spectra could lead to constraints on the feature amplitudes σ(A Y lin ), Y = sin, cos, being worse by a factor of two for CMB-S4 and more for a cosmic variance-limited experiment. We also observe that the feature parameters are independent of the ΛCDM parameters (and of one another) for ω lin 300 Mpc. For smaller frequencies, the primordial oscillations interfere with the baryon acoustic oscillations which in particular leads to a degeneracy with the scale of the sound horizon, as has previously been pointed out in the CMB (see e.g. [72,76]) and was discussed in the main text for LSS. Finally, we note that the Nyquist frequency is much larger in the CMB since the effective cosmic volume extends all the way back to the last-scattering surface.