Squeezing $f_{\rm NL}$ out of the matter bispectrum with consistency relations

We show how consistency relations can be used to robustly extract the amplitude of local primordial non-Gaussianity ($f_{\rm NL}$) from the squeezed limit of the matter bispectrum, well into the non-linear regime. First, we derive a non-perturbative relation between primordial non-Gaussianity and the leading term in the squeezed bispectrum, revising some results present in the literature. This relation is then used to successfully measure $f_{\rm NL}$ from $N$-body simulations. We discuss the dependence of our results on different scale cuts and redshifts. Specifically, the analysis is strongly dependent on the choice of the smallest soft momentum, $q_{\rm min}$, which is the most sensitive to primordial bispectrum contributions, but is largely independent of the choice of the largest hard momentum, $k_{\rm max}$, due to the non-Gaussian nature of the covariance. We also show how the constraints on $f_{\rm NL}$ improve at higher redshift, due to a reduced off-diagonal covariance. In particular, for a simulation with $f_{\rm NL} = 100$ and a volume of $(2.4 \text{ Gpc}/h)^3$, we measure $f_{\rm NL} = 98 \pm 12$ at redshift $z=0$ and $f_{\rm NL} = 97 \pm 8$ at $z=0.97$. Finally, we compare our results with a Fisher forecast, showing that the current version of the analysis is satisfactorily close to the Fisher error. We regard this as a first step towards the realistic application of consistency relations to constrain primordial non-Gaussianity using observations.

We show how consistency relations can be used to robustly extract the amplitude of local primordial non-Gaussianity (fNL) from the squeezed limit of the matter bispectrum, well into the non-linear regime. First, we derive a non-perturbative relation between primordial non-Gaussianity and the leading term in the squeezed bispectrum, revising some results present in the literature. This relation is then used to successfully measure fNL from N -body simulations. We discuss the dependence of our results on different scale cuts and redshifts. Specifically, the analysis is strongly dependent on the choice of the smallest soft momentum, qmin, which is the most sensitive to primordial bispectrum contributions, but is largely independent of the choice of the largest hard momentum, kmax, due to the non-Gaussian nature of the covariance. We also show how the constraints on fNL improve at higher redshift, due to a reduced off-diagonal covariance. In particular, for a simulation with fNL = 100 and a volume of (2.4 Gpc/h) 3 , we measure fNL = 98 ± 12 at redshift z = 0 and fNL = 97 ± 8 at z = 0.97. Finally, we compare our results with a Fisher forecast, showing that the current version of the analysis is satisfactorily close to the Fisher error. We regard this as a first step towards the realistic application of consistency relations to constrain primordial non-Gaussianity using observations.

I. INTRODUCTION
One of the longest standing questions in modern cosmology concerns the mechanism that set the initial conditions for the classical evolution of matter in the Universe. In particular, one asks whether an initial inflationary period has taken place and, if so, what were the underlying physics. In this respect, a crucial aspect is to understand if the primordial fluctuations are consistent with those predicted by single-field inflation, or whether additional light degrees of freedom played a role. To answer this, one typically relies on data for fluctuations that are small enough to lie within the quasi-linear regime, whereupon perturbation theory provides reliable predictions. Such a theory can then be used to robustly extract the physical parameters. In contrast, modes at larger momentum (e.g., k 0.2 h/Mpc for large-scale structures at redshift zero) fall in the non-linear regime and are usually discarded due to modelling difficulties, despite being abundant and precisely measured. It would be ideal to have a tool able to take advantage of this data.
Crucially, the consistency relations for large-scale structure are modified depending on whether the initial conditions for the matter fluctuations are Gaussian, as verified non-perturbatively in [23]. More precisely, if the initial conditions correspond to those of single-field slow-roll inflation, we expect B(q, k)/P (q) to have no inverse-powersof-q pole. Here, B and P are the bispectrum and power spectrum respectively, and q, k are the momenta with the understanding that q k (the squeezed limit). Moreover, the symmetries from which the consistency relations originate characterize not only the evolution of matter distribution, but also that of astrophysically complex objects, such as galaxies (which can merge, form, annihilate, and so on) [22,[24][25][26]. Finally, consistency relations are expected to hold true in redshift-space [18,22,27], making them amenable for application to galaxy surveys, which include strong anisotropy around the line-of-sight direction due to redshift-space distortions.
In this work, we take an important step towards the practical implementation of consistency relations as a tool to obtain non-perturbative information on the initial conditions of the Universe, by studying the squeezed limit of the matter bispectrum. Using a suite of N -body simulations of matter in real-space, we show that the consistency relations can be used to precisely measure the local primordial non-Gaussianity parameter, f NL . The idea is that if the initial conditions were not that of singlefield slow-roll inflation (the curvaton [28,29] model being an example, characterized by a non-vanishing f NL ), the squeezed B/P is no longer protected from having poles in the soft momentum q. In what follows, we consider only local-type non-Gaussianities which correspond to integer order poles as we will show in Section II. Alternative inflation models, such as quasi-single field inflation [7,[30][31][32], can lead to non-integer order poles or oscillatory behavior in the squeezed B/P. We leave a discussion of such models to a future work.
We first derive the non-perturbative relation between the squeezed limit of the bispectrum and f NL in Section II, correcting and clarifying some results present in the literature. Section III then highlights and addresses a number of subtleties regarding the procedure of measuring and analyzing the bispectrum in this context, the union of which allows for robust constraints on primordial non-Gaussianity, as presented in Section IV. Finally, we conclude and discuss the ingredients necessary for a future, realistic application of consistency relations to large-scale structure data in Section V.
The appendices give useful theoretical results and methodology validation. Appendix A includes tree-level and one-loop standard Eulerian perturbation theory calculations of the squeezed bispectrum in the presence of local primordial non-Gaussianity. Appendix B validates our bispectrum parameterization. We derive our likelihood and Fisher matrix in Appendix C 1 and present a quasioptimal weighting scheme in Appendix D. In Appendix E we show the full posterior of our main analysis.

A. Derivation
Let us begin by deriving the squeezed limit of the bispectrum in the presence of local primordial non-Gaussianity. We will work non-perturbatively in gravitational evolution, and follow a similar approach to the derivation of scaledependent bias found in [33]. Additionally, we assume a single redshift, z, and generally suppress this argument for clarity. In the presence of a background (soft) potential, Φ L , the locally measured small-scale power spectrum is modulated, taking the form, 1 where we are assuming that q ≡ |q | k, such that we can treat the long mode as a background, in presence of which the power spectrum is evaluated. This induces a coupling between the hard mode power spectrum, P (k), and the soft mode density, δ q . The bispectrum is then obtained from the δ q P (k|Φ L ) correlator, and is given by where V is the volume. Here P (q) is soft and thus in the linear regime. This result uses the Poisson equation to relate the long-wavelength density and potential, δ q = α(q)Φ L,q , with where Ω m,0 and H 0 are the matter density and expansion rate today, T (q) is the transfer function (normalized to unity for q → 0), and D md (z) is the growth rate, normalized to a(z) = 1/(1 + z) in the matter-dominated era. 2 We are interested in the form of Eq. (2) in the presence of local primordial non-Gaussianity, which can be parametrized by a primordial gravitational potential at sub-horizon scales given by [e.g., 34,35], where φ(x) is a Gaussian random field with expectation value φ . In the separate Universe picture, the soft mode acts as a background rescaling, equivalent to varying the locally measured clustering amplitude, σ 8 , i.e.
To compute the potential derivative, we utilize the definition of σ 8 and f NL [e.g., [36][37][38]: 1 In our notation, the power spectrum and bispectrum are defined as δqδ k ≡ V δ K q+k P (q) and δqδ k δp ≡ V δ K q+k+p B(q, k) respectively, with V the volume and δ K the Kronecker delta. 2 Note that this differs from the normalization used in many cosmological codes, e.g., class.
where we have noted that Φ L (x) is a slowly varying field, thus we can neglect its spatial dependence. Combining Eqs. (3), (5), and (6), we find the leading term in the squeezed bispectrum to be, The result above differs from some former works in important ways. First, it may appear to differ from [e.g., 25,39], which present a rigorous tree-level definition and find a kdependence of the form P (k) rather than ∂P (k)/∂ log σ 2 8 . This discrepancy is resolved by noting that, at first (treelevel) order, P (k) ∝ σ 2 8 , and thus the two coincide. More importantly, a non-perturbative derivation of the squeezed bispectrum is also presented in [16], but neither the logarithmic derivative nor the D md (z) factor appear in the final result. The presence of these factors is also verified by a direct one-loop calculation, as we report in Appendix A. In what follows, we further confirm their necessity in order to recover the correct f NL when the latter is extracted from simulations including primordial non-Gaussianity.

B. Modeling of the bispectrum
To extract the primordial signal given in Eq. (7), we require a model for the squeezed bispectrum that also encodes the contributions coming from the non-primordial gravitational evolution. Beyond perturbative scales, the functional form of these is not well understood. Nonetheless, consistency relations tell us that the dependence on the soft mode is known, and that these late time contributions to the squeezed bispectrum scale with positive powers of the soft mode, q. In light of this, we write the squeezed bispectrum in the following form, where we choose a parametrization in terms of the magnitude of the softest momentum, q, the magnitude of one of the hard modes, k, and the angle between them, θ. Due to the consistency relations, a −2 and a −1 can only be sourced by primordial physics beyond single-field slow-roll inflation. 3 Our parametrization includes only integer order poles since we are assuming local primordial non-Gaussianity. For the range of soft momenta we consider here the transfer function cannot be represented as a simple power series, thus we must include it explicitly. Inspired by perturbation theory, we only include it in the first two terms, since the remaining pieces are dominated by the bispectrum due to gravitational evolution, rather than by the primordial one, see Eq. (7). We will validate this parametrization in Section IV. A comparison with Eq. (7) tells us that, at leading order in f NL , local primordial non-Gaussianity leads to a non-zero a −2 coefficient, while the other non-perturbative coefficients are a priori unknown. 4 Nonetheless, one can still use invariance under parity to argue that [42] a n (k, π − θ) = (−1) n a n (k, θ) .
This means that, if one properly averages the bispectrum over all possible relative angles, the contributions to the odd coefficients coming from triangles related by the above parity transformation cancel (see Figure 1). This reduces the set of unknown parameters to the even coefficients. We will return to this aspect in Section III A and Appendix B. Our plan of action is the following: use N -body simulations to measure the a n coefficients (properly averaged over angles and hard momenta, as explained in Section III A), then extract f NL from a −2 using Eq. (9).

III. METHODOLOGY
To validate our measurements of f NL , we use a suite of N -body simulations consisting of 40 realizations with Gaussian initial conditions and 12 realizations with local primordial non-Gaussianity (f NL = 100). The simulations were run with 1280 3 particles in a comoving cubic volume of L 3 = (2.4 Gpc/h) 3 . The cosmological parameters are Ω m,0 = 0.25 (of which Ω b = 0.04), Ω Λ = 0.75, h = 0.7, n s = 1, and σ 8 = 0.8. We analyze the z = 0 and z = 0.97 snapshots. For further details on the simulations, we refer the reader to [35].

A. Measuring the power spectrum and bispectrum
To obtain an estimate of f NL using the consistency relations, we will require both the matter power spectrum and bispectrum. These are measured using fast Fourier transforms, first assigning the simulation particles to a 1024 3 mesh using Triangular Shaped Cloud interpolation. The soft momentum is collected in bins of width 2k f , with k f ≡ 2π/L 2.6 × 10 −3 h/Mpc being the fundamental mode, starting from modes of magnitude q > k f up to q max = 25k f 0.065 h/Mpc-i.e. 12 soft momentum bins. The hard modes entering the bispectrum will, instead, be taken within a range that is variable, depending on our analysis choices (as discussed below).
Throughout our analysis we pay particular attention to avoid any artifacts induced by binning and the discrete nature of the momenta. Such effects are more prominent for the softest momenta which, at the same time, are the most sensitive to the signal arising from primordial non-Gaussianity. For this reason, as far as the power spectrum and its related quantities are concerned, we refrain from any modeling (i.e., using linear theory) and directly measure the following binned quantities, where by · · · i we represent the average over q modes whose magnitude falls within the i-th bin. By directly measuringM i,n from the simulations, we also take advantage of the sample variance cancellation arising from the fact that bispectrum and power spectrum are measured from the same density field. For the bispectrum, we measure the following quantitieŝ where k min and k max define the hard momentum scales of our analysis. We sum over all k's with magnitude between these values. More precisely, if the relative angle between q and k is allowed to cover the full range between 0 and 2π, then the hard modes contributing to the bispectrum in the soft momenta bin q i effectively range from k min − q i to k max + q i , and hence the hard momenta range varies with q. In order to avoid finite resolution effects from the simulation or our particle assignment scheme, we impose a maximum momentum in our analysis of K = 270k f 0.71 h/Mpc. This means that we must always make sure that k max + q max ≤ K. Once this is done, we are guaranteed that the average in Eq. (12) is done over all possible relative angles, θ, hence ensuring that the odd coefficients in the expansion in Eq. (8) average to zero, as discussed below Eq. (10). Moreover, we simply average over all k's between k min and k max with uniform weight, instead of treating different k-bins separately. This way we can remain agnostic about the k-dependence of all coefficients in Eq. (8) except for a −2 (k), otherwise we would need to introduce a new set of (unknown) coefficients for each k-bin. 5 In light of the discussion above, our theory model for the bispectrum is given bŷ where vectors run over the soft mode index, i, and theā n coefficients correspond to the a n 's in Eq. (8) averaged over angles and over hard modes with magnitude between k min and k max . In Appendix B we show that repeating the analysis with the inclusion of the odd coefficients (ā −1 ,ā 1 ) leads to values consistent with zero. This confirms that the average over triangles is done properly, and it is therefore sensible to only consider the subset of even coefficients. We also show thatā 4 is consistent with zero so that we can truncate the series atā 2 .
Finally, in order to extract f NL fromā −2 , we mea-sureF ≡ δ k δ −k ∂ log P (k)/∂ log σ 2 8 , averaged over k ∈ [k min , k max ], using the same bins as the bispectrum. A couple of subtleties arise when estimating this quantity. First, in principle, P (k) should be computed from a theoretical model as it is an average over realizations; however, since we do not a priori know P (k) in the non-linear regime, we instead measure it from the simulations. This does not significantly impact our results since there is little variance in P (k) at high-k. Second, ∂ log P (k)/∂ log σ 2 8 should be determined from simulations, for example, via a separate Universe prescription; however, this is a computationally expensive and time consuming process. For our fiducial analysis it suffices to estimate the function using halofit [43]. This represents the main source of theoretical uncertainty when constraining f NL from consistency relations using the real-space matter distribution. 6 For a measuredF andā −2 (averaged over the same hard momenta), we can compute f NL using the relationship B. Likelihood and covariance To formulate our likelihood for the non-perturbative coefficients,ā n , we must account for the stochastic nature of both the measured bispectrumB and the theory model, the latter of which depends on the measured binned power spectra,M n , which we also consider as a stochastic variable. As shown in Appendix C 1, this can be taken into account by a likelihood that, for a single realization, r, is given by where δB(ā n ) ≡B r − nā nM r n is the difference between the measured bispectrum and our theory model (Eq. 13), is the parameter dependent covariance of δB, and N R is the number of realizations used to estimate it. Note that we use a multivariate tdistribution instead of a Gaussian likelihood to account for the fact that our covariance is estimated from a finite number of realizations [45] as discussed below.
Since a real analysis would assume a fiducial value of f NL = 0 for the covariance and we only have twelve realizations with primordial non-Gaussianity, we always compute our covariance using the N R = 40 realizations with Gaussian initial conditions. Furthermore, to ease the computational burden associated with sampling a likelihood with a parameter dependent covariance, we evaluate the latter at a fixed set of fiducialā n 's. Evaluating the covariance at a fiducial set of parameters is also necessary so that we can use covariance from simulations without primordial non-Gaussianity when fitting measurements from simulations with f NL = 100. To determine the fiducial set of coefficients we use an iterative procedure (similar to a Newton-Raphson algorithm) to maximize the likelihood forB andM n , which are the mean measurements from the set of simulations with Gaussian initial conditions. Specifically, we first maximize the likelihood with a covariance evaluated atā n = 0. Then, we use the new coefficients to re-compute the covariance, which is used to re-maximize the likelihood, and so on. We stop the procedure once all parameters have converged with a fractional accuracy of 10 −9 , which takes around 10 iterations. We have checked that this procedure does not affect our results in an appreciable way. 7 Once the set of fiducial coefficients has been determined, and thus the covariance, C, we sample from the parameter posterior distribution. For our analysis, we vary theā n coefficients with uniform wide priors between −10 9 and 10 9 (Mpc/h) 3+n . The sampling is done using the Multinest package [46] with 500 live points and an evidence tolerance of 0.005. 7 Our fiducial results for the simulations with Gaussian initial conditions change from f NL = −0.6 ± 12 to f NL = −0.1 ± 12 when including the parameter dependent covariance.

IV. RESULTS
We first present a series of fiducial results carried out on measurements from a single realization of our simulations at two redshifts (z = 0 and z = 0.97) with soft momenta 0.005 < q < 0.06 h/Mpc and hard momenta 0.2 < k < 0.6 h/Mpc. The full posteriors for this analysis can be found in Appendix E. We will later validate this choice of scale cuts, and investigate the corresponding sensitivity of our f NL constraints.
In Figure 2 we plot the squeezed bispectrum and our maximum a posteriori model prediction for a single realization at redshift z = 0. The error bars are computed from the fiducial covariance introduced in Section III B. As expected, in the presence of local primordial non-Gaussianity, there is a sizable contribution from the 1/q 2 term in the squeezed bispectrum. Moreover, theā 2 contribution becomes significant at scales q 0.035 h/Mpc, matching the conclusions of Appendix B. For this particular realization and choice of scale cuts, we find f NL = 0±12 and f NL = 98 ± 12, for the measurement without and with local primordial non-Gaussianity, respectively. Our constraints improve considerably at redshift z = 0.97 for which we find f NL = −1 ± 8 and f NL = 97 ± 8.
The improvement at higher redshift can be understood as follows. On the one hand, the primordial contribution to the squeezed bispectrum, Eq. (7), scales as B ∼ P 2 /D(z) ∼ D 3 (z) for growth factor D(z). On the other hand, the bispectrum covariance scales as C B ∼ P 3 ∼ D 6 (z). It follows that the signal-to-noise ratio has the scaling B/ √ C B ∼ D 0 (z), which is redshiftindependent. Nonetheless, at higher redshift, gravitational non-Gaussianity is suppressed by a factor of D(z), which (a) reduces theā n signal for n > 0 (which must be disentangled from f NL ), and (b) reduces the non-Gaussian covariance. These effects increase the signal-to-noise with which primordial non-Gaussianity can be measured at high-redshift. This indicates a transfer of information on f NL into higher-point functions at late times.

A. Squeezed criterion
In this section we fit the squeezed bispectra for different ranges of minimum hard momenta, k min , and maximum soft momenta, q max , to address how squeezed the bispectrum must be in order for consistency relations to be applied reliably. Since our focus in this section is on method validation, all results use the mean bispectrum measurement obtained from 12 realizations with Gaussian initial conditions, and 12 realizations with local primordial non-Gaussianity. The covariance is, however, always evaluated using the 40 Gaussian simulations, but scaled to match the volume of 12 realizations.
In the top panel of Figure 3 we show the mean value for f NL and its 68% confidence error, for both Gaussian and non-Gaussian initial conditions, as well as for redshifts z = 0 and z = 0.97. We do this for differ- ent values of k min , but fixing k max = k min + ∆k, with ∆k = 5k f 0.013 h/Mpc. We fix the minimum and maximum soft momenta to be q min = k f 0.002 h/Mpc and q max = 23k f 0.06 h/Mpc. For sufficiently high k min , our fit closely matches the expected value of f NL . However, for k min 0.15 h/Mpc we obtain biased results, a signal of the fact that the triangles are not sufficiently squeezed for the relation derived in Eq. (7) to apply. The scale at which we recover unbiased results appears to be consistent between the two redshifts. In light of the above results, unless otherwise stated, for the remainder of the analyses we fix the minimum hard momentum to k min 0.2 h/Mpc.
Having determined a range of suitable hard momentum scale cuts, we now analyze the dependence of our results on the soft momenta scale cuts. To do this, we fix the maximum hard momentum to k max 0.6 h/Mpc and the minimum soft momentum again to q min = k f 0.002 h/Mpc, but vary the maximum soft momentum, q max . The scale cuts are such that each successive value of q max contains one more bin than the previous one. We report the results in the bottom panel of Figure 3. As one can see, we recover the correct value of f NL for both sets of initial conditions, essentially independently of the value chosen for q max . The error on f NL also decreases with increasing maximum soft momentum, albeit mildly. Both observations are explained by the fact that most of the signal is contained in the first few q-bins. For this reason, from now on we fix q max = 23k f 0.06 h/Mpc.
In summary, our constraints on f NL from the matter bispectrum are unbiased for scale cuts with k min /q max 3 at redshifts z = 0 and z = 0.97. Since most of the information is contained within the softest momentum bin, the more relevant scale is likely set by k min and q min . For our analysis, the associated ratio is then k min /q min ≈ 30 (100) for q min = 0.007 h/Mpc (0.002 h/Mpc). Further discussion on the parametrization of the bispectrum and on the scale cuts can be found in Appendix B.

B. Information content and Fisher forecasts
Having demonstrated that the consistency relations can be used to extract unbiased constraints on f NL in the nonlinear regime, we now analyze the information content on f NL in the consistency relations. Of particular interest is how our constraints on f NL change as we vary q min and k max , as well as how our results are limited by naïvely averaging over all hard momenta with uniform weight. To this end, we fit the squeezed bispectra for varying maximum hard momenta, k max , and minimum soft momenta, q min , and additionally perform a series of analytic Fisher forecasts. The derivation of the Fisher matrix used in these forecasts is presented in Appendix C 2.
In Figure 4 we plot the error bar on f NL as a function of k max for redshift z = 0 and z = 0.97. We show the mean and the 68% confidence limit for the error on f NL computed by fitting each of the 40 realizations with Gaussian initial conditions individually. All results assume a maximum soft momentum q max 0.06 h/Mpc and minimum hard momentum k min 0.2 h/Mpc. The bottom panels exclude the first soft momenta bin. Removing the softest momenta bin increases the error on f NL from 12±1 (7 ± 1) to 17 ± 1 (12 ± 1) at k max = 0.54 h/Mpc and z = 0 (z = 0.97), confirming that most of the information is contained within the softest momenta bin.
Turning our attention to the the small-scale information, we find that the error on f NL quickly saturates with increasing k max . This is due to the non-Gaussian contributions to the covariance and is consistent with recent results analyzing local primordial non-Gaussianity in the non-linear regime [47]. Thus, even though the consistency relations can be used to constrain f NL at scales well beyond 0.35 h/Mpc, we find essentially no benefit when including those modes. The situation may be less dire at high redshifts as was found in [48]. In particular, when we increase k max from 121k f 0.32 h/Mpc to 221k f 0.58 h/Mpc we find that σ fNL remains essentially constant at z = 0 (11.7 ± 1.6 to 11.6 ± 1.4), but decreases by 10% at z = 0.97 (8.0 ± 1.0 to 7.2 ± 0.9). Given that the non-Gaussian contributions to the covariance are particularly important for local shapes, even at high redshifts [48,49], the gains at higher redshift may be more modest than those found for other types of non-Gaussianities.
To ensure that the lack of small-scale information is not an artifact of our sub-optimal weighting scheme (i.e., summing all k-modes with uniform weights), we perform a series of Fisher forecasts. The black curves show the Constraints are shown with (top) and without (bottom) the softest momentum bin for redshift z = 0 (left) and z = 0.97 (right). The Fisher forecast is performed assuming all modes have been averaged in a single k-bin (black), matching our analysis, as well as splitting the data into 10 k-bins (grey). The improvement from treating the k bins separately is less than 10%, which suggests that there is little to gain from using a more complex hard momentum weighting scheme and/or introducing different values ofā0 andā2 for each k bin. The error bars quickly saturate as a function of kmax.
forecasts assuming that all hard momenta are averaged into a single k bin, as is the procedure used in our analysis. We find that our errors are within 20% of the Fisher error for all scale cuts, with the exception of the z = 0.97 snapshot with the first bin excluded, for which our error bars are 50% larger. Note that for simplicity the Fisher forecasts assumeā 2 = 0, which leads to an underestimate of σ fNL due to the degeneracy betweenā 2 and f NL , as discussed in Appendix B. 8 We also perform the Fisher forecast assuming the hard momenta between k min and k max are subdivided into 8 A more fair comparison between our measurement error and the Fisher error on f NL can be made by fixingā 2 = 0 and choosing qmax ≤ 0.035 h/Mpc as was verified in Appendix C 2. In this case we find that our measured error on f NL is 7% (14%) larger than the Fisher error for N k = 1 (N k = 10) at z = 0 assuming q min = 0.002 h/Mpc, qmax = 0.025 h/Mpc, k min = 76k f , and kmax = 231k f . N k = 10 linearly spaced hard momenta bins. This is equivalent to the improvement that one would expect if we accounted for the different values ofā 0 over 10 bins, as opposed to simply averaging over all modes. Interestingly, there is only a very modest (< 10%) improvement when treating the hard momenta bins separately, hence there is likely little to gain by implementing an optimal weighting scheme. We found that the error on f NL does start to decrease if we consider N k > 20; however, given that the covariance is estimated from only 40 realizations, we do not include those results and defer this study to a future work in which we have a more robust covariance.
Finally, using our forecasting pipeline, we present an extremely idealistic forecast for the error on local f NL that we can hope to achieve from the Dark Energy Spectroscopic Instrument (DESI) [50] using our method. Assuming DESI will measure a combined survey volume of 170 Gpc 3 58.4 (Gpc/h) 3 (Table 3 of [51]) and neglecting the (likely relevant) complications due to galaxy bias and redshift-space distortions, we find σ fNL 4.9 and 3.3 using the covariance at redshift 0 and 0.97, respectively, and scale cuts q min 0.002, q max 0.06, k min 0.2, k max 0.3 h/Mpc. Although these forecasts are highly optimistic, combining our analysis with existing techniques to constrain f NL from galaxy clustering, particularly those that use the scale-dependent bias, may considerably improve our ability to constrain local primordial non-Gaussianity.

V. CONCLUSIONS
Due to its ability to differentiate between single-field and multi-field inflation, measuring local primordial non-Gaussianity is a key target for current and upcoming cosmological surveys. Here we present a method to measure local f NL from squeezed configurations of the real-space matter bispectrum using the consistency relations. Crucially, our analysis is non-perturbative, allowing us to reliably use data deep into the non-linear regime. To our knowledge, this is the first result using the consistency relations to directly measure local f NL . Our main results are summarized as follows: • Using a suite of N -body simulations with local f NL = 100 and with covariance scaled to match a volume of (2.4 Mpc/h) 3 , we measure f NL = 98 ± 12 at redshift z = 0 and f NL = 97 ± 8 at z = 0.97 from the real-space matter bispectrum using the consistency relations.
• Due to the non-Gaussian bispectrum covariance at late times in the short-scale regime, our error on f NL is relatively insensitive to the choice of maximum hard momenta, k max . In particular, we find little improvement when including modes beyond k 0.35 h/Mpc at redshifts 0 and 0.97, even though our model has been verified up to k max 0.7 h/Mpc.
• Constraints on f NL from the consistency relations are extremely sensitive to the amount of large-scale information.
The error on f NL increases by 50% if we increase q min from 0.002 h/Mpc to 0.007 h/Mpc at redshifts z = 0 and z = 0.97.
In light of these findings, we conclude that even though the consistency relations provide a robust mechanism to constrain local f NL using measurements in the nonlinear regime, the utility of this information is severely limited by the non-Gaussian covariance at small scales. We expect this effect to be less severe at high redshifts, although a more dedicated analysis is needed to verify this. Even if the consistency relations alone are unable to place competitive constraints on local f NL , they provide a strong test of fundamental physics with minimal modeling assumptions and applying an analysis like the one presented in this work to observational data would be highly informative.
Given that the approach presented here used the realspace matter distribution, it would most easily be generalized to galaxy or CMB lensing surveys (which measure only the matter field), albeit with loss of radial information. Furthermore, recent and upcoming lensing surveys such as the Dark Energy Survey [52], the Legacy Survey of Space and Time [53], Atacama Cosmology Telescope [54], Simons Observatory [55], and CMB-S4 [56] will yield precise measurements of a wealth of non-linear modes, making them particularly useful for consistency relationbased methods.
On the other hand, since the consistency relations are expected to hold for biased tracers in redshift-space [18,22,[24][25][26][27], one could extend our method to the bispectrum of photometric galaxy surveys. The main ingredient necessary to apply our method to 2D projected galaxy clustering measurements is a treatment of galaxy bias. This may be non-trivial given that the hard momenta in our analysis are in the non-linear regime. Moreover, the relationship between the measured coefficientā −2 and f NL (Eq. 9) will need to be modified to account for the scale-dependent bias and the response of the non-linear galaxy power spectrum to a long-wavelength density perturbation, both of which could have significant theoretical uncertainties depending on the galaxy population [57].
One could also expand the method presented in this work to 3D galaxy clustering measurements from spectroscopic galaxy surveys, such as BOSS [58], DESI [50], Euclid [59], and SPHEREx [60]. In addition to accounting for galaxy bias, this would require modeling redshiftspace distortions and accounting for stochastic effects, i.e. shot noise. Such an analysis would complement and extend existing analyses which measured local f NL from the redshift-space galaxy bispectrum in the perturbative regime [61,62] and reveal if the consistency relations contain information beyond the scale-dependent bias.
The consistency relations relate correlators of all orders, hence there is likely information from the consistency relations that we are not capturing in our analysis. This could be probed by considering higher-order correlators such as the trispectrum, but estimating such quantities and their covariances is both computationally and theoretically challenging. An alternative would be to constrain primordial non-Gaussianity at the field level which, by design, incorporates information from higher-order correlation functions [38,63]. A comparison with field level approaches would be extremely useful to quantify the information content of the consistency relations.
Finally, one could generalize the method presented in this work to constrain other types of non-Gaussianities that violate the consistency relations. The most direct extension would be to consider non-integer poles as are predicted in quasi-single field inflation scenarios [7,[30][31][32]. This would require modifying our bispectrum parametrization to include such poles and their appropriate coefficients.

ACKNOWLEDGMENTS
We thank Kazuyuki Akitsu for sharing code used to compute the bispectrum. We are grateful to Matteo Biagetti, Will Coulton, Mikhail Ivanov, Kendrick Smith, Marko Simonović, and Beatriz Tucci for insightful discussions. We thank the referee for their helpful comments and suggestions. For most of the development of this work A.E. has been a Roger Dashen Member at the Institute for Advanced Study, whose work was also supported by the U.S. Department of Energy In this Appendix, we verify the form of the squeezed bispectrum (Eq. 7) using Eulerian perturbation theory, assuming q k, with k in the quasi-linear regime. We work in the infinite volume limit for clarity. Since we are interested in the leading 1/q 2 term, which is not generated by the non-linear gravitational evolution alone, we focus only on the contribution to the bispectrum coming from primordial non-Gaussianity.

Tree-level
In the presence of local primordial non-Gaussianity, a non-trivial bispectrum is generated in addition to that from gravitational evolution. In particular, neglecting gravitational corrections, it is given by [e.g., 64], where P L is the linear power spectrum, δ (n) the n-th order density field [65], and by · · · we indicate the ensembleaveraged correlator with overall momentum conserving δ-function removed. In the squeezed limit, recalling that α(q) ∼ q 2 (Eq. 3), this takes the form This matches Eq. (5) with ∂P (k)/∂ log σ 2 8 → P (k), which holds true at tree level.

One-loop
When including non-linear gravitational evolution, additional contributions are generated in the bispectrum. In particular, the one-loop contributions sourced by primordial non-Gaussianity are where the final term is an ultraviolet counterterm required by renormalization, with δ q , with c s a constant corresponding to the speed of sound in the hydrodynamical description of matter evolution. Each of these terms involve five linear density fields, which, after Wick contractions, involve the linear power spectrum, δ (1) δ (1) , and the primordial bispectrum, δ (1) δ (1) δ (1) (except for the counterterm, which involves a primordial bispectrum and a factor of q 2 ). The squeezed bispectrum is dominated by terms involving primordial bispectra of the form B 111 (q, k) ∼ P L (q)P L (k)/α(q), for some k satisfying q k; any other terms are suppressed by powers of q 2 . There are seven such contributions: utilizing momentum conservation, and denoting p ≡ d 3 p/(2π) 3 . The F n functions are the standard perturbative kernels [65]. Terms three, five, and seven are simply terms two, four, and six with the hard momenta switched.
Taking the soft limit, q k, p, and inserting Eq. (A2), these simplify to coefficients are consistent with zero as is expected from our angular averaging procedure. There is significant degeneracy between a−1 and fNL due to the difficulty in distinguishing between a 1/q and 1/q 2 pole in our measurements.
We recognize the term in parentheses as twice the oneloop power spectrum, P 1-loop (k) [66,67]. In combination with Eq. (A2), we find the full f NL contribution to the squeezed bispectrum up to one-loop order: which agrees with Eq. (7), noting that P L ∝ σ 2 8 and P 1-loop ∝ σ 4 8 . It is worth stressing that there is a factor of 2 multiplying the one-loop contribution to the power spectrum. Thus what fits inside the square brackets is not simply the non-linear power spectrum.

Appendix B: Validation of the parametrization
In this Appendix, we perform a series of tests validating our fiducial analysis choice in which we vary f NL ,ā 0 , and a 2 . In particular, we show that the odd coefficients in the power series expansion of the squeezed bispectrum are consistent with zero, that including the quadratic coefficientā 2 is necessary to model the squeezed bispectrum at the scale cuts used in this work, and that coefficients higher than quadratic order can be neglected in our model. Unless otherwise stated, all results in this section use the mean measurements at redshift z = 0 from 12 realizations with covariance scaled to match the volume of 12 realizations assuming a maximum soft momentum q max = 0.06 h/Mpc and hard momenta between 0.2 and 0.6 h/Mpc.
FIG. 6. Constraints on fNL for varying soft momenta scale cuts assumingā2 = 0 and averaging over hard momenta between 0.2 and 0.6 h/Mpc. Results are shown for z = 0 for the simulations with Gaussian initial conditions (red), as well as those with primordial non-Gaussianity (blue). The horizontal offset between the two simulations is purely for visual purposes. We are unable to recover the true value of fNL if we include soft momenta bins larger than 0.035 h/Mpc; therefore, we includeā2 as a free parameter in our fiducial analysis. eitherā −1 orā 1 as a free parameters. 9 In Figure 5 we plot the 2D marginalized posteriors of f NL − f true NL and the odd order coefficients. We include results with Gaussian initial conditions, as well as those assuming f NL = 100. The posteriors onā −1 andā 1 are consistent with zero which suggests that our angular averaging is working as expected. Furthermore, including the odd coefficients as free parameters in our model does not bias the constraints on f NL . We note, however, that the error on f NL increases significantly when we varyā −1 , suggesting that our measurements are unable to fully differentiate between a 1/q and 1/q 2 pole in the squeezed bispectrum. On account of these tests, we conclude that we can set the odd parameters in our model to zero.
In Figure 6 we instead plot the 68% confidence limits on f NL as a function of the maximum soft momentum, q max , assumingā 2 = 0. As one can see, for q max 0.035 h/Mpc, this model gives biased constraints on f NL . Indeed, at these scales the O (q/k) 2 correction to the squeezed bispectrum becomes comparable to the measurement error. Hence, we need to include higher-order terms if we wish to include these modes in our analysis without increasing k min . Includingā 2 as a free parameter provides unbiased constraints on f NL up to scales of at least q max 0.065 h/Mpc as shown in Figure 3. In practice one could either includeā 2 as a free parameter, or impose a stricter soft momentum scale cut and fixā 2 = 0. In our fiducial analysis we choose to varyā 2 , since this leads to an error on f NL that is approximately 20% smaller than what we find when fixingā 2 = 0 and decreasing q max . Alternatively, one could extend the validity of the two-parameter model to include more soft momenta bins by increasing k min ; however, since the maximum accessible value of k is often limited both in observations and simulations (e.g. by shot-noise or resolution) we do not utilize this method.
Finally, we rerun our analysis including terms up to quartic order in q by varying f NL ,ā 0 ,ā 2 , andā 4 . In Figure 7 we show the 2D marginalized posteriors of f NL −f true NL andā 4 . We find thatā 4 is completely consistent with zero and that we continue to recover unbiased constraints on f NL . Includingā 4 as a free parameter increases the error on f NL by 50%, thus we fixā 4 and all higher-order coefficients to zero in our fiducial analysis.

Appendix C: Likelihood and Fisher matrix
In this Appendix we derive the likelihood and Fisher matrix used in our analysis. We first start on general grounds, and then introduce simplifying approximations to ease the computational burden.
Consider the angular averaged bispectrum, B iJ , binned in both soft and hard momenta bins. Here i, j, . . . label the soft modes and I, J, . . . the hard ones. Consistency relations imply that, in the squeezed limit, the angleaveraged bispectrum is a stochastic variable that fluctuates around its true value. For sufficiently thin bins, the latter is given byB iI = s iIPi , whereP i is the true (soft) power spectrum, and s iI is defined as with Similarly, the power spectrum P i is a stochastic variable fluctuating aroundP i . Assuming that both B iJ and P i are Gaussian distributed, we interpret their probability distribution as the likelihood for the unknown parameters (f NL , a n ,P i ), which reads where, contrary to the main text, we explicitly let the vectors run over the hard mode indices, I, J. We have also used the fact that the covariance matrices are approximately diagonal in the soft mode, as verified in our simulations. Using the standard identities for the inverse of a block matrix, the Ψ's can be related to the covariance matrices by, where C BB iIJ ≡ δB iI δB iJ , C BP iI ≡ δB iI δP i , C P P i ≡ δP i δP i , and · · · denotes an ensemble average, which will be estimated from simulations. In particular, in the space of the hard momenta bins, Ψ BB i is a matrix, Ψ BP i a vector, and Ψ P P i a scalar.
1. The likelihood for the analysis As described in Section III A, in our main analysis we average over all hard modes collected in a single k-bin. The corresponding likelihood should then be obtained from Eq. (C3) when the vector index is trivial, e.g., a n → a n . Since we do not fit for the true value of the power spectrum, we also marginalize overP i , obtaining, . Note that the likelihood above is simply a Gaussian one, but with a parameter dependent covariance.
In the analysis described in Section III B we start from the likelihood above, but introduce two modifications for practical purposes. First, since we are estimating the covariance matrix from simulations, we promote it to a non-diagonal one. We have checked that, even assuming a diagonal covariance in our analysis does not change the results appreciably. Second, in principle the only independent stochastic quantity in the theory model is the soft power spectrum, P i . However, when accounting for finite size bins for which q n δ q δ −q = q n δ q δ −q , the different terms in s i will weight P i by different powers of q i , with effects that are hard to predict. We therefore find it more convenient to treat the M in -defined in Eq. (11)as different stochastic variables. In summary, we employ the likelihood Eq. (C5), but promoting the covariance matrix to which, for a finite number of realizations, results in Eq. (15).

The Fisher matrix
To determine the Fisher matrix we work with the original likelihood, Eq. (C3). To reduce the computational burden and reliably estimate the Fisher matrix from 40 mocks, we assume a 2 = 0 and hence the model parameters are (f NL , a 0 ,P i ). The Fisher matrix is then given by the Hessian of − log L, and can be written in a block form as, If we have N k hard mode bins, then F 1 is a (N k + 1) × (N k + 1) matrix with entries, If we have N q soft bins, F 2 is a (N k + 1) × N q matrix with entries, Finally, F 3 is a diagonal N q × N q matrix, By the Cramér-Rao theorem, the variance on a measurement of f NL will then satisfy the bound, which is what we use in our analysis of Section IV B. For the forecasts in Section IV B we assume f NL = 0 and that P i is equal the mean power spectrum as measured from the 40 realizations with Gaussian initial conditions. To determine a set of fiducial values for a 0 we must account for its k-dependence. To do this, we use the consistency relation from [41], a 0 (k) = 1 + 13 21 where D + is the linear growing mode. In particular, we estimate a 0 by computing Eq. (C12) using halofit. This prediction agrees within 5% of our measured value ofā 0 over all redshifts and scale cuts analyzed in this work, and is therefore sufficient for the forecasts.

Appendix D: Optimal weighting schemes
In this Appendix, we consider possible optimal weighting schemes, W (q, k), for the compressed bispectrum, B(q) ≡ k W (q, k)B(q, k). Note that in our main analysis we always fix W (q, k) = 1 as we find that the weights derived in this section do not noticeably improve our constraints on f NL ; nevertheless, we present our main findings below since they could be relevant for future works measuring f NL from consistency relations. To derive the weights, we will compute the maximum-likelihood estimator for theā n coefficients appearing in the bispectrum model of Eq. (13) for a given q-bin using a simplified treatment of our likelihood and covariance.
For simplicity, we assume a Gaussian likelihood for the bispectrum and work in the continuous limit of q and k. The likelihood of the model parametersā n is then, up to a constant term, log L(ā n ) ∝ − q q k k δB(ā n ; q, k) × C −1 B (q, k; q , k ) × δB(ā n ; q , k ) , where δB(ā n ; q, k) ≡ B(q, k) − nā nMn (q) withM n (q) defined via Eq. (11), C B is the squeezed bispectrum covariance, and the integral is carried out over all soft and hard modes allowed for a given choice of scale cuts. In the squeezed limit, the bispectrum covariance is approximately given by [49] C B (q, k; k , q ) (2π) 3 δ D (q − q ) (D2) × [P (q) C P (k, k ) + 2B(q , k )B(q, k)] , where C P is the (non-linear) power spectrum covariance.
Under null assumptions, i.e.ā n = 0 for all n, the bispectrum vanishes and only the first term survives, and the likelihood for a single q mode becomes log L(ā n , q) ∝ − 1 P (q) k k δB(ā n ; q, k) × C −1 P (k, k ) × δB(ā n ; q , k ) , (D3) up to a constant term. Given Eq. (D3), we may derive an optimal estimator forā n (for a single q-mode) by maximizing log L(ā n , q) with respect toā n . This yields with the coupling matrix The optimal weight is thus W n (q, k) ∝ k C −1 P (k, k )M n (q), with one weight per choice of n.
It is useful to consider two simplifying limits. First, if the covariance is dominated by its Gaussian contribution, V C P (k, k ) = 2(2π) 3 δ D (k − k )P 2 (k), yielding a n → (D6) this allows us to remove the k integral. Second, if the coupling betweenā n coefficients is weak (i.e. K mn is close to diagonal), we may apply the optimal weighting only for the m = −2 coefficient. Inserting the relevant template (Eq. 9), we find the quasi-optimal weighting W (q, k) = ∂ log P (k)/∂ log σ 2 8 /P (k) (dropping k-independent factors), and thus the compressed bispectrum B(q) ≡ k B(q, k) P (k) ∂ log P (k) ∂ log σ 2 8 . (D7) Strictly speaking, this weighting could be far from optimal in the high-k limit, whereupon the power spectrum covariance is far from diagonal. In this case, invoking only the second assumption, one might use This may be of greater use in practice, and is straightforward to implement if the weighting is estimated from simulations, e.g., Quijote. Finally, one may wish to retain the bispectra appearing in Eq. (D2), assuming non-zero fiducial values ofā n : this can be done similarly, though the covariance will become a function also of q.