Strong bound on canonical ultra-light axion dark matter from the Lyman-alpha forest

We present a new bound on the ultra-light axion (ULA) dark matter mass $m_\text{a}$, using the Lyman-alpha forest to look for suppressed cosmic structure growth: a 95% lower limit $m_\text{a}>2 \times 10^{-20}\,\text{eV}$. This strongly disfavors ($>99.7\%$ credibility) the canonical ULA with $10^{-22}\,\text{eV}<m_\text{a}<10^{-21}\,\text{eV}$, motivated by the string axiverse and solutions to possible tensions in the cold dark matter model. We strengthen previous equivalent bounds by about an order of magnitude. We demonstrate the robustness of our results using an optimized emulator of improved hydrodynamical simulations.

Introduction -The axion is a well-motivated dark matter particle candidate that can also explain the lack of observed CP violation in quantum chromodynamics [the "strong CP problem"; 1-6]. Ultra-light axions (ULAs) are axion-like particles with very small masses (m a 10 −10 eV). These are generically produced in theories beyond the Standard Model, e. g., string theories, which can predict the existence of many different axions [the string "axiverse"; e. g., 7] that can comprise the dark matter [e. g., 8]. ULAs with masses ∼ 10 −22 eV (also known as fuzzy dark matter) are of particular interest since this may be a preferred mass scale in the string axiverse [e. g., 9,10]. Further, these axions are sufficiently light that wave-like behavior would manifest on astrophysical scales [∼ kpc to Mpc;8]. This could explain possible tensions in the standard cold dark matter (CDM) model between observations and simulations on galactic scales [the so-called CDM "small-scale crisis"; 11]; although, e. g., it is now clear that accurately simulating the co-evolution of dark matter and baryons is vital in this context [12].
ULAs suppress the growth of cosmological structure below a certain scale (∼ Mpc). This scale is set by the so-called "quantum pressure" of ULAs [e. g., 13,14]. It is a function of axion mass such that heavier axions have smaller cut-off scales (at larger wavenumbers k). Current bounds from the early Universe exclude ULAs being more than half of the dark matter with masses m a ≤ 10 −23 eV [15][16][17][18][19].
In order to probe the canonical mass scale of 10 −22 eV, we must exploit the smallest scales currently accessible in the linear matter power spectrum. This is possible using the Lyman-alpha forest, neutral hydrogen absorption seen in high-redshift quasar spectra (2 z 6) [20]. The absorption lines trace fluctuations in the intergalactic medium (IGM): the low-density (around mean cosmic density), largely primordial gas in-between galaxies. It follows that the flux power spectrum (correlations of the transmitted flux in the Lyman-alpha forest) is a powerful tracer of the linear matter power spectrum. By exploiting the highest resolution spectra available [21][22][23], we probe the matter power spectrum down to sub-Mpc scales [24][25][26][27][28] and hence power spectrum cut-offs from larger axion masses. The ULA smoothing "Jeans" length also mildly increases with redshift and so using higherredshift Lyman-alpha forest measurements improves axion bounds.
In this work, we improve upon previous ULA bounds using the Lyman-alpha forest [29][30][31] by exploiting a robust method for modeling the data, which we introduced in Refs. [32,33]. This allows, for the first time, tests of the robustness of the bounds with respect to the fidelity of the theoretical modeling. "Emulation" of the flux power spectrum is necessary since "brute-force" sampling of the parameter space (as required by e. g., Markov chain Monte Carlo methods; MCMC) is computationally infeasible due to the many expensive hydrodynamical simulations needed. An emulator is a computationally-cheaper but accurate model for the power spectrum, which can be called within MCMC and is built from a small set of "training" simulations [34]. We optimize the construction of this training set by using Bayesian optimization [35], a form of adaptive machine learning. The emulator model we use makes fewer assumptions and is more robust in its statistical modeling than existing linear interpolation techniques [e. g., 29,36,37]. More details on our methodology and cross-validation and convergence tests are presented in Ref. [38].
Model -We model the effect of ULA dark matter on the IGM with suppressed initial conditions at z = 99. This captures the small-scale power spectrum suppression that propagates to the flux power spectrum at z ∼ 5. The initial conditions are defined by a transfer function [39].
Here, P ULA (k) and P CDM (k) are respectively the linear matter power spectra for ULA dark matter and cold dark matter as a function of wavenumber k. T (k) is characterized by three free functions [α(m a ), β(m a ), γ(m a )], each a function of ULA mass m a ; α(m a ) sets the scale of suppression, while β(m a ) and γ(m a ) set the shape of the power spectrum cut-off. We fit these functions us-arXiv:2007.12705v2 [astro-ph.CO] 29 Jul 2020 ing a polynomial model 1 to transfer functions given by the modified Boltzmann code axionCAMB 2 [40,41], which calculates cosmological evolution in the presence of a homogeneous ULA field. This parametric model accurately captures the key feature of a sharp small-scale cut-off in the power spectrum as part of our general emulatorinference framework for cosmological dark matter bounds [see 38, and also below]. Following Refs. [42,43], we model the effect of ULA quantum pressure only by modified initial conditions, as this is sufficient for the current sensitivity of data (see below).
For robust ULA bounds, we marginalize over uncertainties in the thermal state of the IGM. This accounts for suppression in the flux power spectrum arising from pressure smoothing, non-linear peculiar gas velocities and the thermal broadening of absorption lines [44]. We model this with three free parameters per redshift bin of our data (see below; z i = The vast majority of the IGM gas at about mean cosmic density (to which the forest is sensitive) at z ∼ 5 is well-described by a power-law temperature T (z) -(over-)density ∆ relation [45]: T (z) = T 0 (z)∆ γ(z)−1 . This has two free parameters: the temperature at mean density T 0 (z) and a slope γ(z). We track the heat deposited in the IGM owing to cosmic reionization by the integrated energy injected per unit mass at the mean density u 0 (z) [46]. This tracks the filtering scale in the IGM gas, which is the relevant pressure smoothing scale for an evolving thermal state in an expanding universe [47,48]. The uniform ultraviolet (UV) photo-ionization rate is degenerate in the flux power spectrum with the mean amount of absorption in quasar spectra. We therefore account for uncertainty in the ionization state of the IGM by marginalizing over the effective optical depth τ eff = − ln F , where F is the mean transmitted flux fraction. Our free parameter is a multiplicative factor τ 0 (z = z i ) to the fiducial redshift dependence of τ eff given by Ref. [28].
In order to accurately bound the ULA power spectrum suppression scale, we marginalize over the slope n s ∈ [0.9, 0.995] and amplitude A s ∈ [1.2 × 10 −9 , 2.5 × 10 −9 ] of the primordial power spectrum, with a pivot scale k p = 2 Mpc −1 . Otherwise, we fix our cosmology to the baseline Planck 2018 parameters [49]: in particular, physical baryon energy density Ω b h 2 = 0.022126, physical dark matter energy density Ω c h 2 = 0.12068 and dimensionless Hubble parameter h = 0.6686.
Simulations -The 1D flux power spectrum measures correlations along the line-of-sight only (i. e., integrated over transverse directions) in the transmitted flux F nor-malized by the mean flux F . In order to model this with sufficient accuracy, we run cosmological hydrodynamical simulations of the IGM using the publicly-available code MP-Gadget 3 [50][51][52]. We evolve 512 3 particles each of dark matter and gas in a (10 h −1 Mpc) 3 box from z = 99 to z = 4.2. For computational efficiency gain with a negligible effect on Lyman-alpha forest statistics, we concentrate computational resources on the colder lower-density IGM gas using the quick-lya flag [53]. At each redshift bin of our data z = [4.2, 4.6, 5.0], we generate 32000 mock spectra (with pixel widths ∆v = 1 km s −1 ) containing only the Lyman-alpha absorption line and measure the flux power spectrum using fake spectra [54]. This uses a fast Fourier transform-based method which corrects for the pixel window function [55]. We evaluate the power spectrum at the discrete sample wavenumbers of the box in inverse-distance units and then linearly interpolate to the velocity wavenumbers of the data [see 38].
Our simulations are optically thin, and heated and ionized by a spatially-uniform set of UV background (UVB) rates [56]. In order to vary the output thermal IGM pa- we vary the simulation input. This entails varying the amplitude H A ∈ [0.05, 3.5] and slope H S ∈ [−1.3, 0.7] in an overdensity-dependent rescaling of the default heating rates: i (z) = H A 0,i (z)∆ HS , for i ∈ [HI, HeI, HeII]. We also vary the mid-point redshift of hydrogen reionization z rei ∈ [6, 15] and the total heat injection during reionization T rei ∈ [1.5 × 10 4 , 4 × 10 4 ] K according to the model of Ref. [57]. We allow for further variation beyond these inputs by varying each redshift bin separately (see above); this in particular allows for uncertainty in the timing of HeII reionization. We vary τ 0 (z = z i ) ∈ [0.75, 1.25] (and hence the IGM ionization state; see above) via a computationally-cheap simulation post-processing step. Further discussion and tests of numerical simulation convergence and the effect of mis-modeling the mean flux (i. e., using a rolling mean) is presented in Ref. [38].
Data -We use the 1D Lyman-alpha forest flux power spectrum described in Ref. [28] and presented in Fig This constitutes a measurement of the flux power spectrum including smaller scales than previously accessed, leading us to anticipate improved bounds on dark matter models as a consequence. The flux power spectrum is measured from a sample of fifteen highresolution (FWHM ∼ 6 km s −1 ) quasar spectra with emission redshifts 4.89 ≤ z ≤ 5.42 and Lyman-alpha forests for 3.99 ≤ z ≤ 5.31; eleven were obtained with the HIRES spectrometer on the Keck Telescope [21] and four with the UVES instrument on the Very Large Telescope [23]. We use data which have been corrected (at the power spectrum level) for the spectrographic resolution [we test the impact of imperfect resolution modeling in 38] and pixel window function, instrumental noise and metal absorption lines. The measured flux power spectra (including full covariance matrices; different redshift bins are modeled as independent) are publicly available at https://iopscience.iop.org/ article/10.3847/1538-4357/aafee4.
Emulation and inference -In order to be able to sample the parameter space in a computationally-feasible manner, we "emulate" the flux power spectrum (at the discrete simulation sample wavenumbers) as a function . We map from m a to [α, β, γ] using the parametric model defined above and fix Ω m = 0.3209 [49]. We follow the Bayesian emulator optimization method we presented in Refs. [32,33,38], which uses a Gaussian process as the emulator model [34]. This is a flexible probabilistic model, which can predict the flux power spectrum throughout the parameter space with, in general, more precise predictions closer to training points. For the Gaussian process, we use a linear combination of a squared exponential, linear and constant noise kernel.
We adaptively optimize the construction of the emulator training set, ensuring convergence in parameter estimation with respect to the accuracy of the emulator model. In total, we build an emulator with 93 training simulations, each with ten samples evenly distributed in the τ 0 (z = z i ) dimension at each redshift (since this parameter can be computationally-cheaply post-processed), i. e., three emulators each with 930 training points. This consists of an initial Latin hypercube [58] of fifty simulations evenly spanning the full θ volume (excluding τ 0 (z = z i )) as part of the general dark matter emulator we present in Ref. [38]. We then add 43 further training simulations, the positions in parameter space of which are iteratively selected by Bayesian optimization [32]. We present comprehensive tests of the method using cross-validation and convergence checks in Ref. [38]; a summary is presented in the supplemental material.
We sample the posterior distribution for parameters φ = [log(m a [eV]), τ 0 (z = z i ), T 0 (z = z i ), γ(z = z i ), u 0 (z = z i ), n s , A s ], for z i = [4.2, 4.6, 5.0], using the MCMC ensemble sampler emcee [59]. We use a Gaussian likelihood function, with the data and their covariance as given above and the emulator covariance added in quadrature to propagate theoretical uncertainty. The theory flux power spectrum is predicted by the optimized emulator along with the modeling of the covariance at that position in parameter space. In our prior distribution, we exclude the edges of the T 0 (z = z i ) -u 0 (z = z i ) plane at each redshift not spanned by our training set, since these areas include unphysical IGMs, i. e., high temperatures (high T 0 ) with little previous heating (low u 0 ) and vice versa. Further, to prevent unphysical sudden changes in the IGM [e. g., 28,29] in adjacent redshift bins (which would be inconsistent with previous observations [e. g., 60]), we prevent changes in T 0 greater than 5000 K and changes in u 0 greater than 10 eV m −1 p (m p being the proton mass). We use Planck 2018-motivated [49] priors on n s and A s (translated to the pivot scale we use): Gaussian distributions respectively with means 0.9635 and 1.8296 × 10 −9 and respectively standard deviations 0.0057 and 0.030 × 10 −9 . In order to disfavor very cold IGMs which are hard to motivate physically [e. g., 36] and following previous analyses [e. g., 29, 31, 61], we use a conservative Gaussian prior on T 0 (z = z i ) with means set to our fiducial model ([8022, 7651, 8673] K at z = [5.0, 4.6, 4.2]) and standard deviations of 3000 K. As the effective optical depth is otherwise poorly constrained by our data and following Ref. [61], we use a Gaussian prior on τ 0 (z = z i ) with mean 1 and standard deviation 0.05.
Our prior is uniform in the logarithm of the ULA mass m a [eV]. This differs from previous analyses [e. g., 29] which have usually been uniform in m −1 a . We argue that our choice constitutes a less informative prior [62] on the axion mass as it is more agnostic of the preferred mass scale. We bound the uniform prior on log(m a [eV]) ∈ [− 22, −19]. This extends from the canonical ULA mass of 10 −22 eV (which is already excluded in previous analyses) to the heaviest ULA mass that our data can probe (10 −19 eV); the power spectrum cutoff from heavier axions manifests on smaller scales than those accessible in our data.
Results -Our main result can be summarized by a 95% credible lower limit on the logarithm of the ULA dark matter mass (marginalized over the nuisance IGM and cosmological parameters described above): log(m a [eV]) > −19.64, which equates to m a > 2 × 10 −20 eV. Table I gives the full set of 1D marginalized 95% credible intervals, while Fig. 3 in the supplemental material shows a summary of the marginalized posterior distributions. Even when marginalizing over parameters which themselves induce small-scale suppression in the flux power spectrum (see above), the lightest axions that we consider (m a <∼ 10 −20 eV) are heavily disfavored relative to the cold dark matter limit. The IGM thermal state over which we marginalize is consistent with no statistically significant redshift evolution. This differs from some previous analyses which have suggested that the temperature at mean density increases from the highest to the next redshift bin [e. g., 60]; although others have also found no significant evolution [e. g., 28] and this is consistent with fiducial UVB heating rates [e. g., 56,63]. Further, we find no significant degeneracy between log(m a [eV]) and T 0 (z = z i ) (see Fig. 3) owing to the wide range of scales and redshifts we exploit. The values of γ and u 0 are otherwise consistent (within 95% limits) with previous observations [e. g., 60] and fiducial expectations [e. g., 45]. The limits on the effective optical depth are consistent with our fiducial model [28] except at z = 5.0, where a lower optical depth is preferred (the marginalized mean is 8% lower). The distributions on the cosmological parameters have not significantly updated from the prior, indicating as expected no constraining power on these parameters from our dataset. Figure 1 compares the data we use (see above) with the maximum posterior flux power spectrum. The fit between data and model is good and the modeling uncertainties on the theory power spectrum are too small to be seen. This is because, due to the Bayesian emulator optimization (see above), the emulator uncertainty in the peak of the posterior (and its 95% credible region) is smaller than the data uncertainty [see 38].
Discussion - Figure 2 compares our bound to some other competitive bounds (see caption for details). Our work closes a window of allowed ULA dark matter masses between the early Universe constraints at the lower end towards the black hole super-radiance bounds for higher masses. Our new lower limit on the ULA dark matter mass of 2 × 10 −20 eV improves over previous equivalent bounds [29] by about an order of magnitude 4 . These bounds (including our own) can be weakened when considering the case where ULAs do not make up all the dark matter [31], but we defer analysis of these mixed dark matter models to future work.
We attribute the strengthening of the Lyman-alpha forest bound to a number of key improvements in our analysis. First, we are able to exploit data to much smaller scales (k max f = 0.2 s km −1 ) than in previous analyses [k max f = 0.08 s km −1 ; 24]. Since the ULA dark matter suppression scale (at which the linear matter power spectrum drops by a half relative to CDM) k 1 2 ∝ m 4 9 a [8], by accessing wavenumbers 2.5 times larger, we can probe axion masses about eight times heavier. Second, we model the simulated flux power spectra using a Bayesianoptimized Gaussian process emulator, which explicitly tests for convergence in parameter estimation with respect to the accuracy of the emulator model [32,33,38]. This contrasts with previous simulation interpolation methods [e. g., 29,36,37] which have relied on a Taylor expansion around a fiducial point and which we have shown in prior work can bias power spectrum estimation and weaken parameter constraints [33].
Third, we marginalize over a physically-consistent IGM model (see above), which allows for a wide range of heating and ionization histories. In previous analyses, the temperature-density relation was varied freely as a function of redshift along with a single redshift of reionization to trace the pressure smoothing in the IGM at all redshifts. This means that IGMs were included with instantaneous temperatures [T 0 (z), γ(z)] which are inconsistent with the single thermal history determined by a given reionization redshift. Our model allows physically-  [15,16]. A combination of the high-redshift UV luminosity function [18] and the optical depth to reionization [19] exclude at 3σ ULA dark matter for ma = 10 −22 eV (with some sensitivity to the reionization model) [17]. The non-detection of supermassive black hole super-radiance (BHSR) excludes 10 −18 eV ma 10 −16 eV [66,67]. The sub-halo mass function excludes ma 2.1 × 10 −21 eV [68], while the equivalent previous bound (see main text) from the Lyman-alpha forest excludes ma < 2 × 10 −21 eV [29]. In this work, we exclude ma < 2 × 10 −20 eV (at 95% credibility). We consider here the case only where ULAs form all the dark matter; these bounds can be partially weakened if ULAs are a sub-dominant component. motivated flexibility by additionally varying the total heat input during reionization [57] and allowing for deviation from fiducial redshift dependencies in the integrated heating (by the u 0 (z = z i ) parameters). This may in fact partially weaken the bound on the ULA mass, by marginalizing over a wider range of physical IGM histories. Finally, our prior is uniform in log(m a [eV]), which we argue is less informative than in previous studies [e. g., 29], where the prior is usually uniform in m −1 a . Nevertheless, the improved modeling and data have allowed us to obtain a strong bound even in the presence of these less restrictive analysis choices.
Conclusions -We present a new lower limit on the mass of an ultra light axion dark matter particle at 95% credibility: log(m a [eV]) > −19.64 or m a > 2 × 10 −20 eV. This heavily disfavors (at > 99.7% credibility 5 ) the canonical ultra-light axion with masses 10 −22 eV < m a < 10 −21 eV as being the dark matter, motivated as a preferred mass scale in the string axiverse [9,10] and addi-tionally, to solve the so-called cold dark matter "smallscale crisis" [11]. We have obtained this dark matter bound using the general emulator-inference framework we present in Ref. [38]. In future work, we will exploit this framework to test other dark matter models, including mixed models where ULAs can be a subdominant component of the dark sector. There is further scope to extend the IGM model to include temperature and ionization fluctuations as a consequence of a spatiallyinhomogeneous reionization [e. g., [69][70][71], to which current data may be marginally sensitive [e. g., 9,72]. Dark matter bounds can also benefit from upcoming Lymanalpha forest observations, e. g., from the Dark Energy Spectroscopic Instrument [73,74], which can better determine the thermal and ionization state of the IGM.
Acknowledgments -KKR thanks Jose Oñorbe for sharing his code for the reionization model of Ref. [57] and for valuable discussions.

Supplemental material
We present here supplemental information on our results and methodology. Figure 3 shows a summary of the posterior distribution of the ULA dark matter mass and nuisance IGM parameters. Figure 4 shows a summary of the emulator convergence tests presented in the companion article [38]. It shows estimates of the 1D marginal- 24 are added; and the convergence increases as more are added.
The full convergence test considers the marginalized posteriors of all parameters. Training data are added until successive estimates by MCMC of the posterior (marginalized mean and 1σ and 2σ constraints) converge with respect to each other. At each step, we maximize a modified GP-UCB acquisition function [75][76][77][78]. This has two terms which balance exploitation of approximate knowledge of the posterior (using the previous iteration of the emulator) to prioritize regions of high posterior probability, with exploration of the full parameter space, prioritizing regions where modeling (emulator) uncertainty is high. The position in parameter space for each optimization simulation is the maximum acquisition point plus a random displacement. The size of the displacement is tuned to the 95% credible region, in order to explore the peak of the posterior, which must be characterized most accurately. A second convergence criterion is also applied, checking that the exploration term of the acquisition function tends towards zero. This indicates that the acquisition is dominated by exploitation and so converged towards the posterior peak.
We propose training simulations in the space of inference parameters φ, i. e., we must then map from m a to [α, β, γ] and from [T 0 (z = z i ), γ(z = z i ), u 0 (z = z i )] to [H A , H S , z rei , T rei ] to determine simulation input. For the first mapping, we use the model described in the main text and for the second, we interpolate a model using existing training data. We do not optimize in the τ 0 (z = z i ) dimensions as these are sampled densely by post-processing. We used the "batch" version of the Bayesian optimization presented in Ref. [32], which adds training data simultaneously in small batches from two to five; this made more efficient use of our computational resources.