Unveiling Dark Matter free-streaming at the smallest scales with high redshift Lyman-alpha forest

This study introduces novel constraints on the free-streaming of thermal relic warm dark matter (WDM) from Lyman-𝛼 forest flux power spectra. Our analysis utilises a high-resolution, high-redshift sample of quasar spectra observed using the HIRES and UVES spectrographs ( 𝑧 = 4 . 2 − 5 . 0). We employ a Bayesian inference framework and a simulation-based likelihood that encompasses various parameters including the free-streaming of dark matter, cosmological parameters, the thermal history of the intergalactic medium, and inhomogeneous reionization, to establish lower limits on the mass of a thermal relic WDM particle of 5 . 7 keV (at 95% C.L.). This result surpasses previous limits from the Lyman-𝛼 forest through reduction of the measured uncertainties due to a larger statistical sample and by measuring clustering to smaller scales ( 𝑘 max = 0 . 2 km − 1 s). The approximately two-fold improvement due to the expanded statistical sample suggests that the effectiveness of Lyman-𝛼 forest constraints on WDM models at high redshifts are limited by the availability of high-quality quasar spectra. Restricting the analysis to comparable scales and thermal history priors as in prior studies ( 𝑘 max < 0 . 1 km − 1 s) lowers the bound on the WDM mass to 4 . 1 keV. As the precision of the measurements increases, it becomes crucial to examine the instrumental and modelling systematics. On the modelling front, we argue that the impact of the thermal history uncertainty on the WDM particle mass constraint has diminished due to improved independent observations. At the smallest scales, the primary source of modeling systematic arises from the structure in the peculiar velocity of the intergalactic medium and inhomogeneous reionization.


I. INTRODUCTION
The Lyman- forest is the main manifestation of the highredshift intergalactic cosmic-web.It is visible in the spectra of quasars (QSOs) and produced by the scattering of the background photons with the neutral hydrogen atoms along the line-of-sight [1,2].The Lyman- forest is a unique probe of geometry and the dynamical state of the Universe, probing diffuse matter around galaxies and in the intergalactic medium (IGM) in regimes which are not covered by other observables, both in terms of redshifts and scales.
In the last decade we have witnessed tremendous progress in the cosmological investigation of the Lyman- forest, mainly along two different directions which are connected to fundamental physics.For example, the discovery of Baryonic Acoustic Oscillations in the 3D correlation function of the transmitted flux has offered the possibility to constrain new physics beyond the standard cosmological model, in the context of allowing curvature or an evolution of the equation of state for dark energy [3].Another important research line, following the work of [4,5], has focussed on the 1D flux power spectrum used to probe the growth of structure down to the smallest scales to see to which extent dark matter free streaming could be constrained.
In this work we will investigate this second aspect and present new results based on a new set of simulations which incorporate the most important physical ingredients [6,7], and a new comprehensive analysis of high-resolution high-redshift data down to the smallest scales.A key goal is to disentangle the different roles of the physical processes able to affect the 1D flux power: the thermal broadening, which is a 1D effect acting along the line-of-sight and is sensitive to the instantaneous gas temperature, and two 3D effects, the gas pressure smoothing that depends on the whole thermal history of the IGM and the dark matter (DM) free streaming.
The possibility of constraining the nature of DM by using the Lyman- forest has motivated a series of works which were able to constrain the models further, explore different particle physics dark matter candidates, and combine likelihoods with other experiments able to constrain the nature of dark matter with strong lensing or flux ratio anomalies [8,9].One of the main reasons to explore warm dark matter (WDM) models was to solve or ease putative problems of cold dark matter at small scales [10,11].However most of these tensions must be discussed also in the context of baryonic physics [12], with processes like galactic feedback playing a major role.Moreover, it appears that minimal extensions of the standard model of particle physics could also accommodate particles like sterile neutrinos or a scalar field [13][14][15][16], which could suppress or erase power at small scales, effectively acting as WDM.
For thermal WDM masses in the keV range, the power suppression happens at the small non-linear scales sampled by the Lyman- forest.In particular, QSO data sets with different resolution and signal-to-noise properties have been used in order to tighten the constraints.The low resolution SDSS and BOSS data sets [5,17,18], the medium resolution X-Shooter sample [19] and the high resolution and high signal-to-noise Keck/HIRES and UVES/VLT QSO spectra [20][21][22][23][24] have all played a major role in the advancement of the field.For example, while the low and medium resolution data are not particularly effective in sampling the scales of the cutoff fully, they nevertheless are sensitive to the thermal history and can return very tight constraints especially when combined with data at smaller scales.The goal of this paper is to give a comprehensive state-of-the-art analysis focusing on high resolution data [25].In Section II we will describe the data set, while in Section III we will present the suite of hydrodynamical simulations used.Section IV will contain our new results which will be extensively discussed in terms of the thermal history of the IGM, the dependence on mass resolution, patchy reionization, instrumental effects (including modelling of the noise) and consistency with results in the literature.We will conclude in section V.

II. DATA
We apply our analysis to the measurements presented in [25].Their 1D flux power spectrum is estimated using 15 high signal-to-noise spectra observed by VLT/UVES [26] and Keck/HIRES [27].The measurements span the high redshift range of  = 4.2 − 5.0 in bins of Δ = 0.4.In each of the redshift bins the flux power spectrum is measured in 15 -bins equidistantly spaced in log 10  in the range of log 10 /[km −1 s] = −2.2 to log 10 /[km −1 s] = −0.7,with logarithmic spacing of Δ log 10 /[km −1 s] = 0.1.Unless specified otherwise we use the full extent of the data, resulting in 45 data points across three redshift bins.
The spectrograph resolution in these observations is very high, with  ∼ 50, 000 (FWHM of ∼ 6 km s −1 for HIRES and ∼ 7 km s −1 for UVES).As already pointed out in the study of [25] the effects of resolution uncertainty are very small, even for the highest wavenumber power spectrum bin measured.A conservative estimate of the 10% uncertainty on the resolution leads only to 1% (5%) uncertainty on the 1D flux power spectrum at scales of  = 0.1 (0.2) km −1 s.The power spectrum measurements of [25] were reported both with and without instrumental resolution correction.In this analysis we use the measurements with instrumental resolution corrected, and propagate this correction through the covariance matrix.The reported measurements are also corrected for power spectrum due to metal contaminants.
The typical flux noise estimated in these measurements is white noise, with its power spectrum amplitude of 0.1 − 0.2 km s −1 .This is comparable to the estimated level of the models at the highest wavenumbers.Characterizing and accounting for the noise levels is of key importance, and has been one of the factors restricting previous analyses to smaller wavenumbers.
The  range of [25] covers the smallest scales measured with the 1D flux power spectrum, extending to  ∼ 0.2 km −1 s, a factor of two higher wavenumber than in previous studies [21][22][23].These studies have shown that the constraining power on WDM models from the Ly forest is dominated by high redshifts and the smallest scales, making this an ideal data set to exploit.

III. SIMULATIONS
The absorption features of the Ly forest contain a wealth of information regarding cosmology and the nature of dark matter, as well as the thermal state of the intergalactic gas.Due to the high sensitivity of the spectrographic instruments it provides a unique window into clustering at the smallest scales.Accessing that information, however, is a non-trivial task.The standard approaches of clustering analysis that invoke biasing schemes typically rely on perturbation theory [28] or build an approximate clustering scheme [29].While very informative in a qualitative sense, these schemes cannot TABLE I: List of simulations used in this work (see also [6]).From left to right, the columns list the simulation name, the box size in ℎ −1 cMpc, the number of particles, the redshift of reionisation (defined as the redshift when the volume averaged ionised fraction 1 −  HI ≤ 10 −3 ), the gas temperature at the mean density,  0 , the cumulative energy input per proton mass at the mean density,  0 , for 4.6 ≤  ≤ 13 [cf.25], and the cosmological model described by ΛCDM parameters ( 8 ,  ) and a WDM parameter for the inverse of the WDM particle mass of a thermal relic ( −1 WDM ).The upper section of the table lists the models in the first set of simulations that we use for our MCMC analysis (see text for details).The lower section of the table lists our second set of simulations, which includes mass resolution (R10) and box size (B40) corrections to the predicted flux power spectrum.The dark matter and gas particle mass are 5.37 × 10 5 ℎ −1  ⊙ and 9.97 × 10 4 ℎ −1  ⊙ respectively for L20, B40 and a subset of R10 runs (2×512   capture the complexity of the data that is highly sensitive to non-linear structure evolution and gas physics, such as Doppler broadening and thermal pressure smoothing [30].Such a task requires simulating the expected Ly forest in different thermal and cosmological models, spanning a wide, multi-dimensional parameter space, and comparing it to the data.In this work we carry out the comparison within the framework of Bayesian inference analysis, which describesaccording to Bayes' theorem -the posterior probability (|) having parameters  given observed data  as: where L (|) is the likelihood and () is the prior on each parameter.
In this work we expand upon the Bayesian inference setup adopted in [7,31] to evaluate the likelihood and prior at each parameter combination in the sampler.The likelihood is evaluated jointly at all the observed data points.This is based on the Monte Carlo Markov Chain (MCMC) sampler, combined with the Metropolis-Hastings algorithm by dynam-ically learning the proposal matrix from the covariance that was introduced in [22].The precision of the thermal parameter recovery with this simulation based emulator was shown to be in good agreement with more advanced machine-learning augmented emulator models [31].
The priors () we adopt in our analysis are described in section IV.The likelihood is modelled as a Gaussian likelihood, determined by the data and its covariance, and a theoretical prediction for the flux power spectrum.The latter is estimated using hydro-dynamical numerical simulations.
We use simulations from the Sherwood-Relics project [6].These are a series of high-resolution cosmological hydrodynamical simulations that use a customized version of P-Gadget3 (see [32] for the original Gadget-2 reference).We use cosmological boxes of size 20 ℎ −1 Mpc with 2 × 1024 3 dark matter and gas particles.The box size and resolution have been chosen to adequately resolve the small scale structure that contributes to the flux power spectrum of the Ly forest, while still retaining a cosmologically relevant volume [33][34][35][36].We further correct the numerical convergence with both box size and resolution with a series of additional simulations summarized in Table I.In all models we use a simple, computationally efficient star-formation scheme -often called Quick_lya -where gas particles are converted into collisionless star particles if they reach overdensities Δ = 1 +  > 10 3 and temperatures  < 10 5 K [37].We assume a flat ΛCDM cosmology with Ω Λ = 0.692, Ω  = 0.308, Ω  = 0.0482,  8 = 0.829,   = 0.961, ℎ = 0.678, and a primordial helium mass abundance of   = 0.24 [38].The initial conditions for the CDM simulations are identical to those used in the earlier Sherwood simulation project [35].We use the WDM transfer function approximation of [21].
A set of simulations is constructed using modifications to the spatially uniform UV background synthesis model introduced by [39].These simulations are similar to models used in earlier works [16,22,40,41], with the main improvements being the larger dynamic range of the simulations, the use of a non-equilibrium thermo-chemistry solver [42], and improved treatment of the IGM opacity that consistently captures the transition between neutral and ionised IGM.
In order to construct a sufficiently well sampled grid of models spanning the entire multi-dimensional parameter range, we post-process the 144 simulations (12 simulations for each cosmology) to obtain different parameter combinations.We follow the method of [25,44] in order to interpolate in the temperature-density plane.Briefly, we rotate and translate the line-of-sight particles in the temperature-density plane to obtain models with different temperature at mean density  0 and temperature-density power-law indices  (the values of  0 and  are inferred from the line-of-sight gas properties; a power-law relation is fitted to points in the temperature density plane in tha range of gas overdensity (0.1 < Δ  < 1.0) and neutral fraction weighted gas temperature ( < 10 5 K)).This preserves the temperature-density cross-correlation coefficient, allowing for an inexpensive construction of models with different thermal parameters on a finely spaced grid.
In post-processing we also vary the redshift evolution of the mean transmission ⟨⟩, by rescaling the optical depth of Ly absorption ( Ly ) obtained from simulations to match observed values of the effective optical depth  eff = − ln⟨⟩.Uncertainties in the background photo-ionization rate mean a rescaling is commonly used to match the simulations to observations [34,45].Note that this step is only a good approximation after reionization, as it implicitly assumes that the gas in the low density IGM is in photo-ionization equilibrium, such that  Ly ∝  HI ∝ Γ −1 HI .The redshift evolution that we adopt for  eff is: taken from [25], and similar to the evolution reported in [40,46].Using the methods described above, we construct a 15 × 10 × 10 grid of parameter values on top of each of the 144 simulations (upper section of Table.I).This grid of models consists of 10 values of  0 spanning the range from 5,000 to 15,000 K in steps of 1,000 K; 10 values of  spanning the range from 0.9 to 1.8 in steps of 0.1; and 15 values of  eff in the range from 0.3 to 1.8 times the value in Eq. 2, in multiplicative steps of 0.1.This gives a total of 15 × 10 × 10 × 12 × (1 + 3 + 2 × 4) =216,000 models.Since we do not extrapolate outside of this grid of models, we have implicit priors on  0 between 5,000 and 15,000 K,  between 0.9 and 1.8, and for  0 between (4.03, 21.12), (3.65, 21.08) and (2.46, 18.73) eV/m p for redshifts 4.2, 4.6 and 5.0, respectively.

A. Flux power spectrum models
From the grid of models we extract 5,000 lines of sight in different orientations through the box.The flux field along each skewer is Fourier transformed, and the resulting power spectrum is averaged over all the lines of sight, resulting in the predicted 1D flux power spectrum for a given model.In order to compare the simulated models to the data we construct an emulator that interpolates the 1D flux power spectrum between the models, allowing us to explore the parameter space spanned by the simulated models.The emulator is based on linear interpolation [22].Since the grid of models fills the parameter space in a uniform fashion the interpolation error is small as demonstrated on the sub-set of the models in [7].Neglecting quadratic terms in the interpolation leads to at most 1.2% correction at high (, ) in the flux power spectrum, well below the statistical uncertainty on the data.
Fig. 1 shows the 1D flux power spectra when varying the parameters that govern the three main scales of suppression of the flux power.In the left panel, increasing the temperature of the gas at mean density increases the suppression on small scales (high-), while inducing a small increase in power at large scales (low-k).The latter is due to keeping  eff fixed, while the former can be understood in the context of thermal broadening of the lines -the transmission profile of the Lyman- scattering is determined by the random motion of the gas at a finite temperature.The higher the temperature the larger the velocity dispersion of the thermal motion, leading to more extended profiles that erase small-scale structure.
A related effect, shown in the central panel of Fig. 1, is the effect of pressure smoothing.As the gas is heated during reionization, it hydrodynamically responds to the resulting increase in its temperature and pressure by expanding [6,47].The more heat injected, the more the gas expands, erasing more small-scale structure.In our models we parametrised this effect with the cumulative heat injected per proton by FIG.1: The relative ratios of the 1D flux power spectra of the simulated models relative to a reference simulation run when varying one parameter at a time:  0 (left),  0 (center) and  WDM (right).In each panel all the other parameters are kept fixed.The scale-dependence of the flux power spectrum changes in response to changes in the input parameters.The left panel shows the effect of thermal broadening on the absorption features of the forest.The center and right panels show the emergence of a small-scale enhancement of the relative flux power spectrum in simulations with varying reionization history and WDM free-streaming.The cumulative heat injection values (center panel) correspond to reionization ending at 5.25, 6.0, 6.75 and 7.5 (top to bottom) for the ionizing UV background model of [39].a given redshift ( 0 ) [41].The exact redshift range of  0 parameters is the same as in [25].
The small-scale structure in the gas could further be affected by the free-streaming of non-standard dark matter models such as WDM.The lighter the mass of a thermal relic WDM particle, the longer the particles will free-stream, from when they decouple from the thermal bath until they become nonrelativistic.The longer this time the larger the scales affected, and the stronger the suppression in the small-scale power.This is shown in the right-hand side panel of Fig. 1, where the proxy for the free-streaming scale used is the inverse of the particle mass,  −1 WDM .

B. Mass resolution and boxsize
Since our models are built from the results of hydrodynamical simulations it is important to understand whether the results of these simulations are numerically converged.Two main factors limit this convergence [34][35][36] -the size of the simulated box limits the number of large-scale modes and affects the convergence on large scales; and the mass or particle resolution of the simulation limits the smallest resolved scale.
We have supplemented our simulation suite with additional calibration runs varying the size of the simulated box at fixed mass resolution.Our fiducial grid of simulations uses a box size of 20 ℎ −1 Mpc.We have applied the splicing correction, [48], using the 40 ℎ −1 Mpc box with the same resolution as L20-ref, which results in a correction of the level of ≤ 3% on the 1D flux power spectrum in the low- regime.We have further verified that at the scales of interest for the analysis of [25] data, further corrections using 80 and 160 ℎ −1 Mpc box sizes were negligible.This was not an unexpected result, and has been observed in several previous studies [21,22,25].
Of more importance for the studies of the small-scale 1D flux power spectrum, is the mass resolution of the simulations (  ).The grid of simulations was run with the fiducial gas mass resolution of 9.97 × 10 4 ℎ −1  ⊙ , corresponding to 2x1024 3 baryon and dark matter particles.These models are converged at the 5-10% at the smallest scales used in the analysis.We have complemented these models with additional simulations varying the number of simulated particles at different fixed box sizes.
Fig. 2 shows the 1D flux power spectrum decrements between different models.The poorer the mass resolution of the simulation the larger the suppression of the small-scale flux power spectrum relative to a higher resolution simulation.The mass resolution correction is larger at higher redshifts, and at smaller scales, in agreement with previous results in the literature (e.g.[35,36]).The mass resolution correction (  ) and the 1D flux power decrements shown in in Fig. 2 are connected as  −1  = 1 + Δ/.The grid of our simulations at the resolution of (20,1024) was corrected for the residual mass resolution with (10,1024) model (R-set; see Table I), corresponding to gas mass resolution of 1.25 × 10 4 ℎ −1  ⊙ .Additional correction due to higher resolution simulations (e.g.(5,1024)) adds less than a few percent to the total mass resolution correction.

IV. RESULTS
The new results on the free-streaming of warm dark matter are summarized in Fig. 3.The six panels show the 2D posteriors for three redshift bins of the data [25], with the redshift label referring to the label of the thermal parameters that are independent in each redshift bin.The bottom row shows the constraints in the thermal parameter space of (5,512)/(5,1024) (5,768)/(5,1024) The effect of mass resolution in the simulations, shown as a flux power spectrum decrement as a function of wavenumber for simulations of varying particle numbers.The mass resolution decrement of the flux power spectrum is largest for the lowest resolution simulations (bluesolid) and smallest for the highest resolution simulations (green-dashed).The decrement as a function of mass resolution decreases, indicating convergence.The fiducial grid of simulations (20,1024) used in this work is converged at the 5-10% level at  = 0.2 km −1 s.The default mass resolution correction uses models with higher mass resolution (10,1024) that are converged at 2-5% at  = 0.2 km −1 s.The shaded regions show the observational 1 uncertainty on the flux power spectrum from [21] (pink) and [25] (violet).The vertical dashed lines indicate the  max of different data sets.
gas temperature and pressure smoothing (through the proxy of cumulative injected heat), whereas the top row shows the constraints spanning the parameter space of pressure smoothing and free-streaming.The fiducial analysis choice assumes priors on the thermal history in the  0 −  0 plane as an envelope around our fiducial grid of simulations (see below).We also assume Planck [49] priors on CDM cosmology parameters ( 8 ,   ).For the default analysis we use mass resolution correction using a fiducial thermal history with CDM cosmology (R-set; see Table I).We also do not include any correction due to inhomogenous reionization.These assumptions were chosen as our reference analysis in order to facilitate better comparison with previous analysis.The additional work presented in this paper which includes patchy correction, thermal dependence of the mass resolution correction (  ( 0 )) and observationaly informed thermal priors ( 0 prior) is also shown in Fig. 3 (orange contours) and discussed in more detail in subsections below.
Our measurements of the thermal state of the gas largely agree with independent measurements in the literature [15,24,25] within 1-2.The data prefers a slightly colder temperature at mean density of  0 = 8, 000 (7,500; 7,800) K at redshift  = 4.2 (4.6; 5.0) as a best-fit (see Table II).At the same time the cumulative heat injected is constrainted to be  0 = 7.2 (6.8; 5.2) eV/m p between redshifts 4.2 and 12.0 (4.6 and 12.0; 6.0 and 13.0).The result is consistent with the analysis of [25].However, models with slightly hotter temperature consistent with [44,50] and less pressure smoothing [51] are within the 2 contours.
The measurements of effective optical depth,  eff , from the flux power spectrum are also consistent with direct observations of the transmitted flux [46,52].The derived measurement of the mean transmitted flux at  = 5.0 is ⟨ Ly ⟩ = 0.1764 +0.0177 −0.0171 .This is consistent at 1 − 2 with the measurement of [52] of ⟨ Ly ⟩ = 0.1581 +0.0082 −0.0089 which used almost four times the number of sightlines compared to [25].
This analysis also varies the power-law of the temperaturedensity relation () as a free parameter in each redshift bin.The data, however, are not constraining this parameter well and its posterior is dominated by the prior.This result was also found in previous studies of high redshift Lyman- forest data (e.g.[22,25]).
The panels at the bottom of Fig. 3  The 2D posterior distributions of the best-fit analysis for the 1D flux power spectrum measurements of [25] using UVES/HIRES quasar spectra.The blue contours show the default analysis, and the orange contours show the analysis that captures our best knowledge of the thermal history (Sec.IV D), inhomogenous reionization (Sec.IV G) and mass resolution corrections (Sec.IV F).The three columns correspond to the three different redshifts of  = 4.2, 4.6 and 5.0 (from left to right).The bottom row shows the contours in the thermal parameter space, with the violet band shows the envelope around the physically motivated simulations, shown as gray points (squares, circles and triangles).This band serves as a prior in the thermal parameter space in the default model.The coloured points correspond to the measurements in the literature from the same data set: from [25] (purple); [15] (in green); and [24] (in red).The top panels show the 1 and 2  contours in the parameter space of free-streaming and pressure smoothing (heat injection).The vertical dotted line and surrounding gray band indicate the best-fit measurements of [25].The intersecting gray dashed and dot-dashed lines show typical degeneracy axes between the parameters.The cutoff at small  0 and small  0 values comes from the implicit prior imposed by the extent of the grid of models (see text for details).
combinations of hydro-dynamical simulations as gray markers (L20; see Table I).The thermal and reionization histories were chosen to bracket the observed flux distribution of high redshift quasar spectra [53], as well as the electron optical depth inferred from the Cosmic Microwave Background (CMB) as reported by Planck [49,54].Through the post-processing technique described in Sec.III the likelihood is able to sample the full span of the  0 − 0 parameter space on a (non-uniform) grid, however in order to avoid unphysical parts of the  0 −  0 parameter space we consider a prior defined as an envelope around the simulations' results (indicated in Fig. 3 by the gray band).

A. Degeneracy axes
The simulated models exhibit a tight correlation between the IGM temperature at a given time, and the integrated injected heat up until that time.The positive correlation between the thermal parameters (dot-dashed lines in bottom panel of Fig. 3) can be well described by  0 ∝  1.7  0 , and the parameter anticorrelation (dashed lines in bottom panel of Fig. 3) is well described by  0 ∝ − 0 .The anti-correlation also indicates the degeneracy axis we would expect from the measurement of the 1D flux power spectrum -at a given observed redshift the flux power suppression can be explained by either higher injected heat, and therefore a larger pressure smoothing scale; or it can be explained by a higher temperature and therefore larger thermal broadening.The Lyman- forest provides constraints in the direction perpendicular to that degeneracy axis, along the direction of the positive correlation between  0 and  0 .
Similarly to the degeneracy between the thermal broadening and pressure scales, we observe a correlation between the pressure smoothing and the free-streaming scales, as shown in the top panels of Fig. 3.The vertical black dashed line and gray shaded region, indicate measurements of the cumulative injected heat,  0 , in a CDM analysis of [25].A negative correlation (dot-dashed lines in top panel of Fig. 3) between the two smoothing scales can be understood as a consequence of both physical mechanisms reducing the small-scale power of the 3D density field.The pressure smoothing scale is typically described as an exponential suppression of the power,  g ∼  m exp − 2  2  [6,55], at a typical filtering scale   .The larger the heat injected into the gas, the more the gas expands due to the pressure, resulting in a positive correlation between the filtering scale and the injected heat  0 .Such a relation was explored in the simulations of [25], where it was found that   ∼ 20 ckpc × √︁ 1 + 2 0 /(1/  ).Equivalently, the warm dark matter transfer function can be approximated by  WDM ∼ 1 + () 2 −5/ , with  = 1.12 and the typical free-streaming scale,  = 70 ckpc × ( WDM /(1keV)) −1.11 , given by [21].
The total power suppression in the 3D field on small scales, is a product of both the pressure smoothing and free-streaming transfer functions.Expanding the product in powers of , the lowest scale dependent coefficient scales as ∝  2 , with the amplitude of  2 2 =  2  +10 2 , where we have approximated  ∼ 1.The anti-correlation between pressure smoothing and the freestreaming that we observe in the data are driven by being sensitive to the total shape of the suppression, thus  2 2 = constant.This can be interpreted as the smoothing being driven by either higher pressure smoothing or larger free-streaming length, and is shown as dot-dashed gray lines in Fig. 3.
Whereas the shape of the power spectrum suppression is poorly constrained by the current data, the data is able to constrain the scale where the suppression occurs -shown as dashed lines in Fig. 3.We estimate this positive correlation between the parameters (dashed lines in top panel of Fig. 3) by matching the scale where the pressure smoothing and freestreaming transfer functions equal one half (e.g. WDM ( 1/2 = 1/2, or  WDM ( 1/2 ) = 1/4).The two scales are given by

B. Best-fit model
Fig. 4 shows 1D flux power spectrum corresponding to the best-fit model over-plotted on the data.The model fits the data reasonably well, with a total  2 of 40.7 and 34 degrees of freedom (see Table II).Furthermore, the model is in excellent agreement with the data up to  ∼ 0.1 km −1 s, and describes the position and shape of the flux power spectrum suppression on small scales.To illustrate this we can compare the model that is fit to all the data points and re-evaluate the  2 for the points up to  < 0.1 km −1 s.In this case the fit gives  2 of 20.4 with 20 degrees of freedom.All three redshift bins show an increase in the measured power relative to the model at  > 0.1 km −1 s.This indicates a possible shortcoming of the model on the smallest scales, or else a signal in the data that is not part of the model.
The best-fit model excludes  WDM < 5.7 keV (95% C.L.) and provides the tightest constraints on the thermal relics WDM particle mass to date (see Table II).The model constraints exclude masses of 3.73 keV and 3.18 keV at 3 and 5, effectively excluding the much discussed 3 keV WDM model (e.g.[56,57]) at more than a 5 confidence level.

C. Improvement on WDM constraints
In Fig. 5 we compare the results of this work to the existing constraints on the WDM mass from the literature.The main result of this work results in a WDM mass bound that excludes WDM masses below  WDM < 5.7 keV at 2 confidence level.It provides improved constraints on WDM mass coming from the matter power spectrum suppression in the Lyman- forest analyses [18,22] as well as non-Lyman- constraints such as the flux ratios of strong lensed systems [58] and stellar streams in the Milky-Way [59].
The new constraint is stronger than the studies using low- [60] or a combination of low- and high- [21,61] Lyman- data, especially when comparing to similar choices in the thermal history priors.The new data is in fact producing a strong enough constraint that, even when relaxing the prior on the astrophysical parameters, the WDM mass bound remains stronger or competitive with past studies that used strong priors on the e.g.temperature evolution with redshift [18,22].
In the regime of the high redshift Lyman- forest analysis, the current analysis tightens the constraint on the WDM particle mass compared to previous analyses.In comparison to older analyses using HIRES/MIKE data [21,22] we see an improvement in the number of the observed quasar spectra by almost a factor of 2 [25].For a factor of 2 improvement in the number of sightlines, we would expect the uncertainty on the flux power spectrum to improve by ∼ 1/ √ 2, at least in the limit that statistical uncertainty dominates the error budget.From Fig. 2 we see that this is indeed the case in the high- regime of the data that is most sensitive to the free-streaming effect of WDM.In fact, in linear theory the sensitivity to the the WDM mass scales as  L,wdm / L,CDM ∼  20  wdm  −20 in the limit of  ≫ 14 ( wdm /1 keV) Mpc −1 .
However, the non-linear mapping between the linear density field and the non-linear flux field is complex.For a range of redshifts (4.2 <  < 5.0) and scales (0.01 < /[km −1 s] < 0.2) considered, the flux power spectrum suppression in our simulations (L20-ref) approximately scales as wdm 4 with line-of-sight wavenumber  in units of [km −1 s] and  wdm in units of [keV].For higher WDM masses, the flux power suppression due to WDM increases rapidly with redshift, but only linearly with the WDM particle mass.The scaling changes at around the WDM mass of 3 keV, when the scaling with mass becomes stronger, and the redshift dependence slightly FIG.4: The best-fit model compared to the data [25].The three panels correspond to three redshift bins, with the bottom panels showing the residuals of the data over the model.The total  2 is 40.7 with 34 degrees of freedom.The data were compared to a simulation based model that varies three thermal parameters and mean transmission independently in each redshift bin ( eff ,  0 , ,  0 ) and three cosmology parameters ( 8 ,  ) and ( WDM ) (see text for details).
weaker.The wavenumber dependence is roughly the same, and not dominant in this range of scales.The scaling is only approximately valid at a fixed thermal history (L20-ref), and the transition between the two scales, as well as the power-law dependencies, can vary across thermal histories.However, at the higher WDM mass limit the sensitivity to the particle mass increases with redshift relatively quickly in the redshift range of the data, improving the linear sensitivity to the mass.As a result the constraining power on WDM mass improves by more than a factor of ∼ 1/ √ 2.

D. Thermal history
The signal at high- in the 1D Lyman- forest flux power spectrum depends on the thermal parameters.The fiducial priors on the thermal history limit the possible combinations in the  0 −  0 plane to the volume of physically motivated simulation results [6].
A different approach would be to instead apply a prior based on independent measurements of the thermal history, for example using the measurements of  0 () from different datasets and different statistical methods.To achieve that we use improved and precise measurements of  0 () that span the redshift range  < 3.8 [44] and  > 5.2 [50].In order to predict viable models in the redshift range covered by our data (4.2 <  < 5.0) we rescale and shift the photo-heating and photo-ionization rates of our fiducial thermal history model [39].A similar methodology was employed in [62] in order to fit the flux power spectra measurements over a range of redshifts.Fig. 6 shows the two models from the literature, as well as our new fit calibrated directly against  0 () measurements.The best-fit prefers slightly higher temperatures in the redshift range 4.2 <  < 5.0 than the measurements of [25] using the flux power spectra data used in this work.In order to construct informative priors on  0 () parameters in our model, we use the best-fit values of the new thermal history as central points of a Gaussian distribution at each redshift, with a fixed standard deviation of 1,000 K.While the standard deviation is somewhat arbitrarily chosen, it roughly matches the typical uncertainty found in more recent works [44,50].Even if a realistic uncertainty is slightly lower at lower redshift, and slightly higher at higher redshift, due to the decreasing numbers of quasar spectra available, this should not impact the main conclusion of this exercise, which is to highlight the effect of independent, observationally informed thermal priors.
The results of the analyses using different thermal prior choices are shown in Fig. 7.The fiducial model (green contours) uses thermal priors in the form of an envelope around the simulations (gray band).Replacing these priors by simpler priors on  0 at each redshift results in slightly more elongated constraints on the  0 −  0 plane (orange contours), with the posterior expanding along the degeneracy direction.The mean and best-fit of the posterior however, change only marginally compared to the standard analysis.Even though the posterior in the  0 direction expands slightly, the posterior on  −1 WDM remains roughly the same at the 2 level, resulting in a very similar constraint on the WDM mass of  WDM > 5.85 keV (2) compared to the default analysis choice of thermal priors.The main difference is that the thermal priors have now been informed by the measured  0 evolution with redshift from other observational studies, rather than by our suite of hydro-dynamical simulations.The model with the  0 prior (see Table II) excludes low WDM masses of 3.75 keV and 3.21 keV at 3 and 5, respectively.Fig. 7 also illustrates the effect of not imposing any thermal priors on the analysis.This resulting posterior (blue contours) is shifted to lower IGM temperatures, and relatively higher values of the cumulative injected heat.This part of the thermal parameter space is unphysical as we expect  0 and  0 to be correlated for physically reasonable IGM heating scenarios.Rather counter-intuitively the constraints on the WDM mass become much stronger if we impose no thermal prior.This TABLE II: List of different models used in the analysis with their corresponding best-fit warm dark matter constraints.The table shows the name of the model and the resulting 2 lower bound on the WDM particle mass ( WDM ), along with best-fit values of the thermal parameters at  = 4.6 for the effective optical depth ( eff ), gas temperature at mean density ( 0 ), the slope of the temperature-density relation () and the cumulative injected heat ( 0 ).For the model where extra instrumental noise in the data was modelled with a free parameter, the best-fit value is shown as well ( noise ).The last column displays the best-fit  2 value and the degrees of freedom.

Name
result can be understood by considering degeneracy axis along which the posterior distributions move.Colder temperatures and enhanced amount of pressure smoothing leave much less room for a WDM model to accommodate the amount of flux power spectrum suppression in the data.Therefore, models with strong WDM suppression are excluded more strongly.

E. Effect of small-scale peculiar velocity
The enhancement of the small-scale power in the models is associated with the enhancement of the small-scale structure in the peculiar velocities.Fig. 8 (left) shows the relative effect of the peculiar velocity field on the flux power spectrum ratios.The models of early ( rei = 7.5) and reference ( rei = 6.0) reionization (blue solid line) show a relative enhancement of power compared to the ratio of the two flux power spectra when the effects of peculiar velocities are not included in the calculation of the optical depth.This effect of setting  pec = 0 has also been seen in [24].Fig. 8 further illustrates that the amplitude of this feature at  > 0.1 km −1 s is sensitive to the amplitude of the peculiar velocity field changing with the amount of pressure smoothing.Furthermore, the feature in the flux power can be associated with the emerging feature in the 1D power spectra of the peculiar velocities (Fig. 8; right), with the strength of the feature exhibiting a positive correlation between its amplitude and the cumulative injected heat.That the feature is stronger for later ending reionization, and weaker for earlier reionization, suggests that this behaviour is due to the hydrodynamic response of the gas to the photo-heating.
In terms of the constraints on the WDM particle mass, this suggests that a certain caution has to be exercised when pushing the models to  > 0.1 km −1 s.While the peculiar velocity feature might be related and correlated with the existing thermal parameters, it is not a-priori obvious that this new scale in the model is properly covered within the range of exisiting simulations, and therefore not properly marginalized over.

F. Thermal dependence of the mass resolution
The results of Sec.IV E show that a small-scale peculiar velocity structure can modify the amount of small-scale flux power.However, this is also the regime where the mass resolution (  ) of our simulations has the biggest effect.
The mass resolution correction of the simulations should depend on the thermal history in this high- regime of the model, i.e.   =   ( 0 ).This is perhaps not surprising -the mass resolution corrections essentially describe how much smallscale structure is missing in the (power spectrum) statistics as a result of not resolving the structure at very small scales, and its non-linear coupling to larger scales.If the field in configuration space is smoothed out due to physical effects -such as higher pressure smoothing or larger free-streaming scale -the amount of missing small-scale flux power will also be smaller.To estimate this effect we repeated the resolution correction exercise for different models in our suite of simulations (R10-; see Table I).The results, in Fig. 9, show that indeed the mass resolution correction exhibits a strong dependence on the thermal history at  > 0.1 km −1 s.In particular, late reionization models with less pressure smoothing (or lower cumulative injected heat) can show up to 5% larger mass resolution corrections compared to the fiducial correction used in the analysis.This trend is more prominent at higher redshifts, and less important at  ≤ 4.2.On the other hand, models with larger pressure smoothing scales require consistently smaller resolution corrections at small-scales, by up to 2%.
Similarly to the effect of the thermal history, the smoothing of the density field due to free-streaming also decreases the required mass resolution correction.As shown in Fig. 9 a 2 keV WDM model on average requires a 5% lower mass resultion correction at  ∼ 0.1 km −1 s at  = 5.0.This effect is reduced at lower redshifts.
These results imply that applying a mass resolution corrections that depends on the thermal history widens the range of  1D at  > 0.1 km −1 s for models within a given section of the parameter space, ultimately resulting in higher sensitivity to thermal parameters and lower sensitivity to free-streaming.On ) that represents corrections due to patchy reionization, thermal dependence on the mass resolution and independent  0 prior, and  max < 0.1 km −1 s data scale cut analysis).The resulting lower bounds on the WDM particle mass are stronger or comparable to those previously published in the literature, including studies that combined low-and high- Lyman- forest data to increase the redshift lever arm (middle panels).The top panel shows a compilation of results from non-Lyman- studies.
the other hand, the mass resolution correction that depends on  WDM shows stronger sensitivity of the  1D at  > 0.1 km −1 s, which leads to stronger constraints on the lower bound WDM mass.Current bounds on the WDM mass lie in the range of ∼ 4 − 6 keV, however, and the effect of mass resolution dependence on WDM free-streaming is severely reduced.Thus most of the effect of the mass resolution that depends on thermal history and WDM mass comes from the thermal history dependence.
The results of applying free-streaming and thermal history dependent mass resolution correction are shown in Fig. 10.Compared to the analysis without any thermal priors shown in Fig. 7, the new mass resolution correction does not shift the posterior in the thermal parameters, suggesting that the effect of peculiar velocities is not completely explained by accounting for the thermal dependence in the mass resolution.On FIG. 6: The thermal evolution of  0 () for various models, compared to the independent measurements of [44] (high-), [50] (low-) and temperature measurements of [25].The models shown are that of [39] (in black; our reference simulation run), [62] (in blue) and a new fit to [44,50] data (in red).The new fit was obtained by rescaling the model of [39], and serves as a prior in the analysis shown in Fig. 7, with prior values shown as red circles and error bars at  = 4.2, 4.6, 5.0 with values of 9155.5, 8986.5 and 9286.5 K respectively.The uncertainty propagated in the prior is 1,000 K at each of the redshifts.the other hand, the WDM constraints are weakened, roughly to the same level as when a sensible thermal prior is applied to the model, resulting in  WDM > 4.44 keV at 2.Low WDM masses of 3.19 keV and 2.78 keV are excluded at 3 and 5 respectively.Applying both the physical thermal prior ( 0 prior) and the new mass resolution correction leads to  WDM > 4.24 keV at 2.

G. Patchy reionization
The original Sherwood suite of simulations used in this study evolves the reionization homogenously throughout the simulated volume.In reality the Universe reionizes in a more complex, inhomogenous manner, where local ionized bubbles first appear around the sources of ionizing photons [6].Observations of the Lyman- forest at higher redshifts can thus still be affected by relic fluctuations of the reionization persisting for a time after most of the Universe has been reionized.This topic has been a focus of several studies over the years [7,31,[63][64][65][66].The main effect of the patchy nature of reionization on the 1D flux power spectrum of the Lyman- forest has been found to be an enhancement of power on large scales, that traces the fluctuations in the temperature and ionized fraction of hydrogen gas.The conclusions of recent works [7,31] suggest that the enhancement of power appears at larger scales ( < 5 × 10 −3 km −1 s) than those observed in [25], i.e., the flux power spectrum measurements used in this study.
While the large scale effect of ionization fluctuations and their effect on the Lyman- forest has seen a certain agreement between different methods and simulations, the same is not true for the effect of inhomogenous reionization on small scales.Spatial fluctuations of the photo-ionization rate result in spatial In the thermal parameter space (left), the default analysis (blue contours) uses thermal priors that envelop the simulations (the envelope is shown as a violet shaded area).A similar result can be obtained by instead imposing a  0 () prior (orange contours) using independent temperature measurements [44,50].The warm dark matter particle mass constraints get slightly stronger if a temperature prior is used instead of the envelope prior in the  0 −  0 plane.As a reference we also show an analysis without imposing any thermal priors (green contours).The vertical shaded band on the left panel indicates measurement of  0 from [25].
fluctuations of the temperature density relation.Regions of the IGM that are ionized later heat up later as well, while regions that reionized and heated up earlier had time to cool down, due primarily to the expansion of the Universe and inverse Compton scattering [63,64,67].As a result regions ionizing later would exhibit stronger suppression of the flux power spectrum due to thermal Doppler broadening, than IGM regions that have ionized long ago.As pointed out by [6,65] a competing effect to the thermal fluctuations, is that the IGM regions that ionized earlier had more time to hydrodynamically respond to the injected heat, resulting in a larger pressure smoothing scale.More pressure smoothing also reduces the small-scale power.It has been suggested that these two effects might largely cancel each other out, leaving small-scale power unchanged compared to homogenous reionization models.
In this study we make use of the tabulated correction to the 1D Lyman- flux power spectrum from [7], that is based on the Sherwood Relics simulation suite [6].The effect of a patchy reionization correction in that study results in a ∼ 10% suppression of Lyman- flux power at  > 0.1 km −1 s.The amplitude and shape of the suppression are largely independent of the thermal history models used in that study.Aside from the temperature fluctuations, [7] found that the dominant effect of the small-scale suppression was due to the effect of the peculiar velocity field.
The results of this analysis are presented in Fig. 11.Since patchy reionization suppresses the small-scale power, one could expect that lower values of WDM masses might be even further excluded by the data.However the small-scale suppression induced by inhomogenous reionization affects all models equally, including the models with different thermal and pressure broadening.The main effect on the Lyman- data analysis is to move the peak of the  0 posterior to lower values.The reason for this is as follows: the higher the  0 value the stronger the suppression due to thermal broadening.The flux power spectrum models for low  0 values that on their own do not exhibit enough suppression to explain the data, now achieve enough suppression through patchy reionization correction.Therefore the first conclusion is that lower  0 models that were excluded before now fit the data, and the posterior of the  0 parameter expands towards lower  0 values.On the other hand, the models with high  0 values would now show too strong of a suppression, and models that fit the data without the correction due to patchy reionization are now in tension with the data.The posterior of  0 therefore shrinks for high  0 values.Due to the prior in the  0 −  0 plane, shifting the  0 posterior to lower values also shifts the  0 posterior to lower values at each redshift, resulting in data preferring less pressure smoothing.Since both the thermal and pressure smoothing effects are reduced, the posterior of the WDM mass expands to compensate for the fact that somewhat lower WDM mass models are now no longer in tension with the data.
While the specific result shown in Fig. 11 depends on the choice of thermal priors, the main conclusion would remain the same even in the light of less stringent priors.As the  0 posterior systematically shifts to lower values, the  0 −  0 anti-correlation direction is preserved as it depends on the fact that both parameters increase the small-scale suppression.The ratio of 1D flux power spectra of very early ( rei,end = 7.5) and reference ( rei,end = 6.0) reionization models (solid blue).The dashed lines show the effect of replacing the peculiar velocity fields of both simulations simultaneously.The dashed blue line shows the effect where no peculiar velocities are included.The coloured lines show the effect of using peculiar velocity fields corresponding to different thermal histories -in particular different heat injection values.All models show a relative increase of small-scale power ratio compared to the model with no peculiar velocities.The strength of this relative increase correlates with heat injected during reionization, with  pec coming from a late reionization run (low heat injection; red dashed line) giving the strongest signal.Right: The relative increase in the small-scale structure of the 1D flux power spectrum is related to small-scale structure in the peculiar velocity field.A feature is present in the power spectrum of the peculiar velocity gradient ( = ∇ pec ), where the peak shifts from  ∼ 0.15 km −1 s for early reionization models (with higher heat injection) to  ∼ 0.30 km −1 s for late reionization models (with lower heat injection).Models with cumulative injected heat of  0 = 8.14, 11.6, 7.11, 21.1 eV/m p correspond to the L20-ref, L20-very early, L20-late, L20-very early-hot models respectively.Similarly, the models from [6] with  0 = 6.04 eV/m p correspond to their late reionization model ( rei,end = 5.3).
The resulting posterior in a scenario with wider thermal priors would therefore only extend further along the  0 −  0 anticorrelation direction, but still resulting in reduced sensitivity to  WDM .
From Fig. 11 we also observe that a ∼ 10% suppression of power in all the models results in only ∼ 0.5 shift of the posterior in the  0 −  0 plane, along the positive degeneracy axis (shift between the blue and orange countours).The constraints on the WDM mass are thus slightly weaker, with  WDM > 5.10 keV at (2).However, as was highlighted in [7], the 10% small-scale suppression is mainly driven by the peculiar velocity field differences between the inhomogenous and homogenous reionization models.As we have shown in previous sections, the exact nature of the peculiar velocity structure on small-scales has implications for the WDM mass inference, and can affect both the mass resolution correction of the simulations as well as parametrisation of the thermal history on the smallest scales probed by the Lyman- forest (∼ 50 − 100 ckpc/h).While the nature of the peculiar velocity field structure requires further study, it is reassuring that the effect on the WDM constraints is small (∼ 10%).
Combining the corrections due to inhomogenous reionization and the thermal history dependence of the mass resolution (  ( 0 )), with the thermal priors coming from independent  0 () observations ( 0 prior) we get a combined constraint on the WDM particle mass of > 5.9 keV (95% C.L.).Even though individually both the patchy reionization and the   ( 0 ) correction reduce the WDM constraining power, together with the  0 prior they are pushed in the parameter space of higher  0 values and a lower pressure smoothing scale (or late reionization), which leaves little room for additional suppression due to WDM free-streaming.While these constraints are the strongest presented in this paper, they rely on our first attempt at both a patchy reionization and   ( 0 ) corrections.With their impact on the WDM particle mass, these results provide additional incentive to improve on the modelling of the small-scale thermal history in the inhomogenous reionization models.

H. Instrumental effects
Several instrumental and observational effects can potentially systematically alter the small-scale flux power spectrum: mis-estimation of the observed flux noise, contamination due to metal lines or the instrument resolution.Of the three, the instrument resolution has been one of the more studied effects, as it has a large impact on large Lyman- Viel+13 σ P FIG.9: The effect of mass in the simulations, shown as a flux power spectrum decrement as a function of wavenumber for different models with varying the thermal history and WDM particle mass.The more heat is injected into the IGM, the more the gas is smoothed due to pressure effects.Larger pressure smoothing results in less in the flux power spectrum at small scales, and depends less on the mass resolution.The same happens if the matter density is smoother due to the presence of free-streaming, which results in a smaller required mass resolution correction.
surveys that observe spectra at lower spectral resolution [19,61,68,69].For a typical line-spread function shape the correction of the instrument resolution on the flux power spectrum is well described by a Gaussian kernel   →   / 2  =   exp  2  2  , with the Gaussian width of the resolution   = FWHM  /(2 √ 2 ln 2) given as function of the FWHM resolution element (FWHM  = /) or resolving power ().For the scales of  ≳  −1  the correction due to resolution becomes ∼ 1, dominating the total signal in the instrument.While of significant concern for lower resolution instruments such as X-Shooter ( ∼ 8, 000), the resolving power of Keck/HIRES and VLT/UVES is high enough ( ∼ 50, 000 − 80, 000) to not play a major role in flux power spectrum measurements for scale cuts  < 0.1 km −1 s [21,25,70].Indeed assuming the fiducial value of the resolution FWHM  = 6 km/s ( ∼ 50, 000) of the data [25], this translates into   = 2.55 km/s.In order to improve the fit to the data at small scales, the resolution width would have to be overestimated by 30-40%.Typically the resolution is estimated to ∼ 10%, and a factor three to four seems unlikely to be an explanation for excess small scale power.
Similarly, the contribution from contaminating metal absorption in the Lyman- forest has been studied in both low-resolution [61,71] and high resolution data [19,25,69].The contamination can be split into two main groups: (a) metals that have a rest-frame wavelength transition close to the Lyman- line (e.g.SiIII) [48,61]; and (b) metals situated at a lower redshift and associated with either IGM or circumgalactic medium (CGM) contributions [48,72].The first group (a), imprints an oscillatory feature on the flux power spectrum.The frequency of this feature increases with scale, and is typically averaged over many periods in measurements of the high- flux power spectra, leaving distinct features observable only at low k,  < 0.01 km −1 s.The second group (b) is important at all redshifts and scales, and due to large differences in redshift can be subtracted statistically by measuring the flux power spectrum on the red side of the Lyman- emission line.The metal flux power spectra are typically dominated by CIV and SiIV doublets [73], and are smaller than the Lyman- flux power spectrum by one or two orders of magnitude.The amplitude of the metal power spectrum would need to be larger by a factor of 5-10, in order to have an impact on Lyman- flux power spectrum parameter estimation.While some studies suggest that the redside metal power spectrum captures only about half of the contaminated metal content in the Lyman- forest [74], it is difficult to argue on observational grounds that the small- 10: Effect using a mass resolution correction that depends on the IGM thermal history1 (  ( 0 )).The two panels show 2D posterior distributions for redshift  = 4.2 in the plane of temperature and heat injection (left) and warm dark matter mass and heat injection (right).In the thermal parameter space (left), the mass resolution correction adds more power for models with less pressure smoothing, which makes such models easier to fit the data.As a result the posterior still lies along the  0 −  0 degeneracy axis, but the amplitude of this axis moves to lower  0 values.Similarly, the degeneracy between free-streaming and pressure smoothing (right) opens up along the degeneracy axis, because the models with low  0 require larger mass resolution corrections that increase the power, even for lower warm dark matter masses.
Default -u 0 − T 0 prior patchy + u 0 − T 0 prior patchy + R s (u 0 ) + T 0 prior FIG.11: Effect of including a correction due to inhomogenous reionization.The two panels show 2D posterior distributions for redshift  = 4.2 in the plane of temperature and heat injection (left) and warm dark matter mass and heat injection (right).The effect on the analysis mostly comes from the small scale suppression of power due to the peculiar velocity structure of the gas, rather than the large scale enhancement of power due to photo-ionization and temperature fluctuations.Different thermal history models have an almost identical suppression of power due to inhomogenous reionization compared to the equivalent homogenous reionization model.The net effect is a systematic shift along the (positive)  0 −  0 degeneracy axis in the thermal parameter space (left), in the direction of lower temperature and lower pressure smoothing.
Similarly, the degeneracy between free-streaming and pressure smoothing (right) opens up along the degeneracy axis, because the models with lower thermal and pressure smoothing leave more freedom for WDM suppression to accommodate the data.
scale enhancement of the small-scale power spectrum due to metals could significantly affect our analysis.The flux noise estimation has received somewhat less attention as a source of systematic uncertainty in high resolution and high signal-to-noise quasar spectra.It plays a crucial role in the low signal-to-noise spectra of large surveys (e.g.[61]).The flux power spectrum of the noise is 2-3 orders of magnitude lower than the Lyman- forest flux power spectrum signal in medium (/ > 20) and high (/ > 40) quality data (e.g.[19]).The signal decays exponentially towards higher wave numbers, suggesting that the noise flux power quickly becomes a bigger contribution to the signal as the analysis is pushed towards higher k.
The analysis of [25] estimated the noise power on a per quasar sightline basis in 20 ℎ −1 Mpc sections.This was achieved by measuring the raw or total flux power spectrum in each section of the Lyman- forest and estimating the asymptote level at high k.This method relies on the assumption that the noise power is white -an assumption that is largely validated in other studies (e.g.[19,60,61]) -and that it dominates at high wavenumbers.The method contends with several challenges, from a noisy estimation of the measured noise power spectrum in individual 20 cMpc/h sections, to the fact that the asymptote levels at high k will also include the metal contamination, as well as the very signal that one wishes to measure.
A careful analysis of uncertainty propagation is warranted, especially for a signal dominated by the highest wavenumbers such as is the case in this WDM study.Fig. 12 shows the probability distribution of the noise power estimates from the individual 20 cMpc/h sightline sections, in each of the three redshift bins.The vertical black lines indicate the effective average  noise assumed in the analysis of [25].This is simply a result of averaging the difference of raw and noise power per section over all the sightlines in a given redshift bin.As the average was not weighted by the signal-to-noise, the estimated average  noise is simply the mean of the distribution.One immediate conclusion of Fig. 12 is that the distribution of the noise power is not Gaussian around the mean, with the bulk of the distribution typically peaking at lower than average  noise values.The distributions at each redshift are also relatively broad.The mean of the distributions, ⟨ noise ⟩ are 0.08, 0.1 and 0.12 for  = 4.2, 4.6 and 5.0 respectively.This corresponds to roughly 5% of the total power at the highest wavenumber in the data.We approximate the  noise distributions with a log-normal model with the min/max range of the measured values (solid black lines in each of the redshift bins in Fig. 12).
The noise power spectrum distribution in Fig. 12 is dominated primarily by the distribution of signal-to-noise in the data, as well as the mean transmission variations among the 20ℎ −1 Mpc segments of the Lyman- forest.This has been verified in mock data with pathlength and redshift ranges of observed quasars reported in [25].The methodology of estimating the noise power asymptote within each 20ℎ −1 Mpc segment is noisy it is therefore unlikely that the uncertainty on the noise power estimation exceeds the width of the distributions in Fig. 12.As such we use the distribution of the noise power as a conservative prior.
In order to asses the potential impact of noise mis-estimation in the data, we add a constant term  noise () to the model of the 1D flux power spectrum.This term is scale independent, but is modelled separately for each redshift bin.Since the mean of the noise flux power spectra distributions were already subtracted from the data, this constant term measures the deviation of the noise flux power from this mean value.In the data analysis step, the noise is subtracted from the raw power before the resolution correction of the instrument is deconvolved.The theoretical 1D flux power spectrum model is modified as follows: where Ly ,1 is the Lyman- flux power spectrum as given by the emulator, ⟨ noise ⟩() are the means of the  noise distribution in each of the redshift bins, and  2 () is the instrumental correction due to resolution and pixel size (Following [25] we use a pixel size of 2.5 km/s and a FWHM resolution of 6.0 km/s using top-hat and Gaussian kernels for the two corrections, respectively.).Fig. 13 shows the results of the analysis where three  noise parameters were added to the theoretical model (one for each redshift bin), and the parameters' priors were assumed to be given by the approximate log-normal model of the  noise distribution.The resulting WDM mass constraint is slightly weakened, and the lower WDM mass bound is  WDM > 3.91 keV.The thermal constraints are significantly degraded along the  0 −  0 degeneracy axis.This is because at every individual redshift, the noise parameter  noise strongly correlates (anticorrelates) with the IGM temperature (cumulative heat injection), whereas the correlation with the WDM mass parameter is weaker.The marginalized mean of the posteriors (and their best-fit) values of  noise are 0.74 +0.49−0.49(0.24), 1.12 +0.49−0.29 (1.73), 0.87 +0.56  −0.15 (1.38).The data prefers values of  noise > 0, implying noise was underestimated.The best-fit values also show a slight increase with redshift, suggesting that the effect was larger for higher-redshift quasar spectra.The typical values of  noise are of the order of unity, suggesting that the noise subtraction of the data performed by [25] may be incomplete.The sensitivity of thermal parameters to this relatively small noise contribution is quite large, possibly implying that measurements of the IGM temperature and reionization are very sensitive to noise subtraction in the data.There is also sensitivity of the WDM mass constraints to this effect, although somewhat reduced compared to the thermal parameters.
The sampling of the  noise distribution is relatively sparse, measured in 20 ℎ −1 Mpc sections in only 15 quasar sightlines in each redhift bin.This marks a significant improvement on previous measurements, but is nonetheless sensitive to sample variance.To understand the sensitivity of the conclusions of this analysis step we modify the prior choice to be a Gaussian distribution, with the mean and the standard deviation estimated from the average and variance of the samples in each redshift bin.Since the posteriors of  noise for the highest two redshifts are dominated by the upper limit on the prior range, we further allow this Gaussian prior to have no min/max limits other than the physical requirement that noise is larger or FIG.12: The probability distribution of the measured noise power spectra from [25] in each of the redshift bins.The measured distribution is reasonably well approximated by a log-normal distribution at each redshift, shown as a solid black line.For comparison we also show a normal distribution with mean and variance computed from the first two moments of the measured distribution.The distributions are fairly broad, however the inclusion or removal of tails beyond the range of measured  noise has a negligible effect.As the noise affects the amount of small-scale power it effectively removes the information from those scales, which leads to poorer constraints on thermal parameter as well as warm dark matter mass.The noise distribution is marginalized over the measured distribution from [25].However, the results remain largely unchanged if the shape of the distribution was changed to a normal distribution with a standard deviation of 10% of the measured power.This can also be included at the level of the covariance matrix.
equal to zero  noise ≥ 0. This allows for a tail of the  noise distribution to arbitrarily large values.The mean of the posterior distributions of  noise parameters however do not move significantly.The main difference is the tail of the posteriors towards higher  values .Note that the WDM constraint changes less than 1%.
If the signal at the high- end of [25] data is indeed due to under-subtracted noise power, then the situation should improve with better and more data.If the flux uncertainties are dominated by the read-out noise component, then increasing the signal-to-noise (/) ratio of individual quasar sightlines quickly reduces the level of noise power ( noise ∝ (/) −2 ) [75].Future studies should thus suffer less from the impact of instrumental effects on the flux power spectrum measurements, allowing for exploration of data to high  max .In the thermal parameter space (left) limiting the analysis to the value of  max < 0.1 km −1 s -similar to previous analyses using Ly forest data -has a similar effect to marginalizing over noise or thermal dependence of the resolution correction.The posterior stretches in the degeneracy direction of  0 −  0 .Similarly the warm dark matter mass constraints relax as more thermal support can accommodate the data.

I. Small-scale data cuts
In order to facilite a more direct comparison between the new analysis using [25] data, and previous analyses using high redshift HIRES/MIKE data [21,22] a consistency check can be performed by limiting the new analysis to the same scale cuts ( < 0.1 km −1 s).We further compare such an analysis to [21,22] that used  max = 0.088 km −1 s, and a similar redshift range.The previous analysis extended to  = 5.4, however the flux power spectrum uncertainty at this highest redshift was considerably larger, and most of the constraining power came from the  = 4.2, 4.6, 5.0 redshift bins, which are also the ones used in this study.Furthermore, we limit the comparison to the thermal history priors where  0 is varied independently in each redshift bin.In [22] the resulting lower bound on the WDM mass was ∼ 2.1 keV (2) (MIKE/HIRES Iršič+17 + wide thermal prior (Fig. 5); [22]).
A similar test was performed in [24], where the reported value on the WDM mass bound sits at 3.6 keV.The same scale cuts were used, using the quasar spectra data of [25].However somewhat different thermal history priors were applied.Fig. 14 shows the result of small scale data cuts in this analysis.Using the same scale cuts and treatment of the thermal history with the new data improves the constraint to  WDM > 4.09 keV (2) ( max < 0.1 km −1 s -this work (Fig. 5)).The right hand side panel of Fig. 14 illustrates that imposing conservative scale cuts reduces the sensitivity to  −1 WDM and pressure smoothing scale as probed by the injected heat  0 .This reduced sensitivity to the pressure smoothing can be understood in the thermal parameter space (left panel of Fig. 14) as expanding of the posterior along the  0 −  0 degeneracy axis.The posterior in this parameter space also shifts by ∼ 0.2 along the positive  0 −  0 relation that exists in hydro-dynamical simulations.The shift, however, is small, and can at least in part be attributed to reaching the corner of the priors at low  0 and low  0 values.
In the case of conservative scale cuts ( max < 0.1 km −1 s) the best-fit improves over the fit to all the data, with the  2 /d.o.f.= 10.2/20(see Table II).The fit prefers slightly warmer temperatures of the IGM ( 0 ( = 4.6) ∼ 8, 400 K) and slightly higher value of cumulative injected heat.The posterior of the  0 parameter is very wide, however, suggesting that with conservative scale cuts the data are not sensitive to this parameter anymore.This has been observed in previous analysis using older data sets that did not extend beyond  ∼ 0.1 km −1 s.

V. DISCUSSION AND CONCLUSIONS
This study presents new constraints on the free-streaming of WDM using a simulation based likelihood and Bayesian analysis of the VLT/UVES and Keck/HIRES Lyman- forest flux power spectrum measurements of [25].The new constraints of our fiducial analysis on the mass of a thermal relic WDM particle mass,  WDM > 5.7 keV, are the strongest to date.For the fixed shape of the WDM transfer function used in this study, the bound on the WDM particle mass translates into a wavenumber scale below which the matter power spectrum cannot drop by more than 5%,  0.05 = 14.35 ℎ −1 Mpc.
Comparing to the previous high-redshift Lyman- forest data from HIRES/MIKE [21], the new data comprises of a larger number of quasar spectra in the range 4.2 <  < 5.0, and is probing small scales up to a wavenumber of  max = 0.19 km −1 s -almost a factor of two improvement.Limiting the analysis to the same  max cuts, and thermal state priors, as in previous HIRES/MIKE analyses we find the constraint to be  WDM > 4.1 keV (this work), compared to  WDM > 2.0 keV (e.g.[22]).This factor two improvement on the bound on the WDM particle mass is consistent with the expected improvement of the statistical power of the high-redshift Lyman- forest data, and WDM mass sensitivity at 0.1 km −1 s dominated by statistical uncertainty.This result is qualitatively similar to the recent analysis of [24] using the same scale cuts, and a different thermal state parametrisation.
Recent studies of [24,76] have found a preference for nonzero  −1 WDM in their default analysis, indicating a preference for a WDM cosmology.This warrants further study and rigorous tests on both the data and theory side.Our findings here lead us to suggest that one possibility is that the non-zero preference in the WDM parameter space is a result of the restricted variations in the thermal parameters.E.g., the analysis of [76] assumes a thermal history with very low cumulative injected heat, and therefore a small amount of pressure smoothing.Limiting the amount of pressure smoothing can be of interest in specific applications, but in terms of a WDM particle mass constraint a thorough marginalization over the parameter space should be more robust.The analysis of [24] follow a similar simulation setup and parameter space variation as in this work, except for two main differences: (a) the variations in the redshift of hydrogen reionization were much narrower than explored in this work, and (b) parametrisation in the  rei − 0 as opposed to  0 − 0 plane only allowed for more restricted thermal histories.
The HIRES/UVES data of [25] has also recently been used to provide constraints on a slightly different class of dark matter models -ultra-light axion dark matter [15].While the transfer functions of the two dark matter models are different enough that a direct comparison is non-trivial, the results of [15] suggest a strong bound on the thermal WDM mass, while at the same time recovering a hotter thermal history compared to both results in the literature [24,25] and the results of this study.A more thorough investigation is required but a major difference in the simulation setup of [15] is the initial condition generation with MP-Gadget [77] which uses glass initial conditions for the gas component.This introduces spurious small-scale power [78].
Additionally, [15] uses a Lyman- spectral extraction code fake_spectra [79] that uses a different optical depth assignment scheme that leads to additional enhancement of smallscale power in the 1D Lyman- forest flux power spectrum [80].These differences in the simulated flux power spectrum suggest further investigation is required for the comparison with the ultra-light axion constraints of [15].
Further improvement in the WDM constraint comes from the smallest scales,  > 0.1 km −1 s.The sensitivity to the WDM mass is increased at smaller scales, resulting in potentially stronger constraining power.However, the regime of  > 0.1 km −1 s is also more sensitive to observational and modelling systematics.In this study we have reviewed several aspects of the observational and instrumental systematics, of which the observational flux noise subtraction in the Lyman- forest flux power spectrum is potentially the most likely to affect the results.An average of a factor of two increase in the noise power (or 40% increase in the level of noise), would on its own explain the small-scale signal observed, with no additional cosmological information beyond  max > 0.1 km −1 s.While this appears not very likely, it illustrates the need for improved treatment of the noise power at smallest scales in future Lyman- forest data analyses.
The thermal history priors have previously been identified as the dominant source of modelling systematics in the Lyman- forest flux power spectrum.In this study we revisited this, by exploring thermal priors motivated by different assumptions: a prior in the plane of IGM temperature and cumulative injected heat that envelopes physically motivated simulations consistent with the still rather weak constraints on the evolution of the neutral hydrogen fraction during the epoch of reionization, or a simple prior on the IGM temperature as interpolated from the measurements of the IGM temperature at  < 4.2 and  > 5.0.This was possible due to the improved range of simulations as well as post-processing techniques to expand on the number of models.While the posterior distributions are indeed sensitive to the choice of these priors, the WDM mass only changes by 2%.This suggests that reasonable thermal priors lead to a stable and robust constraint on the WDM particle mass.We further point out that not imposing any thermal priors leads to a stronger and not weaker bound on the WDM particle mass.
With the data extending to  > 0.1 km −1 s, we have also identified a new source of modelling systematics -the gas peculiar velocity field as modified by inhomogenous reionization.The effect of peculiar velocities is present, although different in amplitude, in both homogeneous and inhomogeneous models of reionization.The peculiar velocity field induces a knee in the flux power spectrum, that appears sensitive to the cumulative injected heat, indicating that the timing and process of reionization are an important factor in the peculiar velocity structure.This high- regime of the models is also sensitive to the numerical mass resolution of the simulations (at a level of up to 20%).The peculiar velocity field structure is sensitive to the mass resolution at the level of as much as 50%, resulting in thermal history dependent mass resolution corrections.All of these statements result in a similar effect on the WDM mass inference, weakening the constraint to  WDM > 4.44 (5.10) keV at 95% C.L. for including the mass resolution and inhomogeneous reionization respectively.Combining inhomogenous reionzation and mass resulution corrections together with observationally motivated thermal prior results in a WDM constraint that is not very different from our fiducial analysis, while at the same time preferring a slightly hotter IGM and reionization histories that end later.These statements are, however, somewhat model and simulation dependent, and indicate that further work into the origin and impact of the small-scale peculiar velocity structure is required.However, an important result for the particle astrophysics modelling is that WDM particle masses of 2.5 keV are ruled out at more than 5, and 3 keV at more than 3, for any of the analysis choices presented in this paper.In fact, the 3 keV is ruled out at 5 for any of the reasonable choices of thermal priors (e.g. a  0 prior).
We summarize the main conclusion points as follows: • Our fiducial analysis leads to improved WDM mass constraints from high-redshift quasar spectra of  WDM > 5.7 keV at 95% C.L.
• Using small scale data cuts, limiting the analysis to  max < 0.1 km −1 s, results in a WDM constrain of  WDM > 4.1 keV at 95% C.L., a factor of two stronger constraint than previously published for the same choice of thermal priors and redshift range of the data [21,22].
• The 50% higher WDM constraint coming from smallscales  > 0.1 km −1 s has been explored with a variety of checks for instrumental systematics.We find that the flux noise may be underestimated by 40% in the data, reducing the constraining power at  > 0.1 km −1 s.It should be possible to mitigate this in future surveys by a careful study of the instrumental noise, as well as by obtaining higher signal-to-noise spectra.
• The modelling uncertainties on the small-scale peculiar velocity structure can weaken the constraining power on the WDM mass by as much as 25%.The effect of thermal history and inhomogenous nature of reionization on the peculiar velocity fields of the baryonic gas is still poorly explored and needs further study.
The Lyman- forest data continue to push the frontier on astrophysical constraints on the CDM paradigm.The new constraints on the free-streaming of dark matter in this study both improve on exisiting constraints, and demonstrates that a larger number of quasar sightlines should translate into strong improvements on the WDM particle mass bound.As has been done in previous studies [18,21,22], the high-redshift Lyman- forest data can be further combined with the lowredshift ( < 4.0) flux power spectrum measurements in order to increase the redshift leverage arm, and further push the constraints on the free-streaming of dark matter.We leave such a study for future work.

6 ,
τ eff = 1.40 u 0 = 7.69 eV/m p z = 4.6, τ eff = 1.40 u 0 = 7.69 eV/m p z = 4.6, τ eff = 1.40 u 0 = 7.69 eV/m p z = 4.6, τ eff = 1.40 u 0 = 7.69 eV/m p z = 4.6, τ eff = 1.40 u 0 = 7.69 eV/m p T 0 = 0.84 × 10 4 K T 0 = 0.92 × 10 4 K T 0 = 1.01 × 10 4 K T 0 = 1.06 × 10 4 K T 0 = 1.12 × 10 4 1.01 × 10 4 K T 0 = 1.01 × 10 4 K T 0 = 1.01 × 10 4 K T 0 = 1.01 × 10 4 K u 0 = 6.59 eV/m p u 0 = 7.69 eV/m p u 0 = 9.59 eV/m p u 0 = 11.43 eV/m p m WDM = 4.0 keV m WDM = 3.0 keV m WDM = 2.0 keV 0 −1.8 −1.6 −1.4 −1.2 −1.0 −0.8 log 10 k/[km −1 s] FIG.3:The 2D posterior distributions of the best-fit analysis for the 1D flux power spectrum measurements of[25] using UVES/HIRES quasar spectra.The blue contours show the default analysis, and the orange contours show the analysis that captures our best knowledge of the thermal history (Sec.IV D), inhomogenous reionization (Sec.IV G) and mass resolution corrections (Sec.IV F).The three columns correspond to the three different redshifts of  = 4.2, 4.6 and 5.0 (from left to right).The bottom row shows the contours in the thermal parameter space, with the violet band shows the envelope around the physically motivated simulations, shown as gray points (squares, circles and triangles).This band serves as a prior in the thermal parameter space in the default model.The coloured points correspond to the measurements in the literature from the same data set: from[25] (purple);[15] (in green); and[24] (in red).The top panels show the 1 and 2  contours in the parameter space of free-streaming and pressure smoothing (heat injection).The vertical dotted line and surrounding gray band indicate the best-fit measurements of[25].The intersecting gray dashed and dot-dashed lines show typical degeneracy axes between the parameters.The cutoff at small  0 and small  0 values comes from the implicit prior imposed by the extent of the grid of models (see text for details).

FIG. 5 :
FIG.5:The 2 constraints on the thermal relic warm dark matter mass.The arrows indicate the exclusion limits on the WDM particle mass in keV.The bottom panel shows a compilation of constraints from high redshift Lyman- forest 1D flux power spectrum.The black arrows at the very bottom indicate the results of this study, for three different analysis choices pertaining the measured flux power at highest wavenumbers (default, (,   ,  0 ) that represents corrections due to patchy reionization, thermal dependence on the mass resolution and independent  0 prior, and  max < 0.1 km −1 s data scale cut analysis).The resulting lower bounds on the WDM particle mass are stronger or comparable to those previously published in the literature, including studies that combined low-and high- Lyman- forest data to increase the redshift lever arm (middle panels).The top panel shows a compilation of results from non-Lyman- studies.

FIG. 7 :
FIG.7: Effect of thermal priors on the posterior.The panels show 2D posterior distributions for redshift  = 4.2 in the plane of temperature and heat injection (left) and warm dark mass and heat injection (right).In the thermal parameter space (left), the default analysis (blue contours) uses thermal priors that envelop the simulations (the envelope is shown as a violet shaded area).A similar result can be obtained by instead imposing a  0 () prior (orange contours) using independent temperature measurements[44,50].The warm dark matter particle mass constraints get slightly stronger if a temperature prior is used instead of the envelope prior in the  0 −  0 plane.As a reference we also show an analysis without imposing any thermal priors (green contours).The vertical shaded band on the left panel indicates measurement of  0 from[25].

v 5 PFIG. 8 :
FIG.8: Effect of peculiar velocities on the small scale ( > 0.1 km −1 s) suppression of power.Left: The ratio of 1D flux power spectra of very early ( rei,end = 7.5) and reference ( rei,end = 6.0) reionization models (solid blue).The dashed lines show the effect of replacing the peculiar velocity fields of both simulations simultaneously.The dashed blue line shows the effect where no peculiar velocities are included.The coloured lines show the effect of using peculiar velocity fields corresponding to different thermal histories -in particular different heat injection values.All models show a relative increase of small-scale power ratio compared to the model with no peculiar velocities.The strength of this relative increase correlates with heat injected during reionization, with  pec coming from a late reionization run (low heat injection; red dashed line) giving the strongest signal.Right: The relative increase in the small-scale structure of the 1D flux power spectrum is related to small-scale structure in the peculiar velocity field.A feature is present in the power spectrum of the peculiar velocity gradient ( = ∇ pec ), where the peak shifts from  ∼ 0.15 km −1 s for early reionization models (with higher heat injection) to  ∼ 0.30 km −1 s for late reionization models (with lower heat injection).Models with cumulative injected heat of  0 = 8.14, 11.6, 7.11, 21.1 eV/m p correspond to the L20-ref, L20-very early, L20-late, L20-very early-hot models respectively.Similarly, the models from[6] with  0 = 6.04 eV/m p correspond to their late reionization model ( rei,end = 5.3).

FIG. 13 :
FIG.13: Effect of marginalizing over the noise uncertainty distribution.The two panels show 2D posterior distributions for redshift  = 4.2 in the plane of temperature and heat injection (left) and warm dark matter mass and heat injection (right).As the noise affects the amount of small-scale power it effectively removes the information from those scales, which leads to poorer constraints on thermal parameter as well as warm dark matter mass.The noise distribution is marginalized over the measured distribution from[25].However, the results remain largely unchanged if the shape of the distribution was changed to a normal distribution with a standard deviation of 10% of the measured power.This can also be included at the level of the covariance matrix. 3 ).The cosmology parameter ranges for  8 include five runs [0.754, 0.804, 0.829, 0.854, 0.904] and similarly 5x runs for   [0.921, 0.941, 0.961, 0.981, 1.001].The WDM mass in keV −1 of 0 indicates a CDM run.The other WDM runs are for 2, 3 and 4 keV WDM particle mass.