How blind are underground and surface detectors to strongly interacting Dark Matter?

Above a critical dark matter-nucleus scattering cross section any terrestrial direct detection experiment loses sensitivity to dark matter, since the Earth crust, atmosphere, and potential shielding layers start to block off the dark matter particles. This critical cross section is commonly determined by describing the average energy loss of the dark matter particles analytically. However, this treatment overestimates the stopping power of the Earth crust. Therefore the obtained bounds should be considered as conservative. We perform Monte Carlo simulations to determine the precise value of the critical cross section for various direct detection experiments and compare them to other dark matter constraints in the low mass regime. In this region we find parameter space where typical underground and surface detectors are completely blind to dark matter. This"hole"in the parameter space can hardly be closed with an increase in the detector exposure. Dedicated surface or high-altitude experiments may be the only way to directly probe this part of the parameter space.


I. INTRODUCTION
The existence of large quantities of dark matter (DM) in the universe is backed by strong astrophysical evidence [1,2]. Direct detection experiments are consequently trying to observe nongravitational interactions between DM particles from the galactic halo and target atoms inside a terrestrial detector [3,4]. So far no conclusive signal has been reported, and detectors continue probing weaker and weaker DM-nucleon interactions by increasing their exposure [5] or extend their search to lower DM masses by decreasing their recoil energy threshold [6,7]. Other approaches focus on experimental signals other than nuclear recoils to probe light DM, as e.g. DMelectron scatterings [8] or inelastic DM-atom scatterings that produce photons [9,10].
These experiments are typically located underground, beneath ∼ 1 km of rock, and equipped with additional shielding layers in order to reduce background signals. This comes at the price of impairing the experiment's sensitivity to strongly interacting DM, since elastic DM-nucleus collisions occur not only in the target material but also inside the Earth crust prior to reaching the detector. These collisions slow down and deflect the DM, attenuating the flux of DM particles capable of triggering the detector, such that the Earth crust could shield off DM. Above a critical cross section the shielding effect of the crust or the atmosphere is effective enough that any detector on Earth becomes blind.
The scenario of strongly interacting DM 1 is highly constrained by astrophysical observations. Strong interactions between DM and baryons in the early universe lead to momentum transfer between them and alter the anisotropy of the Cosmic Microwave background (CMB) [11][12][13]. Strong interactions between DM and cosmic ray nuclei lead to the production of neutral pions and consequently γ-rays [14][15][16]. They also introduce a collisional damping effect during cosmological structure formation [17]. Further constraints on strongly interacting DM are set by satellite experiments such as IMP7/8 [18,19], early experiments on the Skylab space station [19,20], searches for new forces between nucleons [21], the rocket-based X-ray Quantum Calorimeter experiment (XQC) [22], as well as balloon-based experiments such as IMAX [19,23] and the one by Rich et al. [24]. Considering that terrestrial detectors lose sensitivity above a certain cross section, an important question is whether or not there exists allowed parameter space, a "window" between the constraints mentioned above and the ones from direct detection [25].
Indeed there have been several allowed windows in the constraints on strongly interacting DM over a wide range of DM masses [25,26]. In order to close these windows, new constraints were derived based on DM capture and heat flow inside the Earth [27], on observations by the IceCube experiment [28] and the Fermi Gamma-ray Space Telescope [15]. For light DM a window in the constraints was closed by reanalyzing old results from DAMIC and XQC [29], and using the new results of the CRESST 2017 surface run [30]. Most recently several holes in the constraints on super-heavy DM were closed using direct detection experiments only, namely by a reanalysis of CDMS-I [31].
As we mentioned, predetection DM-nuclei scatterings can reduce and effectively eliminate the flux of detectable DM particles in underground detectors. Consequently terrestrial experiments constrain DM-nucleon cross section within a band only. Below the lower limit, there is simply not enough exposure for an experiment to probe DM. Above the upper critical cross section, no constraint can be imposed, because DM interacts too strongly with nuclei and loses enough energy or gets deflected, resulting in a non-detectable flux at the location of the detector.
In this paper, we use Monte Carlo (MC) techniques to precisely determine this upper critical cross section, above which each experiment fails to probe DM. Previous works on this matter treated the stopping of DM in an overburden with analytic formulas which capture the average energy loss of DM particles traversing through the Earth crust [25,[30][31][32][33]. However, this analytic treatment does not suffice for a precise determination of the critical cross section. Since the analytic description overestimates the stopping power of the Earth crust by capturing the average energy loss, the bounds obtained this way turn out to be conservative, and may be extended towards higher cross sections. For this purpose dedicated MC simulations of DM particles propagating and scattering in the shielding layers are necessary, as first applied in this context in [26], to DM-electron scattering experiments in [34], and further developed, motivated and applied in [29,35].
Similar simulations have been used to study diurnal signal modulations for intermediate cross sections in [36][37][38] and more recently with our DAMASCUS code in the context of light DM [39,40]. Built upon this tool we developed a dedicated simulation code for the determination of the critical cross section of strongly interacting DM. In this paper we present results for CRESST-II [41], the CRESST 2017 surface run [7], and XENON1T [5], and put the direct detection constraints into context with constraints from other sources. We also compare the MC methods to analytic treatments.
In the first section of this paper we review the commonly applied analytic methods to describe nuclear stopping of DM inside matter, its application to direct detection, and motivate the use of MC simulations. Then we move on to describe our simulations and discuss the method to obtain the critical DMnucleon cross section. In section IV we present our findings, before we conclude in section V. Furthermore we provide the interested reader with a set of appendices, which go into more detail about computational methods, the considered detectors and the implemented model of the Earth crust and atmosphere.
The code DAMASCUS-CRUST developed for this paper is publicly available [42].

II. REVIEW OF ANALYTIC METHODS
Different analytic methods have been applied to set limits on the sensitivity of DM detectors on strongly interacting DM. In this section we review these methods, pointing out their shortcomings and motivate the use of MC simulations.
We start by considering a DM particle of mass m χ and energy E χ moving through matter interacting with the nuclei via spin-independent contact interactions. The average energy loss due to elastic scatterings is described [25,32] by where i runs over the abundant nuclei species, n i is their corresponding number density, and E R is the nuclear recoil energy with its maximum value E max mi . From this point on µ ij denotes the reduced mass of particle i and j, and m i is the nuclear mass of isotope i with mass number A i . The differential cross section 2 is given by (2a) 2 To avoid notation clutter we omit the "SI" from now on. with σ tot Here we introduced the spin-independent DM-nucleon scattering cross section σ χn , and the nuclear form factor F i (q 2 ), a function of the momentum transfer q, which is related to the recoil energy via q 2 = 2m i E R . For light DM setting F i (q 2 ) ≈ 1 is an excellent approximation, and the differential cross section no longer depends on E R . Hence in this case eq. (1) reads and becomes solvable for a homogenous medium with constant density and composition. Then DM particles of initial energy E ini χ travelling a distance d through this medium will end up with the average final energy In the last expression ρ is the (constant) mass density of the medium and f i the mass fraction of nuclei with mass number A i . In an earlier work we numerically confirmed, that this equation indeed describes the average energy loss accurately [34].

Method a:
We can use the analytic description of nuclear stopping to give a first estimate of the critical cross section. Every direct detection experiment has a recoil energy threshold E thr R . For a given DM mass m χ this corresponds to a minimal energy E min χ , such that less energetic DM particle can no longer trigger the detector, Here m T is the mass of the detector's lightest target nuclei. Above a certain cross section even the most energetic DM particles of the halo travelling towards the detector on the most direct path get slowed down below this value via nuclear stopping. It may serve as a first estimate of the critical cross section. Therefore we solve for σ χn using the initial energy E ini χ = mχ 2 (v esc + v ⊕ ) 2 with the maximum DM speed of the halo, the sum of the galactic escape velocity v esc ≈ 544 km/sec and the Earth's velocity in the galactic frame, v ⊕ ≈ 230 km/sec. The detector under consideration is assumed to be located at an underground depth d. We therefore assumed that the particles take the shortest way from the Earth surface to the detector. The resulting estimate of the critical cross section is given by This method has been developed and applied in [25,32,34]. It turned out to be a reasonably good first estimate of the critical cross section for strongly interacting DM, but has a number of shortcomings.
1. Unlike the usual lower limit on the cross section set by a direct detection experiment, the critical cross section obtained this way is completely independent of the experimental exposure. In order to set the upper and lower constraint limits on the cross section on equal footing, we need to find the critical cross section using equivalent methods.
2. To set v ini χ = v esc + v ⊕ seems rather arbitrary and ignores the knowledge of the halo's speed distribution. However, this makes this approach more conservative. 3. We assumed F i (q 2 ) ≈ 1 for the nuclear form factor, which holds for light dark matter only.
4. The actual particles do not travel on a straight line toward the detector. This approach ignores deflections of the particles. However, one can argue that also this makes (7) more conservative since longer distances travelled underground increase the energy loss.
5. The analytic stopping equation overestimates the stopping power in finding the precise constraint, as pointed out in [29,35]. We emphasized that the analytic stopping equation accurately describes the average energy loss of DM particles travelling through matter. It does not describe the few, rare particles from the distribution's tails, which scatter on much fewer nuclei than the average particle. The number of these particles is naturally suppressed, which is compensated by the large probability to trigger the detector, once a DM particle actually reaches it. The high cross section both increases the energy loss in the overburden as well as the detection event rates.
These points make it hard to evaluate precisely the validity of eq. (7), since point 2 and 4 underestimate the crust's shielding power, while point 5 overestimates it. With analytic means problem 5 can hardly be solved, while point 4 can be addressed in the single scattering regime only [43], where the cross section is too low to impair the overall sensitivity. However, point 1, 2 and 3 can be addressed directly by implementing the analytic stopping description into the computation of detection event rates, as we will do next.

Method b:
The differential event rate of a generic direct detection experiment is given [44] by where f d (v) is the speed distribution at the detector's location 3 . It is straightforward to include the effect on nuclear stopping by altering the speed distribution. For this purpose we rewrite eq. (4) in terms of speed, v fin Next we need to find the DM speed distribution function f d (v) at depth d based on our knowledge of the distribution f (v) at the surface. Since we assume that all particle move on a straight line from the surface to the detector while getting decelerated, the particle flux is conserved, We can read off the underground distribution, The speed distribution f d (v) can be seen in fig. 1. This ex- Note the assumption that all particles move directly from the surface straight down to the given depth. Because the particle flux is conserved, the density grows as the particles get slower. This is reflected by the fact that the distribution function is no longer normalized.
pression can be substituted into (8). The resulting spectrum can be written in terms of the surface distribution f (v), The result can then be used to compute recoil spectra and event counts in the usual way, and the shielding effect of the overburden is included automatically, i.e. the signal weakens rapidly with increasing cross section above some critical value. At this point we see that the critical cross section obtained from method a is exactly the value, above which we obtain zero using eq. (11). Hence method a will always slightly overestimate the upper bound of the excluded band compared to method b. This method was further refined and applied [30], taking into account the geometry of the whole Earth, not just the overburden above the laboratory as well the atmosphere. However, close to the critical cross section the DM-matter interactions are so strong that virtually all DM particles reach the detector from above. For intermediate cross sections additional attenuation occurs due to DM particles passing through the bulk of the Earth before reaching the detector from below, which is not included in eq. (10).
Finally the nuclear form factor can also be included, as done in the context of constraining super-heavy DM [31,45], fixing problem 3 of our previous list. This leaves us with point 4 and 5, both of which can be solved by MC simulations of DM trajectories undergoing scatterings on nuclei of the shielding layers above a detector.

III. MONTE CARLO SIMULATIONS
The general principles of our MC simulations are laid out in detail in [39]. We simulate DM particles that travel underground on straight lines until they scatter elastically on a nucleus, where the particle gets decelerated and deflected. As opposed to the DAMASCUS code we model the overburden, i.e. the Earth crust, the atmosphere and shielding layers as planar layers instead of a spherical Earth model. Close to the critical cross section the DM-baryon interaction is so strong that particles reach the detector virtually exclusively from above. For the details of the crust and atmosphere models we refer to appendix C.
The number of shielding layers in our simulations is variable, each layer is characterized by its density and nuclear composition. The mean free path λ inside layer i is given by Here ρ i denotes the ith layer's mass density, and n ik , f ik the corresponding number density and elemental abundance of the isotope k. Before a DM particle scatters inside a shielding layer, it travels freely for a distance with ξ being a uniformly distributed random number. Where a particle passes a boundary into another shielding layer before the next scattering, the mean free path changes along the way.
To account for this discontinuity, we implement a recursive algorithm to find the location of the next scattering regardless of how many layer boundaries are crossed. A flow chart can be found in figure 2.
Once the location of scattering is determined, the collision on a nucleus is simulated, where the probability of colliding on nucleus species k of layer i is given by The particle's velocity changes from v χ to The unit vector n yields the final direction of the DM particle in the center-of-mass frame. Here we implicitly assumed that the target nuclei are particles at rest. The relative velocity is dominated by the incoming DM particle. We denote the angle between n and the incoming velocity v χ as α. For light DM the distribution of n is isotropic. For more details on the distribution function of α, or rather cos α, we refer to the appendix A 2.
The main goal of the MC simulations is to obtain a precise estimate for the DM speed distribution f (v) at the detector's location over the interval of interest The minimum speed depends on the experiment's threshold E thr , target atoms, and the energy resolution where m T is again the mass of the lightest target isotope. For light DM the interval of interest typically lies in the high velocity tail. The above steps repeat until either the particle gets reflected into space, gets decelerated below the minimum speed, or reaches the depth of the detector. In the last case we record its speed.
To make the connection from the data to direct detection rates, we need to estimate the speed distribution f d (v) at depth d. Our estimate is given by the product of an attenuation factor and the normalized speed distribution, The attenuation factor accounts for the fraction of particles making it to detector depth, while still being detectable, Note that we only sample initial conditions within the inter- Solve eq. (A12) for l.
Are both x0 and xnew in the same layer?
No. Did the particle reach the detector or space? No.
FIG. 2. Recursive algorithm to find the location of the next scattering with multiple layers and Importance Sampling (hence the presence of δ λ , see app. A 2). The algorithm concludes when either the location of the next scattering is found, the particle gets reflected into space, or the particle reaches the detector depth. Note that λ refers to the local mean free path of a particle at x0 with speed v0. It is furthermore important that xexit is uniquely identified as element of the new layer.
Hence, if we simulate N sim particles, this corresponds to a total particle number of Here we assumed to have simulated N sim trajectories to get N data points, associated with each is a statistical weight w i . For a given point in parameter space (m χ , σ χn ) and a given experiment at depth d, the simulations provide us with a sample of speed data and their corresponding weights {(v 1 , w 1 ), ..., (v N , w N )} as well as the attenuation factor a. We follow the following steps to extract the recoil spectrum.
3. Depending on the experiment of interest compute the recoil spectrum dR dE including the detector's energy resolution, efficiency, and quenching factors.

Compute detection signal counts and likelihoods.
For a given DM mass m χ we repeat the simulations for different cross sections. Starting at the usual upper bound we systematically increase the cross section until the likelihood grows above (1-CL), and the experiment no longer constrains the cross section. The exact value of the critical cross section is finally determined by interpolating the likelihood as a function of the cross section.

A. Computational details
The simulation code DAMASCUS-CRUST is written in C++ loosely based on our previous DAMASCUS tool [40]. It is publicly available [42]. All simulations for this work were run on the Abacus 2.0, a 14.016 core supercomputer of the DeIC National HPC Center, SDU. At various places we use interpolation with Steffen splines, which has the great advantage of avoiding spurious extrema [46]. All numerical integrations in the simulation and data analysis are performed with the adaptive Simpson method [47]. For the non-parametric, data-based estimate of the speed distribution function we use univariate Kernel Density Estimation (KDE). To speed up the simulations we implement Importance Sampling (IS) as proposed by Mahdawi and Farrar [29,35]. The details on both KDE and IS are presented in appendix A.

IV. RESULTS
We compare the different methods reviewed in section II with results from the MC simulations. As an illustrative example we show the number of events at XENON1T for a DM mass of m χ = 10 GeV as a function of the DM-nucleon scattering cross section in figure 3. Without the stopping effect of the 1400m of rock overburden taken into account the number of events naturally keeps growing with the cross section. If, instead of the unmodified Maxwell-Boltzmann distribution, we use eq. (10), we successfully reproduce the characteristic behaviour. While the overburden's shielding effect has virtually no effect for intermediate cross sections, since the mean free path is much longer than the underground depth, above a critical cross section the event number in the detector drops drastically. By looking for the cross section for which the likelihood drops below 0.1, we obtain the 90%CL bound.
It should be mentioned at this point that the assumption that all particles reach the detector from above is of course only valid for high cross sections. In the intermediate regime par- arriving at the detector from below will also contribute and attenuate event counts, giving rise to diurnal modulations. This is not taken into account here, where all particles are assumed to reach the detector from above and the focus lies on finding the critical cross section of strongly interacting DM.
ticles entering the detector from below will travel through the bulk of the Earth and undergo scatterings. This leads to additional attenuations and diurnal modulations, which will not show up in fig. 3. For the investigation of the intermediate regime, we refer to [39]. Comparing the blue curve to the MC results, the advantage of the simulation approach becomes very clear. The analytic stopping equation obviously overestimates the stopping power of an overburden and makes the event number drop too fast with increasing cross section. In reality, particles which scatter fewer times than the average still reach the detector capable of triggering it. Therefore MC simulations make constraints on strongly interacting DM more stringent, extending to higher cross sections. The resulting limits are not just more restrictive, but also more accurate, robust and consistent, since upper and lower bounds are on equal footing.
In a recent paper [35] the authors claim that the analytic description fails in deriving the critical cross section of strongly interacting DM, quoting a discrepancy in the number of events of multiple orders of magnitude. However, looking at fig. 3 it is clear that any method which conservatively underestimates the critical cross section, will lead to much higher event numbers compared to the corresponding MC simulations. Yet, this discrepancy says very little about the accuracy of the critical cross section estimate as the actual quantity of interest, since the event number drops very steeply. The limits obtained with the analytic descriptions may be conservative and improvable, but they are still valid. They typically underestimate the critical cross section just by a factor of a few.
For completeness we also include the corresponding bound obtained with method a, i.e. the simple speed cutoff criterion. In this case it gives a reasonable and conservative estimate, which is more restrictive than the limit of method b as expected. However, without the MC results a quality assessment Our results for the 90% CL constraints on light DM for CRESST-II [41], the CRESST 2017 surface run [7], DAMIC(2011) [48], and XENON1T [5]. Also included are constraints from XQC [22], and the CMB [13]. At the bottom of the plot we included the neutrino background [49], and in black dashed lines we indicate the new constraints from CRESST-III [6].
would not have been possible, as discussed in sec. II.
We show the main results of this study in fig. 4, the constraints on DM with masses between 100 MeV and 20 GeV from CRESST-II, XENON1T, DAMIC(2011), and the CRESST 2017 surface run, together with constraints from the XQC experiment and the CMB. For each mass and detector 4 , we obtain an excluded band of cross sections, from a lower limit to the upper critical cross section due to shielding of strongly interacting DM.
The DAMIC(2011) constraints are fully covered by the two experiments of the CRESST collaboration. The purpose of including these result is to compare them to limits obtained with the DMATIS code [29] as an independent and valuable cross check of our simulation. For the masses between 1 and 100 GeV we find an average relative deviation between the two limits of about 15% with slightly higher deviations for masses of order O(1 GeV). But overall the two limits seem to agree to a reasonable precision. Further cross checks and comparisons might be desirable, though the DMATIS code has not been released at the time of submission of this paper.
Both CRESST-II and XENON1T are located deep underground at LNGS. Hence it comes to no surprise that they turn out to be rather insensitive to strongly interacting DM. In the low-mass regime they constrain cross sections up to ∼ 10 −30 cm 2 and ∼ 10 −31 cm 2 respectively. Most interesting is last year's CRESST 2017 surface run of a prototype detector developed for the ν-cleus experiment. As opposed to the vast majority of DM detectors it was not placed underground and is therefore ideal to constrain strongly interacting DM. It probes and constraints cross section around three orders of magnitude higher than XENON1T and CRESST-II despite its small exposure. The resulting constraint close all allowed windows between other terrestrial detectors and the XQC experiments. However, a window between the new constraints and the CMB limits remains, which might get narrowed by taking constraints from cosmic rays into account [14].
In fig. 5 we compare our CRESST 2017 surface run MC constraints with analytic results from recent works. Depending on the DM mass the constraints reported in [30,31], whose analytic methods are based on method b of this paper, underestimate the critical cross section by up to a factor of 2. It should be mentioned again that this method assumes that a DM particle simply moves along a straight path through the Earth while continuously losing energy, an approximation that breaks down for light DM, as discussed in section II. Most recently, the authors of [16] presented a new, and simple analytic method to establish very conservative limits, which circumvents this assumption. It is based on rescaling the number of events while filtering out DM particles that scatter at least once in the overburden. The constraints are therefore considering only the unscattered DM population to set a constraint. But a single scattering is not sufficient to render the flux of the remaining, scattered DM particles undetectable, which is why their constraints are so conservative. The constraints from MC simulations improve upon these results by at least one order of magnitude.
We should emphasize that the exposure has only little effect on the critical cross section as opposed to the lower limit. Despite an exposure larger by a factor of ∼700, the bounds for masses below 20 GeV by XENON1T exceed the ones by CRESST-II by ∼10% only. Comparing to DAMIC(2011) and the CRESST surface run(2017), it becomes clear that the underground depth is by far the dominating factor. This means that in order to close the window between the CMB and CRESST 2017 surface run constraints by means of direct detection, a dedicated surface or high-altitude run of a sensitive portable detector is needed.

V. CONCLUSIONS
In this paper we determine for the first time the precise limits that different underground detectors set on the DM cross section-mass parameter space focusing in the interesting and to great extent unconstrained low DM mass region. We determine the precise critical DM-nucleon scattering cross section with MC techniques above which a given detector with a certain depth becomes blind to DM. This is important because it is essential to know if there are blind spots in the DM parameter space not covered by terrestrial detectors. In such a case surface or high-altitude runs of detectors may close remaining windows in the parameter space.
We presented a brief review of the analytic methods, described the details of our MC simulations and presented updated limits for CRESST-II, XENON1T, and the CRESST surface run(2017). In addition we compared the MC results with the analytic methods confirming that the critical cross section estimated via the analytic methods presented in [32] is typically within a factor of a few below the precise MC one.
We found that the recent surface run of the prototype detector by the CRESST collaboration constrains strongly interacting DM with cross sections up to about 2 × 10 −27 cm 2 . This closes the allowed window for light DM between the constraints by terrestrial detectors and the XQC experiment, as also shown in [30]. However there is still an open window between the CMB constraints and the ones of the CRESST 2017 surface run. It is open for masses between 140 and about 300 MeV and extends over more than an order of magnitude in cross section. If DM lies within this window, we find that all underground detectors are completely blind to DM almost regardless of how much they can improve in exposure. Dedicated surface or preferably high-altitude experiments could close this window. Furthermore, a number of complementary astrophysical constraints to direct detection exist as listed in the introduction, which could exclude the detectors' blind spots. Constraints from cosmic rays narrow the allowed window [14]. Under the assumption that DM can annihilate into Standard Model particles, the constraints from the anomalous heat flow in the Earth rule out the upper left corner of figure 4 and close the small window in parameter space [27]. Naturally, these constraints do not apply to e.g. asymmetric DM. The DAMASCUS-CRUST code we developed for this pur-pose can be used for any future experiment of a given target, depth, exposure and energy threshold, providing a definite and precise answer on what parameter space can be excluded.
Finally, we should also emphasize that DM with mass beneath the reach of CRESST could also have large enough cross sections, such that underground scatterings have to be taken into account. Hence, future efforts should be put towards performing a similar analysis for DM-electron scattering experiments. For models with a dark-photon portal a large hierarchy between the DM-electron and the DM-nucleon scattering cross section follows, such that nuclear stopping becomes very relevant for detectors probing DM-electron interactions. Furthermore, it would be desirable in this context to weaken the assumption of a heavy mediator, since scenarios with light mediators are much less constrained. In conclusion we hope that this paper and the corresponding code will serve as a tool to specify the extend to which direct detection experiments constrain models with strong DM-nucleus interactions.

ACKNOWLEDGMENTS
The CP 3 -Origins centre is partially funded by the Danish National Research Foundation, grant number DNRF90, and by the Danish Council for Independent Research, grant number DFF 4181-00055. Computation/simulation for the work described in this paper was supported by the DeIC National HPC Centre, SDU.
Appendix A: Computational Methods

Univariate Kernel Density Estimation
A central problem for the data treatment in this study is the question how to estimate an unknown probability density function (PDF) f (x) with domain I, i.e. x ∈ I, based on a data set {x 1 , ..., x N } and possibly weights {w 1 , ..., w N } in a non-parametric way. Histograms are the first straight-forward idea that comes to mind, but there is a more sophisticated method, which has the advantage of producing a continuous and smooth estimate of the true PDF, Kernel density estimation (KDE) [50,51]. The PDF can be estimated viâ The parameter h is called the bandwidth. An applicable kernel K(x) satisfies Commonly used kernels include uniform, triangular, Epanechnikov (parabolic), cosine, and many more. We usually choose a simple Gaussian kernel, We also define the scaled kernel, such thatf The bandwidth h is the only free parameter and its choice is therefore crucial, similarly to the bin width for histograms. Choose h too large and the KDEf h (x) might oversmooth crucial features of the true PDF f (x). Choose it too small and we might resolve statistical fluctuations. One estimate is Silverman's rule of thumb [52], which takes the sample's size N and standard deviationσ into account. The Gaussian Kernel (A3) might be problematic since the kernel domain is (−∞, ∞). It is a well-known and extensively studied problem of KDE that this can introduce a significant error in cases, where the domain I is bounded. The estimate given by (A5) underestimates the true PDF close to the boundary. This is due to the fact, that the Kernel does not contain information of the boundary and assigns weight to the region beyond the boundary, where naturally no data is found, as visible in the green line of fig. 6.
The issue is most severe for large probability mass close to the boundary, i.e. if the true PDF does not vanish at edges. This is typically the case for the PDF we estimate in the MC simulations of this work. Many methods of boundary bias removal have been proposed, see e.g. [53] and references therein. An effective and easy to implement example is to simply reflect the data around the boundary. Say we have a PDF f (x) with support [x min , x max ] and f (x min ) > 0, and a data sample {x 1 , ..., x N } with x min ≤ x i ≤ x max . We could now adjust eq. (A5) viâ with x refl i ≡ 2x min −x i . This illustrates how the generation of pseudo-data beyond the boundary can reduce boundary bias. This specific example however has the drawback that it leads tof (x min ) = 0, which might not be a feature of the true PDF. Cowling and Hall proposed a procedure to generate pseudodata without this disadvantage [54]. There are ways to linearly combine data points from within the support into pseudodata points (x (−1) , ..., x (−m) ) beyond the boundary, in a way that the correct edge behaviour of f (x) is reproduced. The corrected KDE is then One way of combining data into pseudo data is the following three-point-rule, Since the normalization of (A8a) on its support is no longer guaranteed, it might be necessary to renormalize the PDF. The result can be seen in fig. 6.

Importance sampling
A challenge to MC simulations arises, if the desired sample is composed of rare events. In our scenario it might be the case, that in order to get a single DM speed data point at detector depth we would have to simulate millions and millions of particles. This is, in the best case, inconvenient and inefficient, since it requires a tremendous amount of computing power. In the worst case MC simulations might not be applicable at all. This happens typically if the regions of interest of the involved PDFs fall into their tails. Introducing Importance Sampling (IS), a standard technique in rare event MC simulations, can soften this problem considerably 5 . The basic idea is to artificially increase the number of rare events in a controlled manner, compensating by a proper weighting factor.
Assuming for simplicity that the simulations involve a single probability density function f (x), which determines the behavior of every single particle trajectory. However, our final data set consists exclusively of particles that make it to the detector depth while still being detectable. Hence, we aggressively filtered out trajectories or events based on a given criterion. These events can be very rare, and their statistical properties are expected to differ from the 'typical' particle, most of which had been filtered out. Consequently, the underlying PDF g(x) of the data set differs from f (x).
A MC simulation with IS makes use of this, in order to increase the chance of these rare events. Instead of f (x), we change the simulation PDF to some new PDFĝ(x), which should approximate g(x), mimicking the PDF of the successful events during the simulations, and therefore rare events occur more frequently. The introduced bias has to be compensated by a data weight function, as we describe below. If g(x) is chosen poorly, the method becomes unstable. It is crucial that the biased PDFs approximate the distributions of the final data set. This is why we ran consistency checks and compared "brute force" simulations with IS simulations for several examples.
The weighting factor can be understood based on a simple observation. Assume we have a random variable X with PDF f X (x) and we are interested in the expectation value of a quantity Y (X) on a given interval I, We can trivially rewrite this as The functionĝ(x) can be regarded as the new PDF and the factor f (x) g(x) as the weighting function. Consequently, we sample fromĝ(x) instead of f (x) during the simulations. Provided that we choseĝ(x) as described above, the desired rare events will be favoured, and we can significantly reduce the amount of MC runs necessary to gather the data sample we need.
In many cases this method makes MC simulations practically feasible, where normal MC simulations would fail due to lack of time and resources. This makes the method particularly useful to simulate strongly interacting DM particles moving through the Earth crust and shielding layers [29,35]. Detectable DM particles which make it to the detector depth turn out to differ in their statistical properties from the average DM particle in two ways: 1. They scatter fewer times, and travel freely further than the mean free path.
2. They scatter more in the forward direction.
Both points make the survival to the detector depth more likely. This also indicates how we should alter the PDFs of our MC simulations. In the following, we describe these modifications to the PDFs and the corresponding weight functions in detail.
The distance x a DM particle travels through matter without scattering on one of its constituents has the underlying PDF with the mean free path λ. By increasing the mean free path, we introduce a bias that favors DM trajectories that can make it to the detector with enough energy to get detected. We therefore chose the altered PDF with δ λ > 0. Every time we sample a distance l i from this PDF by solving where ξ ∈ (0, 1) is a uniform random number, we keep track of the weighting factor w λ,i = f λ (l i ) g λ (l i ) = (1 + δ λ ) exp − δ λ l i (1 + δ λ )λ . (A13) If the trajectory consists of n S scattering events, the overall weight of the trajectory is Especially when the particle freely passes different shielding layers a more useful expression for the weight is Next we turn to the scattering angle, where we want to favour forward scattering. For light DM we can approximate the nuclear form factor F A (q 2 ) ≈ 1 and the scattering is isotropic in the center of mass frame. In other words cos α is a uniform random quantity with PDF and domain [−1, 1]. Since forward scattering favors detectable particles at detector depth, we introduce the following IS biased PDF g α (cos α) = 1 + δ α cos α 2 , with δ α ∈ [0, 1] .
At each scattering we sample from g α , i.e. we obtain cos α i by generating a uniform random number ξ ∈ (0, 1) and solving cos αi −1 d cos α g α (cos α) = ξ , (A18) Note that lim δα→0 cos α i = 2ξ − 1 as expected. We keep track of the weighting factor, For heavier DM the approximation for the nuclear form factor F A (q 2 ) ≈ 1 no longer holds. Then the scattering is not isotropic in the center-of-mass frame anymore. Instead the PDF for the scattering angle cos α of a DM particle of speed v χ scattering on a nucleus with mass number A is given by where q 2 = 2µ 2 χA v 2 χ (1 − cos α). We implemented the Helm form factor [56] in our code, which is given by with r n = c 2 + 7 3 π 2 a 2 − 5s 2 , (A22b) c = 1.23A 1/3 − 0.6 fm , In analogy to eq. (A17), the IS biased PDF is chosen to be g α (cos α; v χ ) = f α (cos α; v χ ) + δ α 2 cos α , in order to favour forward scattering. In this case a slight subtlety enters, since the inclusion of the loss of coherence already favors forward scattering naturally. Therefore we have to be careful that g α (cos α; v χ ) does not yield negative values for backwards scattering. This occurs if If this is the case for a particular scattering we adjust δ α to zero for this scattering, and set the corresponding weight w α,i to 1 correspondingly.
The overall weight factor of a successful trajectory with n S scatterings is then If we use IS for both involved PDFs, for l and cos α, the overall weight of a data point is w = w λ w α .  fig. 7. For the sake of easy implementation we divide the atmosphere in a set of layers with constant density, such that the integral ρ(x) dx is the same for each layer. As a cross-check we varied the number of layers, to ensure that our results are stable and do not depend on this discretization.