First Dark Matter Search with Nuclear Recoils from the XENONnT Experiment

We report on the first search for nuclear recoils from dark matter in the form of weakly interacting massive particles (WIMPs) with the XENONnT experiment which is based on a two-phase time projection chamber with a sensitive liquid xenon mass of $5.9$ t. During the approximately 1.1 tonne-year exposure used for this search, the intrinsic $^{85}$Kr and $^{222}$Rn concentrations in the liquid target were reduced to unprecedentedly low levels, giving an electronic recoil background rate of $(15.8\pm1.3)~\mathrm{events}/(\mathrm{t\cdot y \cdot keV})$ in the region of interest. A blind analysis of nuclear recoil events with energies between $3.3$ keV and $60.5$ keV finds no significant excess. This leads to a minimum upper limit on the spin-independent WIMP-nucleon cross section of $2.58\times 10^{-47}~\mathrm{cm}^2$ for a WIMP mass of $28~\mathrm{GeV}/c^2$ at $90\%$ confidence level. Limits for spin-dependent interactions are also provided. Both the limit and the sensitivity for the full range of WIMP masses analyzed here improve on previous results obtained with the XENON1T experiment for the same exposure.

We report on the first search for nuclear recoils from dark matter in the form of weakly interacting massive particles (WIMPs) with the XENONnT experiment which is based on a two-phase time projection chamber with a sensitive liquid xenon mass of 5.9 t.During the (1.09 ± 0.03) t • y exposure used for this search, the intrinsic 85 Kr and 222 Rn concentrations in the liquid target were reduced to unprecedentedly low levels, giving an electronic recoil background rate of (15.8±1.3)events/(t•y•keV) in the region of interest.A blind analysis of nuclear recoil events with energies between 3.3 keV and 60.5 keV finds no significant excess.This leads to a minimum upper limit on the spin-independent WIMP-nucleon cross section of 2.58 × 10 −47 cm 2 for a WIMP mass of 28 GeV/c 2 at 90 % confidence level.Limits for spin-dependent interactions are also provided.Both the limit and the sensitivity for the full range of WIMP masses analyzed here improve on previous results obtained with the XENON1T experiment for the same exposure.
Astrophysical and cosmological observations indicate the existence of a massive, non-luminous, non-relativistic and non-baryonic dark matter (DM) component of the Universe [1].One well-motivated class of DM candidates is weakly interacting massive particles (WIMPs), which arise naturally in several beyond-Standard-Model theories [2].Direct detection searches for WIMPs with masses of a few GeV/c 2 to tens of TeV/c 2 using liquid xenon (LXe) time projection chambers (TPCs) have produced the most stringent limits to date on elastic spinindependent WIMP-nucleon cross sections [3][4][5].
The XENON Dark Matter project currently operates the XENONnT experiment at the INFN Laboratori Nazionali del Gran Sasso (LNGS) underground laboratory.It is an upgrade of its predecessor, XENON1T [6], with a new, larger dual-phase TPC featuring a sensitive LXe mass of 5.9 t.The XENON1T cryogenics, gaseous purification and krypton distillation systems, as well as the 700 t water Cherenkov muon veto (MV) tank [7,8] are reused to operate XENONnT.Inside the water tank, a new neutron veto (NV) detector encloses the TPC cryostat.For the exposure used in this analysis, the NV was operated as a water Cherenkov detector, tagging neutrons through their capture on hydrogen which releases a 2. 22 MeV γ-ray.
The senstive LXe detector volume, enclosed by a polytetrafluoroethylene (PTFE) cylinder with a height of 1.49 m and a diameter of 1.33 m, is viewed by 494 Hamamatsu R11410-21 3-inch photomultiplier tubes (PMTs) [9] distributed in a top and a bottom array.To fill the vessel housing the TPC, a total of 8.5 t liquified xenon is required which is continuously purified by a new liquid-phase purification system [10].Together with a high flow radon distillation system [11], a careful selection of detector construction materials [12] and a specialized assembly procedure, this led to an unprecedentedly low electronic recoil (ER) background of (15.8 ± 1.3) events/(t • y • keV) below recoil energies of 30 keV [13].
Particles depositing energy in the LXe produce a prompt scintillation signal (S1) as well as ionization electrons which drift upwards and are extracted into the gas above the liquid due to applied electric fields.Here a second scintillation signal (S2), proportional to the number of extracted electrons, is produced.WIMPs are expected to primarily produce nuclear recoils (NRs), where a xenon nucleus recoils, while the background is dominated by ER interactions where an electron recoils.A higher scintillation-to-ionisation ratio is expected for NRs, but unlike ERs, a fraction of the total recoil energy is also lost as unobservable heat.
Three parallel-wire electrodes (cathode, gate and anode) are used to establish the drift and extraction fields.The gate and anode electrodes are reinforced with two and four transverse wires, respectively, to minimize wire sagging.Two additional parallel-wire screening electrodes are used to shield the PMT arrays from the electric fields.After two months of commissioning at a drift field of 100 V/cm, a short between the bottom screening and cathode electrodes limited the applied drift field to 23 V/cm, corresponding to a maximum drift time of 2.2 ms.The extraction field was set to 2.9 kV/cm in LXe to reduce localized, intermittent bursts of single electron S2 signals.Despite the lower-than-designed drift and extraction fields, the energy and position resolution, as well as the energy threshold, are comparable to those achieved with XENON1T.
The TPC and veto detectors are integrated into a single data aquisition system [14].The data acquired by the MV uses the same hardware event trigger as in XENON1T [15], whereas data from the TPC and NV are acquired in a "triggerless" mode, with each individual PMT channel recording all signals above a channelspecific threshold of 0.13 photoelectrons (PE).
The recorded signals are processed using customdeveloped open source software packages [16,17].Each PMT signal is scanned for PMT "hits" above threshold, and hits found in the TPC channels are clustered and classified into S1, S2 or "unclassified" peaks based on pulse shape and PMT hit-pattern.At least three PMTs must contribute to an S1 within ±50 ns around the center of the integrated peak waveform.Events are built in time intervals between 2.45 ms before and 0.25 ms after S2s, and overlapping events are merged.The event S2 is required to be greater than 100 PE, and have fewer than eight other peaks larger than half of the S2 peak area within ±10 ms.
The PMT hit patterns of S2 signals are used to recon-struct the horizontal position (X, Y) of an event using neural network models [18,19].Each model was trained by the S2 light distribution on the top PMT array generated through optical simulations with Geant4 [8], corrected for the number of exlcuded PMTs and electronics per-PMT response with the XENONnT waveform simulator (WFSim) [20].The horizontal interaction position resolution for simulated events close to the PTFE detector walls is 1 cm, and 0.75 cm within the fiducial volume (FV), for a 1000 PE S2 ( 30 extracted electrons).The depth, Z, of an interaction is reconstructed from the measured drift time between S1 and S2 and the electron drift velocity with a resolution < 1 %.The 50 % S2 width of a single electron signal is about 600 ns and the width of S2s within the FV of the detector typically range from 2 µs to 9 µs.The drift field has a radial component that shifts ionization electrons originating deeper in the detector inwards when they are observed at the liquid surface.This inward shift is corrected with a data-driven approach, assuming a uniform distribution of 83m Kr calibration events in radius squared (R 2 ) as in [18].The position and time information of the detected S1 and S2 signals is used to correct for the inhomogeneous detector response due to quanta generation and collection effects, and corresponds to corrections of up to 30 % for either signal.Scintillation photons are affected by a position-dependent optical light collection efficiency which reduces the S1 peak area.A light yield (LY) map normalized to the mean response in the (FV) is generated using 83m Kr signals.The electric field dependence of the LY is removed using a drift field map constructed by matching the spatial distribution of 83m Kr to a COM-SOL [21] simulation, accounting for potential charge accumulations on the PTFE surfaces.This drift field map was validated with data using the measured S1 ratio of the two 83m Kr decays [22].The resulting LY map is valid over the full energy range of this analysis and is used to correct S1 signals, referred to as cS1.
The S2 peak area reduces exponentially for signals deeper in the detector, as drifting electrons can be captured by electronegative impurities.This effect leads to a time-dependent lifetime of the free electrons which is corrected using data from 83m Kr and 222 Rn decays, and monitored with a new purity monitor system [23].The charge yield of the respective sources was corrected by the drift field map using low-field data from [24].An electron lifetime better than 10 ms was reached throughout the science run with a liquid purification flow of 8.3 t/d [10].The spatial variation in the S2 response is dominated by the position-dependent optical light collection efficiency and inhomogeneous electroluminescence amplification.83m Kr events are used to obtain a normalized horizontal S2 peak area correction map.Time-dependent variations of the single electron gain and extraction efficiency following each ramping up of the electric field are corrected by their respective data-driven trends.S2 signals summed over the top and bottom array, and corrected for the above effects are referred to as cS2.
The method to convert the cS1 and cS2 signals of NRs and ERs into a combined energy scale is described in [25].The photon and electron gains are found to be g 1 = (0.151 ± 0.001) PE/photon and g 2 = (16.5 ± 0.6) PE/electron, assuming the mean energy to produce a charge or light quantum to 13.7 eV/quantum [26].Reconstructed energies using this scale directly give the ERequivalent energy (keV ER ), while the NR-equivalent energy (keV NR ) requires a model for energy lost to heat, and uses the full NR detector model, described later.
The science search data was collected from July 6 th to November 10 th 2021.This period, named Science Run 0 (SR0), contains a total of 97.1 d of data which corresponds to a deadtime-and veto-corrected livetime of 95.1 d.The length of SR0 was primarily chosen to investigate the XENON1T ER excess [25], leading to a WIMP search exposure of (1.09 ± 0.03) t • y.The detector conditions were stable throughout SR0 with an average LXe temperature of (176.8 ± 0.4) K and pressure of (1.890 ± 0.004) bar, where the uncertainties represent the corresponding RMS over SR0.PMT gains were monitored by weekly calibrations with a pulsed low-intensity light source and voltages were adjusted at the beginning of SR0 to achieve 2 × 10 6 gains for all PMTs.The time dependence of the PMT gains was modeled and the signals were corrected, resulting in a gain variation < 3 %.In total 17 PMTs were excluded from analysis due to internal vacuum degradation, instability, light emission or noise.Five of these PMTs are distributed evenly in the top PMT array.Periods of data taken with an intermittent and localized high rate of S2 emission from single or few electrons are not included in calibration and search data.Calibrations with 83m Kr were performed every second week to correct the detector response for positionand time-dependent effects, and to monitor the stability of cS1 and cS2.
The NR response of XENONnT and the NV tagging efficiency were calibrated using an external 241 AmBe source which was placed in three positions close to the TPC cryostat. 241AmBe emits neutrons via the alphacapture reaction 9 Be(α, n) 12 C which has a chance of about 60 % to emit an additional 4.44 MeV γ-ray [27].This γ-ray, well above the NV threshold, is used to select NR S1 signals in a 400 ns window.After applying the same data-quality cuts as used in the main analysis, 1986 events remain in the region of interest (ROI), shown in Figure 1.Only (1.8 ± 0.6) events are expected from random coincidences between the two detectors, determined through a sideband study.The tagging efficiency of the NV is estimated from the number of delayed neutron capture signals following the NR S1 signals.This data-driven tagging efficiency is corrected for positiondependent effects using Geant4 [28] simulations which account for the full spatial distribution of neutrons emit-ted by detector materials [8].The length of the veto window was set to 250 µs with a 5-fold PMT coincidence and a 5 PE event area threshold in the NV.This gives a neutron tagging efficiency of (53 ± 3) %, and a livetime reduction of 1.6 %.
The ER response model is calibrated with 2051 212 Pb β events from a 220 Rn calibration source [29], before SR0 and with events from an 37 Ar source [30] collected after SR0, as discussed in [13].NR and ER calibration datasets were fitted using the LXe response model and fast detector simulation described in [31].For both datasets, a Markov-Chain Monte Carlo (MCMC) sampling of the parameter space gives the best-fit point and posterior distribution.The goodness-of-fit (GOF) was assessed by partitioning the cS1, cS2 space into equiprobable bins according to both best-fit models and then computing a Poisson χ 2 likelihood, as well as onedimensional projections on cS2.Neither tests reject the best-fit model, with two-dimensional p-values of 0.18 and 0.39 for ER and NR, respectively, and no significant pvalues for the one-dimensional projections.The calibration data and contours of the best-fit model are shown in Figure 1.The leakage fraction of the 220 Rn ER events below the NR median is 1.1 +0.2 −0.3 %.The full ER model has too many parameters to be tractable in the inference toy MC simulations.Using linear combinations of the original parameters identified with a principal component analysis reduces parameter redundancies, and these parameter directions are then ranked according to their impact on the background expectation in a signal-like region in cS1 and cS2.The two parameters with the highest impact are included as nuisance parameters in the ER model used in the WIMP search likelihood.
The ROI is defined by cS1 between 0 PE and 100 PE and cS2 between 126 PE and 12 589 PE.Together with detection and selection efficiencies, this gives an energy range with at least 10 % total efficiency from 3.3 keV NR -60.5 keV NR .All events reconstructed with an ER energy below 20 keV ER and found in the cS1 and cS2 contours of the ER and NR band were blinded.For the study of the ER data presented in [13] all events above the −2 σ quantile of the ER band or with a reconstructed ER energy larger than 10 keV ER were unblinded.The remaining region was unblinded only after finalizing the analysis procedure presented here.
The event selection criteria from [18] were optimized for the ROI in this analysis.Data quality cuts are applied in order to include only well-reconstructed events and to suppress backgrounds.All cuts were optimized based on calibration data and simulations using WFSim.Each valid event is required to have a valid S1-S2 pair.Events tagged by the MV or NV are removed from the data selection as are multiple-scatter (MS) events since WIMPs are expected to induce only single-scatter (SS) NRs.The MV uses a veto window of 1 ms with a 5-fold PMT coincidence and a 10 PE MV event area threshold.
A dedicated cut similar to that in [32] using a gradient boosted decision tree (GBDT) was developed to reduce the background due to randomly paired S1-S2 signals called accidental coincidences (ACs).This cut uses S2 area and shape, as well as interaction depth, and reduces the AC background by 65 % at 95 % signal acceptance.Due to an insufficient model of the S2 pulse shape near the transverse wires, caused by local variations of the drift and extraction field with respect to the rest of the TPC, an optimization of the GBDT and other S2 shapebased cuts was not possible with WFSim.Consequently, the LXe target is split into two parts in the modeling for the WIMP search.A less strict data-driven model for the S2 width cut and no GBDT selection is used in an 8.9 cm wide band around the transverse wires, leading to a lower signal-to-background ratio, but with a 10 % higher selection efficiency.The total selection efficiency for these "near"-and "far"-wire regions is estimated following the procedure in [18,25].Efficiency losses due to the event building are also taken into account in the selection efficiency.
The detection efficiency of the TPC, dominated by the S1 detection efficiency, is evaluated using WFSim and validated with a data-driven method [31,33].Both methods agree within 1 %.Efficiency losses at small energies are dominated by the recoil spectra of three different WIMP masses in Figure 2.
In order to mitigate background events from detector radioactivity as well as "surface events" produced by ERs from 210 Pb plate-out [3], only events reconstructed in a central FV (illustrated in Figure A.2 in the supplementary material) are considered in the analysis.The FV shape is optimised based on the background distributions, as well as constrained to not include regions where the detector is not sensitive or models are incomplete.The total LXe mass of the FV after considering the systematic uncertainty of the field distortion correction is (4.18 ± 0.13) t.
Five different background components make up the total background model: radiogenic neutrons, coherent elastic neutrino-nucleus scattering (CEνNS), ERs, surface events and ACs.The expectation values for each are summarized in Table I.In addition to the full expectation values, we include for illustration expectation values in a signal-like region defined to contain half of a 200 GeV/c 2 WIMP signal with the lowest signal-to-background ratio.
The NR background in XENONnT is dominated by radiogenic neutrons from spontaneous fission and (α, n)reactions.Neutron yields and energies originating from various detector materials are evaluated as in [8,31].A custom interface based on the fitted NR model accepts Geant4 simulation inputs, and provides observable quanta processed by WFSim to construct the neutron background model [34].The neutron rate was estimated based on this full detector simulation and compared against a data-driven method.The data-driven TABLE I. Expected number of events for each model component and observed events.The "nominal" column shows expectation values and uncertainties, if applicable, before unblinding.The nominal ER value is the observed number of ER events before unblinding.Other columns show best-fit expectation values and uncertainties for a free fit including a 200 GeV/c 2 WIMP signal component.The best-fit signal cross-section is 3.22 × 10 −47 cm 2 .In addition to the expectation values in the full ROI, we include the expectation values in a signal-like cS1,cS2 region containing the 50% of signal in with the best signal-to-background ratio.This region is indicated in Figure 3 with an orange dashed contour.The best-fit and pre-unblinding values agree within uncertainties for all components which include an ancillary constraint term.estimate uses a combined Poisson likelihood for MS and SS events tagged by the NV, together with a simulationdriven MS/SS ratio which was validated with 241 AmBe data.The maximum deviation of the MS/SS ratio estimated as a function of radius between data and simulation was found to be less than 20 %.However, a wrong sign in the NV tagging window, discovered only after unblinding of the main data, meant that the simulation and data-driven estimates found before were no longer in agreement.This error arose from the premise that the tagging efficiency was determined in a forward coincidence, counting the number NV tags for a given set of NR SS events, while the tagging is done by a backwards veto triggered when a NV event satisfies the threshold criteria.In accordance with the analysis plan, the datadriven rate estimate is used.Four events in the WIMP blinding region are tagged by the NV and cut, three of them also fail the SS cut, compatible with the MS/SS ratio from simulations.This gives a total neutron expectation of 1.1 +0.6 −0.5 events which is a factor 6 higher than predicted by simulations.Analysis choices such as the NV tagging window and the FV were not re-optimized after this correction.

Nominal
The remaining contribution to the NR background is predominately due to CEνNS from 8 B solar neutrinos.The rate is constrained by measurements of the 8 B flux [35], but the total uncertainty of the expectation value is dominated by the detector response model uncertainties.The number of cosmogenic neutrons is conservatively estimated to be less than 0.01 events after MV tagging [7], not including the additional suppression by the NV.Thus, this background is considered to be negligible.
The ER background is dominated by β-decays of 214 Pb originating from the decay of 222 Rn in the LXe.Solar neutrino-electron scattering, 85 Kr and γ-rays emitted by detector materials also contribute to the ER background [13].The ER response model fit was updated after unblinding of the main data to use the same data quality selections as of this study, compared to [13].Prior to unblinding, 134 events are found in the ER band of the ROI.
Data-driven models are constructed for AC events and surface background events.The AC background is concentrated at low S1 and S2, and is therefore a particular challenge for low-mass WIMP searches.The model is constructed from a synthetic dataset made from isolated S1s and S2s using the method in [32].Looser cuts in the near-wire region give a 6 times larger AC rate for this region compared to the rest of the TPC.Background sidebands and 220 Rn and 37 Ar calibration data were used to validate the AC model, and the rate is estimated with an uncertainty of better than 5 %.The surface background model is constructed from 210 Po events originating from the TPC walls, using a similar method as in [31].The data is described in radius using a parametric likelihood fit based on events found below the blinded region.cS1 and cS2 are modeled using a kernel density estimation derived from events reconstructed outside of the TPC.The wall model is validated using the unblinded WIMP region outside of the FV as a sideband.The expected values for both backgrounds are summarized in Table I and their distributions in the (cS1, cS2) space are shown in Figure 3.In addition, an extended version of Table I differentiating the near and far wire region can be found in Table II in the supplemental materials.The statistical analysis of the WIMP search data uses toy MC simulations of the experiment to calibrate the distribution of a log-likelihood-ratio test statistic as in [31,36].Four terms make up the likelihood: two search-data terms for events near and far from the transverse wires, an ER calibration term and a term representing ancillary measurements of parameters.The first three are extended unbinned likelihoods in cS1, cS2, as well as R for the first term.All three terms have the same form as equation ( 21) in [31].The two search-data likelihoods include components for the ER, AC, surface, CEνNS and radiogenic neutron backgrounds, as well as the WIMP signal.The 220 Rn calibration term includes the ER model as well as an AC component.The expected number of events for each component is a nuisance parameter in the likelihood.In addition, two shape parameters for the ER model are included, and a parameter representing the uncertainty of the expected number of signal events given the NR response model.The ER shape parameters mainly modify the signal-like ER tail below S1 = 10 PE, where they allow the signal-like ER tail be- low the median S2 expected from a 200 GeV/c 2 WIMP to vary between 0.009 and 0.017 at 60% confidence level.The signal shape is fixed, as even a large signal excess would be small enough that the calibration constraints would dominate.The signal expectation value for a certain cross-section is included as a nuisance parameter.The ancillary measurement term includes Gaussians representing the measurements constraining the AC, radiogenic, surface and CEνNS rates, and the uncertain signal expectation.
The signal NR spectrum is modeled with the Helm form factor for the nuclear cross section [37], and a standard halo model with parameters fixed to the recommendations of [36].The main change from previous XENON publications is an updated local standard of rest velocity of 238 km/s [38,39].The NR model fit to calibration data is used to construct a model for the signal in cS1 and cS2.
After unblinding, the ROI contains 152 events, 16 of which were in the blinded WIMP region.The data is shown in Figure 3, and the best-fit expectation values are in Table I.The binned GOF test indicates no largescale mismodelling (p = 0.63).At high cS1, ⪆ 50 PE, we observe more events which are consistent with ER events than our model or calibration data predicts, in particular between cS1s of 50 PE and 75 PE.Of the 16 1 sensitivity 2 sensitivity FIG. 4. Upper limit on spin-independent WIMP-nucleon cross section at 90 % confidence level (full black line) as a function of the WIMP mass.A power-constraint is applied to the limit to restrict it at or above the median unconstrained upper limit.The dashed lines show the upper limit without a powerconstraint applied.The 1 σ (green) and 2 σ (yellow) sensitivity bands are shown as shaded regions, with lighter colors indicating the range of possible downwards fluctuations.The result from XENON1T [3] is shown in blue with the same power-constraint applied.At masses above 100 GeV/c 2 , the limit scales with mass as indicated with the extrapolation formula.
former blinded events, 13 are found in the upper right half of the horizontal event distribution, with no correlation with the transverse wires observed (see Figure A.3).The 220 Rn, 83m Kr and 37 Ar calibration datasets do not exhibit any asymmetry, nor is any seen in the acceptances evaluated in the X, Y plane for any of the applied cuts.The WIMP discovery p-value indicates no significant excess (p ≥ 0.20, with the minimum for masses above 100 GeV/c 2 ), and the resulting limits on spinindependent interactions are shown in Figure 4, with spin-dependent limits included in Figures A. 1a and A.1b of the supplementary material.To constrain large downwards fluctuations, the limit is subjected to a powerconstraint following [40].We choose a very conservative power threshold of 50%, higher than that advocated in [36], as that paper mistakenly defined the powerconstraint in terms of discovery power when settling on a threshold of 15%.See the supplementary materials for further discussion.For spin-independent interactions the lowest upper limit is 2.58 × 10 −47 cm 2 at 28 GeV/c 2 and 90% confidence level (CL).At masses above 100 GeV/c 2 , the limit is 6.08 × 10 −47 cm 2 × (M DM /(100 GeV/c 2 )).For spin-independent interactions the lowest upper limit is 2.58 × 10 −47 cm 2 at 28 GeV/c 2 and 90% CL.At masses above 100 GeV/c 2 , the limit is 6.08 × 10 −47 cm 2 × (M DM /(100 GeV/c 2 )).
In conclusion, a blind analysis of 95.1 d of science data with a total exposure of (1.09 ± 0.03) t • y has been performed.The best fit to the data is compatible with the background-only hypothesis.The experiment achieved an ER background level of (15.8 ± 1.3) events/(t • y • keV), 5 times lower than XENON1T, with comparable detector resolutions, and energy threshold.This results in a sensitivity improvement with respect to XENON1T by a factor of 1.7 at a WIMP mass of 100 GeV/c 2 .Currently, XENONnT continues to take data, with a further reduced 222 Rn ER background, using the radon distillation system with combined gaseous and liquid xenon flow.Subsequent data-taking is planned with the NV operating as designed, with Gd-sulphate-octahydrate loaded into the water [41,42] to increase the neutron tagging efficiency to 87 % with a lower overall lifetime reduction [8]. We

Spatial event distribution
The distribution of events in the horizontal plane enters the likelihood only as radius R or in the nearwire/far-wire partition.Therefore, no dedicated GOF tests were defined in this plane prior to unblinding, and results are expected to be robust against a potential mismodelling that does not affect these variables.
Figure A.2 and A.3 show the spatial distribution of all events found in the ROI after unblinding.The lower number of events next to the TPC wall for low z in Figure A.2 is due to a charge insensitive region caused by an inhomogeneity of the drift field.Field inhomogeneities also cause a variation in drift speed as function of R leading to a small bias in the reconstructed z-position as it can be seen from the slight bending of the near cathode events in Figure A.2.The bias is accounted for in the estimation of the fiducial volume and mass.The reconstruction bias is about 2 cm at the bottom outer edge of the fiducial volume and changes to roughly −0.4 cm at a radius of 40 cm.
Figure A.3 shows clusters with a higher density of events near the transverse wires as well as localised over densities in a periodic pattern close to the TPC wall outside of the fiducial volume.The former is caused by the higher rate of AC events and overall less strict requirements on the S2 width near the transverse wires as explained in the main body of the paper.The latter is an artefact due to the structure of the TPC PTFE cylinder which is composed of pillars and panels.The over densities are localised near the TPC pillars.
In total 13 of the 16 former blinded events are found in the upper right half of the horizontal event distribution.The events are neither found near the transverse wires nor the position of the single to few electron S2s burst which were localized in a 10 cm radius around (5 cm,−20 cm) in X and Y .Additional test were carried out after unblinding to check if any systematic bias was overlooked during the development of corrections or data quality selections.The 220 Rn ER band calibration data taken before SR0 was tested for a similar asymmetry for all calibration data points found in the lowest 5 % of the ER band in cS2.Only a weak correlation with the transverse wires was found, as expected due to the 10 % higher relative selection efficiency, but no asymmetry in the horizontal event distribution.The 37 Ar calibration data taken after SR0 and 83m Kr taken every second week during SR0 were also tested for non-uniformity.In both cases no indication is found of an asymmetric bias in the event reconstruction.The impact of each data quality cut on the unblinded data was tested in several parameter spaces.The selections showed only expected correlations, e.g. the impact of the radius of the FV on the surface background, but none of the selections showed any behavior which could explain the observed asymme-

Comparison of upper limits
Figure A.4 compares this work to other recent results, both with and without a power-constraint applied consistent with the original PCL recommendation.In order to not place limits on models for which an experiment has low sensitivity, a set of recommendations adopted by LXe dark matter experiments [36] recommends using a powerconstraint [40].The set of recommendations erroneously defines sensitivity in terms of discovery power, while it should be in terms of "rejection power"; the probability for a certain signal to be excluded given the no-signal hypothesis.This rejection power corresponds to the quantile of upper limits for that signal, as used to produce the conventional sensitivity bands.The power-constrained limit is defined by setting a signal size threshold corresponding to a certain rejection power, and only placing upper limits at or above this threshold.This aims both to limit arbitrarily low limits being set by a systematic fluctuations, and moderates the effect on the upper limit of mis-modelling, in particular overestimated backgrounds.The choice of threshold rejection power is a fiducial one, and previous publications and the community recommendations (using discovery rather than rejection power) set it to correspond to the −1 σ quantile of the limit distribution.Given the need to amend the recommendations, we choose a very conservative rejection power threshold of 0.5 for this work, corresponding to the median uncon-strained limit.Upper limits on spin-independent WIMP-nucleon cross section at 90 % confidence level for this work (black lines), LZ [5] (purple lines, preprint), PandaX-4T [4] (red lines) and XENON1T [3] (blue lines).For PandaX and LZ, dashed lines represent their published result, for XENON results the dashed lines represent limits without a powerconstraint applied.Full lines for each experiment represent a limit that is power-constrained to always lie at or above the median un-constrained limit.

FIG. 1 .
FIG.1.NR and ER calibration data from 241 AmBe (orange), 220 Rn (blue) and37 Ar (black).The median and the ±2 σ contours of the NR and ER model are shown in blue and red respectively.The gray dash-dotted contour lines show the reconstructed NR energy (keVNR).Only not shaded events up to a cS1 of 100 PE are considered in the response model fits.

FIG. 2 .
FIG.2.Detection and selection efficiency for NR events in this search as a function of the NR recoil energy.The total efficiency in the WIMP search region (black) is dominated by the detection efficiency (green) at low energies and event selections (blue) at higher energies until the edge of the ROI.Normalized recoil spectra for WIMPs with masses of 10 GeV/c 2 , 50 GeV/c 2 , and 200 GeV/c 2 are shown with orange dashed lines for reference.

FIG. 3 .
FIG.3.DM search data in the cS1-cS2 space.Each event is represented with a pie-chart, showing the fraction of the best-fit model, including the expected number of 200 GeV/c 2 WIMPs (orange) evaluated at the position of the event.The size of the pie-charts is proportional to the signal model at that position.Background probability density distributions are shown as 1 σ (dark) and 2 σ (light) regions as indicated in the legend for ER (blue), AC (purple) and surface (green, "wall").The neutron background (yellow in pies) has a similar distribution to the WIMP (orange filled area showing the 2 σ region).The orange dashed contour contains a signal-like region which is constructed to contain 50% of a 200 GeV/c 2 WIMP signal with the highest possible signal-to-noise ratio.

FIG. A. 3 .
FIG. A.2. Spatial distribution of the search data in the 4.18 t fiducial volume (blue line).Each event is represented with a pie-chart, showing the fraction of the best-fit PDF including a 200 GeV/c 2 WIMP evaluated at the position of the event, color-coded as in Figure 3. Events reconstructed outside of the fiducial volume are colored in gray.Black dashed lines depict the boundaries of the sensitive volume given by the cathode and gate positions.The TPC radius is indicated by a vertical black line.
FIG.A.4.Upper limits on spin-independent WIMP-nucleon cross section at 90 % confidence level for this work (black lines), LZ[5] (purple lines, preprint), PandaX-4T[4] (red lines) and XENON1T[3] (blue lines).For PandaX and LZ, dashed lines represent their published result, for XENON results the dashed lines represent limits without a powerconstraint applied.Full lines for each experiment represent a limit that is power-constrained to always lie at or above the median un-constrained limit.
gratefully acknowledge support from the National Science Foundation, Swiss National Science Foundation, German Ministry for Education and Research, Max Planck Gesellschaft, Deutsche Forschungsgemeinschaft, Helmholtz Association, Dutch Research Council (NWO), Weizmann Institute of Science, Israeli Science Foundation, Binational Science Foundation, Fundacao para a Ciencia e a Tecnologia, Région des Pays de la Loire, Knut and Alice Wallenberg Foundation, Kavli Foundation, JSPS Kakenhi and JST FOREST Program in Japan, Tsinghua University Initiative Scientific Research Program and Istituto Nazionale di Fisica Nucleare.This project has received funding/support from the European Union'sHorizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement No 860881-HIDDeN.Data processing is performed using infrastructures from the Open Science Grid, the European Grid Initiative and the Dutch national e-infrastructure with the support of SURF Cooperative.We are grateful to Laboratori Nazionali del Gran Sasso for hosting and supporting the XENON project.* l.althueser@uni-muenster.de