Improved Short-Baseline Neutrino Oscillation Search and Energy Spectrum Measurement with the PROSPECT Experiment at HFIR

We present a detailed report on sterile neutrino oscillation and U-235 antineutrino energy spectrum measurement results from the PROSPECT experiment at the highly enriched High Flux Isotope Reactor (HFIR) at Oak Ridge National Laboratory. In 96 calendar days of data taken at an average baseline distance of 7.9 m from the center of the 85 MW HFIR core, the PROSPECT detector has observed more than 50,000 interactions of antineutrinos produced in beta decays of U-235 fission products. New limits on the oscillation of antineutrinos to light sterile neutrinos have been set by comparing the detected energy spectra of ten reactor-detector baselines between 6.7 and 9.2 meters. Measured differences in energy spectra between baselines show no statistically significant indication of antineutrinos to sterile neutrino oscillation and disfavor the Reactor Antineutrino Anomaly best-fit point at the 2.5$\sigma$ confidence level. The reported U-235 antineutrino energy spectrum measurement shows excellent agreement with energy spectrum models generated via conversion of the measured U-235 beta spectrum, with a $\chi^2$/DOF of 31/31. PROSPECT is able to disfavor at 2.4$\sigma$ confidence level the hypothesis that U-235 antineutrinos are solely responsible for spectrum discrepancies between model and data obtained at commercial reactor cores. A data-model deviation in PROSPECT similar to that observed by commercial core experiments is preferred with respect to no observed deviation, at a 2.2$\sigma$ confidence level.

We present a detailed report on sterile neutrino oscillation and 235 U νe energy spectrum measurement results from the PROSPECT experiment at the highly enriched High Flux Isotope Reactor (HFIR) at Oak Ridge National Laboratory. In 96 calendar days of data taken at an average baseline distance of 7.9 m from the center of the 85 MW HFIR core, the PROSPECT detector has observed more than 50,000 interactions of νe produced in beta decays of 235 U fission products. New limits on the oscillation of νe to light sterile neutrinos have been set by comparing the detected energy spectra of ten reactor-detector baselines between 6.7 and 9.2 meters. Measured differences in energy spectra between baselines show no statistically significant indication of νe to sterile neutrino oscillation and disfavor the Reactor Antineutrino Anomaly best-fit point at the 2.5σ confidence level. The reported 235 U νe energy spectrum measurement shows excellent agreement with energy spectrum models generated via conversion of the measured 235 U beta spectrum, with a χ 2 /DOF of 31/31. PROSPECT is able to disfavor at 2.4σ confidence level the hypothesis that 235 U νe are solely responsible for spectrum discrepancies between model and data obtained at commercial reactor cores. A data-model deviation in PROSPECT similar to that observed by commercial core experiments is preferred with respect to no observed deviation, at a 2.2σ confidence level.

I. INTRODUCTION
Neutrinos arguably remain the least well-understood fundamental particles in the Standard Model: their absolute masses are only constrained within a few orders of magnitude, properties of their right-handed versions and differences between matter and antimatter versions are undetermined, and many of their flavor mixing parameters remain uncertain at the 10% level or greater [1]. Further improvement in understanding of these properties requires new high-precision measurements using high-intensity neutrino sources. Nuclear reactors are the * prospect.collaboration@gmail.com highest intensity artificial neutrino sources on Earth, producing MeV-scale energy electron-type antineutrinos (ν e ) predominantly via β-particle decay of neutron-rich fission daughter products of 235 U, 238 U, 239 Pu, and 241 Pu [2].
These prodigious reactor ν e emissions have been used in past experiments to substantially expand our understanding of neutrino properties. Early reactor ν e experiments provided the first direct evidence of the particle's existence [3] and measured its rate of charged and neutral current interaction [4][5][6]. More recently, the KamLAND experiment used fluxes from many reactors at 180 km average distance to measure distortion of the predicted reactor ν e energy spectrum due to ν e disappearance [7,8], confirming large-amplitude lepton flavor mixing as the solution to the long-standing 'solar neutrino problem' [9]. Subsequently, three reactor neutrino experiments at kmscale baselines -Daya Bay, Double Chooz, and RENO -also measured substantial ν e disappearance and energy spectrum distortion [10][11][12][13][14]. These results produced the first confirmation of a non-zero value for the neutrino mixing parameter θ 13 , opening the door to future accelerator-based measurements of leptonic CP violation and determination of the ordering of the three Standard Model neutrino masses [15]. Reactor neutrino experiments provide leading or competitive precision in the determination of three of the six parameters describing Standard Model neutrino mixing: θ 13 , ∆m 2 21 , and |∆m 2 31 | [1]. The observed discrepancies between measured and modelled reactor ν e fluxes [16] has motivated new experiments and analyses that focus on probing the active-sterile mixing parameters ∆m 2 41 and θ 14 [17]. As these measurements have improved in precision, they have also enabled a more detailed understanding of reactors as a source of ν e . The production of ν e in a nuclear reactor core per second at time t, given in terms of neutrinos per unit energy, can be described as follows: where W th is the core's thermal power output, f i and s i (E ν ) are respectively the fission fraction and antineutrino flux from isotope i, and E(t) = i f i (t)e i is the average energy release per fission, with individual fission isotope energy releases e i . Some of these inputs are computed directly from measurements of the core or its fuel: W th is derived from real-time in-reactor measurements [18], while f i are determined by reactor simulations benchmarked to measurements of spent fuel content [19,20]. Other inputs are based on theoretical calculations. The energy released per fission E is primarily dependent on mass differences between fission isotopes and their products, and can be calculated with relatively little uncertainty based on existing nuclear databases [21]. On the other hand, calculations of s i (E ν ) suffer from a variety of systematic uncertainties. The favored method performs conversion of measured aggregate β-particle spectra from each fission isotope [22][23][24] into ν e spectra using energy conservation and various spectrum shape assumptions and corrections [25,26]. These spectrum inputs have sizable systematic uncertainties; nonetheless, this method serves as the standard method for calculating s i (E ν ). An alternate summation method calculates s i (E ν ) by adding the ν e produced by each fission daughter using their evaluated nuclear data (e.g. fission yields and β decay properties) [16,27]; here, uncertainty is contributed by the incomplete and sometimes inaccurate nature of the inputs [27][28][29][30][31]. To gain further insight into the potential deficiencies of these methods, recent reactor ν e measurements have been used to directly determine ν e production by reactors and individual fission isotopes.
Direct ν e measurements are usually reported in terms of the inverse beta decay (IBD) yield and energy spectrum per fission, σ i (E ν ) = σ IBD (E ν )s i (E ν ), where σ IBD is the well-known cross-section for the inverse beta decay interaction used for detection in most reactor ν e experiments, ν e + p → e + + n [32]. Using results from cores of differing fuel composition, direct determination of isotopic IBD yields and spectra now approaches or exceeds the precision of the theoretically-calculated counterparts. The IBD yield for 235 U has been measured to better than 1.5% precision via historical measurements at highly 235 U enriched reactor cores [33], while measurements from commercial cores during periods of differing fuel content at Daya Bay [34] produce better than 2.5% and 6% precision in 235 U and 239 Pu yields, respectively [35]. Global fits of all IBD yield measurements produce 1.5%, 14%, and 4.5% precision in production by 235 U, 238 U, and 239 Pu, respectively [36]. Daya Bay has also provided measurements of IBD energy spectra from 235 U and 239 Pu fission below 9 MeV, with precision better than 5% and 12% over most of the relevant energy range [35]. The PROSPECT experiment has recently performed the first highstatistics measurement of IBD energy spectra at a highly 235 U enriched reactor, with precision better than 10% between 1-6 MeV [37]. These measurements confirm differing rates and energies of ν e production for the different fission isotopes, and provide improved justifications for and demonstration of capabilities in monitoring the status, power, and fuel content of nuclear reactors using their ν e emissions [38][39][40][41][42].
Comparison of theoretical conversion predictions and direct ν e flux and spectrum measurements yields numerous inconsistencies. An overall deficit in measured IBD yields with respect to predictions of approximately 6% is observed [43,44]; this deficit is referred to throughout the rest of this paper as the 'Reactor Antineutrino Anomaly.' In addition, the size of this discrepancy is partially dependent on the fuel content of the reactors producing the observed ν e [34]. Measured IBD energy spectra from numerous experiments are found to be in clear disagreement with conversion-based predictions [13,[45][46][47]. Recently improved summation models predict a smaller IBD yield excess and correct dependence of IBD yields with fuel content, but also cannot reproduce the measured IBD spectrum per fission [48]. These discrepancies are indicative of a lack of understanding of neutrino production in nuclear reactor cores and/or their fundamental properties.
As observed in previous experiments, reactor ν e undergo flavor transformations, or oscillation, as they travel from source to detection point, a quantum mechanical phenomenon resulting from the interacting flavor states being a superposition of underlying mass eigenstates [49][50][51]. When only one neutrino mass difference is involved, this oscillation probability P dis can be described as where ∆m 2 is the squared mass difference, θ is the mixing angle between the mass and flavor states, and E ν and L are the energy and travel distance (baseline) of the neutrino, respectively. For a reactor ν e experiment detecting neutrinos via IBD, this transformation manifests itself as a deficit in detection rates that varies with baseline and neutrino energy. According to Eq. 2, a mass splitting on the order of 1 eV 2 or larger will manifest as an observed average deficit in energy-integrated reactor ν e detection rates with respect to predictions for reactor experiments with L of order 10 m and larger [44]. This mass splitting is much larger than those associated with the three Standard Model neutrinos [1], requiring the existence of new neutrino mass states; to maintain consistency with existing collider physics measurements, these new states must be 'sterile,' or incapable of interacting via the weak force [43]. Demonstration of the existence and properties of such a particle would have far-reaching implications in particle physics and cosmology.
To unambiguously investigate whether these neutrino propagation effects contribute to the observed discrepancies between reactor ν e measurements and predictions, experiments must directly probe the baseline and neutrino energy dependence of reactor ν e signals. Any deviation from 1/r 2 behaviour as a function baseline and energy would indicate an oscillation effect and provide the ability to infer the parameters describing such oscillation. This investigation can be performed using ν e energy spectrum measurements at multiple short (O(10 m)) reactor-detector baselines [52]. Historical and more recent measurements of short-baseline IBD energy spectrum ratios have excluded large regions of sterile oscillation parameter space [47,[53][54][55]. Using 33 days of reactor-on data-taking, the PROSPECT experiment has recently placed limits on sterile neutrino oscillations through relative comparison of measured IBD spectra between multiple baselines within a single stationary detector [56].
The observed deviation between measured and predicted IBD energy spectra, as well as the dependence of measured IBD yield deficits on reactor fuel content, cannot be caused by neutrino oscillations, and are likely the result of incorrect modelling of the ν e flux [57,58]. New, more precise ν e measurements from reactors of differing fuel content will enable further study of the nature of this mis-modelling [2,59] and facilitate improved understanding of ν e production by fission daughters. Of particular importance is understanding whether or not ν e spectrum and flux predictions are similarly incorrect for all four primary fission isotopes. Inconsistencies present in individual isotopes could direct additional scrutiny towards specific fission β spectrum measurements [60], corrections to be applied during the beta-conversion process [61][62][63], or nuclear data for fission daughters [31,64]. Considering isotopic IBD yields, global fits currently favor incorrect prediction of only 235 U and 238 U ν e fluxes, but inclusion of sterile neutrino oscillation effects clouds this picture [36,65]. For isotopic IBD energy spectra, Daya Bay and RENO both appear to show disagreement between measurement and prediction for 235 U, but they cannot presently determine whether other isotopes exhibit similar discrepancies [35,66]. Meanwhile, the first PROSPECT measurement of the pure 235 U IBD energy spectrum at the highly-enriched HFIR reactor core is consistent with Daya Bay's 235 U result, while also slightly disfavoring 235 U as being the sole isotope exhibiting a spectrum discrepant with its prediction [37]. This paper will present new results from the PROSPECT experiment using an enlarged dataset including 96 (73) days of reactor-on (-off) data. Improved sterile neutrino oscillation search results and an improved measurement of the reactor ν e spectrum produced by 235 U fission will be described, in addition to reviewing in detail how the inputs and systematic uncertainties for these two different analyses are determined. Section II will describe the experimental layout and detector design. Sections III and IV will describe the detector calibrations and subsequent event and physics metric reconstruction, respectively. Section V will then describe the selection and Monte Carlo-based modelling of IBD candidates, with background to this selection described in Section VI; selected IBD candidate datasets are then described in Section VII. New oscillation and 235 U ν e energy spectrum measurements will be presented in Sections VIII and IX, respectively, with concluding remarks given in Section X.

II. EXPERIMENT DESCRIPTION
The PROSPECT experiment is located at the High Flux Isotope Reactor (HFIR) facility at Oak Ridge National Laboratory in Oak Ridge, Tennessee. Among other factors, the high power and compact core of the highly 235 U enriched HFIR research reactor, the availability of unoccupied near-reactor floor space [67], and the status of HFIR as a DOE user facility make it a favorable site for the PROSPECT detector. To probe disappearance caused by the existence of an eV-scale sterile neutrino, the PROSPECT detector must be located in close proximity to the HFIR core and without substantial overburden, necessitating an IBD measurement in an intrinsically high-background environment. Moreover, demonstration of the L/E ν nature of this disappearance requires the reconstruction of neutrino interaction locations and energies within the PROSPECT detector target [52]. These requirements served as the primary drivers behind the PROSPECT experimental layout and detector design. A detailed description of these aspects of PROSPECT are provided in Ref. [68]. The aspects of the experimental layout and detector design most relevant to performing a sterile neutrino search and absolute ν e spectrum measurement with PROSPECT are outlined below.

A. Experimental Layout
The HFIR reactor is located at an elevation of roughly 250 meters above sea level in the HFIR building. The HFIR core contains two concentric cylindrical rings of 235 U fuel plates (93% enrichment) with an outer diameter of 0.435 m and height of 0.508 m. The fuel is surrounded by an aluminum cladding and structural environment, which is in turn surrounded by a thick concentric cylindrical beryllium reflector. Each reactor cycle starts with fresh fuel and lasts for ∼24 days running at a nominal thermal power of 85 MW th . The reactor core and pressure vessel are operated inside a large water pool whose surface is nominally eight meters above the midplane of the core. To enable more direct access to the reactor vessel during reactor-off operations, reactor pool water levels are occasionally reduced by 5 m for time periods no longer than a few days. Spent fuel elements are stored in an adjacent water pool O(10 m) from the reactor core. A more detailed FIG. 1. A top and side view of the PROSPECT experimental layout in the HFIR building. The HFIR reactor and PROSPECT detector package are illustrated, along with the reactor pool containment wall (gray) and the concrete monolith located underneath much of the detector package (dashed line). Horizonal and vertical distances from the reactor core center to various detector locations are shown, as well as coordinate axes used to describe the orientation of the reactor and detector. The floor on which the detector sits is parallel to the x-z plane, and the zenith is in the +y direction. description of the HFIR core and facility is given in Ref. [69].
The PROSPECT detector is located one floor above the HFIR core in a ground-level hallway running along the outer side of the pool walls, as illustrated in Figure 1. The detector package, consisting of inner detector, liquid containment vessels, γ-ray and neutron shielding, and detector movement elements, is partially sited above a thick concrete monolith that significantly attenuates γ-ray and neutron backgrounds associated with Neutron scattering experiments located one floor below. The operational cycles of these instruments is a source of non-negligible time-varying γ-ray backgrounds. Lead walls built between the detector package and the reactor pool wall provide targeted shielding of γ radiation emanating from the reactor environment and unused neutron beam tubes.
The PROSPECT inner detector, which serves as the antineutrino target, is illustrated in Figure 1, including size and orientation with respect to the HFIR core. The x, y, and z coordinate system used to describe the orientation of detector and reactor throughout this work are also indicated in Figure 1, with the center of the inner detector serving as the system's origin. The PROSPECT inner detector approximates a rectangular prism with dimensions of 2.045 m long (in x), 1.607 m tall (in y), and 1.176 m wide (in z). The inner detector center is displaced from the center of the reactor core by -1.19 m in z and +5.09 m in y. The distance from the frontmost (back-most) midpoint edge of the inner detector to the reactor center is 6.65 m (9.22 m), respectively, with a coredetector center-to-center distance of 7.93 m [69]. Distances between the inner detector edges and the detector package exterior range from 40 cm on the detector sides and bottom to 100 cm on top of the detector. Detector distance from the reactor was determined with respect to a reference location on the detector package exterior with ±10 cm estimated precision using HFIR facility mechanical drawings and a measuring tape. Negligible additional baseline uncertainty is contributed from the knowledge of relative inner detector placement with respect to this detector-external reference point.

B. Antineutrino Detection Strategy
PROSPECT detects IBD ν e interactions with hydrogen nuclei in liquid organic scintillator comprising most of the volume of the inner detector. The IBD signal consists of timeand position-correlated energy depositions produced by an IBD positron and the capture of the IBD neutron on 6 Li doped into the liquid scintillator. The IBD positron produces a signal with low ionization density and extended (tens of centimeters) topology due to the production of positron annihilation γ-rays. The energy deposited by the IBD positron, E p , is related to the energy of the incoming ν e : with outgoing IBD neutron kinetic energy, E n , of order 10 keV or less. The IBD neutron preferentially captures on 6 Li within a few tens of centimeters and roughly 50 µs, producing 3 H and 4 He ions with 4.78 MeV of total kinetic energy due to the mass difference between parent and daughter nuclei. These products generate a compact (µm-range) monoenergetic energy deposit with high ionization density. Ionization signals from the liquid scintillator produce visible light, which can be collected and converted to an electronic signal by photomultiplier tubes. The PROSPECT inner detector is designed to capture the unique energy, energy density, spatial, and temporal signatures specific to IBD interaction products.

C. PROSPECT Inner Detector
The PROSPECT inner detector is pictured in Figure 2. It contains four tons of pulse shape discriminating (PSD) liquid scintillator loaded to a mass fraction of 0.08% with 6 Li [70,71]. This scintillator is sub-divided into 154 14.5 cm ×14.5 cm ×117.6 cm optically isolated segments of rectangular cross-section by an optical grid composed of thin (1.18 mm) specularly reflecting panels held together by white, diffusely reflecting, hollow 3D-printed polylactic acid (PLA) plastic support rods. The grid permits the liquid scintillator to fill the entire volume of all segments. One subset of hollow support rod axes are instrumented to allow the use of removable radioactive calibration sources, while another is equipped with stationary optical sources, as indicated in Figure 2. Uninstrumented axes are filled with square acrylic rods. The inactive materials composing the optical grid and calibration sub-systems comprise 3.5% of the mass of the scintillator contained in the 154 active segments.
The long axis of each segment is oriented along z, running parallel to the front reactor-facing side of the detector, and is bounded on either end by a mineral oil-filled acrylic box containing a 5" photomultiplier tube (PMT), a magnetic shield, a light reflector, and a support structure. 240 housings contain one Hamamatsu R6594 PMT, while 68 segments on the inner detector top and side edges contain one ET 9372KB PMT. Individual PMT housings are mechanically connected to one another and to acrylic supports running along the inner detector z axis outside the outer rows of segments; this rigid support structure ensures mechanical integrity of the inner detector and maintains consistent target and segment dimensions. To achieve better dimensional uniformity, during detector dry assembly, segment dimensions were measured to mm-scale precision with metrological surveys. The inner detector and support structure are contained within a rectangular acrylic vessel under continuously flushed nitrogen cover gas inside a waterfilled aluminum tank providing secondary containment of detector liquids. Ultrasonic sensors above two corners of the detector target monitor the scintillator liquid level above the top row of detector segments to sub-millimeter precision. Additional sensors monitor temperatures in the scintillator and surrounding detector regions, as well as humidity and pressure in the cover gas region.

D. Readout, Triggering, Data Acquisition, and Storage
Scintillation light produced in a detector segment via interaction of ionizing particles is efficiently transported by the reflecting walls to the corresponding PMTs, whose analog responses are individually processed by CAEN V1725 250 MHz 14-bit waveform digitizer (WFD) modules [72]. The shape of digitized waveforms are primarily determined by the timing characteristics of the PROSPECT scintillator, photon transit time dispersion in the segment, photoelectron transit times in the PMTs, and impedance mismatch in the connections and cabling en route to the detector-external digitizing electronics. Scintillator light output is from a combination of processes characterized by different exponential decay times. Ionization density affects the relative fractions of these processes, causing differences in pulse shape between light and heavy ionizing particles. Lower-ionization-density events such as electron tracks are dominated by a 16 ns scintillator decay time, while a 38 ns component increasingly affects higher-ionization-density (e.g., proton recoil) events, with a small contribution from a 225 ns tail. Averaged PROSPECT waveforms representative of electron and proton interactions are illustrated in Figure 3. Averaged waveforms typical for low-energy-density (electron) tracks (lower, blue), and high-energy-density (proton recoil) tracks (upper, red). Electron-like pulse shapes are independent of energy, while recoil pulses have varying proportions of the longertime tail component (asymptotically approaching the electron track shape in the high energy, minimum-ionizing-density limit). The inset panel shows the same waveforms on a linear scale.
The PROSPECT detector implements a trigger configuration and zero-suppression scheme that enables unbiased readout of all energy depositions above ∼200 keV in energy despite operating in a challenging background environment. A data acquisition (DAQ) trigger starts with a pair of PMTs on the same segment producing a signal 50 ADC channels ( 5 photoelectrons) above baseline within 16 ns of each other. The pairwise logical AND for every segment is combined via a logical OR operation at the WFD board level (up to 8 pairs). A resulting logic output signal from each of 21 WFD boards is further combined via logical OR by a Phillips Scientific 757D Fan-in/Fan-out NIM module, modified by the manufacturer for 32-in-to-32-out operation. This logical OR of all segment pairs defines the DAQ global acquisition trigger, which is fanned out to the acquisition trigger input of each WFD. The global trigger rate is ∼ 5 · 10 3 /s when the HFIR reactor is not operating ('reactor-off') and ∼ 2 · 10 4 /s when it is on ('reactor on'), with the latter dominated by γ-ray backgrounds related to the reactor's operation.
On receipt of the global trigger signal, the WFD records a 592 ns (148-sample) data sequence for each channel, including ∼200 ns preceding the trigger signal. New events arriving within 592 ns of the initial trigger do not re-trigger the DAQ, resulting in a deadtime after each trigger during which light pulses may be recorded in the waveform but are truncated at the end of the sampling sequence -an O(1%) deadtime effect, depending on total trigger rate.
Events may produce light in one or more segments, with a typical multiplicity of <5% of all segments. To greatly reduce the data volume transferred, the WFDs' firmware applies "Zero Length Encoding" (ZLE) to suppress empty signals. A ZLE threshold of 20 ADC channels above baseline (2 photoelectrons) is applied to the waveform data, and sections more than 24 samples before or 20 samples after the nearest above-threshold value are suppressed (possibly including all 148 samples). This reduced stored data volume is illustrated in Figure 4. Example DAQ trigger readout (y axis offset for clarity). Pictured waveforms are inverted and baseline-subtracted with respect to the raw DAQ output; see Section III A for details on low-level waveform processing. Blue (red) waveform datapoints correspond to PMT channels at high (low) z. Pink and green highlighted waveform regions are those above the 50 ADC and 20 ADC trigger and ZLE thresholds, respectively. The global trigger is generated by the first segment pair coincidence above trigger threshold (top two waveforms). All of the pictured PMT channels would have portions of their waveforms read out.
The post-ZLE waveforms are transferred by CAEN CONET2 optical fiber from the WFDs to two four-link CAEN A3818 optical fiber cards in separate DAQ computers (with two or three WFDs daisy-chained for readout on each fiber link). Each link is handled by an independent readout process for maximum parallel throughput. Without ZLE, the 85 MB/s bandwidth of each fiber link would be the main datarate-limiting bottleneck. With ZLE, data rates are 10% of the fiber capacity, with the limiting factor being a fast readout cycle before the maximum event buffer size of the WFDs (1023 triggers) overflows. Testing indicates that the DAQ falls behind on readout (resulting in data loss) at trigger rates 9 · 10 4 /s. The DAQ computers send the data streams to a 20 TB RAID-6 disk array over the local 10 Gb/s ethernet network, and to a "real-time" analysis process generating monitoring plots for a detector status webpage. Data are recorded in the binary format produced by the boards, slightly modified with extra header blocks and removal of fully-ZLE-suppressed waveform headers, with gzip compression. The data are transferred from the RAID array for analysis and archiving on Oak Ridge and Livermore National Laboratory computer clusters, with the RAID array providing 2 weeks storage buffer capacity in the event of network outages to the remote facilities.

E. Physics Datasets
The analysis described in this paper uses data taken with the PROSPECT detector from March 5, 2018 to October 6, 2018. During this time period, which spanned five HFIR fuel cycles, the PROSPECT detector was in physics data-taking mode for 183 days; the HFIR reactor was on (off) for 105 (78) of these days. Calibration data-taking accounted for an additional eight calendar days of data-taking. Physics data for eight (five) of these reactor-on (-off) calendar days were not used for physics analysis due to PMT HV or other data quality issues not identified until after data-taking. In total, 95.6 (73.1) calendar-days of reactor-on (-off) data passing all quality checks were used for the physics analyses described in this paper.
To provide improved background rejection a 106 segment inner fiducial volume is defined. IBD events reconstructed in all outer segments and two inner segments in the bottom back corner of the detector (high x and low y) are not included in the IBD candidate dataset. PMTs in 64 of 154 detector segments ultimately exhibited current instabilities during physics data-taking. These segments, comprising 42% of the total active detector volume, were not used in the physics analyses described in this paper. Of these, 36 were amoung the fiducial segments considered in the IBD selection criteria (described in Section V). This corresponds to 34% of the fiducial volume. The impact on the data analysis is described in detail in Section V C.

III. LOW-LEVEL DATA PROCESSING, CALIBRATION, AND EVENT RECONSTRUCTION
PROSPECT data is analyzed to select antineutrinos interacting via inverse beta decay in and around the inner detector volume. This selection involves analysis criteria on the reconstructed timing, position, energy, and pulse shapes of signals collected from the active segments of the detector. For the PROSPECT sterile neutrino oscillation analysis, establishing consistent energy scales between segments is essential. For the 235 U spectrum measurement, accurate determination of the relationship between true antineutrino energy and reconstructed energy is important. The following section describes how raw digitized waveforms are processed to reconstruct and calibrate each of these quantities for PROSPECT physics analyses.

A. Pulse Definition and Low-Level Metrics
Raw waveform data is initially stored to disk by the DAQ in the proprietary binary format produced by the CAEN V1725 digitizer cards, slightly modified to include additional run header information and strip out data blocks containing no waveforms. This process produces eight separate files, written to disk in parallel, each containing the output of two or three V1725 cards sharing a common optical fiber link to the DAQ readout. The parallel readout scheme facilitates uninterrupted data throughput, necessary to prevent data loss from buffer overflows of the on-board memory of each digitizer card. The separate raw readout files are later collated in time sequence into a single ROOT file [73], with hardware board/channel numbers mapped to a channel numbering scheme by segment number. The waveform file is then analyzed to locate and characterize pulses. Each waveform is represented by a sequence of 14bit integer ADC samples for contiguous 4 ns digitization intervals. The negative-polarity waveform is inverted so higher sample values indicate larger charge signals. One global maximum sample and any number of local maxima (separated by at least 20 samples from any higher point) are identified as initial pulse candidates. The waveform baseline is calculated from the average of the median 8 samples in the range of 5 to 30 samples before the global maximum. This baseline value is subtracted from all samples in the waveform for subsequent analysis. The global maximum, and any local maxima at least 30 ADC units above the baseline, are considered as pulses for further analysis. One such pulse is shown in Figure 5 to visually illustrate the quantities of interest for each selected pulse.
Each pulse's area S is calculated from the sum of samples in the range from 3 samples before to 25 samples after the maximum location. The pulse's arrival time t is determined by scanning backwards from the pulse's maximum sample location to the first level-crossing at 50% of the maximum. The arrival time is linearly interpolated to the 50% point between the two samples bracketing the level crossing.
A pulse shape discrimination (PSD) value is calculated for each pulse as the "tail-over-total" ratio of pulse areas between 11 and 50 samples after t to the area between 3 samples before and 50 samples after t, integrated assuming trapezoidal interpolation between samples. This choice of integration windows was selected to maximize the PSD figure-of-merit for discriminating neutron captures from similar-energy γ-ray interactions.
The time-ordered list of analyzed pulses found in each waveform -arrival time t, area S, PSD, along with baseline b and peak height h -is written to an HDF5 table format file.

B. Pulse Clustering and Pairing
The next stage of analysis uses the HDF5-format pulse data to extract low-level calibration constants from ambient background events. These calibration constants are stored to a calibrations database, to be used in a later pass converting the pulse data to calibrated physics metrics involving ionization energy, time, and positions.
Prior to performing calibration procedures, however, pulse data are grouped into "clusters" of pulses nearby in time, defined as having arrival times between subsequent pulses separated by no more than 20 ns. Within the cluster, pulses are paired between PMTs on opposite sides of the same segment. Segment pulses without a matching pair -either because the other channel was turned off, or the signal fell below acquisition thresholds on the opposite side -are retained by the data processing infrastructure, but are excluded from subsequent calibrated data analysis for results shown in this paper. Paired pulses are processed and combined to produce calibrated physics values describing the interaction producing the collected waveform in that segment: its time, position, visible energy deposition, and PSD.

C. Timing Calibration
The pulse arrival time variables t i 0 and t i 1 for the two PMTs on segment i are transformed into a conjugate pair of variables: a segment hit time t i = 1 2 (t i 0 + t i 1 ), and a timing difference δt i ≡ t i 1 − t i 0 . The segment hit time is, to first order, independent of ionization position along the segment, as increased light transport time to one end cancels decreased transport time to the other. The differential time is independent of absolute event time in the run, and strongly correlated with hit position along the segment.
Relative timing offsets between channels arising from electronics effects and cable length variations are calibrated out using through-going cosmogenic muon tracks. Candidate muon tracks are identified by a pulse ADC area (S) sum above 10 5 and at least 4 paired segments. Muon tracks crossing the full width of a segment produce signals which exceed the dynamic range of the digitizer, resulting in saturated waveforms with nonlinear degraded energy and timing information for energy depositions above ∼ 15 MeV. However, shorter "corner-clipping" track sections produce a range of well-formed waveform signals. Muon statistics are sufficient to calibrate timing on a run-by-run basis: typically one hour, but sufficient even for five minute calibration source runs.
Muon tracks provide signals across multiple segments with approximately simultaneous origin times, up to the muon transit speed through the detector. Muon transit time is estimated from a Principal Components Analysis (PCA) trajectory fit to the pulse pair data. For each pair i, j of segments in the event with "corner-clipping"-range signals, mean and variance of the segment-to-segment distributions T ij ≡ t i − t j − t ij µ and δT ij ≡ δt i − δt j are tallied, where t ij µ is the estimated muon transit time between segments.
The collection of averaged T ij and δT ij values defines an overdetermined linear system of equations for segment-tosegment timing offsets, up to a common-mode offset. This system is solved using the least-squares method to determine average timing offsets t i (with common-mode constraint i t i = 0) and differential offsets δt i for each segment.
These timing values, saved to the calibration database, are subtracted from the raw t i and δt i values for a pulse pair to yield the reconstructed event time t i rec ≡ t i − t i and position- Figure 6 shows the segment timing calibrations extracted from a typical one-hour run. Timing differences 10 ns arise mainly from differences in PMT transit times, with systematic offsets between the ET and Hamamatsu PMT models, plus board-to-board clock t 0 offsets in discrete 8 ns intervals. Board-to-board t 0 offsets are prone to vary run to run; modulo this effect, the extracted timing calibration offsets have a run-to-run scatter 20 ps, and long-term drifts < 1 ns over months.

D. Combined PSD Parameter
To produce a single pulse PSD value, the PSD values from the two channels in a segment pair are corrected to remove residual position dependence and then statistically combined.
Position variation of the PSD observed by each PMT for minimum-ionizing event tracks is mapped out using the corner-clipping muon hits also used for timing. The PSD tail fraction is observed to increase with distance from the PMT, explicable by a wider spread in photon transit distances to the photocathode delaying light further from the shortest-path arrival edge. The observed distribution is empirically fit as a function of ∆t, for each segment in each run, with a threeparameter curve p · (1 + d · [1 − e k∆t ]).
The measured position dependent component p·d·[1−e k∆t ] is subtracted off of the PSD from each pulse, leaving a distribution centered around p for electron-like events, and a higher but still position independent distribution of high-ionizationdensity events. The two position-corrected PSD values for each pulse are averaged together, weighted by the estimated number of photoelectrons in each pulse, into a single PSD value. Figure 7 shows a calibrated PSD distribution for pulses occurring after candidate muon tracks, which include a large population of 6 Li-captured spallation neutrons. Calibrated pulse PSD values are plotted versus uncalibrated signal amplitude -defined as the product of pulse areas S 0 and S 1 for that segment's low-z and high-z PMT channel, respectively.
The p, d, k PSD values track the long-term changes in detector light transport. While the position-dependent terms d,k are calibrated out, the long-term variation in p, trending towards lower values as increased attenuation filters out longer light paths, remains. Rather than calibrating it out, the timevarying value of p is used for defining PSD cuts. The center of the PSD distribution for n+ 6 Li capture events is also tracked for use in neutron capture identification cuts. Amplitude is defined as the product of pulse areas S0 and S1 for that segment. Neutron capture signals on 6 Li are clearly visible in a localized region of amplitude and high PSD, above a band of γ-produced signals of low PSD.

E. Position Calibration
Both the relative timing and relative signal amplitude between PMTs provide information about the position of events along the segment length. A position estimate is calculated both from timing ∆t and from the log ratio of pulse areas The ∆t distribution for previously-described cornerclipping muon hits is recorded in each segment, and is plotted in Figure 8. The distribution is not broadly uniform across the segment due to geometric selection efficiencies for this event type. However, the edges of the distribution provide well-defined markers for the ends of the active scintillator volume. Additional high-frequency variations are also present across the distribution, corresponding to light transport perturbations caused by the diffusely-reflecting plastic support rod clips holding the edges of the specularly-reflecting optical grid panels (described in Section II and Ref. [74]). The corner-clipping muon event class is more sensitive to these than events uniformly distributed over the detector bulk, since scintillation occurs in the segment corners near these clips.
The ∆t distribution shown in Figure 8 is fit to extract the distribution edges and the fine-structure wiggle positions across the segment. A position model z = a∆t + b(∆t) 3 is used, combined with empirical parameters for large-scale resolution and shape. This two-term position model (linear and cubic components) produces agreement to better than 1 cm with dedicated calibration source position scans.
To estimate position from relative light collection, the log signal ratio R is fit against ∆t, which is in turn linked to z by the procedure above. For this step, a linear fit plus cubic correction R = a + b∆t + c(∆t) 3 is employed. Parameters for both timing-based and amplitude-based calibration curves z(∆t) and z(R) are stored to the calibration database for later numerical evaluation. A final reconstructed position z rec for each pulse is formed from a statistically-weighted average of its timing-and amplitude-based z estimates. It is found that removal of either the time or the amplitude based information from z rec produces noticeable degradation in reconstructed position resolution.
The general features of reconstructed pulse positions z rec are illustrated for a single detector segment in Figure 9 using a high-purity selected set of polonium α decay events in the PROSPECT scintillator, which arise from the presence of added 227 Ac [68], and naturally occurring 238 U, and 232 Th decay chain isotopes. The energy, position, and time coincidence requirements for these datasets are described in Section III H. For all three polonium isotopes, uniform z rec distributions are centered on z rec = 0 with a width consistent with expectation based on the 117.6 cm active segment length. The resolution of z rec is illustrated by the the gradual reduction in rates at segment ends (high |z rec |), and by the z rec coincidence observed between 215 Po and its α decay parent 219 Rn, which decays at an effectively identical location.
As scintillator optical properties slowly evolve with time, so do both z(R) and z(∆t). Collecting sufficient statistics to resolve the fine structure in the ∆t distribution for each segment requires combining data over week-timescale periods. The whole dataset is thus subdivided into 11 calibration periods for measuring and applying position calibrations.

F. Energy Calibration
The PROSPECT detector's segmented construction, coupled with scintillator nonlinearity (quenching) and trigger acquisition thresholds, complicates event-by-event extraction of interaction energies. Rather than attempt reconstructing the initial energy of each interaction, the PROSPECT calibration effort is divided into two components: extracting a consistent measure of the visible energy, E vis (light production after scintillator nonlinearity, but before light transport and PMT gain effects), and adjusting parameters in a Monte Carlo (MC) based detector response model to accurately reproduce data observables in E vis space. This section describes the first component, calibration of position-and time-dependent variations in light collection. Adjusting the response model to match absolute energy scale is discussed in Section IV.
Inputs for reconstructing the E vis of a segment interaction are the two pulse area signals S 0 , S 1 from each segment end, and the reconstructed longitudinal position in the segment z rec . The statistically optimal way to combine this information into a single E vis number, given the dominant uncertainty of photoelectron (PE) counting statistics fluctuations on the pulse area values, is to sum the estimated total number of PE counted by both PMTs, and divide out a position-dependent light collection factor, E vis = S 0 n 0 /g 0 + S 1 n 1 /g 1 n 0 η 0 (z rec ) + n 1 η 1 (z rec ) , where g i is the pulse area signal per E vis deposited at segment center (combining effects of light production, light transport, and PMT/readout gain), n i is the estimated number of photoelectrons collected per E vis at segment center, and η i (z) is the position-dependent light transport efficiency to each PMT, normalized to 1 at segment center. Neutron capture signals on 6 Li provide a monoenergetic reference continuously available from natural backgrounds throughout the scintillator volume, cleanly separable from dominant γ-ray backgrounds by both PSD and time correlations. The high-ionization-density tracks of the 4 He-3 H products are well into the scintillator's nonlinear quenching range, so the E vis produced cannot be accurately predicted from first principles. From the absolute energy scale calibration described in Section IV, γ-ray calibration source spectra are reconstructed to the correct (i.e. MC-matching) E vis when the neutron capture peak is scaled to fall at E vis = 0.526 MeV.
The neutron capture signal is measured for each PMT for each run to determine the gain-stabilizing g i calibration constants. The neutron capture peak is also used to map out the light transport curves η i (z), summed over approximately twoweek long periods for sufficient statistics. The accuracies of n 0 and n 1 is not critical to the result, since these only define weightings that cancel out -sub-optimal estimates of n i would inflate statistical scatter in the result, without shifting the mean, insofar as g i and η i are accurate. The value of n i is determined from the width of the n+ 6 Li capture peak width. Gain-stabilizing constants are recorded into calibration databases and applied on a run-by-run basis, while light transport and photoelectron collection constants are recorded and applied in two-week intervals.  Figure 10 illustrates the magnitude of the time variation of the position-dependent light transport variation that must be taken into account to achieve stable E vis calibration. For a single channel, the overall level of light collection varies by a factor of 3-5 along z rec , with substantial variation between segments. Variation in light collection as a function of z rec is reduced to roughly 50% when information from both channels is combined. A substantial reduction in light collection is also clearly visible between the beginning and end of the dataset used for this analysis: at the segment center, a light reduction of 30% is observed over the 7 calendar month data-taking period.
Degradation of scintillator optical properties with time causes a continuous gradual degradation of E vis resolution. For constructing energy spectra in a uniform manner across different time periods, which permits straightforward reactoroff data subtraction and simpler interpretation of spectrum results, a "smeared" energy E smear is produced by adding random fluctuations to E vis to reduce the resolution to the equivalent of 325 photoelectrons/MeV in all segments at all times. Figure 11 shows the long-term stability of E smear energy resolution for the 215 FIG. 11. Erec resolution of the 215 Po peak from 219 Rn-215 Po α-α decays before (black) and after (blue) applying Erec energy smearing.

G. Event Reconstruction
As interactions of ν e and other particles in the PROSPECT inner detector will often produce pulses in multiple detector segments, it is necessary to analyze physics events at the cluster level. Thus, reconstructed cluster physics metrics are primary inputs to the higher-level PROSPECT oscillation and 235 U physics analyses. Cluster formation was described previously in Section III B.
To ensure consistency in cluster energy and multiplicity definitions despite variations in per-segment energy response with time coupling with hardware thresholds, only reconstructed pulses with E smear > 90 keV are considered for analysis in reconstructed clusters. This threshold was estimated to be above the ZLE ADC threshold at all positions in all segments for the entire dataset by examining each segment's pulse energy spectrum shape in the vicinity of the trigger threshold at different times. To account for unexpected biases in the analysis method, the 90 keV energy cut threshold is treated with a ±5 keV uncertainty when comparing predicted and measured IBD datasets. This uncertainty allows for small variations in the multiplicity of predicted events, which naturally propagates to an uncertainty in predicted reconstructed energy.
Reconstructed physics quantities for individual clusters are formed using the reconstructed quantities of the included pulses. Cluster time, T rec , is defined as the median t rec of the individual included pulse times. Cluster energy, E rec , is defined as the sum of the reconstructed smeared energies E smear of all contained pulses. Cluster z-position and segment number, Z rec and S rec , are defined as the z rec and segment number of the highest-energy contained pulse. Cluster segment multiplicity, as well as the energies, PSD values, and z-positions of each segment pulse, are also stored for use in later steps of the analysis. All of these cluster-related variables are used in the IBD signal selection process. Cluster E rec and S rec are used as primary inputs to the sterile neutrino oscillation analysis, while E rec is also a primary input to the 235 U ν e spectrum analysis.

H. Calibration Performance
The stability of the energy, position and PSD metrics as a function of time and segment can be characterized using a variety of background categories present in normal physics datataking runs, encompassing a range of particle types, energies, and spatial topologies. The most versatile event category is a high-purity sample of detector-intrinsic ( 219 Rn, 215 Po) correlated α decays produced by 227 Ac deliberately dissolved into the scintillator. The selection criteria and time-integrated rate for this signal are summarized in Table I. The total rate of this signal in the detector, 0.4 Hz, enables daily characterization of energy-and z-related performance metrics, as well as timeintegrated comparisons between datasets from differing detector segments. Notably, the compact topology of these α coincidences also enables characterization of the stability of zposition reconstruction resolution with time. A similar highpurity sample of correlated ( 214 Bi, 214 Po) and ( 212 Bi, 212 Po) (β + γ,α) decays from the 238 U and 232 Th decay chains can also be found in the dataset due to natural radioactive contamination in the inner detector. Selection criteria and rates for these events are also summarized in Table I. Due to the presence of γ-rays in the prompt signal and the significant path length of the β-particles, they are not ideal for characterizing the z-resolution of the detector. Various clean γ-ray signals can also be identified for use in stability studies. A sample of mono-energetic 2.2 MeV γray produced by n-H capture in the detector can be obtained from a 10-200 µs window following cosmogenic muon signals in the detector. Cosmogenic muon signals are defined as events with summed pulse energies greater than 15 MeV, while the purity of the n-H sample can be further improved with tight cuts on the electron-like PSD band. Finally, prominent γ-ray peaks are visible in the low-PSD single trigger energy spectrum originating from intrinsic 208 Tl contamination in the detector and from capture of reactor generated neutrons on metals in the HFIR complex and the PROSPECT shielding package The time-stability of energy and z-related reconstruction metrics are summarized for these various sources in Figure 12. For each metric and event type, stability in time is expressed in reference to the mean value over the full dataset for that FIG. 12. Stability of pulse-level reconstructed physics metrics related to energy and longitudinal position (z). Stability is pictured over time, as well as between reactor-on and reactor-off periods. Metrics are calculated for 215 Po (black) and 214 Po (blue) α decays uniformly distributed throughout the detector, for nH captures (green), for γ-ray full-energy peaks from single 208 Tl decay (red), and for and the highest-energy prominent reactor neutron capture peak edge (pink) during reactor-on and -off periods, respectively. Reconstructed metrics are described in more detail in the text. All quantities are shown relative to the average of all points in the dataset. Light grey bands indicate reactor-on periods. Right panel shows relative changes between reactor on and off datasets. All error bars represent statistical uncertainties. metric/event type; stability between reactor-on and reactoroff periods is expressed with respect to the mean of reactor-on and reactor-off values. E rec values for all sources are stable within ±0.5% over the full dataset, and to within 0.2% between reactor-on and reactor-off periods. E rec resolutions are stable within ±5% over the full dataset, and within 2% between reactor-on and reactor-off periods. Given the expected stability and uniformity of ( 214 Bi, 214 Po) and ( 219 Rn, 215 Po) distribution throughout the detector with time (Fig 9), the root mean square (RMS) of all coincidences' delayed reconstructed z position, Z RM S , should exhibit time-stability; any change in this quantity would indicate an alteration in the resolution of pulse z reconstruction. This quantity is found to be time-stable within ±1.5%, corresponding to roughly 2 cm with respect to the 1.2 m segment length. A more precise probe of z resolution is provided by the distance between prompt and delayed ( 219 Rn, 215 Po) signals, σ ∆z . This metric exhibits a 7% variation over time, corresponding to roughly 3.5 mm with respect to the 50 mm ( 219 Rn, 215 Po) time-averaged σ ∆z . This variation in z reconstruction capabilities is caused by the reduction in photon counting statistics due to decreased light collection over time, as described in the previous sections. Time variation in z-resolution for events with higher energies and larger spatial extent, such as IBD prompt positron signals, are likely to be less significant, due to higher photostatistics and to the finite cm-scale spatial extent of ionization tracks. Due to the interleaved nature of reactor-on and reactor-off datasets, this time variation results in <0.5% difference in σ ∆z between reactor-on and reactor-off periods. The minor impact of z-resolution time-dependence on the selection of IBD events will be discussed in more detail in Section V C. Quantities are calculated for 215 Po (black), 214 Po (blue), and 212 Po (red) α decays uniformly distributed throughout the detector. Reconstructed quantities are described in more detail in the text. All quantities are shown relative to the average of all points in the dataset with the exception of mean zrec, which is plotted in mm. Figure 13 provides similar reconstruction stability characterizations for the ensemble of detector segments. Reconstructed quantities for the γ-ray event classes are excluded because the segment multiplicity is greater than unity. Energy scales and resolutions are found to be identical to within ±0.5% and ±7% between all detector segments, respectively. To gauge the common alignment of z between all segments, the mean -rather than the RMS -of the reconstructed zposition distribution for each segment is also plotted. The mean z rec for all segments are found to be aligned within ±0.5 cm for 215 Po events and within ±2.0 cm for 212 Po and 214 Po events. Prompt-delayed position coincidence distributions for ( 219 Rn, 215 Po) events are found to have variations in width (σ ∆z ) of order 10% or less, corresponding to a segmentto-segment variation of 0.5 mm or less.
Variations in pulse-level reconstructed metrics with time and segment are propagated as systematic uncertainties in higher-level PROSPECT analyses. The treatment of these uncertainties are discussed in further detail in Sections VIII and IX.

IV. ABSOLUTE ANTINEUTRINO ENERGY AND ENERGY RESOLUTION
For higher-level analyses, it is essential to define the relationship between reconstructed cluster energy, E rec , and incoming antineutrino energy, E ν . This relationship is complex, given the presence of dead material throughout the antineutrino target, the segmented detector geometry, the small target size, and the non-linearity of light production in the scintillator. For ν e -related energy depositions, this relationship is defined using PG4, a GEANT-4 based [75] MC simulation of the PROSPECT detector, which is adjusted to reproduce the observed PROSPECT response to a wide variety of radioactive calibration source and intrinsic background energy depositions. This approach is in contrast to that recently presented by other reactor ν e experiments such as Daya Bay, where geometric, scintillator, and electronics effects are independently modelled and parameterized, with energy non-linearities then matched to empirical fits of calibration and background energy spectra [76].

A. Monte Carlo Simulation Description
The PG4 MC simulation incorporates the essential aspects of the realized PROSPECT detector geometry described in Section II. The modelled dimensions of the scintillator volume accurately reflect dimensions measured during detector assembly and scintillator preparation [71]. The modelled optical grid features the as-measured average reflector and support rod dimensions, materials, and densities reported in Ref. [74]. The most important aspects of both instrumented and un-instrumented segment support rods are also modeled, including radioactive source capsule materials and geometries as well as accurate air, acrylic, Teflon, PLA, and scintillator volumes.
The simulation includes the geometries and materials of the PMT housings, the acrylic support structure, the acrylic and aluminum tanks, and the inner and outer shielding packages. To simplify the simulation, all segments are given identical geometric and material properties. Modest simplifications are also applied to the support rod axis and calibration deployment system geometries. These simplifications are expected to have minimal impact on the PG4-determined relationship between true and reconstructed ν e energies.
The non-linear optical response of the PROSPECT scintillator to energy depositions is not directly simulated via the computational-resource-heavy process of optical photon production and propagation. Instead, the fractional rate of conversion of true deposited energy to scintillation light is calculated step-by-step during Geant4 propagation of the particle using parameterizations of these physics processes: The energy converted directly into scintillation light dE scint during simulation step i is parameterized using Birks' law quenching [77]: where k B1 and k B2 are first-and second-order Birks constants and dE/dx is the true deposited energy in that step. Cerenkov light production and absorption and subsequent scintillation photon re-emission in simulation step i is modelled as where N λ is the number of Cerenkov photons emitted per unit wavelength, E λ is the energy of those Cerenkov photons, and k c is a normalization parameter that scales Cerenkov light production with respect to a default estimate based on simple scintillator refractive index assumptions. In Equation 5, an overall scaling factor A enables variation of the overall fractional rate of conversion of deposited energy into detected energy. We note that scintillation light from nuclear recoil signatures are modelled with two independent Birks parameters tuned to properly place the n-Li E rec peak location with respect to the γ-ray and β+γ signatures used for calibration; recoil signatures in the energy range of interest for this analysis produce no Cerenkov light. During the simulation, each step in deposited energy E M C,i is assigned to a running total for the appropriate segment. E M C for each segment following particle propagation is used to build synthetic waveforms based on measured shape templates and low-level detector calibration parameters. Waveform shape for each channel is assigned according to the magnitude of simulated scintillation light quenching for the relevant energy depositions, while waveform amplitude is determined by the magnitude of E M C and the position of deposited energy in z. Low-level pulse processing, cluster formation, and timing, PSD, position and energy reconstruction then proceed identically to that described above for real PROSPECT data.

B. Absolute Energy Response Determination
PG4 MC simulations, run through PROSPECT's analysis infrastructure, can be used to generate simulated cluster E rec distributions and pulse multiplicities in response to any energy deposition given any combination of absolute energy response parameters (A,k B1 ,k B2 ,k c ). Data and PG4 cluster E rec and pulse multiplicity distributions can then be compared for a variety of radioactive sources, both deployed and intrinsic. For E rec spectra, datasets include γray sources 60 Co (1.17+1.33 MeV), 137 Cs (0.66 MeV), and 22 Na (2×0.511+1.27 MeV and 2×0.511 MeV) deployed at the detector z midpoint along a calibration axis near the (x,y) center of the detector, n-H capture γ-rays from a similarlydeployed 252 Cf spontaneous fission source (2.22 MeV), and β-dominated energy spectra from cosmogenically-produced 12 B (3 MeV to 13.4 MeV). Pulse multiplicity distributions are included in the fit for all of the γ-ray sources listed above. All γ-ray datasets and the 252 Cf dataset were obtained during special calibration campaigns in April and May 2018, respectively; the high-purity 12 B dataset derives from special analysis cuts applied to the full physics dataset. Co data 60 Cs data 137 Na data 22 Best fit MC To determine the nominal PROSPECT detector energy response model, cluster E rec and multiplicity distributions described above were simulated in PG4 for each grid point in a 4-dimensional detector response parameter space (A,k B1 ,k B2 ,k c ), and compared to the corresponding calibra-tion datasets using the χ 2 function: In this comparison, χ 2 γ is the χ 2 value for each γ-ray E rec distribution, χ 2 multi is the χ 2 value for each of the two included γ-ray multiplicity distributions, and χ 2 12 B is the χ 2 value of the 12 B E rec distribution. The grid point providing the lowest χ 2 value was chosen as the nominal energy model. Reconstructed energy and multiplicity distributions for the data and best-fit PG4 are shown in Figure 14. Both the shape and scale of these distributions show good agreement between data and the bestfit Monte Carlo. The best-fit parameters for this model are (A, k B1 , k B2 , k c ) = (1.0026±0.004, 0.132 ±0.004 cm/MeV, 0.023±0.004 cm 2 /MeV 2 , 37±2%), with a χ 2 /DOF (degrees of freedom) of 581.5/420. For the best-fit model, light is overwhelmingly contributed by direct scintillation from excitation and ionization: as an example, for the 2.22 MeV n-H capture de-excitation γ-ray, only 3.5% of E M C is contributed by the Cerenkov process (E c ).
Uncertainties on each of the four energy response parameters are assigned by identifying the maximum variation in each parameter value among all grid points with χ 2 values within 1σ of the best-fit model. For the 235 U spectrum and oscillation physics analyses, an energy scale uncertainty covariance matrix reflecting these energy model parameters is then generated using these parameter variation ranges. This scintillator-associated uncertainty is assumed to be correlated between all segments.
To reduce the required parameter space dimension and computing time, the detector energy resolution smearing, perpulse 90 keV analysis threshold, and PG4 geometry are held constant for all simulated grid points. These features and their uncertainties are determined using separate information, such as QA/QC studies and detector surveys, or data analyses that are unaffected by PG4 energy response parameters. The persegment 5 keV threshold uncertainty is defined as given in Section III G, and is propagated as both a segment-correlated and a segment-uncorrelated uncertainty. Energy resolution uncertainty is described in the following section. Finally, PG4 studies indicate that the limited precision in measurements of the optical reflector panel masses can cause modest variations in detector energy response. The size of this uncertainty was estimated using PG4, along with the mass measurement precision of 1.7 kg reported in Ref. [74]; this dead mass uncertainty is propagated in PG4 MC simulations as a 0.03 mm segmentcorrelated uncertainty in reflector thickness.
The overall agreement in measured and predicted response across the E rec energy distribution is further illustrated in Figure 15, which shows the ratio of the reconstructed γ-ray energy between data and the best-fit PG4 calibration dataset. This ratio is found to be unity within ±1% for all γ-ray calibration datasets used in the fit, with residues all lying within the error bands defined by the energy model and per-segment energy threshold uncertainties. For the 12  FIG. 15. Ratios of γ calibration source reconstructed energy peak locations between data and PG4 MC simulations utilizing best-fit energy response modeling, plotted versus reconstructed γ-ray energy.
Error bars indicate statistical and systematic uncertainties. Top: ratios for calibration source datasets used in the determination of the best-fit PG4 response model. Bottom: ratios for calibration source datasets taken during different run periods. Ratios for all datasets are within 1% of unity. data-PG4 β-particle E rec at higher energies. Data and PG4 12 B spectra are found to be most consistent when a relative shift of 0.38±0.41% is applied; given the close correspondence between the properties of 12 B electron and IBD positron kinetic energy depositions, this 0.41% precision in verifying predicted and measured 12 B energy scale agreement is also propagated as a segment-correlated energy scale uncertainty in the full detector response uncertainty covariance matrix.
Similar data-PG4 comparisons are also shown in Figure 15 for γ-ray and 252 Cf calibration datasets acquired in August and December 2018, which were not used in the energy calibration procedure described above. Ratios are similarly statistically consistent with unity for these later datasets, demonstrating the stability of non-linearity effects and calibrated energy scales over time.
While not included in the energy response model fitting, a special December 2018 detector-center deployment of an 241 Am-9 Be source yielded a dataset containing 4.43 MeV γrays from de-excitation of the first excited state of 12 C following α-particle capture on 9 Be. These signals were measured preceding neutron capture signals by requiring promptdelayed time and position coincidence criteria identical to the IBD selection. Cuts rejecting high-PSD pulses within the prompt cluster enabled reduction of proton recoil contamination of the 12 C de-excitation signature and more direct data-PG4 comparison of the monoenergetic γ-ray's energy deposition. As illustrated in Figure 15, and in more detail in Figure 16, the energy scale of this feature is also accurately predicted by the best-fit PG4 MC to within 0.5%, providing further confidence in PG4 modeling of response at high IBD MeV γ-rays from de-excitation of the first excited state of 12 C following α-particle capture on 9 Be. This signal was extracted from data featuring detector-center deployment of an 241 Am-9 Be source.

C. Energy Resolution
The resolution in reconstructed energy distributions was also characterized for calibration γ-ray events. The PG4 energy model was smeared with a Gaussian distribution whose σ value was fitted with the resolution function where the first term is dependent on light collection inefficiency variations, the second term represents energydependent photostatistics, and the third term is related to PMT and electronics noise. The best-fit energy resolution as a function of energy deposition is shown in Figure 17; best-fit resolution parameters are found to be (a,b,c) = (1.15%±0.47%, 4.61%±0.24%, 0.0+1.3%). The determined 1σ spread in best-fit parameters is assigned as a correlated energy resolution uncertainty between all segments. We note that since both data and MC include inherent energy smearing due to loss of energy in non-scintillating regions, this contribution is not reflected in the fit parameters or in Figure 17. This effective resolution contribution is characterized in Section V B.

D. Determination of Position-Dependent Energy Response Variation
In addition to modeling absolute energy response effects in the PROSPECT detector center, PG4 MC simulations must also properly take into account position variations in IBD prompt E rec response due to proximity to the target boundary and to non-active segments. PG4 IBD MC simulations show that, to first order, variations in leakage of annihilation γ-ray energy into these regions results in a consistent shift in reconstructed IBD prompt E rec . Proper modeling of these leakage effects was verified by performing segment-by-segment E rec comparisons between data and Monte Carlo for multiple 22 Na source deployment locations. As a positron emitter, the 22 Na source reflects the change in IBD energy scales resulting from variations in annihilation γ-ray energy leakage with detector position, as well as the distribution of IBD positron annihilation γ-ray energies among different detector segments. The latter effect is reflected in the top panel of Figure 18, which shows the reconstructed spectrum from a 22 Na source deployed at z = 0 within a single ring of segments (four total segments) surrounding the 22 Na source calibration axis, and within three rings of segments (36 total segments). The best-fit PG4 energy response model is also included for comparison. Incorrect modeling of the topology of annihilation γ-ray energy deposition would produce data-PG4 deviations that vary between one-ring and three-ring distributions. On the contrary, both the shape and scale of the PG4 and data distributions show good general agreement for both the onering and three-ring cases. By minimizing the χ 2 between data and energy-shifted PG4 spectra, the relative data-PG4 scale shift for the one-and three-ring topologies is determined to be 4±1 keV and 5±1 keV respectively.
Gamma energy leakage effects can also be demonstrated by comparing data and PG4 energy distributions for detectorcenter and detector-corner 22 Na deployments. Figure 18 also shows 3-ring reconstructed energy distributions for a 22 Na deployment at z =30 cm along the same calibration axis as above, and at z =30 cm along a calibration axis bordering the corner of the detector's fiducial volume. Again, good general agreement is found between the shape of data and PG4 distributions. Relative data-PG4 scale shifts are found to be 8±1 keV and 7±1 keV for these two detector positions respectively. This study indicates that PG4 IBD MC simulations reproduce variations in prompt energy scale arising FIG. 18. Reconstructed energy distributions for calibration and bestfit PG4 MC 22 Na source deployment datasets. Image insets depict the geometry of the source deployment axis and ring definitions. 'X' indicates an inactive segment; as this calibration run was taken earlier in the data-taking period, fewer dead segments are present in this analysis that in the IBD selection. Top: Detector-center source deployment segment-integrated energy distributions when either the nearest one or nearest three rings of detector segments are included in the integral. Bottom: Three-ring energy distributions for source deployments at z=30 cm along detector center and detector corner calibration axes. from annihilation γ-ray energy leakage with keV-level precision. A conservative ±8 keV uncertainty in prompt IBD energy scale due to modeling of annihilation γ-ray energy leakage is included as both a segment-correlated and segmentuncorrelated uncertainty in subsequent physics analyses.

V. IBD EVENT SELECTION
Less than 1000 IBD positron-neutron pairs are expected to be produced per day in the PROSPECT inner detector by reactor antineutrinos during reactor-on data-taking periods. These IBD events exist amidst an intense background of reactor-and cosmogenically-produced γ-ray and neutron fluxes. To uncover this IBD signal, a highly effective selection based on pulse-and cluster-level reconstructed physics metrics must be performed. In the following section, we outline these selection criteria and discuss the expected stability of the resulting IBD detection efficiency.

A. Antineutrino Selection
The positron produced by a reactor antineutrino interaction in the PROSPECT scintillator will deposit up to about 8 MeV of kinetic energy in a small number (usually 1, 2, or 3) segments, with the highest energy deposition usually present in the segment in which the IBD interaction took place. The positron annihilates, almost always producing two 511 keV γrays, which will deposit energy in segments within a few tens of centimeters of the IBD interaction point. These positronrelated low-density energy depositions occur on nanosecond timescales. Thus, the IBD selection requires an initial cluster with E rec between 0.8 and 7.2 MeV and individual reconstructed pulse PSD values all within 2.0 standard deviations of the calibrated electron-like PSD mean. No further cuts are made on the temporal or topological characteristics of the prompt cluster.
The IBD neutron is produced with less than a few tens of keV of kinetic energy and produces negligible scintillation light as it thermalizes. It then captures within a few tens of centimeters of the IBD interaction point with a roughly 50 µs time constant. Approximately 75 % of IBD neutrons capture on 6 Li, producing a 3 H-4 He pair with 0.526 MeV of total visible energy. The high ionization density tracks of the capture products terminate within micrometers of the neutron capture point, producing scintillation light in a single segment. Thus, the IBD selection requires a single-pulse cluster within an (E rec ,PSD) phase space consistent with n-Li capture. That phase space is defined using the Gaussian-shaped feature corresponding to cosmogenic n-Li capture events in this space (Figure 7), with energy required to be within ±3σ of the mean value of 0.526 MeV and PSD within ±2σ of the PSD mean value. This delayed cluster is required to occur within 120 µs of the prompt cluster; its segment S rec must be the same as or vertically/horizontally adjacent to that of the prompt cluster. If S rec are identical, the prompt-delayed Z rec difference is required to be less than 140 mm; if S rec are adjacent, Z rec spacing must be less than 100 mm.
To remove activity associated with cosmogenic muons and other high-energy events capable of creating significant numbers of delayed secondaries, IBD candidates are rejected if their delayed capture times are within 200 µs of a preceding cluster with E rec > 15 MeV; this cut is referred to as a 'muon veto.' To similarly reject cosmogenic neutron-related activity, IBD candidates are rejected if their delayed capture occurs within 400 µs of another n-6 Li candidate, or within 250 µs of a preceding cluster with E rec > 0.25 MeV and at least one pulse with a PSD larger than 2σ above peak of the electron-like PSD band. These cuts are referred to as the 'neutron veto' and 'recoil veto,' respectively. These three cuts are also referred to collectively as a 'cosmic veto.' IBD candidates are also rejected if either cluster occurs within 0.8 µs of a previous cluster; this cut, referred to as the 'pile-up veto' reduces ambiguities in the calculation of trigger-related dead times.
PG4 MC simulations of cosmogenic processes also indicate that neutron-related backgrounds are concentrated on the edges of the active region [68]; for this reason, IBD candidates are rejected if their prompt or delayed S rec is within the outermost layer of segments on the detector top and sides. Signals in two segments in the bottom back corner of the detector are similarly rejected due to high reactor-on trigger rates in these segments from reactor γ-ray backgrounds. IBD candidates are are rejected if prompt or delayed Z rec values are within 140 mm of the segment ends. These segment and z-end exclusion cuts are referred to as 'fiducialization' in following sections.  Figure 19 illustrates the reduction in IBD candidates upon sequential application of the IBD selection cuts described above during reactor-on data-taking; distributions include subtraction of accidentally time-coincident backgrounds, which is described in Section VI. A two to three order of magnitude reduction in IBD candidates is observed after all cuts are applied. The reactor-on prompt E rec distribution in Figure 19 exhibits a smooth event distribution peaking between 2-3 MeV and falling at higher energies, consistent with the expected energy distribution of reactor ν e IBD interactions; however, peak-like features also appear in this distribution, indicating the residual presence of background IBD candidates. The PSD distribution in Figure 19 exhibits a double-humped structure matching that expected from prompt IBD positrons (low PSD) and prompt nuclear recoils (high PSD), gamma interactions from inelastic scatters (low PSD), and captures (high or low PSD for captures on 6 Li and hydrogen, respectively) produced by cosmogenic neutrons. We note that due to integration over a broad energy and time range, the high and low PSD distributions observed in Figure 19 are smeared out and provide an incomplete representation the de-tector's true PSD sepration capability.
IBD candidates are also investigated in Figure 20 by simultaneously plotting the PSD and energy distributions for the most restrictive selection given in Figure 19. Pictured are the total summed prompt E rec , as well as the PSD value for the pulse of highest reconstructed energy within the prompt cluster. The elongated band at low PSD value represents the area containing all selected IBD candidates, as well as a subset of non-IBD events containing sub-dominant prompt cluster pulses with high PSD values, e.g. due to the recoil from inelastic scattering. Pictured are the total prompt Erec as well as the PSD value for the pulse of highest reconstructed energy within the prompt cluster. The labelled regions contain IBD candidates (red), coincident (n-Li,n-Li) captures (blue), (n-p,n-Li) scattering and capturing fast neutrons on protons (magenta) and other heavier nuclei (black). We note that a subset of prompt clusters inside the IBD candidate labelled band will also contain high-PSD pulses, and will not be selected as IBD candidates.
Two other regions of potential IBD-like backgrounds are also highlighted in Figure 20. One isolated region at low energy and high PSD is produced by the time-coincident captures of two neutrons on 6 Li, which are a signature of multineutron cosmogenic showers. Another region inhabiting a broad energy range at high PSD is produced by the scattering and subsequent 6 Li capture of a single energetic cosmogenic neutron. These event classes, designated (n-Li,n-Li) and (n-p,n-Li), will be used to further investigate the impact of multi-neutron showers and high-energy cosmogenic neutrons on PROSPECT signals. In these investigations, the latter (n-p,n-Li) class will also include rejected events in the IBDlike band of Figure 20 that contain a sub-dominant high-PSD prompt cluster pulse. The prompt parameter distribution in Figure 20 clearly demonstrates the highly-effective reduction in copious multi-neutron and fast-neutron backgrounds made possible by PROSPECT's prompt PSD capabilities. Interestingly, an additional band visible at higher prompt PSD than the (n-p,n-Li) events is likely produced by fast neutron recoils on other heavier nuclei.

B. IBD Monte Carlo Simulation
After the parameter optimization described in the previous sections, PG4 IBD MC simulation datasets can be produced that include effects of energy response non-linearity, IBD positron energy loss and energy leakage, and energy resolution smearing. At the same time, the IBD MC must also accurately model the position distribution of IBD interactions within the PROSPECT detector, the behavior of IBD neutrons as they propagate through the detector, and detection efficiency variations associated with the IBD selection criteria. All of these aspects of the simulation are required to fully characterize the relationship between true ν e energy spectra and prompt IBD E rec spectra, which is essential when comparing predicted oscillated and unoscillated reactor ν e flux models to selected IBD candidates.
In the PG4 IBD MC simulation, an IBD vertex positioner module is first used to ensure proper placement of IBD interactions throughout the inner detector. To first order, IBD vertices are distributed according to a 1/r 2 distribution in the inner detector given the known reactor-detector center-to-center baseline reported in Section II. Vertices are generated in all detector materials, including the scintillator, optical grid components, PMT housing faces, and acrylic supports; vertex densities are varied to properly account for relative proton density differences between the materials in these different components. Vertex locations can be generated using either a pointlike core geometry, or one incorporating the finite cylindrical shape of the reactor core as described in Section II. For the purpose of generating descriptions of detector IBD energy response, the point-like and cylindrical core geometry yield nearly identical results; given its quicker processing time, the point-like geometry is used. For the purpose of generating realistic distributions of ν e production-interaction baselines for the oscillation analysis, the cylindrical reactor geometry is used.
Final state positrons and neutrons are generated at each simulated IBD interaction vertex with kinetic energy and momentum distributions defined by the IBD cross-section [32] given the incoming neutrino direction and energy. At reactor ν e energies, this will produce IBD positrons (neutrons) with momenta preferentially directed back towards (away from) the reactor core. IBD neutrons, produced with O(keV) energies, will thermalize and scatter prior to capture. The Geant4 libraries "G4NeutronHPElastic" and "G4NeutronHPThermalScattering" are implemented to model the propagation above and below 4 eV, respectively; the latter is modelled assuming thermal scattering by unbound hydrogen atoms. IBD positrons are propagated using the default Geant4 "emstandard" libraries. The simulated detector geometry, translation from scintillator-deposited true energy to quenched energy, and PMT waveform simulation is as described in Section IV A.
The position-integrated relationship between E ν and E rec for the full IBD MC is illustrated in Figure 21; this detector response matrix is directly used in the PROSPECT 235 U spectrum analysis, and is included in tabulated form in the attached supplementary materials. The matrix is generated using only output from the active detector segments used in these analyses. For the oscillation analysis, similar E ν to E rec translation matrices are also generated separately for all individual PROSPECT segments. To simplify the generation of these per-segment matrices and address ambiguities related to true ν e baselines, only MC IBD events with prompt S rec containing the true IBD vertex are considered. While this choice reduces the IBD MC sample by 3% for each active detector segment and ignores signal candidates from IBD interactions in inactive segments, these exclusions are found to produce negligible bias in the oscillation fit. Top: PROSPECT Detector response matrix describing the relationship between true νe and reconstructed energies, as modelled by the best-fit PG4 detector simulation. The matrix is generated using only output from the active detector segments used in the oscillation and spectrum analyses. Bottom: PG4-modelled Erec distribution in response to mono-energetic 4.0 MeV νe evenly distributed throughout the detector. A photostatistics-smeared, full-energy peak from this source is also plotted; see the text for detailed description of these distributions. Figure 21 also includes an illustration of the E ν -E rec re-lationship for 4.0 MeV of monoenergetic ν e energy, corresponding to a vertical slice of the full detector response matrix. This distribution is accompanied by the true fullenergy prompt positron signature expected from a 4.0 MeV neutrino as described by Equation 3, smeared by the 5.5% photo-statistics energy resolution realized in the reconstructed IBD dataset. The added resolution smearing contributed by positron kinetic energy loss in non-active materials and annihilation γ-ray energy leakage is obvious here, and dominates the smaller photo-statistics resolution effect. A large offdiagonal contribution can be seen at low E rec arising largely from positron kinetic energy deposition in non-active detector regions. A relative offset between full and reconstructed energy peaks is also visible; this feature is a byproduct of both a mean per-event energy loss in non-active materials, as well as scintillator non-linearity effects which categorically reduce reconstructed energies below that of the true deposited energy.

C. IBD Detection Efficiency Variations
The efficiency of analysis cuts in selecting IBD interactions in active fiducial segments is estimated to be 30-40% using PG4 IBD MC simulations. Some cuts are highly efficient: requirements on prompt E rec and PSD, prompt-delayed time coincidence, and segment and z prompt-delayed spatial proximity cuts each remove less than 10% of IBD events. Delayed cluster cuts are ∼70% efficient, largely due to IBD neutron captures on nuclei other than 6 Li. Cosmogenic and closelyspaced cluster veto cuts remove ∼12% (10%) of the total detector live time during reactor-on (off) periods. An additional ∼25% inefficiency is introduced by z-fiducialization of each segment. Precise quantitative demonstration of these detectorwide efficiencies is not elaborated upon further as this quantity is not a necessary input for the spectrum or oscillation measurements presented in this paper.
In contrast, relative variations in efficiency between segments, and between time periods, are important for both reported measurements, and must be characterized. Due to edge effects and non-active detector segments, the efficiency of IBD detection is expected to be position-dependent in PROSPECT. Relative efficiency variations between segments, if not correctly characterized, can mimic baseline-dependent ν e disappearance effects for low-∆m 2 scenarios. Segments with relatively high detection efficiencies also play an outsized role in determining baseline-integrated detector energy response; thus, an understanding of the fractional signal contribution of each segment is a necessary input to comparing predicted and detected 235 U ν e spectra. Variations in detector performance exhibited by PROSPECT also result in timevarying IBD detection efficiency, which complicates the subtraction of backgrounds from the IBD signal. The remainder of this section will characterize IBD efficiency variations observed or expected in the PROSPECT detector, and describe any uncertainties or biases in the IBD signal associated with these variations.

Position-Dependent Efficiency Variations
The primary driver of IBD selection efficiency nonuniformity with position is neutron mobility. Thermalizing IBD neutrons can migrate out of the active detector region, or into nearby inactive segments, where they are not detected. The magnitude of this effect is well-demonstrated in Figure 22, which shows the simulated efficiency of detecting IBDs generated in each active fiducial segment, relative to the segment of highest efficiency. Efficiencies are found to be as much as 25% lower in segments adjacent to larger numbers of inactive or non-fiducial segments. FIG. 22. Simulated IBD detection efficiency for each PROSPECT segment, given relative to the segment of highest efficiency. The uncertainty for the relative efficiency of each segment is 0.5%. Also pictured is the default PROSPECT segment numbering scheme.
Neutrons produced by a 252 Cf source deployed for 1 hour in a calibration axis near the detector center were used to verify the modelling of neutron mobility by PG4. 252 Cf neutron signals were selected by requiring time-and positioncoincident clusters from prompt low-PSD fission γ-rays and delayed high-PSD fission neutron-6 Li captures. Figure 23 demonstrates the fractional contribution of 252 Cf neutron captures in the different regions surrounding the source deployment axis. PG4-simulated fractional contributions using a custom 252 Cf generator are also pictured. Good agreement is exhibited between predicted and measured capture locations.
The mobility of the IBD positron and its annihilation γrays will also generate a segment-dependent variation in IBD cut selection efficiency. However, this effect is substantially smaller than that of neutron mobility: as an example, PG4 IBD MC predicts that the S rec for a selected IBD will differ from the IBD interaction segment only 3% of the time, compared to a 25% migration fraction for delayed neutrons. The small mobility effect, combined with relatively loose cuts applied to prompt cluster energies and the absence of cuts on prompt signal topology, results in a percent-level efficiency variation associated with the prompt signal.
Since prompt and delayed mobility effects are modelled in PG4, their impact on IBD are taken into account in the oscillation and spectrum analysis. The inset image defines which segments are assigned to which region bin. In this inset, 'X' indicates an inactive segment; as this calibration run was taken earlier in the data-taking period, fewer dead segments are present in this analysis than in the IBD selection. Blue dots represent data, while red lines represent PG4 MC simulations.
Minor IBD segment-to-segment signal rate variations from a variety of other sources were also investigated. Given their small size, the following effects were not included in PG4 MC simulations. Instead, they were encapsulated in segment-uncorrelated signal rate systematic uncertainty estimates, along with uncertainties in the PG4-modelled efficiency variations.
Detected IBD rate variations can arise from differing segment masses. Owing to the mm-level survey of the optical grid during detector assembly and the rigid optical grid mechanical structure, realized segment geometries are expected to have volumes identical to <1%. Differences in fiducialization efficiencies can arise from inconsistent z rec between segments. As described in Section III H, z offsets between segments are less than 1 cm, while z resolutions for ( 219 Rn, 215 Po) events vary between segments by less than 1 cm. Given the 89 cm fiducial segment length, this persegment resolution variation corresponds to less than 2% variation in z fiducialization efficiency per segment. Characterization of the combined effects of variation in segment volumes and z-fiducialization can be performed by comparing rates of detection of uniformly-distributed correlated ( 219 Rn, 215 Po) decays between fiducial segments, which can be selected with extremely high efficiency (>99.9%) and purity. As demonstrated in Figure 24, rates are found to be similar in all fiducial segments to within ±2%.
Given the comparatively high PSD cut efficiencies and relatively consistent segment-to-segment PSD response, PSD cuts are expected to introduce negligible segment-to-segment variation in detected IBD signal rates. Cosmogenic and other IBD veto cuts are applied at the full-detector level and are also expected to have negligible impacts on relative IBD signal rates. Since none of the possible sources of IBD rate variation between segments for the oscillation analysis described above exceed 2%, a conservative 5% segment-uncorrelated efficiency uncertainty is applied. FIG. 24. Relative rate of detected correlated ( 219 Rn, 215 Po) decays from 227 Ac, as calculated for each fiducialized segment. Segment numbers increase from detector bottom rows to top rows (increasing y), as illustrated in Figure 22; within a row, segment numbers increase with increasing x. All values are given relative to the mean of 3.3 mHz. Error bars represent statistical uncertainties.

Time-Dependent Efficiency Variations
A variety of time-dependent aspects of the IBD selection have been identified. Many, such as variations in the optical and PSD performance of the detector, have been effectively mitigated during the process of low-level detector calibration, as described in Section III. Remaining time-dependent aspects of the selection after calibration must be quantified and either corrected or taken into account in uncertainty estimates. The primary source of time-dependence in detected IBDlike rates arises from changes in dead time fractions from muon, neutron, recoil, and pile-up veto cuts, which were described in Section V A. These effects are illustrated in Figure 25, which shows, as a function of time, the fractional detector-wide dead time associated with these cuts. Veto fractions are systematically higher while the reactor is running. In addition, veto dead time fractions vary within individual reactor-on periods, while also increasing systematically with time during both reactor-on and reactor-off periods. Clearly, precisely correcting for dead time differences must be done in order to compare IBD-like rates between different time periods.
To understand these veto fraction time variations, rates of  TABLE II. Rates, barometric coefficients, and on-off scaling coefficients for different types of single (top) and correlated (bottom) event categories; barometric and scaling coefficients are used for cosmic background estimation in Section VI B. When relevant, the IBD veto cut associated with the listed event type is specified. Time-integrated rates, as well as rate differences between reactor-on and -off periods, are provided. Given the large non-atmospheric time-dependent changes in single n-p and single cluster detection rates, atmospheric coefficients and reactor-off background scaling coefficients are not calculated for these signals.
the various vetoing event classes are investigated. Table II overviews the rates of these and other event classes. Inclusive trigger rates, which determine the pileup veto dead time, are obviously the highest shown in Table II, and exhibit substantial on-off differences. However, given the short veto window length for this veto (0.8 µs), this cut produces less than 1% dead time during both reactor-on and reactor-off periods.
Muons represent the second most common veto event class but exhibit relatively little variation between reactor-on and reactor-off periods. Thus, while the comparatively longer veto window length of this class (200 µs) produces the largest overall dead time contribution, it is relatively constant between reactor-on and reactor-off data periods. Recoil vetoes, the next most common class, exhibit relatively high rates as well as substantial on-off variation. This class largely arises not from true neutron-proton recoils, but from the small fraction of γray flux in the high tail of the γ-like PSD distribution. Gamma fluxes vary substantially between reactor-on and -off periods and within individual reactor-on periods; see Section VI A for an in-depth description of these variations. Moreover, the slow reduction in light yield described in Section III expands the overlap between high-PSD and low-PSD bands over time.
For these reasons, this event class contributes the majority of time-dependence in total veto dead times. Neutron vetoes exhibit the lowest rate of any veto class, and contribute less than 1% to total dead time.
A sub-dominant additional source of time-dependence in the IBD selection is the reduction in the fraction of neutrons capturing on 6 Li with time. Figure 26 shows the increase in average capture time and the n-H capture fraction for cosmogenic neutrons. Capture times are obtained by fitting coincidence time distributions between prompt recoil and delayed capture signals with the same coincidence and veto requirements as for IBD-like events; this event class, called (n-p, n-Li), was previously described in Section V A. For 6 Li capture fractions, n-Li cuts are identical to those applied in the IBD analysis, while n-H captures are delayed clusters with energies within 2.0σ of the γ-like PSD band and 3.0σ of the n-H peak energy. The ratio of n-H to n-Li captures increases from 12.6% to 13.2% over the course of the physics dataset. As IBD cuts select only 6 Li captures, this change will translate to a ∼0.7% reduction in IBD detection efficiency over the course of the physics dataset. A decrease in 6 Li capture rates resulting from a small reduction in 6 Li concentration in the scintillator should be accompanied by an increase in capture times towards that expected in a pure hydrocarbon environment (∼200 µs). Figure 26 shows such an increase for the same (n-p, n-Li) dataset, from 49.1 µs to 50.2 µs. Using PG4, this 1 µs change in capture time for IBD events is found to produce a 1-2% reduction in coincidence time cut efficiency. PG4 MC simulations also verify that both the change in capture time and n-H capture fraction are consistent with a fractional reduction of approximately 3% in the scintillator's 6 Li content. Capture time variations of generally similar magnitude appear to be present in all regions of the fiducial volume for this dataset within ± 1 µs, with lower (higher) increases observed in the bottom-most (top-most) row of fiducial detector segments. These changes are found to have negligible impact on PG4-predicted prompt energy spectra.
If these two sub-dominant aspects of time dependence (reduction in capture time and increase in n-H capture fraction) observed in various non-IBD event samples are combined, a position-integrated reduction in IBD detection efficiency of 2-3% should be expected over the course of the physics dataset. Interestingly, a reduction in ( 219 Rn, 215 Po) event rates 3% greater than that expected based on the 21.8 y 227 Ac half life is also observed during the same physics dataset [78]. The general correspondence between IBD and ( 219 Rn, 215 Po) rate variations, common doping chemistry for 227 Ac and 6 Li, and neutron capture time and n-H fraction variations all appear to be consistent with a reduction in dopant concentration in the PROSPECT scintillator bulk; further dedicated chemical measurements of PROSPECT scintillator samples must be performed to verify this explanation.
Finally, as mentioned in Section III H, modest degradation has been observed in the resolution of Z rec for ( 219 Rn, 215 Po) events. Using PG4 IBD MC simulations, a similar broadening of the prompt-delayed z coincidence distribution is estimated to produce less than 0.5% reduction in IBD detection efficiency. This variation is also found to have no impact on PG4-predicted prompt energy spectra.
The impact of these sub-dominant time-dependent IBD efficiency variations on high-level spectrum and oscillation analyses is expected to be negligible. For both analyses, variations in detection efficiency can complicate the scaling of reactor-off IBD candidate datasets to subtract cosmogenic backgrounds during reactor-on periods. This backgroundsubtraction procedure, described in more detail in Section VI, is relatively insensitive to monotonically decreasing efficiency due to the interleaved nature of reactor-on and reactoroff datasets. As demonstrated in Figure 12, linearly timedependent quantities, such as the z-coincidence width for ( 219 Rn, 215 Po) events, exhibit reactor on-off variations more than an order of magnitude smaller than variations between the beginning and end of the physics dataset. Any residual reactor on-off background scaling ambiguities or biases arising from detection efficiency variations are smaller than other sources of background scaling uncertainty; these additional uncertainties are discussed in more detail in the following section.
For both the spectrum and oscillation analyses, any impact of efficiency time-dependence is minimized by the lack of substantial energy-dependence related to the effect. Regarding baseline dependence, which is most relevant to the oscillation analysis, statistical uncertainties on the baseline-uniformity of efficiency variations are smaller than the previously-defined 5% per-segment normalization uncertainties described above.

VI. BACKGROUNDS
An array of backgrounds related to the reactor and cosmogenic activity accompany the IBD signal after the selection cuts in Section V are applied. The following section describes these various background sources.

A. Accidental Backgrounds
Single γ-rays and single neutron captures from uncorrelated physics events can deposit energy in the PROSPECT detector in close enough spatial and temporal proximity to pass all IBD selection cuts. This category of background event is more common during reactor-on periods due to increased γray fluxes due to the reactor and nearby neutron scattering experiments. This variation is illustrated in Figure 27 IBD prompt-like singles span a broad energy range during reactor-on periods, with a substantial high-energy contribution from reactor neutron capture on structural materials in the reactor building; prompt-like energy spectra soften substantially during reactor-off periods. Also visible in Figure 27 is an increasing rate of single IBD delayed-like events, with a noticeable difference in rates between reactor-on and reactoroff periods. This effect can be explained by noting, as discussed in Section V C, that a substantial proportion of high-PSD signals, including delayed-like events, are contributed by a small proportion of the plentiful γ-related activity with statistically high PSD values.
The spatial distribution of prompt-like and delayed-like signal rates in the detector are shown in Figure 28. During reactor-on periods, prompt-like singles rates are found to be 2-10 times higher in segments in the bottom back (high-x, low-y) corner of the detector. This region of the detector receives comparatively less protection from the under-detector concrete monolith and from lead shielding lining the detector movement chassis. During reactor-off periods, prompt-like singles rates are found to exhibit substantially less segment dependence, with rates roughly two times lower in detectorinterior segments. Delayed-like singles rates per segment are also found to be comparatively uniform, with roughly a factor of two variation across the detector during both reactor-on and reactor-off periods. The rate and physics properties of accidental backgrounds for this analysis were determined by collecting cluster pairs that pass all IBD cuts, with the exception of an altered (-12,-2) ms requirement on prompt-like cluster time with respect to the delayed-like cluster. This time separation window excludes all relevant physics-correlated events, giving a pure, high-statistics accidental background sample identical to that in the IBD-like time coincidence window. After scaling this sample to account for the relative difference in coincidence time window lengths, it is directly subtracted from the IBD candidate sample with negligible associated uncertainty.

B. Cosmogenic Time-Correlated Backgrounds
As the PROSPECT detector is situated underneath minimal (<1 meter water equivalent) overburden, substantial contributions of time-correlated prompt-like and delayed-like cluster pairs are expected from cosmogenic muon and neutron fluxes. Some are included in the IBD candidate sample despite the dedicated cosmogenic veto cuts described in Section V A. To estimate the contribution of these backgrounds to the IBD candidate sample collected during reactor-on datataking, identical IBD selection cuts are also applied to the reactor-off dataset. Accidental backgrounds in the reactor-off IBD candidate dataset are similarly calculated and subtracted as described in the previous sub-section. The prompt E rec spectrum of the reactor-off IBD candidate dataset, pictured in Figure 29, exhibits contributions from three primary event categories. A peak in the spectrum centered at 2 MeV is characteristic of a n-H capture ; this feature can be caused by multi-neutron cosmogenic showers in which two neutrons of low incident energy capture within the inner detector. A peak in the spectrum centered at 4.5 MeV is characteristic of the 4.43 MeV γ-ray line of the first excited state of 12 C; this feature is caused by the inelastic scatter and subsequent capture of one high-energy cosmogenic neutron in the detector. Finally, the continuum component of the spectrum encapsulates a combination of neutron-related processes, dominated by neutron-proton elastic scatters, inelastic neutron scatters, or a combination of these effects; both of these dominant continuum-contributing categories are produced by high-energy neutrons. PG4 MC simulations of pure cosmogenic neutron fluxes following the 'Goldhagen' spectrum of Ref [79] are found to reproduce these primary features of the reactor-off IBD candidate spectrum. Simulations of primary cosmogenic neutrons and muons using the CRY cosmogenic simulator [80] indicate that neutrons are by far the dominant background source of these two. We note that these cosmogenic PG4 MC simulations are not used in any aspect of the cosmogenic background estimation and subtraction process for PROSPECT physics anlayses. FIG. 30. Change in the rate of IBD-like events versus atmospheric pressure during reactor-off run periods. Each point represents one day of reactor-off data. The fitted trend is used to scale for the difference in average pressure between reactor-on and reactor-off data periods. The average pressure difference between reactor-on and reactor-off periods is much smaller than the range of pressures depicted in this Figure. Reactor-off cosmogenic backgrounds are subtracted from the reactor-on dataset after appropriately scaling the reactoroff dataset's normalization for relative differences in detector live-time and relative differences in absolute cosmogenic flux due to variations in atmospheric pressure. Rate corrections for atmospheric pressure are calculated using procedures similar to those documented in Refs. [56,81]. Figure 30 demonstrates the correlation between cosmogenically-produced IBD-like event rates and atmospheric pressure during reactor-off periods. Using the fitted correlation coefficient also pictured in Table II, (-0.70±0.01) %/mbar, along with the small average atmospheric pressure difference between interleaved reactoron and reactor-off periods, a nominal reactor-off cosmogenic normalization scaling factor of 1.00±0.03% is obtained.
Similar correlation coefficients were also determined for different cosmogenic physics event categories, including single muons, single n-Li captures, and time-coincident (n-p, n-Li) and (n-Li, n-Li). Associated correlation coefficients and on-off scaling factors for the various datasets are given in Table II. The scaling factors for all event classes are found to be within <0.1% of unity. Nonetheless, a conservatively assigned 0.5% uncertainty is used for the subsequent oscillation and spectrum analyses. We stress that atmospheric scaling factors are consistent between datasets in spite of relative off-sets in absolute rates between reactor-on and reactor-off data periods, which were also given in Table II and discussed in Section V C.
During approximately 3 calendar days of reactor-off datataking, the water level in the pool surrounding the reactor core was lowered from a nominal height of 3 m above the PROSPECT target volume y-center to 2 meters below it. Water level changes, performed to enable direct access to regions of the core vessel, were documented in paper logs taken by reactor operations staff, and shared with the PROSPECT collaboration. If this effective reduction in nearby shielding material has a substantial impact on the rate of cosmogenic IBDlike backgrounds in PROSPECT, a background scaling factor similar to that generated for atmospheric pressure variations must be calculated and applied to these data periods.
The general accuracy of the water pool level documentation was verified with PROSPECT data by monitoring incident through-going muon rates at zenith angles corresponding to the location of displaced pool water. This analysis was enabled by a dedicated PROSPECT 3D muon tracking algorithm that exploits the relative charge and timing information from each PROSPECT segment. During periods of low pool water level, muon rates from these specific incident angles were found to increase by 2% relative to adjacent data periods; a comparison of these same periods integrating over all zenith angles yielded negligible relative increases.
Previously-discussed single n-p and single n-Li cosmogenic neutron event classes, whose average rates are given in Table II, were used to estimate variations in IBD-like backgrounds during low pool level periods. Comparing low pool level periods to nearby nominal pool level periods, rates of these two event classes are found to be unchanged within a conservative 2% envelope. Rates of IBD-like events during these two time period groups are also found to be identical within 2%. This 1.00 water pool scaling coefficient and its 2% uncertainty applies only to the 5% of reactor-off data experiencing low water pool levels. Thus, we apply no correction to account for this effect; this choice contributes negligibly (0.05%) to the overall uncertainty in the previouslydescribed correlated background atmospheric scaling uncertainty (0.5%).

C. Other Time-Correlated Backgrounds
A direct subtraction of the reactor-off backgrounds using the scale factor described above will not properly remove or account for any background component that scales differently in time than the cosmogenic flux. We have investigated three such background categories: neutrinos from spent nuclear fuel, time-coincident backgrounds from reactor γ-rays and neutrons, and time-correlated signals produced by radiogenic α-particles in the PROSPECT detector.
HFIR's spent nuclear fuel cores are stored in a pool directly adjacent to that housing the burning core, within 15 m of the PROSPECT detector. Due to the short cycle length for each HFIR core, the build-up of the long-lived fission products, such as 144 Ce, 106 Ru, and 90 Y, is low compared to commer-cial reactor fuel. Using HFIR's mean cycle length and thermal power, the energy released per 235 U fission from Ref. [82], and standard nuclear databases [83,84], daily spent nuclear fuel ν e contributions for each of the long-lived 235 U fission products were individually calculated for one HFIR core [69]. With conservative assumptions regarding spent core storage in the HFIR spent fuel pool, total spent fuel IBD contributions are found to be less than 0.1 per day, providing a negligible overall contribution to the IBD candidate dataset.
Fast neutrons are produced in the matrix of the nuclear reactor core, but are very efficiently thermalized and attenuated by the light water pool surrounding the nuclear reactor core. Nonetheless, it is possible to generate reactor-produced, physics-correlated cluster pairs in the PROSPECT detector, either through travel of multiple neutrons from the same fission event to the inner detector, or through inelastic scattering of fast reactor neutrons or high-energy reactor-related γrays. The former process is highly unlikely: with 10 19 HFIRproduced neutrons per second at 85 MW th , and a reactoron trigger rate of 2 · 10 4 Hz, the probability of PROSPECT detecting one (two) HFIR neutron(s) per fission is certainly less than 10 −15 (10 −30 ). This estimate is highly conservative, considering the limited dependence of single n-Li and single n-p rates on reactor status as shown in Figure 27 and Table II. Nonetheless, such a probability indicates far less than 0.1 daily IBD candidates produced via this process.
Given the high observed rate of single high-energy promptlike clusters shown in Figure 27, we also investigated the possibility of production of reactor-related correlated triggers from (γ,nγ) photo-neutron interactions in various PROSPECT detector materials, including lithium (scintillator), carbon (all components), deuterium (all components), boron (inner shielding), oxygen (water shielding), aluminum (inner tank), and lead (shielding). Considering incident γray energies, relevant cross-sections, and relative abundances within the detector, photo-nuclear interactions in PROSPECT are vastly more abundant in its lead shielding than in any other detector component. The contribution of photonuclear interactions in lead to IBD-like signatures in PROSPECT was estimated by performing PG4 MC simulations of high-energy γ-rays outside the detector shielding package with a flux normalization and spectrum tuned to reproduce rates of high E rec prompt-like clusters in PROSPECT during reactor-on periods (as in Figure 27). These simulations estimate an IBD candidate rate of much less than one per day from this process.
A similar consideration of reactor-produced neutron inelastic scattering processes in PROSPECT again reveals its lead shielding as the primary site of these interactions. With γ-ray fluxes expected to be significantly higher than reactor neutron fluxes in the lead shield, and comparable (n,nγ) and (γ,nγ) cross-sections in the relevant energy ranges, the former process is unlikely to dominate the latter in producing IBD backgrounds in the PROSPECT target. If inelastic reactor neutron interactions closer to the detector target are non-negligible, we would also expect an observed increase in detected (n-p,n-Li) events in PROSPECT during reactor-on periods; as noted in Table II, we see no evidence of such an increase.
Time-correlated IBD-like background contributions from radiogenically-produced (α,n) interactions in organic scintillator detectors have been estimated by previous MeV-scale neutrino experiments [85,86]. The primary process considered in these experiments is the 13 C(α, n) 16 O* interaction, which produces time-coincident signals from a prompt highenergy de-excitation γ-ray and the delayed neutron capture. Daya Bay estimates IBD-like signal rates of roughly 0.005 per ton of scintillator per day from α-particle rates of roughly 0.5 Hz/ton [86]. As described in Table I, α-particles are primarily expected to be generated through decay products of 227 Ac deliberately doped into the PROSPECT scintillator, which has an observed 0.3 Hz rate in the fiducial volume.
Considering the Daya Bay α-induced IBD backgrounds per ton given above, 227 Ac chain products will generate much less than 0.1 IBD event per day in PROSPECT. Backgrounds from α processes on fluorine present in the PROSPECT optical grid's FEP linings were also considered and estimated to be negligible IBD contributors. It should also be noted than any time-stable radiogenic IBD background contribution would be identical between reactor-on and reactor-off periods, and would thus be properly removed during the subtraction of other correlated backgrounds.

D. Background Subtraction Validation
In the following section we present analyses to demonstrate the reliability and accuracy of reactor-on background estimates. Consistency between IBD-like datasets from different time periods demonstrates proper understanding of the level of time-stability of the detector's energy scale and IBD-like background contamination. This comparison for two different reactor-off time periods is shown in Figure 31. To more closely mimic the distribution of reactor-on and reactor-off periods in time due to the short HFIR cycle length, the two periods chosen for comparison in Figure 31 are interleaved in time as shown in the figure inset; any systematic variation in efficiency or energy response occurring over extended timescales will have a reduced impact in this scenario. In addition, datasets are scaled to account for relative differences in atmospheric pressure between the two time period definitions; as in the comparison of reactor-on and reactor-off datasets, the scaling factor for this off-off comparison is much less than 1%. The reactor-off datasets show consistency with one another: comparison in the 0.8-7.2 MeV E rec range yields a χ 2 /DOF of 47.68/31. If the normalization is allowed to float between datasets, the best-fit offset in the 0.8-7.2 MeV energy range is found to be consistent with unity to 1% statistical precision.
PROSPECT IBD analyses rely on the correspondence of correlated IBD-like background rates and spectra between reactor-on and reactor-off periods. An explicit verification of this correspondence for IBD-like backgrounds is not possible, due to the presence of real IBD events during reactor-on periods. Instead, we have examined rates and spectra of correlated background event classes similar in appearance to IBD-like candidates in PROSPECT. Figure 32 shows the correspondence for two specific event classes for reactor-on and reactor-off periods. The first is IBD candidates rejected by a muon veto; muon cut definitions are outlined in Section V. This class is overwhelmingly produced by neutronic signatures related to the initial vetoing cluster, particularly coincident captures of multiple neutrons. Similar events in which the vetoing particle does not traverse the PROSPECT target represent one source of expected IBD-like background. This event class contains a small expected contamination from true IBD events accidentally appearing in the muon veto window; this contribution is removed by appropriately scaling and subtracting the observed backgroundsubtracted IBD signal described in Section VII. The second event class is the (n-p,n-Li) dataset described in Table II and Figure 20: IBD candidates failing the prompt PSD requirement. These events are overwhelmingly produced by interaction of fast cosmogenic neutrons in the PROSPECT target. In addition, many of the aforementioned potential sources of reactor-related correlated IBD-like backgrounds would also produce events in this category. This event class contains negligible contamination from true IBD interactions. Figure 32 demonstrates on-off correspondence by plotting the prompt energy spectrum of each event class during reactor-off periods, as well as the residual prompt spectrum during reactor-on periods after properly scaling and subtracting out this reactor-off signal. If reactor-off periods provide an accurate description of correlated backgrounds during reactoron periods, the background-subtracted reactor-on signal for these event classes should be statistically consistent with no signal at all prompt energies. We note that in calculating statistical consistency for the muon-vetoed IBD-like event class, one must propagate statistical correlations between events and between prompt energy bins generated by the fact that many IBD-like candidates are often produced by the same cosmogenic interaction.
For muon-vetoed IBDs, we find that reactor-on residuals are statistically inconsistent with zero at 2.9σ confidence level in the vicinity of the n-H peak at 1.6-2.6 MeV prompt E rec . The amplitude of this deficit in reactor-on signal is -2% of the total reactor-off event class size in the n-H peak region, and shows no statistically significant variation with detector position. No statistically significant residual deficit or excess is observed in this event class outside the n-H peak region. IBD candidate events vetoed by preceding n-p recoil signatures also exhibit a similar -3% offset in the n-H peak region during reactor-on periods.
Meanwhile, the (n-p,n-Li) event class pictured in Figure 32 showed a substantially smaller reactor-on residual excess: in the 0.8 MeV-7.2 MeV IBD prompt E rec range, the offset is +0.31% ± 0.13% the size of the reactor-off rate. This offset is similar in size to the current 0.5% correlated cosmogenic background normalization uncertainty envelope described earlier in this section. A variety of other statistically independent non-signal cosmogenic event classes were also investigated and showed no meaningful excess in reactor-on data. Most notably, as will be described in the following Section, no residual reactor-on excess or deficit is observed within 2% statistical uncertainty in IBD candidates above 8 MeV prompt E rec , where negligible contributions from reactor ν e are expected.
The observation of a residual reactor-on deficit for some event classes during PROSPECT reactor-on data periods is suggestive of unidentified time-variations in selection cut efficiencies, dead times, or accidental/cosmogenic background estimates, rather than the presence of unidentified reactorproduced correlated backgrounds [87]. Issues related to detector response may also produce percent-level excesses in other event classes, depending on the cuts applied. Given the negligible estimated contributions from reactor-related correlated background in Section VI C, we suspect that response-related issues are responsible for both the deficits and excesses observed. As PROSPECT has been unable to precisely determine the underlying cause of this percent-level imperfection in its background subtraction procedure, the observed residuals are used to define additional uncertainty contributions to be applied to the ν e oscillation and spectrum analysis. First, an additional 1% energy-and baseline-correlated reactor-off background normalization uncertainty is introduced to account for the small observed on-off excess in (n-p,n-Li) IBDlike events. An added 3% uncertainty in the amplitude of the reactor-off nH peak in the 1.6-2.6 MeV region is also instituted to reflect the residual on-off deficit exhibited in muonand recoil-vetoed IBD-like events; this uncertainty is treated as baseline-correlated, but is uncorrelated with respect to the reactor-off background normalization uncertainty. These additional uncertainties produce minimal degradation in oscillation and spectrum sensitivity; this conclusion remains unchanged when adjusting the level of assumed baseline correlation.

Prompt Energy
FIG. 32. Prompt Erec spectra for two classes of vetoed IBD-like events: IBD candidates rejected by a muon veto (left) and IBD candidates failing the prompt PSD requirement (right). These IBD-like event classes are primarily produced by multi-neutron and fast neutron cosmogenic events, respectively. Pictured curves represent reactor-off correlated vetoed IBD candidates (blue), and reactor-on correlated residuals following reactor-off background subtraction (red). Accidentals (between the IBD candidate pairs and with the veto-inducing event) have been subtracted out from all distributions. Due to the presence of high-multiplicity showers in the dataset, substantial correlations are present between bins in the left-hand plot.

VII. MEASURED IBD SIGNAL SAMPLE
Following the application of cosmogenic and re-triggering vetoes to the 95.7 (73.1) calendar days of reactor-on (off) data described in Section II E, IBD candidates were selected from 82.2 (65.2) days of reactor-on (off) live-time. During the reactor-on data-taking period, a total of 115852 IBD candidates are selected. Of these candidates 28358±18 are calculated to be contributed by accidental backgrounds, yielding a total of 87494±341 correlated IBD candidates. Following subtraction of 1309±4 accidental background events from the reactor-off IBD candidate dataset, a total of 29258±175 correlated IBD-like candidates are selected in the reactor-off dataset. Following application of live-time and atmospheric pressure scalings, this reactor-off IBD candidate tally corresponds to a total reactor-on cosmogenic background estimate of 36934±221. After subtraction of this background, a total of 50560±406 signal IBD events remain in the reactor-on dataset. The ratio of signal IBD to cosmogenic background (accidental background) events is estimated to be 1.37 (1.78). A summary of IBD candidate accounting is provided in Table III.
The rate of correlated IBD candidates and accidental backgrounds per live-day is shown in Figure 33. As described in Section VI, accidental backgrounds exhibit marked timedependence, largely due to variations in prompt-like rates during reactor-on data-taking periods. The correlated IBD candidate rate is clearly dependent on reactor status; given the lack of reactor-correlated backgrounds (Section VI C), this dependence provides clear indication of observation of reactor antineutrinos. Smaller-amplitude deviations in these rates during reactor-on and reactor-off periods are caused by previously-described variations in cosmogenic fluxes, and thus cosmogenic IBD backgrounds, due to variations in atmospheric pressure. After applying subtraction of both accidental and correlated cosmogenic backgrounds, relative rates of IBD signals are shown in Figure 33 for each active fiducial segment; rates are normalized with respect to the shortest baseline, and are corrected for PG4-predicted relative variations in efficiency between segments. Efficiency-corrected IBD signal rates decrease with segment baseline, following the 1/r 2 distribution expected when sampling an isotropically-emitting compact ν e source. The best-fit inverse-square function (with only amplitude parameter) pictured in Figure 33 provides a χ 2 /DOF of 72.4/69.
The prompt E rec spectrum of the accidentals-subtracted reactor-on IBD-like dataset is pictured in Figure 34, along with that of the cosmogenic background expected from the reactor-off dataset and the fully-background-subtracted IBD signal. After subtracting cosmogenic backgrounds, the IBD signal's prompt energy distribution matches the general expected shape of reactor ν e interacting via IBD: count rates are highest in the 1-7 MeV range with a generally continuous appearance versus energy in this range despite the pres-  ence of peak-like features in the subtracted cosmogenic spectrum. Above 7 MeV, where reactor IBD signal contributions are expected to be minimal, background-subtracted IBD-like count rates are consistent with zero, indicating proper scaling of reactor-off data during reactor-on cosmogenic background subtraction. A quantitative comparison of the background-subtracted IBD signal distribution to zero from 8 MeV to 12 MeV yields a χ 2 /DOF of 20.9/20.

A. Signal Validation
To demonstrate a proper understanding of the backgroundsubtracted IBD signal dataset, it is valuable to perform comparisons of IBD-like event distributions between different time periods and detector locations. Given the stability in reactor thermal power during HFIR operation, a demonstration of time stability of the IBD selection can be provided by comparison of different reactor-on time periods. This comparison for two different reactor-on time periods is shown in Figure 35. As in Figure 31, the two time periods are interleaved in time as shown in the figure inset. These datasets show consistency with one another: quantitative comparison between 0.8 MeV and 7.2 MeV yields a χ 2 /DOF of 26.2/31. If the normalization is allowed to float between datasets, the best-fit offset in the 0.8-7.2 MeV energy range is found to be less than 2%, consistent with a hypothesis of equal normalizations within ∼2σ statistical confidence level.
Due to the compact size of PROSPECT's inner detector, IBD interactions taking place in the inner-most and outer-most segments of its fiducial volume should exhibit differing levels of annihilation γ-ray energy leakage, leading to differences in prompt E rec spectra between these two regions. In addition, the presence of larger numbers of inactive segments near the detector bottom should lead to enhanced energy leakage for IBD interactions taking place in the bottom of the fiducial volume. These relative variations in response with position in the detector must be properly accounted for in predicted IBD signal distributions. To verify proper modelling of these effects in the PROSPECT detector response model, background-subtracted IBD signal prompt E rec distributions are compared between these different detector regions in Figure 36. Figure insets illustrate which detector active segments are assigned to which category. Also pictured are the spectrum ratios between these two regions, in addition to that predicted by PG4 MC simulations of IBD interactions. Energy spectra and normalizations per segment should not be expected to be identical between regions due to the uneven distribution of dead and non-fiducial segments in the detector. However, deviations between regions should be correctly predicted by the PG4 IBD MC. Indeed, data-PG4 spectrum ratios between regions are generally consistent within the data's statistical limitations: a quantitative comparison of the data and PG4-predicted inner-outer (upper-lower) ratios give χ 2 /DOF of 56.6/31 (54.4/31).
The segmented nature of the PROSPECT target enables a variety of other cross-checks of the background-subtracted IBD dataset and modelling of these events. Whether due to IBD positrons traversing optical grid separators or migration of annihilation γ-rays, an IBD interaction in the PROSPECT detector more often than not produces reconstructed clusters spanning multiple segments. This effect is illustrated in Figure 37, which shows the segment multiplicity of prompt clusters for the background-subtracted IBD signal. Both data and PG4 IBD MC simulations exhibit identical multiplicity distributions within systematic uncertainties, which are dominated by the ±5 keV per-pulse analysis threshold uncertainty. This agreement is particularly reassuring, given the importance of pulse thresholding effects in determining event energy scales. Accurate modelling of IBD event topology is also demonstrated in Figure 37 by plotting the summed energy of all pulses (E rec ) excluding that with the highest reconstructed energy (E max ). This energy distribution is expected to be dominated by annihilation γ-ray energy depositions. Excellent agreement is found for this distribution between data and PG4 MC simulations, indicating accurate modelling of annihilation γ-ray energy depositions in the detector.
Detector segmentation also enables comparison of the relative positioning of prompt and delayed IBD signals with respect to one another in the detector target, as illustrated in Figure 38. Approximately 77.4%±0.5% of IBD neutrons and positrons are found to have identical S rec , indicating that IBD neutrons tend to capture in the same segment as their associated IBD interaction. This ratio is found to be 78.3% for PG4 IBD MC, 0.9%±0.5% from the observed value. The data's marginally reduced IBD neutron mobility will result in smaller relative contributions from IBD interactions in inactive and non-fiducial segments. The impact of this added contribution on expected prompt E rec distributions is found to be small compared to those of other more dominant energy scale systematic uncertainties.
When examining IBD signal events with different prompt and delayed S rec , both data and PG4 show an outsized contribution from events with longer-baseline delayed S rec . Events where the delayed S rec is 'downstream' from the prompt S rec contribute 15.0%±0.3% of all IBD signal data events, while events with 'upstream' neutrons contribute only 7.6%±0.3%. This difference in PG4 MC simulation is attributable to the non-negligible downstream kinetic energy of the final-state IBD neutron. The observation of this effect in PROSPECT provides an intriguing demonstration of the capabilities of segmented IBD detectors to statistically reconstruct the incoming direction of reactor ν e .

VIII. STERILE NEUTRINO SEARCH RESULTS
Sterile neutrino oscillations are probed with the PROSPECT dataset by comparing prompt E rec spectra between different detector baselines. The following section will describe the appearance of the PROSPECT datasets in different baseline bins, introduce the statistical methods used to search for unexpected relative variations in E rec spectra between baselines, and present new sterile neutrino oscillation results based on the dataset described in Section VII.

A. Datasets and Predictions
To perform the oscillation analysis, active detector segments are assigned to one of ten defined baseline ranges, or l, as illustrated in Figure 39. IBD events are then assigned to baseline bin l according to their prompt S rec . Segment l assignments are chosen to produce roughly similar IBD signal statistics in each baseline bin. Given the 1/r 2 reduction of IBD signal events with baseline demonstrated in Fig 33, this choice results in uneven baseline bin widths. This method differs from that described in the previous PROSPECT oscillation analysis [56], where the IBD dataset was separated into six bins of equal width; the new binning method provides better statistical coverage over a wider range of baselines and delivers better overall oscillation sensitivity. Roughly 5000 events are contained in each baseline bin l, with per-bin relative variations of 10% illustrated in Figure 39.
Prompt E rec spectra for background-subtracted IBD signal events in each l bin, called M l,e , are pictured in Figure 40. Also pictured are the unoscillated IBD prompt E rec predictions P l,e for each baseline bin. P l,e are formed by applying the best-fit PG4-derived segment response matrices described in Sections IV A and V B to an IBD interaction generator following the 235 U ν e energy spectrum calculated by Huber [25] and the IBD cross-section of Ref. [32]. IBD vertex distributions for P l,e are generated assuming a finite cylindrical HFIR core geometry, as described in Section V B. To remain consistent with procedures used for generating detector response matrices, the segment hosting each generated IBD's interaction is assigned as that reconstructed IBD event's prompt S rec . While this choice effectively ignores a source of worsened resolution in knowledge of true ν e baselines, this contribution is negligible compared to the position resolution smearing related to the finite reactor core geometry and l bin width.
Prior to application of detector response to produce IBD prompt E rec distributions, true ν e energy distributions for each segment's IBD interactions can be distorted to account for the possible presence of sterile neutrino oscillations. This distortion is dictated by the parameters (∆m 2 41 ,sin 2 2θ 14 ) as defined in Eq. 2, as well as by the ν e energies and true baselines corresponding to these IBD interactions. To accelerate the generation of oscillated predictions, each segment's zcenter midpoint is used as the true generated ν e interaction location for each IBD event. This choice serves to ignore the ν e baseline (and oscillation) smearing provided by the O(10 cm) range of ν e production-interaction baselines within a segment; however, this contribution is once again negligible compared to that of the finite HFIR core size.
To ensure minimal dependence of the oscillation result on uncertainties in the shape and normalization associated with the Huber 235 U reactor flux prediction, relative comparisons between measured prompt E rec are used to perform PROSPECT's oscillation measurement. These comparisons are based on the per-baseline measured and PG4-predicted content of each bin in baseline l and energy e, M l,e and P l,e , and on the detector-wide measured and predicted content of bin e, respectively: A detailed description of M e and P e will be given in Section IX. For the oscillation analysis, M l,e are compared to the predicted per-baseline spectra M e P l,e Pe . The latter quantity reduces the dependence on the underlying reactor ν e model, while also correcting for relative energy response variations between baseline bins predicted by the PG4 simulation. The ratios between these two quantities for each baseline are shown in Figure 41. An absence of short-baseline oscillation effects in M will produce a flat ratio at unity; meanwhile, the presence of oscillation effects in M l,e and M e will alter this ratio in a manner also depicted in Figure 41. Visual examination of the measured ratios in Figure 41 yields no immediate indication of non-flat trends similar to that produced by largeamplitude sterile neutrino oscillations.
As each baseline bin l is composed of segments of varying proximity to the detector edge and to inactive segments, some variations in M l,e are expected between different baseline bins even in the absence of oscillation effects. As mentioned above, PG4 is used to characterize these relative re-sponse variations, which are taken into account in P l,e predictions. To demonstrate the behavior of these relative response variations, Figure 42 shows the relative differences between un-oscillated predicted spectra P 1,e and P 5,e along with the impact of sterile neutrino oscillations on these ratios for differing mass-splitting values. High mass-splitting oscillations produce relative spectrum differences between baselines that are characteristically different than those produced by expected energy response variations. Thus, in the mass splitting region above 1 eV 2 , statistical uncertainties are expected to dominate PROSPECT's sterile neutrino oscillation sensitivity. Below ∼0.5 eV 2 , relative energy response variations and efficiency differences between baselines can mimic to an extent the behavior of oscillations; thus, uncertainties in these variations will also limit oscillation sensitivity in this mass-splitting range.

B. Statistical Method
To test for the possible existence of sterile neutrino oscillations, measured per-baseline prompt E rec spectra M l,e are quantitatively compared to predicted per-baseline prompt E rec spectra M e P l,e Pe in the presence of oscillation effects in P l,e and P e dictated by the parameters ∆m 2 41 and sin 2 2θ ee . For this purpose, a χ 2 is defined as: where ∆ is a 160-element vector that represents the relative agreement between measurement and prediction in 10 l bins and 16 e bins: The 160 ∆ entries are grouped by baseline, running from shortest distance to highest distance. Within each baseline group, ∆ elements run from lowest to highest E rec . Statistical and systematic uncertainties and their correlation between energy bins are incorporated into Eq. 11 using the covariance matrix V tot . This matrix is composed of the sum of individual statistical and systematic matrices V stat and V sys . To highlight the relative magnitude of uncertainty contribution of different elements, the total uncertainty reduced covariance matrix is pictured in Figure 43. Each entry V i,j tot is obtained by multiplying the corresponding reduced covariance matrix entry by M i · M j . As mentioned above, the 160 i and j values in V tot are grouped by baseline, running from lowest to highest baseline with increasing i and j. For example, the 10 sub-matrices appearing along the diagonal of V tot represent uncorrelated statistical uncertainties for each individual baseline.
Statistical uncertainties V stat are dominated by reactor-on IBD candidates and the subtracted cosmogenic background estimate based on the reactor-off IBD candidate dataset. Subtracted accidental backgrounds during reactor-on and reactoroff periods contribute little statistical uncertainty, owing to the Pe ) for the ten baseline bins defined in Figure 39. PG4-predicted ratios in the presence of sterile neutrino oscillations matching those of the best-fit point of (sin 2 2θ14,∆m 2 41 ) = (0.11,1.78 eV 2 ) and the 'Reactor Antineutrino Anomaly' (RAA) best-fit point of Ref. [44] are also pictured as solid purple and blue lines, respectively. In the absence of oscillations, the predicted ratio is unity for all energy-position bins. Error bars represent statistical uncertainties. large offset time window used to determine them. Uncorrelated statistical uncertainties from each dataset, which compose the diagonal of V stat , are primarily determined by the Poisson error of each l, e bin after properly scaling for relative live time and environmental differences between datasets and data periods. As each M l,e is a subset of the detectorintegrated spectrum M e , correlations in ∆ l,e statistical uncertainties will exist between different l, resulting in off-diagonal contributions to V stat . Systematic uncertainties in ∆ l,e , as well as systematic correlations between different l and e, are taken into account in the covariance matrix V sys . Various sources of systematic un-certainty related to detector response, response stability with time and with detector position, and background estimates, have been described throughout previous sections in this paper. These sources of systematic uncertainty are overviewed in Table IV, as well as being described briefly below: • Absolute background normalization and n-H peak uncertainty: accounts for unexpected background variations between reactor-off and reactor-on periods, and for uncertainty in the atmospheric scaling factor. Each is included as a baseline-and energy-correlated uncertainty within its relevant energy range; the two effects are treated as uncorrelated.
• Relative signal normalization: accounts for relative volume and efficiency variations between different baseline bins. Included as an energy-correlated uncertainty.
• Baseline: accounts for uncertainty in the detectorreactor baseline, as described in Section II. Included as an energy-correlated and baseline-correlated uncertainty.
• Energy non-linearity model uncertainties: accounts for uncertainty in best-fit Birks scintillator non-linearity parameters k b1 and k b2 and the Cerenkov light contribution k c . As all segments contain the same scintillator, these uncertainties are treated as baseline-correlated.
• Energy scale uncertainties: accounts for linear energy scale uncertainties. These are included as both a baseline-correlated and a baseline-uncorrelated uncertainty, reflecting the validations provided in Sections IV B and III H, respectively.
• Energy loss and leakage uncertainties: accounts for uncertainties in PG4 MC modelling of energy scale offsets between different detector regions/locations, which arise from loss of energy in inactive detector materials. Energy loss in optical grid reflectors is treated separately from energy losses due to leakage of γ-ray energy out of active detector regions. These are included as both baseline-correlated and baseline-uncorrelated uncertainties.
• Energy threshold uncertainties: accounts for uncertainties in reconstructed pulse energy thresholds, which play a key role in equalizing pulse multiplicities between different segments and different time periods. These are included as both a baseline-correlated and a baseline-uncorrelated uncertainty.
• Photostatistics resolution uncertainties: accounts for uncertainties in photostatistics resolution in Eq. 9. These are included as both a baseline-correlated and a baseline-uncorrelated uncertainty, reflecting the validations provided in Sections IV C and III H, respectively.
For each systematic uncertainty parameter described in Table IV, a covariance matrix V x is produced through generation and characterization of systematically fluctuated MC datasets. This process proceeds by first generating 10 3 MC datasets and unoscillated P l,e datasets including variations of a single systematic uncertainty parameter following a Gaussian distribution with a 1σ width as indicated in Table IV. Toy MC P l,e distributions for baseline, signal normalization, and energy resolution, leakage and linear scale variations are generated via analytical adjustment of the default null oscillation PG4 IBD dataset; for the background normalization uncertainty, similar analytical adjustment is applied to the reactoron cosmogenic background prediction. P l,e distributions for energy threshold systematic variations also use this default PG4 dataset, while applying a variety of reconstructed pulse energy threshold requirements in the analysis chain. For reflector panel thickness and scintillator non-linearity parameter uncertainties, P l,e distributions are obtained via generation of PG4 IBD MC datasets containing adjusted input simulation parameters; sample sizes are sufficiently large to ensure negligible MC-related stastical uncertainty contribution. For the purposes of covariance matrix generation, we subsequently refer to systematically fluctuated P l,e distributions as P i and the un-fluctuated P l,e as P i . With systematically fluctuated datasets P i in hand, covariance matrix elements for each uncertainty parameter can be calculated as the average difference in fluctuated and unfluctuated datasets, for any two entries i and j in P . It is clear from the large size of on-diagonal elements in V tot from Figure 43 that uncorrelated statistical uncertainty contributions are of substantially larger size than systematic uncertainty contributions.

C. Oscillation Results
Using the PROSPECT IBD candidate E rec spectra M l,e described in Section VII, the covariance matrices V sys and V stat described in the previous section, and PG4-generated oscillated P l,e spectra, the χ 2 of Equation 11 can be calculated for each point in the tested sterile neutrino parameter space. Calculated ∆χ 2 with respect to the best-fit point in phase space are pictured in Figure 44. The minimum value (χ 2 min /DOF) of 119.3/142 was identified at the grid point (sin 2 2θ 14 ,∆m 2 41 ) = (0.11,1.78 eV 2 ). This χ 2 min /DOF of 0.84 is slightly higher with respect to the previous minimum, 0.74, reported at (sin 2 2θ 14 ,∆m 2 41 ) = (0.35,0.5 eV 2 ) by PROSPECT in Ref [56]. This new χ 2 min value should also be contrasted with that obtained in the case of null oscillations (θ 14 =0), where the χ 2 /DOF is 123.3/144; while this ∆χ 2 of 4.0 indicates that the null oscillation case does not provide the best match to the data, further statistical analysis must be done to quantify the level of preference for non-zero oscillations. These two χ 2 can also be compared to 135.1, the χ 2 value obtained at the 'Reactor Antineutrino Anomaly' (RAA) bestfit point of Ref. [44], (sin 2 2θ 14 ,∆m 2 41 ) = (0.165, 2.39 eV 2 ). This emphasizes that the dataset also contains a preference for the null oscillation hypothesis over this suggested region of oscillation parameter space.
Based on the χ 2 values in Figure 44, two distinct statistical approaches were used to define oscillation parameter space regions allowed and excluded by the data. The first method, called the Gaussian CL s method [88], is based on testing multiple pairs of hypotheses. To assign the exclusion confidence level, for each point in (sin 2 2θ 14 ,∆m 2 41 ) parameter space three values are needed: where the ∆χ 2 in all cases are calculated using Equation 11. ∆χ 2 min (x) 0 and ∆χ 2 min (x) 1 are calculated with the PROSPECT data against the null oscillation hypothesis and oscillation hypothesis with parameters (∆m 2 41 ,sin 2 2θ 14 ) respectively. ∆χ 2 min (x Asimov 0 ) 1 is calculated with the un- oscillated Asimov dataset [88] tested against the oscillation hypothesis given by the parameters (∆m 2 41 ,sin 2 2θ 14 ). ∆χ 2 min (x Asimov 1 ) 0 is its converse, calculated for oscillated Asimov dataset with parameters (sin 2 2θ 14 ,∆m 2 41 ) tested against the null oscillation hypothesis.
Once the values from Equation 14 are known, the value of CL s can be computed using: .
The point (sin 2 2θ 14 ,∆m 2 41 ) is said to be excluded by the given data at 2σ confidence level if CL s < 0.05. The resulting 95% confidence level CL s exclusion contour is shown in Figure 45. The RAA best fit is clearly excluded at better than 95% CL.
The Gaussian CL s method provides a conservative excluded region that allows for easy combination with other experimental results, but it does not address the consistency of the data with respect to the null oscillation hypothesis. To remedy this, an examination of excluded sterile neutrino oscillation parameter space based on the the input χ 2 map in Figure 44 was performed using a Feldman-Cousins frequentist approach [89], similar to that described in Ref. [56]. This approach was first used to determine the level of preference observed in PROSPECT data for the best-fit point described above with respect to the null hypothesis, and with respect to the RAA best-fit point. For the null hypothesis, 10 3 individual toy datasets were generated by taking an unoscillated model spectrum at each baseline and adding a vector of independent random variables multiplied by a Cholesky decomposition of the full covariance matrix. This ensures that all toy results include the proper correlated and uncorrelated variations across  Figure 44. Both contours are obtained using the Gaussian CLs method. Also pictured is the RAA preferred parameter space and best-fit point from Ref. [16]; the best-fit point is excluded at >95% confidence level.
baselines and energies. These toy PROSPECT datasets represent the range of expected measurements likely to be delivered by PROSPECT in the absence of sterile neutrino oscillations given the range of expected statistical and systematic variations described above. Each toy PROSPECT dataset was then fit in a manner similar to that described above for the observed PROSPECT data. The ∆χ 2 = χ 2 null − χ 2 min values calculated for all toys then form a distribution of expected ∆χ 2 values, as shown in Figure 46. The ∆χ 2 value obtained by a fit to the PROSPECT dataset was then compared to this distribution; the observed ∆χ 2 value, 123.3 -119.3 = 4.0, is found to be smaller than 57% of ∆χ 2 generated by the toy null oscillation datasets, indicating little incompatibility with the nooscillation hypothesis.
The same test was performed on the RAA best-fit point using 10 3 oscillated toy MC datasets. For the measured data, the best-fit χ 2 mentioned above forms a ∆χ 2 value of 16.1 with respect to the χ 2 obtained at the RAA best-fit point. When compared to the distribution of ∆χ 2 values from the RAAoscillated toy datasets described above, we find that the observed ∆χ 2 value corresponds to a p-value of 1.5%, as shown in Figure 46. This indicates that the RAA best-fit point is excluded by the PROSPECT data at the 2.5σ confidence level.
Similar ∆χ 2 profiles were generated for each point in an examined grid of (∆m 2 41 , sin 2 2θ 14 ) values. At each grid point, a critical value, ∆χ 2 crit , is identified below which 95 % (2 σ) of all 10 3 toy dataset-derived ∆χ 2 fall. The map of ∆χ 2 crit values for each grid point in oscillation parameter space is shown in Figure 47.
It is worth noting that assuming these ∆χ 2 distributions follow a χ 2 distribution with two degrees of freedom, as might be naively done when fitting two oscillation parameters, ∆m 2 In particular, the incorrect χ 2 crit value associated with this inappropriate statistical treatment, for the case of the null hypothesis, would yield a p-value of 0.17, smaller than the p-value of 0.57 reported by the Feldman-Cousins approach. For the RAA best-fit point, this treatment leads to a p-value of 0.0004, smaller than the correct 0.015 p-value. Thus, it appears that this incorrect statistical interpretation of observed ∆χ 2 values will lead to over-statement of levels of statistical disagreement between data and the no-oscillation hypothesis, as well as understatement of the level of compatibility between the data and some regions of non-zero oscillation parameter space. This observation is consistent with discussions in a variety of other publications [89][90][91], and underscores the importance of using correct statistical treatments, such as the Gaussian CL s or Feldman-Cousins approaches.
Using the Feldman-Cousins approach, an oscillation parameter space exclusion contour was assigned in (sin 2 2θ 14 ,∆m 2 41 ) space to the observed χ 2 values pictured in Figure 44. A 95 % confidence level exclusion contour, shown in Figure 48, can be drawn by identifying all oscillation parameter space grid points whose data-derived ∆χ 2 between that grid point and the best-fit exceeds the χ 2 crit value given in Figure 47. The present dataset excludes significant portions of the Reactor Antineutrino Anomaly allowed region [44]. This exclusion shows good agreement with that derived using the Gaussian CL s method. Oscillation exclusion contours derived using the Gaussian CLs and Feldman-Cousins (FC) methods. Also pictured are the 1σ and 2σ (green and yellow) exclusion ranges produced by PROSPECT toy MC datasets, as well as the RAA preferred parameter space and best-fit point from Ref. [16].
The colored bands included in Figure 48 indicate, for each ∆m 2 value, the range of sin 2 2θ 14 values at which the 95% CL exclusion boundary appears for unoscillated toy MC datasets; green and yellow ranges contain 1σ and 2σ of all toys' 95% CL exclusion boundaries. By comparing the observed exclusion region to these bands, one can assess the compatibility of the spectral ratio data in Figure 41 with the range of expected unoscillated PROSPECT spectral ratios. The exclusion region formed by the PROSPECT data sits within the green 1σ region for most ∆m 2 values, indicating that the observed spectral ratios are typical of those expected based on the systematic and

IX. SPECTRUM ANALYSIS
Using the data and detector response model described the previous sections, the detected E rec spectrum of IBD interactions can be compared to theoretical predictions. A total of 50560 ± 406(stat) IBD events have been detected, with a cosmogenic (accidental) signal to background of 1.4 (1.8). This is the highest statistics measurement of the 235 U ν e spectrum to date.
Since 235 U is the only primary fissile isotope that can be studied in isolation, this measurement enables improved interpretation of measurements from low-enriched uranium (LEU) power reactors such as those used by the θ 13 experiments. These experiments have observed discrepancies between predicted and detected ν e energy spectra [13,45,46]. In this section we present an updated PROSPECT measurement of the 235 U ν e spectrum from HFIR, compare it to theoretical predictions, and perform further analysis to gauge the source of the deviation from predictions at high energy observed by LEU experiments.

A. Modelling the HFIR νe Spectrum
More than 99% of the ν e produced by High Flux Isotope Reactor are due to U-235 fission. However, small fluxes of neutrinos are produced from neutron activation of the surrounding material. The two dominant non-235 U sources of ν e are 28 Al from the fuel cladding and 6 He generated in the beryllium neutron reflector that surrounds the core [92]. Each of these contribute less than 1% of the total observed ν e flux and they are limited to the low-energy region of the spectrum (<4 MeV true neutrino energy). The predicted contribution to the detected spectrum for each of these is shown in Fig 49. The leading theoretical model of 235 U ν e emission the Hu-ber beta conversion model from Ref. [25]. This model converts a measured electron spectrum from neutron irradiation of fissile material into an ν e energy spectrum using 'virtual beta-branches'. Since the irradiation time in these measurements is relatively short compared to HFIR's 24-day cycle, corrections are needed to account for the production of nonequilibrium isotopes. The procedure laid out in Ref [16] is followed to determine the correction needed to match the exposure in this measurement. This correction is also shown in Fig 49. The inverse beta decay cross-section from Ref [93] is used to convert the ν e flux to a predicted spectrum. These components are summed to produce the model of the HFIR ν e spectrum that the PROSPECT detector is exposed to. The total ν e spectrum is passed through the detector response model to produce a predicted E rec spectrum which can be compared to the PROSPECT measurement. Further details of the HFIR prediction can be found in the Supplemental Material.

B. Statistical Treatment
A χ 2 metric is used to quantify the comparison between the measured spectrum and the beta conversion 235 U model prediction: where ∆ i is the difference between the measured and predicted events in the ith E rec bin including a free-floating nuisance parameter η to account for the normalization. The total uncertainty covariance matrix (V) is used to determine the minimum χ 2 for the measurement, including all uncertainties from signal and background statistics, detector, background, and reactor-related systematics, and from the theoretical model for the 235 U ν e spectrum. Statistical uncertainties from signal and background datasets are determined using methods similar to those for the oscillation analysis. For reactor-related spectrum uncertainties, a 100% uncertainty is assumed for all non-235 U corrections and for the non-equilibrium correction. For theoretical model uncertainties, the Huber model's published covariance matrix is converted into PROSPECT E rec space via Cholesky decomposition.
For detector and background systematic uncertainties, a covariance matrix was generated for each contribution by either varying parameters in simulated data, or by analytically varying the Huber spectrum [25] passed through the full detector response. Values used for each uncertainty were chosen as the result of a dedicated study of each effect. These effects include the physical properties of the detector, such as nonlinearity, energy loss, Cherenkov contributions, and wall thickness, as well as components of analysis cuts or signal definition, such as fiducial volume, energy threshold, or background subtraction.  origin of each of these systematic uncertainties has been provided throughout the previous sections of this paper.
To provide an illustration of the relative contribution from different uncertainty sources for the spectrum analysis, Fig 50 shows the diagonal elements of the various categories included in the full uncertainty covariance matrix. Statistics clearly serve as the dominant source of uncertainty for the current 235 U spectrum measurement, with detector-related systematic uncertainties as the largest sub-dominant uncertainty contributor. Reactor and model-related uncertainties provide the smallest overall uncertainty contribution. Figure 50 also provides a breakdown of the largest detector-related contributors. The dominant sources of detector systematic uncertainty are the limitations of understanding of the detector's E rec scale and non-linearity, as well as the uncertainty in the total dead mass contributed by the reflecting walls of the optical grid.

C. Results
The comparison of the Huber model to the measured spectrum is shown in Fig 51. The normalization of the model is determined by a minimization of the χ 2 in the [0.8,7.2] MeV region. A χ 2 /DOF of 30.79/31 is observed, corresponding to a one-sided p-value of 0.48. To further quantify if any specific region of the spectrum is contributing significantly to this total χ 2 , additional nuisance parameters are added in 200 keV and 1 MeV-wide windows and a new χ 2 min determined for each. This ∆χ 2 can be interpreted as the local contribution to the total χ 2 . The corresponding single-sided p-values are determined from the ∆χ 2 and plotted in Fig 51. Small excursions are observed in the 2.5 MeV and 5 MeV regions using this method. However, no region shows more than 2σ deviation within the 1 MeV model prediction windows used.
Precision measurements at nuclear power reactors have observed discrepancies between predicted and detected ν e energy spectra. Most notably, a wide excess of events between 4-6 MeV E rec has generated much interest in the community. As these LEU reactors burn a time-evolving mixture of fuel, it is difficult to disentangle the isotopic origin of this distortion. To test whether PROSPECT observes such a feature, a Gaussian with mean 5.678 MeV and sigma 0.562 MeV is added to the HFIR model in true neutrino energy prior to applying the detector response. This mean and sigma of the Gaussian are obtained from fitting the unfolded Daya Bay spectrum [18]. The amplitude (A) of this addition, in units where a Daya Bay-sized distortion is equal to one, is varied yielding the single parameter χ 2 curve shown in Fig 52. A best-fit distortion of 0.84±0.39 is observed. Fig 51b shows a comparison of the data to both the best-fit distortion and the unmodified HFIR predicted spectrum.
The data are consistent with a distortion of equal size to that observed by the θ 13 experiments (A = 1). However, the data disfavor a null-hypothesis of no distortion in the 235 U spectrum (A = 0) at 2.17σ, as well as a 235 U spectral distortion of the size (A = 1.78) required to be the sole source of the θ 13 measurements at 2.44σ.

X. SUMMARY
During 96 calendar days of reactor-on data-taking between March and October 2018, the PROSPECT experiment observed over 50,000 inverse beta decay interactions of ν e produced by 235 U fission product decays by the highly-enriched 85 MW HFIR reactor. Despite deployment on the earth's surface in a high-background reactor facility environment, the PROSPECT IBD analysis is capable of selecting more signal IBD events than either cosmogenic-induced backgrounds (signal-to-background ratio of 1.4) or accidental backgrounds (signal-to-background ratio of 1.8). In overviewing the signal and background modelling, estimation, and validation processes, a number of unexpected but useful PROSPECT capabilities were also demonstrated, such as its performance of cosmic muon tomography of the HFIR water pool, and its ability to determine the direction of propagation of an observed flux of reactor ν e .
In order to probe short-baseline reactor antineutrino disappearance with PROSPECT, reconstructed prompt energy spectra at ten different reactor-detector baselines were compared. In particular, baseline-dependent variations in detected energy spectra would indicate disappearance produced by oscillation between active and sterile neutrino sectors. In this paper, it was shown using two different statistical techniques that these relative baseline comparisons indicated no significant indication of sterile neutrino oscillations. While a best fit to the data in the sterile neutrino parameter space is found at (sin 2 2θ 14 ,∆m 2 ) = (0.11,1.78 eV 2 ), this preference is very mild with respect to the no-oscillation hypothesis, which is disfavored with a p-value of only 0.57. However, the canonical Reactor Antineutrino Anomaly best-fit point given in [44] is substantially disfavored at the 2.5σ confidence level. Other regions of parameter space in the ∼0.1-15 eV 2 mass splitting range are disfavored at more than 95% confidence level by PROSPECTs data.
By integrating the measured prompt energy spectra over all baseline ranges, PROSPECT has also reported on a new measurement of the 235 U ν e energy spectrum. PROSPECT's updated 235 U spectrum result shows good agreement with the beta conversion ν e prediction of Huber [25], with a χ 2 /DOF of 30.79/31. By measuring a nearly pure sample of ν e resulting from 235 U fission, PROSPECT is able to assess hypotheses regarding the origin of differences between modelled