Low exposure long-baseline neutrino oscillation sensitivity of the DUNE experiment

The Deep Underground Neutrino Experiment (DUNE) will produce world-leading neutrino oscillation measurements over the lifetime of the experiment. In this work, we explore DUNE's sensitivity to observe charge-parity violation (CPV) in the neutrino sector, and to resolve the mass ordering, for exposures of up to 100 kiloton-megawatt-years (kt-MW-yr). The analysis includes detailed uncertainties on the flux prediction, the neutrino interaction model, and detector effects. We demonstrate that DUNE will be able to unambiguously resolve the neutrino mass ordering at a 3$\sigma$ (5$\sigma$) level, with a 66 (100) kt-MW-yr far detector exposure, and has the ability to make strong statements at significantly shorter exposures depending on the true value of other oscillation parameters. We also show that DUNE has the potential to make a robust measurement of CPV at a 3$\sigma$ level with a 100 kt-MW-yr exposure for the maximally CP-violating values $\delta_{\rm CP}} = \pm\pi/2$. Additionally, the dependence of DUNE's sensitivity on the exposure taken in neutrino-enhanced and antineutrino-enhanced running is discussed. An equal fraction of exposure taken in each beam mode is found to be close to optimal when considered over the entire space of interest.

F. Xie, 185 E. Yandel, 33 G. Yang, 183 K. Yang,152 66 (The DUNE Collaboration) The Deep Underground Neutrino Experiment (DUNE) will produce world-leading neutrino oscillation measurements over the lifetime of the experiment. In this work, we explore DUNE's sensitivity to observe charge-parity violation (CPV) in the neutrino sector, and to resolve the mass ordering, for exposures of up to 100 kiloton-megawatt-years (kt-MW-yr). The analysis includes detailed uncertainties on the flux prediction, the neutrino interaction model, and detector effects. We demonstrate that DUNE will be able to unambiguously resolve the neutrino mass ordering at a 3σ (5σ) level, with a 66 (100) kt-MW-yr far detector exposure, and has the ability to make strong statements at significantly shorter exposures depending on the true value of other oscillation parameters. We also show that DUNE has the potential to make a robust measurement of CPV at a 3σ level with a 100 kt-MW-yr exposure for the maximally CP-violating values δCP = ±π/2. Additionally, the dependence of DUNE's sensitivity on the exposure taken in neutrino-enhanced and antineutrino-enhanced running is discussed. An equal fraction of exposure taken in each beam mode is found to be close to optimal when considered over the entire space of interest.

I. INTRODUCTION
The Deep Underground Neutrino Experiment (DUNE) [1] is a next-generation long-baseline neutrino oscillation experiment which will utilize high-intensity ν µ andν µ beams with peak neutrino energies of ≈2.5 GeV over a 1285 km baseline to carry out a detailed study of neutrino mixing. Some of DUNE's key scientific goals are the definitive determination of the neutrino mass ordering, the definitive observation of charge-parity symmetry violation (CPV) for more than 50% of possible true values of the charge-parity violating phase, δ CP , and the precise measurement of other three-neutrino oscillation parameters. These measurements will help guide theory in understanding if there are new symmetries in the neutrino sector and whether there is * Corresponding author: cwilkinson@lbl.gov a relationship between the generational structure of quarks and leptons [2]. Observation of CPV in neutrinos would be an important step in understanding the origin of the baryon asymmetry of the universe [3,4]. DUNE has a rich physics program beyond the three-neutrino oscillation accelerator neutrino program described here. These include beyond standard model searches [5], supernova neutrino detection [6], and solar neutrino detection [7].
Neutrino oscillation experiments have so far measured five of the three-neutrino mixing parameters [10][11][12]: the three mixing angles θ 12 , θ 23 , and θ 13 ; and the two squared-mass differences ∆m 2 21 and |∆m 2 31 |, where ∆m 2 ij = m 2 i − m 2 j is the difference between the squares of the neutrino masses. The neutrino mass ordering (the sign of ∆m 2 31 ) is not currently known, though recent results show a weak preference for the normal ordering (NO), where ∆m 2 31 > 0, over the inverted ordering (IO) [13][14][15]. The value of δ CP is not well known, though neutrino oscillation data are beginning to provide some information on its value [13,16]. The oscillation probability of ν ( -) µ → ν ( -) e through matter in the standard three-flavor model and a constant matter density approximation can be written as [17]: where a = ± G F N e √ 2 ≈ ± 1 3500 km ρ 3.0 g/cm 3 , aL ≈ 0.367, G F is the Fermi constant, N e is the number density of electrons in the Earth's crust, ∆ ij = 1.267∆m 2 ij L/E ν , ∆m 2 ij is in eV 2 , L is the baseline in km, and E ν is the neutrino energy in GeV. Both δ CP and a terms are positive (negative) for ν µ → ν e (ν µ →ν e ) oscillations. The matter effect asymmetry encapsulated in the a terms arises from the presence of electrons and absence of positrons in the Earth's crust [18,19]. In the analysis described here, the oscillation probabilities are calculated exactly [20].
DUNE has published sensitivity estimates [21] to CPV and the neutrino mass ordering, as well as other oscillation parameters, for large exposures of up to 1104 kiloton-megawatt-years (kt-MW-yr), which show the ultimate sensitivity of the experiment. Sophisticated studies with a detailed treatment of systematic uncertainties were carried out only at large exposures. In this work, DUNE's sensitivity at low exposures is explored further, with a detailed systematics treatment, including an investigation into how the run plan may be optimized to enhance sensitivity to CPV and/or mass ordering. It is shown that DUNE will produce world-leading results at relatively short exposures, which highlights the need for a high-performance near detector complex from the beginning of the experiment.
The DUNE far detector (FD) will ultimately consist of four modules, each with a 17 kt total mass. The neutrino beam-line has an initial design intensity of 1.2 MW, with a planned upgrade to 2.4 MW. We assume a combined yearly Fermilab accelerator and neutrino beam-line uptime of 57% [8], and include this in our calculation of calendar years in this work. As the FD deployment schedule and beam power scenarios are both subject to change, the results shown in this work are consistently given in terms of exposure in units of kt-MW-yr, which is agnostic to the exact staging scenario, but can easily be expressed in terms of experiment years for any desired scenario. For example, with two FD modules, assuming a fiducial mass of 10 kt and a beam intensity of 1.2 MW, exposure would accumulate at a rate of 24 kt-MW-yr per calendar year, although a ramp up in beam power is expected before reaching the design intensity in early running. In this work, the single-phase horizontal drift technology is assumed for all FD modules (see Section II D), which is a necessary simplification, but alternative technologies which may have slightly different performance are under investigation for some FD modules.
The analysis framework is described in Section II, including a description of the flux, neutrino interaction and detector models and associated uncertainties. A study on the dependence of the sensitivity to CPV and mass ordering on the fraction of data collected in neutrino-enhanced or antineutrino-enhanced running is given in Section III. A detailed study on the CPV and mass ordering sensitivities at low exposures are described in Sections IV and V, respectively. Finally, conclusions are presented in Section VI.

II. ANALYSIS FRAMEWORK
This work uses the flux, neutrino interaction and detector model described in detail in Ref. [21], implemented in the CAFAna framework [22]. This section provides an overview of the key analysis features. Further details on all aspects can be found in Ref. [21].
A. Neutrino flux DUNE will operate with two different beam modes, which depend on the polarity of the electromagnetic horns used to focus secondary particles produced after protons from the primary beam line interact in the target. Forward horn current (FHC) corresponds to neutrinoenhanced running, and reverse horn current (RHC) corresponds to antineutrino-enhanced running. In both FHC and RHC there are significant contributions from neutrinos with energies between 0.5-6 GeV, with a flux peak at ≈2.5 GeV. The neutrino flux prediction is generated with G4LBNF [8,23], using the LBNF optimized beam design [8]. Flux uncertainties are due to the production rates and kinematic distributions of hadrons produced in the target and the parameters of the beam line, such as horn currents and horn and target positioning ("focusing uncertainties") [8]. They are evaluated using current measurements of hadron production and estimates of alignment tolerances, giving flux uncertainties of approximately 8% at the first oscillation maximum, which are highly correlated across energy bins and neutrino flavors. A flux covariance as a function of neutrino energy, beam mode, detector, and neutrino species is generated with a "toy throw" approach, which is built using variations ("throws") of the systematics propagated through the full beam line simulation. To reduce the number of parameters used in the fit, the covariance matrix is diagonalized, and each principal component is treated as an uncorrelated nuisance parameter. Only the first ≈30 principal components (out of 108) were found to have a significant effect in the analysis and were included. The shapes of the unoscillated fluxes at the ND and FD are similar, and the differences between them are understood at the percent level.

B. Neutrino interaction model
The interaction model used is based on GENIE v2.12.10 [24,25], although the combination of models used is much closer to some of the physics tunes available with GENIE v3.00.06, including a number of uncertainties beyond those provided by either GENIE version. These are motivated by data, although the available (anti)neutrino data taken on argon targets is sparse, leading to an uncertainty model that relies in a number of places on light target (mostly hydrocarbon) data. Variations in the cross sections are implemented either using GENIE reweighting parameters, or with ad hoc weights of events designed to parameterize uncertainties or cross-section corrections currently not implemented within GENIE. The latter were developed using alternative generators or GENIE configurations, or custom weightings using the NUISANCE package [26]. Further details about the uncertainties used can be found in Ref. [21] (Section 3).
The nuclear model which describes the initial state of nucleons in the nucleus is the Bodek-Ritchie global Fermi gas model [27], which includes empirical modifications to the nucleon momentum distribution to account for short-range correlation effects. The quasi-elastic model uses the Llewelyn-Smith formalism [28] with a simple dipole axial form factor, and BBBA05 vector form factors [29]. Nuclear screening effects and uncertainties are included based on the T2K 2017/8 parameterization [30] of the Valencia group's [31,32] Random Phase Approximation model. The Valencia model of the multinucleon, 2p2h, contribution to the cross section [31,32] is used, as described in Ref. [33]. Both MINERvA [34] and NOvA [35] have shown that this model underpredicts observed event rates on carbon at relevant neutrino energies for DUNE. Modifications to the model are constructed to produce agreement with MINERvA CCinclusive data [34], which are used in the analysis to introduce additional uncertainties on the 2p2h contribution, with energy dependent uncertainties, and extra freedom between neutrinos and antineutrinos. Uncertainties are added on scaling the 2p2h prediction from carbon to argon on electron-scattering measurements of short-range correlated (SRC) pairs taken on multiple targets [36], separately for neutrinos and antineutrinos. GENIE uses a modified version of the Rein-Sehgal (R-S) model for pion production [37]. A data-driven modification to the GENIE model is included based on reanalyzed neutrinodeuterium bubble chamber data [38,39]. The Deep Inelastic Scattering (DIS) model implemented in GENIE uses the Bodek-Yang parametrization [40], and GRV98 parton distribution functions [41]. Hadronization is described by the AKGY model [42], which uses the KNO scaling model [43] for invariant masses W ≤ 2.3 GeV and PYTHIA6 [44] for invariant masses W ≥ 3 GeV, with a smooth transition between the two models for intermediate invariant masses. Additional uncertainties developed by the NOvA Collaboration [45] to describe their resonance to DIS transition region data are also included. The final state interaction model and uncertainties available in GENIE are retained [46][47][48].
The cross sections include terms proportional to the lepton mass, which are significant at low energies where quasielastic processes dominate. Some of the form factors in these terms have significant uncertainties in the nuclear environment. Separate (and anticorrelated) uncertainties on the cross section ratio σ µ /σ e for neutrinos and antineutrinos are adopted from Ref. [49]. Additionally, some ν e charged-current (CC) interactions occur at four-momentum transfers where ν µ CC interactions are kinematically forbidden, and so cannot be constrained by ν µ cross-section measurements. To reflect this, a 100% uncertainty is applied in the phase space present for ν e but absent for ν µ .

C. Near detector simulation and reconstruction
The near detector (ND) hall will be located 574 m downstream of the proton target and ≈60 m underground. The reference design for the DUNE ND system is fully described in Ref. [9], and consists of a liquid argon (LAr) time projection chamber (TPC) referred to as ND-LAr, a magnetized high-pressure gaseous argon TPC (ND-GAr), and an on-axis beam monitor (SAND). Additionally, ND-LAr and ND-GAr are designed to move perpendicular to the beam axis in order to take data at various off-axis angles (the DUNE-PRISM technique). ND-LAr is a modular detector based on the ArgonCube design [50][51][52], with a total active LAr volume of 105 m 3 (a LAr mass of 147 tons). ND-GAr is implemented in this analysis as a cylindrical TPC filled with a 90/10 mixture of argon and CH 4 at 10 bar, surrounded by a granular, high-performance electromagnetic calorimeter (ECal). ND-GAr sits immediately downstream of the LAr cryostat and serves as a muon spectrometer for ND-LAr [9].
Neutrino interactions are simulated in the active volume of ND-LAr. The propagation of neutrino interaction products through the ND-LAr and ND-GAr detector volumes is simulated using a Geant4-based program [53]. As pattern recognition and reconstruction software has not yet been fully developed for the ND, this analysis uses a parameterized reconstruction based on the Geant4 simulated energy deposits in active detector volumes.
Only CC-inclusive interactions originating in the LAr are considered in this analysis, with a fiducial volume (FV) which excludes 50 cm from the sides and upstream edge, and 150 cm from the downstream edge of the active region, containing a total fiducial mass of ≈50 t. Most muons with kinetic energies greater than 1 GeV exit ND-LAr. Energetic forward-going muons pass into ND-GAr, where their momentum and charge are reconstructed by curvature. Muon energy is reconstructed by range for tracks that stop in the LAr, and the charge cannot be determined event-by-event. Events with muons that exit the LAr active volume and do not match to a track in ND-GAr are rejected, as the muon momentum is not well reconstructed. For FHC beam running, the wrong-sign background is small and the charge is assumed to be negative for all LAr-contained muons. For RHC beam running, a Michel electron is required at the end of these stopped tracks to suppress the wrong-sign µ − by a factor of four.
All generated muons and charged pions are evaluated as potential muon candidates. Tracks are classified as muons if their length is at least 1 m, and their mean energy deposit is less than 3 MeV/cm. The minimum length requirement imposes an effective threshold on the true muon kinetic energy of about 200 MeV, but greatly suppresses potential neutral current (NC) backgrounds with low-energy, non-interacting charged pions. Charged-current events are required to have exactly one muon candidate, and if the charge is reconstructed by curvature, it must be of the appropriate sign. Hadronic energy in the ND is reconstructed by summing all charge deposits in the LAr active volume that are not associated with the muon. To remove events where the hadronic energy is badly reconstructed due to charged particles exiting the detector, a veto region is defined as the outer 30 cm of the active volume on all sides, and events with more than 30 MeV total energy deposited in the veto region are rejected. Only a fraction of neutron kinetic energy is typically observed (24% on average with large fluctuations), resulting in poor energy reconstruction of events with energetic neutrons. The reconstructed neutrino energy, E rec ν = E rec µ + E rec had , is the sum of the reconstructed hadronic energy, E rec had , and the reconstructed muon energy, E rec µ . The reconstructed inelasticity, y rec = 1 − E rec µ /E rec ν , is the fraction of the neutrino energy that is carried by hadrons.

D. Far detector simulation and reconstruction
The DUNE FD design consists of four separate LArTPC detector modules, each with a total LAr mass of 17 kt, installed ≈1.5 km underground at the Sanford Underground Research Facility (SURF) [54]. The technologies to be deployed for the four modules and their order of construction are still under investigation, so in this analysis, only the single-phase design with a horizontal drift [55] is used. In this design, signals from drift electrons in the 13.3 × 12.0 × 57.5 m 3 active volume are read out by ≈5 mm spaced wires in anode readout planes. Scintillation light produced at the time of the neutrino interaction is detected and used to reconstruct the start time of the electron drift. We have developed a full simulation chain, which generates neutrino events in a Geant4 model of the FD geometry and simulates the electronics readout. We have developed a reconstruction package to calculate efficiencies and reconstructed neutrino energy estimators for the four CC-inclusive FD samples used in the analysis (ν µ -like FHC, ν e -like FHC,ν µ -like RHC and ν e -like RHC).
The electronics response to the ionization electrons and scintillation light is simulated in the wire planes and photon detectors, respectively. Algorithms are applied to remove the impact of the LArTPC electric field and electronics response from the raw detector signal to identify hits, and to cluster hits that may be grouped together due to proximity in time and space. Clusters from different wire planes are matched to form high-level objects such as tracks and showers using the Pandora toolkit [56,57]. Event classification is carried out through image recognition techniques using a convolutional neural network [58] which classifies events as ν events is estimated by the sum of the energy of the longest reconstructed track (highest energy reconstructed electromagnetic shower) and the hadronic energy. For both event types, the hadronic energy is estimated from the charge of reconstructed hits that are not in the primary track or shower, and corrections are applied to each hit charge for recombination and electron lifetime effects. For ν ( -) µ -CC events, the energy of the longest track is estimated by range if the track is contained or by multiple Coulomb scattering if it is exiting. For 0.5-4 GeV neutrino energies, the observed neutrino energy resolution is 15-20%. The muon energy resolution is 4% for contained tracks and 18% for exiting tracks. The electron energy resolution is approximately 4%⊕9%/ √ E. The hadronic energy resolution is 34%.

E. Detector systematics
Detector effects impact the event selection efficiency as well as the reconstruction of neutrino energy and inelasticity (the variables used in the oscillation fits). The main sources of detector systematic uncertainties are limitations of the expected calibration and modeling of particles in the detector. Important differences between the ND and FD LArTPC design, size, detector environment, and calibration strategy, lead to uncertainties that do not fully correlate between the two detectors. The degree of correlation is under active study, but in this analysis they are treated as being completely uncorrelated. Detailed simulations of detector effects are under devel-opment. In this analysis, uncertainties on the energy scale, energy resolution, particle responses, and detector acceptance are included to encapsulate these effects. The absolute scale uncertainties shift the reconstructed energy distributions, while the resolution uncertainties narrow or broaden them.
An uncertainty on the overall energy scale is included in the analysis presented here, as well as particle energy scale and resolution uncertainties that are separate and uncorrelated between four particle classes: muons, charged hadrons, neutrons, and electromagnetic showers. In the ND, muons reconstructed by range in LAr and by curvature in the ND-GAr are treated separately and assigned uncorrelated uncertainties. For each class of particle, uncertainties on the energy scale are introduced as a function of the reconstructed particle energy, E, with a constant term, a term proportional to √ E, and a term proportional to 1/ √ E. A 10% uncertainty on the energy resolution is also included, and treated as uncorrelated between the four particle classes. The parameters produce a shift to the kinematic variables in an event, as opposed to simply assigning a weight to each simulated event. The scale of the uncertainties is motivated by what has been achieved in recent experiments, including calorimetric based approaches (NOvA, MINERvA) and LArTPCs (LArIAT, MicroBooNE, ArgoNeut).
In addition to impacting energy reconstruction, the Efield model also affects the definition of the FD FV, which is sensitive to electron drift. An additional 1% uncertainty is therefore included on the total fiducial mass, which is conservatively treated as uncorrelated between the ν ( -) µ and ν ( -) e samples due to the potential distortion caused by large electromagnetic showers in the electron sample.
The FD is sufficiently large that acceptance is not expected to vary significantly as a function of event kinematics. However, the ND acceptance does vary as a function of both muon and hadronic kinematics due to various containment criteria. Uncertainties are evaluated on the muon and hadron acceptance at the ND based on the change in the acceptance as a function of muon kinematics and true hadronic energy.

F. Sensitivity Methods
Systematics are implemented in the analysis using onedimensional response functions for each analysis bin, and oscillation weights are calculated exactly, in fine (50 MeV) bins of true neutrino energy. For a given set of inputs -flux, oscillation parameters, cross sections, detector energy response matrices, and detector efficiency -an expected event rate can be produced. Minimization is performed using the minuit [59] package.
Oscillation sensitivities are obtained by simultaneously fitting the ν µ -like FHC, ν e -like FHC,ν µ -like RHC and ν e -like RHC FD spectra along with the ν µ FHC andν µ RHC samples from the ND. In the studies, all oscillation   [60,61] to neutrino oscillation data. The matter density is taken from Ref. [62]. Because the probability distributions are somewhat non-Gaussian (particularly for θ23), the relative uncertainty is computed using 1/6 of the 3σ allowed range from the fit, rather than 1/2 of the 1σ range. For θ23, θ13, and ∆m 2 31 , the best-fit values and uncertainties depend on whether NO or IO is assumed. The best fit for δCP is used as a test point in the analysis, but no uncertainty is assigned. Table I are allowed to vary. Gaussian penalty terms (taken from Table I) are applied to θ 12 , ∆m 2 21 , and the matter density, ρ, of the Earth along the DUNE baseline [62]. Some studies presented in this work include a Gaussian penalty term on θ 13 (also taken from NuFIT 4.0, given in Table I), which is precisely measured by experiments sensitive to reactor antineutrino disappearance [63][64][65]. The remaining parameters, sin 2 θ 23 , ∆m 2 32 , and δ CP are allowed to vary freely, with no penalty terms. The penalty terms are treated as uncorrelated with each other, and uncorrelated with other parameters.

parameters shown in
Flux, cross-section, and FD detector parameters are allowed to vary in the fit, but are constrained by a penalty term corresponding to the prior uncertainty. ND detector uncertainties are included via a covariance matrix based on the shape difference between ND prediction and the "data" (which comes from the simulation in this sensitivity study). The covariance matrix is constructed with a throwing technique. For each "throw", all ND energy scale, resolution, and acceptance parameters are simultaneously thrown according to their respective uncertainties, and the modified prediction is produced by varying the relevant quantities away from the nominal prediction according to the thrown parameter values. The bin-tobin covariance is determined by comparing the resulting spectra with the nominal prediction, in the same binning as is used in the oscillation sensitivity analysis.
The compatibility of a particular oscillation hypothesis with both ND and FD data is evaluated using the standard Poisson log-likelihood ratio [66]: where ϑ and x are the vector of oscillation parameter and nuisance parameter values, respectively; M i ( ϑ, x) and D i are the MC expectation and fake data in the ith reconstructed bin (summed over all selected samples), with the oscillation parameters neglected for the ND; ∆x j and σ j are the difference between the nominal and current value, and the prior uncertainty on the jth nuisance parameter; and V kl is the covariance matrix between ND bins described previously. To protect against false minima, all fits are repeated starting at four different δ CP values (-π, -π/2, 0, π/2), in both mass orderings, and in both sin 2 θ 23 octants, and the lowest obtained χ 2 value is taken as the true minimum.
Two approaches are used for the sensitivity studies presented in this work. Asimov studies [67] are carried out (in Section III) in which the fake (Asimov) dataset is the same as the nominal MC. In these, the true value of all systematic uncertainties and oscillation parameters are set to their nominal value (see Table I) except the parameters of interest, which are set to a test point. Then a fit is carried out in which all parameters can vary, constrained by their prior uncertainty where applicable. For the smallest exposures investigated with an Asimov study in this work, all samples have at least 100 events, satisfying the Gaussian approximation inherent in the Asimov method. Toy throw studies are performed (in Sections IV and V) in which an ensemble of systematic, oscillation parameter and statistical throws are made. Systematic throws are made according to their prior Gaussian uncertainties, oscillation parameters are randomly chosen as described in Table II, and Poisson fluctuations are then applied to all analysis bins, based on the mean event count for each bin after the systematic adjustments have been applied. For each throw in the ensemble, the test statistic is minimized, and the best-fit value of all parameters is determined. The expected resolution for parameters of interest are then determined from the spread in the distribution of their post-fit values.
Asimov studies are computationally efficient, and for Gaussian parameters and uncertainties, give a good sense of the median sensitivity of an experiment. Toy throwing studies are computationally expensive, fully explore the parameter space, and make fewer assumptions about the behavior of parameters and uncertainties.

G. Near and far detector samples and statistics
In this work, the sensitivity as a function of FD exposure is explored and results are reported in terms of kt-MW-yr, which does not assume any specific FD or beam intensity staging scenario. However, the ND used in this analysis (ND-LAr with a dowstream muon spectrometer) is assumed not to be staged, and as such the ND sample size corresponding to a particular FD exposure will vary based on the staging scenario. The nominal staging scenario from Ref. [21] is therefore retained for the purpose of normalizing the ND samples at each FD exposure. In that scenario, a 7 year exposure corresponds to 336 kt-MW-yr at the FD, and 480 t-MW-yr at the ND, summed over both beam modes. The ND statistics used in this analysis are scaled assuming this ratio throughout, using the same fraction of exposure in each beam mode as used at the FD. The ND samples used in this analysis are relatively quickly systematics limited in both beam modes, and so these approximations are unlikely to have a significant impact on the results.
The oscillation analysis presented here includes two CC-inclusive samples originating in the ND-LAr FV, an FHC ν µ and an RHCν µ sample. These samples are both binned in two dimensions, as a function of reconstructed neutrino energy and inelasticity, y rec = 1 − E rec µ /E rec ν . The sample distributions for both FHC and RHC are shown in Figure 1 for an exposure of 105 t-MW-yr, corresponding to 100 kt-MW-yr at the far detector with the assumptions stated above. The size of the systematic uncertainty bands from all of the flux, cross-section and ND detector systematics used in the analysis and described above are shown, as well as the postfit uncertainty bands obtained by performing an Asimov fit to the ND data. It is clear that even after a relatively small exposure of 105 t-MW-yr, the ND samples are very high statistics, and are systematics limited in the binning used in the analysis. Backgrounds in the ν ( -) µ -CC samples are also shown in Figure 1. NC backgrounds are predominantly from NC π ± production where the pion leaves a long track and does not shower. Wrong-sign contamination in the beam is a background where the charge of the muon is not reconstructed, which particularly affects low reconstructed neutrino energies in RHC. The wrong-sign background is also larger at high reconstructed inelasticity, y rec , due to the kinematics of neutrino and antineutrino scattering.  Table I).
The size of the systematic uncertainty bands from all of the flux, cross-section and FD detector systematics used in the analysis are shown, as well as the postfit uncertainty bands with parameters constrained by ND data. Backgrounds are also shown, the largest contribution comes from intrinsic ν ( -) e contamination in the beam, although NC and other flavors, The expected FD FHC ν e and RHCν e samples are shown in Figure 2 for a 100 kt-MW-yr total FD exposure, split equally between FHC and RHC beam modes. The systematic uncertainty bands with and without the ND constraint applied are shown, as well as the back-ground contributions. There are contributions from both ν e andν e in both beam modes. The NC, intrinsic beam ν ( -) e , and wrong flavor contamination is also shown; the largest background comes from the intrinsic ν ( -) e beam contribution in both modes. After a 50 kt-MW-yr exposure in FHC, the ν e sample statistical uncertainty is close to the systematic uncertainty before the ND constraint, although is still clearly statistics limited when the ND constraint is applied. Theν e sample is still strongly statistics limited after 50 kt-MW-yr exposure in RHC. The difference is largely due to the difference in the ν e andν e cross sections.
The expected FD FHC ν µ and RHCν µ samples are shown in Figure 3 for a 100 kt-MW-yr total FD exposure, split equally between FHC and RHC beam modes. The systematic uncertainty bands with and without the ND constraint applied are shown, as well as the background contributions. Although the wrong-sign ν µ contribution to the RHCν µ sample is shown separately, it still provides useful information for constraining the oscillation parameters and is included in the analysis. The statistics are much higher than in Figure 2; the statistical uncertainty on the ν µ FHC sample is smaller than the systematic uncertainty band for some regions of phase space, even after the ND constraint is applied, although the statistical uncertainty is larger than the constrained systematic uncertainty in the "dip" region, around 2.5 GeV, which is likely to have the most impact on the analysis. The statistical uncertainty on theν µ RHC sample is larger, again due to the smallerν µ (than ν µ ) cross section and lower fluxes in RHC running. The statistical uncertainty around the 2.5 GeV dip region is significantly larger than the systematic uncertainty band, although as for the FHC ν µ sample, the statistical uncertainty is smaller than the systematics for some regions of the parameter space.
Events with a reconstructed neutrino energy of less than 0.5 GeV (which are shown in Figures 1, 2 and 3) or a reconstructed neutrino energy greater than 10 GeV are not included in the analysis for any of the FD or ND samples.

III. RUN PLAN OPTIMIZATION
In previous DUNE sensitivity studies [21], equal running times in FHC and RHC were assumed, based on early sensitivity estimates for different scenarios. In this section, the dependence of the median CPV and mass ordering significances are studied, for different fractions of time spent in each beam mode, using the full analysis framework described in Section II. Figure 4 shows DUNE's Asimov sensitivity to CPV for a total 100 kt-MW-yr far detector exposure, with different fractions of FHC and RHC running, at the Nu-FIT 4.0 best fit value in both NO and IO (see Table I), shown with and without a penalty on θ 13 applied. For each point tested, all oscillation and nuisance parame-  Table I). The size of the systematic uncertainty bands from all of the flux, cross-section and ND detector systematics used in the analysis are shown, as well as the postfit uncertainty bands with parameters constrained by ND data. NC backgrounds and wrong-sign contributions to the event rate are also shown.
ters are allowed to vary, and three fits are carried out, two where δ CP is set to the CP-conserving values δ CP = 0 and δ CP = ±π, the minimum of which is the CP-conserving best-fit value, and another where δ CP is allowed to vary. The difference in the best-fit χ 2 values is calculated: and the square root of the difference is used as the figure of merit on the y-axis in Figure 4. There are some caveats associated with this figure of merit, which are discussed in Section IV. A 100 kt-MW-yr exposure is shown as it was identified in Ref [21] as the exposure at which DUNE's median CPV significance exceeds 3σ at δ CP = ±π/2, an important milestone in DUNE's physics program (with equal beam mode running). Figure 4 shows that when the reactor constraint on θ 13 is included, the sensitivity to CPV can be increased in some regions of δ CP parameter space with more FHC than RHC running. However, this degrades the sensitivity in other regions, most notably for δ CP > 0 regardless of the true mass ordering. This is due to a degeneracy between δ CP and the octant of sin 2 2θ 23 because both parameters impact the rate of ν e appearance. The degeneracy is resolved by including antineutrino data; the octant of sin 2 2θ 23 affects the rate of ν e andν e appearance in the same way, but the effect of δ CP is reversed for antineutrinos.
For regions of phase space where the octant degeneracy does not affect the result (e.g., sin 2 θ 23 ≈ 0.5), there is no degradation in the sensitivity, and enhanced FHC running increases the sensitivity for all values of δ CP . Increasing the fraction of RHC decreases the sensitivity for the entire δ CP range when the reactor θ 13 constraint is included, relative to equal beam mode running. This is due to the lower statistics of theν e sample (see Figure 2) because of the reduced antineutrino flux and cross section. For short exposures, DUNE will not have a competitive independent measurement of θ 13 , so the main analysis will include the reactor θ 13 constraint. Nonetheless, it is instructive to look at the results without the penalty applied. In this case, the sensitivity is severely degraded (as expected) for 100% running in either beam mode. Figure 5 shows DUNE's Asimov sensitivity to the mass ordering for a total 24 kt-MW-yr far detector exposure, with different fractions of FHC and RHC running, and the same four true oscillation parameter sets. A 24 kt-MW-yr exposure is used in Figure 5 as it is around the exposure at which DUNE's median mass ordering significance exceeds 5σ for some vales of δ CP [21]. For each point tested, all oscillation and nuisance parameters are allowed to vary, and two fits are carried out, one using each ordering. The difference in the best-fit χ 2 values is calculated: and the square root of the difference is used as the figure of merit on the y-axis in Figure 5. There are some caveats associated with this figure of merit, which are discussed in Section V. It is clear from Figure 5 that the mass ordering sensitivity has a strong dependence on the fraction of running in each beam mode. As in the CPV case, the effect is very different with and without the reactor θ 13 constraint included. If the true ordering is normal and the reactor θ 13 penalty is applied, the sensitivity increases significantly with increasing FHC running, with a full 1σ increase in the sensitivity between equal beam running and 100%    Table I).
is inverted, 100% FHC running would degrade the sensitivity by ≥1σ for all values of δ CP at the NuFIT 4.0 best fit point. Overall, the sensitivity to the inverted ordering is improved by a more equal split between the beam modes. It is clear that 100% RHC running gives poor sensitivity for all values tested.
Without the reactor θ 13 constraint, the greatest sensitivity is obtained with close to an equal split of FHC and RHC running, and the sensitivity is significantly reduced with 100% FHC running. This is because of a degeneracy between the effect of θ 13 and the mass ordering on the rate of ν e appearance in FHC mode. If the mass ordering is normal, the ν e rate in FHC will be enhanced; without the reactor constraint, this excess can be accommodated by increasing the value of θ 13 .
For comparison, Figure 6 shows the Asimov CPV and mass ordering sensitivities, with and without the reactor θ 13 constraint included, for true normal ordering only, for a large exposure of 336 kt-MW-yr, with different fractions of FHC and RHC running. At large exposures, running with strongly enhanced FHC no longer improves the sensitivity over equal beam mode running, with or without the θ 13 penalty applied, for either CPV or mass ordering determination. This can be understood because the enhancement to the statistics that enhanced FHC brings is no longer as important to the sensitivity, and   Table I).
DUNE is able to place a constraint on the value of θ 13 with its own data.
Overall, the sensitivity to CPV and the mass ordering is dependent on the division of running time between FHC and RHC, but a choice that increases the sensitivity in some region of parameter space can severely decrease the sensitivity in other regions. If there is strong reason to favor, for example, normal over inverted ordering when DUNE starts to take data, Figure 5 shows that this could be more rapidly verified by running with more FHC data than RHC data, as the reactor θ 13 constraint will be used in the main low exposure analysis. However, if this choice is wrong, this might cause DUNE to take longer to reach the same significance. Clearly this is an important consideration which should be revisited shortly before DUNE begins to collect data. Similarly, the CPV sensitivity shown in Figure 4 might be optimized if there is a strong reason to favor gaining sensitivity for δ CP > 0 or δ CP < 0, at a cost of reducing the sensitivity to CPV if δ CP has the other sign. But, it is clear from Figures 4 and 5 that equal running in FHC and RHC gives a close to optimal sensitivity across all of the parameter space, and as such is a reasonable a priori choice of run plan for studies of the DUNE sensitivity. Additionally, it is clear from Figure 6 that the improvement in the sensitivity with unequal beam running is a feature at low exposures,   Table I).
but not at high exposures, particularly because at high exposures when DUNE is able to constrain all the oscillation parameters with precision [21], there is a stronger motivation to run a DUNE-only analysis, without relying on the reactor θ 13 measurement.

IV. CP VIOLATION SENSITIVITY
In this section, CPV sensitivity results are presented. For simplicity, only true NO will be shown unless explicitly stated. In all cases, a joint ND+FD fit is performed, and a θ 13 penalty is always applied to incorporate the re-actor measurement, as described in Section II. An equal split between FHC and RHC running is assumed based on the results obtained in Section III. Asimov sensitivities, as shown in Section III, are instructive but do not give information on how the expected sensitivity may vary with statistical or systematic uncertainties, or for variations in the other oscillation parameters of interest. The width of the transparent bands cover 68% of fits in which random throws are used to simulate systematic, oscillation parameter and statistical variations, with independent fits performed for each throw constrained by prior uncertainties. The solid lines show the median significance. tion 3, which is calculated for each throw of the systematic, other oscillation parameters and statistics.
The sensitivity shown in Figure 7 has a characteristic double peak structure because the significance of a CPV measurement decreases around CP-conserving values. The systematic and statistical variations mean that all throws have ∆χ 2 CPV ≥ 0, and therefore neither the median significance nor the band showing the central 68% of throws reach exactly 0 at CP-conserving values. This is entirely expected, it simply means that random variations in the data will cause us to obtain a 1σ measurement of CPV ≈32% of the time for CP-conserving values. Median significances are slightly higher for IO than for NO, and by exposures of 100 kt-MW-yr, the median significance exceeds 3σ for the maximal CP-violating values of ±π/2. This presentation of the CPV sensitivity was followed in Ref. [21], and is very informative at high exposures. Around CP-conserving values (δ CP = {0, ±π}), the distribution of the sensitivity metric ∆χ 2 CPV is non-Gaussian for all exposures. Additionally, at lower exposures, as shown in Figure 7, the distribution of ∆χ 2 CPV around maximally CP-violating values of δ CP = ±π/2 is increasingly non-Gaussian, making the spread in sensitivity harder to interpret with this presentation.
The CPV significance in Figure 7 (and previously in Ref. [21]) is calculated using constant ∆χ 2 critical values, where ∆χ 2 CPV ≤ 1, 4, 9 corresponds to a significance of 1, 2 and 3σ for one degree of freedom. This assumption holds when Wilks' theorem can be applied [68], but can lead to incorrect coverage where it cannot. It is known to break down for low-statistics samples, around physical boundaries, in the case of cyclic parameters, and where there are significant degeneracies. It is likely that a constant ∆χ 2 treatment will break down for δ CP , where all of these issues apply, as has indeed been shown by the T2K Collaboration [13].
The Feldman-Cousins method [69] is a brute force numerical method to calculate confidence intervals with correct coverage. A large number of toy experiments are produced, where the parameter(s) of interest (here δ CP ) is set to a desired true value, all other systematic and oscillation parameters are thrown, as described in Section II, and a statistical throw is made, for the two ND samples and four FD sample used in the analysis. Then two fits are performed, one where the parameter(s) of interest are fixed to the true value, and another where the test statistic is minimized with respect to the parameter(s) of interest. In both fits, all other parameters are allowed to vary. For each throw, the profile likelihood ratio ∆χ 2 FC is calculated using the minimum χ 2 values for those two fits, as in Equation 5.
The distribution of these throws is used to calculate the ∆χ 2 FC value that gives the desired coverage, with the appropriate fraction of toys above/below the calculated value. These are labelled critical values, and are denoted ∆χ 2 c . A distribution of ∆χ 2 FC values is shown in Figure 8 for an example ND+FD analysis with a 100kt-MW-yr exposure at the far detector, equal FHC and RHC run fractions, and the reactor θ 13 constraint applied. In Figure 8, the ∆χ 2 c values corresponding to for 68.27% (1σ), 90%, 95.45% (2σ) and 99.73% (3σ) of the throws are indicated. The ∆χ 2 c values were only calculated up to the 3σ level due to the very large number of throws required for higher confidence levels. An uncertainty on the value of ∆χ 2 c obtained from the toy throw distribution (e.g., Figure 8), is obtained using a bootstrap rethrowing method [70]. The empirical PDF obtained from the throws is treated as the true PDF, and B independent samples of size n are drawn from it, where n is the total number of throws used to build the empirical PDF. Each throw can be drawn multiple times in this method, so the ensemble of throws is different in each sample. Then, the standard deviation sθ, on the ∆χ 2 c values of interest, ϑ, are calculated for each of the B samples using: where ϑ * i denotes the calculated ∆χ 2 c value of interest for each of the samples, andθ * is their average value. Figure 9 shows the evolution of the ∆χ 2 c values as a function of exposure for δ CP = 0, the relevant value for CPV sensitivity, for an ND+FD analysis with equal FHC and RHC running and the reactor θ 13 constraint applied. For all significance levels tested, the ∆χ 2 c rise quickly as a function of exposure, and stabilize at values slightly higher than those suggested by the constant ∆χ 2 method by exposures of ≈100 kt-MW-yr. The initial rise in the ∆χ 2 c values is due to the low statistics at those exposures. Overall, this implies that the CPV significance is slightly weaker than what would be inferred from σ = ∆χ 2 CPV , as used for example in Figure 7. Crucially, there is no constant increase in the ∆χ 2 c values over time as has been reported by the T2K Collaboration [13]. Details on the number of toy throws used at each point of Figure 9 are , and is indicated as the shaded line. To guide the eye, horizontal dashed lines are included which indicate the 1σ, 90%, 2σ and 3σ ∆χ 2 values assumed using the constant-∆χ 2 method, with one degree of freedom. The distribution of throws used produced to calculate the ∆χ 2 c values shown are given in Figure 19.
given in the Appendix, and the toy throw distributions from which the ∆χ 2 c values and their uncertainties were calculated are shown in Figure 19.
As δ CP is a cyclical parameter, with physical boundaries at ±π, it is interesting to see how the ∆χ 2 c values evolve as a function of it. Figure 10 shows the ∆χ 2 c as a function of true δ CP , for an ND+FD analysis with equal FHC and RHC running including the reactor θ 13 constraint, for both 100 kt-MW-yr and 336 kt-MW-yr exposures. There is a noticeable, although not large, depression in the ∆χ 2 c values at δ CP = ±π/2 for all significance levels considered. This effect is larger at the lower, 100 kt-MW-yr, exposure, and is larger at higher significance levels. It is also clear from Figure 10 that the ∆χ 2 c behaviour is very similar at δ CP = ±π/2 as at δ CP = 0. Although the ∆χ 2 c values are relevant for CPV sensitivity, this evolution of the ∆χ 2 c values with δ CP will be important for estimating DUNE's δ CP resolution. Details on the number of toy throws used at each point of Figure 10 are given in the Appendix, and the toy throw distributions used to calculate the ∆χ 2 c values and uncertainties are shown for the 100 kt-MW-yr (336 kt-MW-yr) test points in Figure 20 (Figure 21). DUNE's CPV sensitivity is calculated using Equation 3 from an ensemble of throws of all systematic, other oscillation parameters and statistics. Figure 11 shows the fraction of throws for which DUNE would observe a CPV significance above a discrete threshold, as a function of the true value of δ CP , for 1-3σ significances and for a variety of exposures. The shaded histograms show the complete treatment including FC, while the dashed histograms show the constant-∆χ 2 treatment, to show the deviation from Wilks' theorem, and to facilitate comparison at higher significances where the FC treatment becomes computationally prohibitive.
The point at which the median significance (50% of throws) passes different significance thresholds can be easily read from the figures, and can be compared with those shown in Figure 7. The same double peak structure seen in Figure 7 can be observed. The median significance for measuring CPV exceeds 3σ after ≈100 kt-MW-yr, but a significant fraction of throws exceed 3σ at shorter exposures. For a 336 kt-MW-yr exposure, the fraction of throws for which the significance is less than 3σ at maximal values of δ CP is very small. In general, the effect of the Feldman-Cousins correction is to reduce the fraction of toy throws that cross each significance threshold (with respect to the constant-∆χ 2 result), by a maximum of ≈10%, but the exact fraction changes as a function of true δ CP value and exposure. An exception to this general trend is the 3σ behaviour at 24 kt-MWyr, the lowest exposure shown, where the significance increases. This is due to fall in the 3σ ∆χ 2 c value towards the lowest exposures observed in Figure 9. The number of throws carried out at each exposure is indicated on each plot. The number of throws decreases as a function of exposure because fixed computing resources were used for each configuration, and the time for the ensemble of fits carried out for each throw to complete increases slightly with exposure. The final 336 kt-MW-yr exposure has more throws because it was generated for the analysis presented in Ref. [21], where more than one projection was considered -requiring more throws to sample the space. Figure 12 shows the fraction of throws which exceed different significance thresholds at the maximal δ CP violation value of δ CP = −π/2, and for 50% of δ CP values as a function of exposure, with and without FC corrections, for 1-3σ significance values. Figure 12 was produced using the same throws used for Figure 11, with additional points from higher exposures used in Ref. [21], but not shown in Figure 11 (646 kt-MW-yr and 1104 kt-MW-yr). After ≈200 kt-MW-yr, the median significance (including FC correction) for 50% of the δ CP range is greater than 3σ. It is clear from Figure 12 that the effect of the FC correction is not large, and ≈10% longer exposures are required for the median expected significance to cross each threshold than without correction, at both δ CP = −π/2 and for the 50% range of δ CP values.
Calculating ∆χ 2 c values above 3σ using the FC method is challenging due to the large number of throws to explore the tails of the ∆χ 2 FC distribution and prohibitive computational cost. In Figure 13, the fraction of throws that exceed 1-5σ significance, calculated only with the constant-∆χ 2 method are shown in order to explore DUNE's sensitivity at higher significance levels. All the caveats described above relating to the constant-∆χ 2 method still apply. Figure 13 shows that although the median significance to CPV does not exceed 5σ until ≈336 kt-MW-yr, there are significant fractions of throws at lower exposures which reach 5σ significance. Figure 14 shows the fraction of throws which exceed different significance thresholds at the maximal CP-violating value of δ CP = −π/2, and for 50% of all δ CP values, as a function of exposure. By ≈200 kt-MW-yr, where the median  significance for 50% of the δ CP range is greater than 3σ, the sensitivity at δ CP = −π/2 exceeds 4σ.

V. NEUTRINO MASS ORDERING SENSITIVITY
In this section, the toy-throwing approach described in Section II is used to explore the neutrino mass ordering sensitivity as a function of exposure in detail. In all cases, a joint ND+FD fit is performed, and the reactor θ 13 constraint is always applied, as described in Section II. An equal split between FHC and RHC running is assumed based on the results obtained in Section III. Figure 15 shows the significance with which the neutrino mass ordering can be determined for both true NO and IO, for exposures of 66 and 100 kt-MW-yr. The sensitivity metric used is the square root of the difference between the best fit χ 2 value obtained using each ordering, as shown in Equation 4, which is calculated for each throw of the systematics, other oscillation parameters and statistics. The characteristic shape of the MH sensitivity in Figure 15 results from near degeneracy between matter and CPV effects that occurs near δ CP = π/2 (δ CP = −π/2) for true normal (inverted) ordering. Dedicated studies have shown that special attention must be paid to the statistical interpretation of neutrino mass ordering sensitivities [71][72][73] because the ∆χ 2 MO metric does not follow the expected chi-square distribution for one degree of freedom, so the interpretation of the ∆χ 2 MO as the sensitivity is complicated. Given the complications with the interpretation of significance for mass ordering determination, it is instructive to look at the distribution of the test-statistic (Equation 4), which gives more information than the 68% central band and median throw shown in Figure 15. Figure 16 shows the distribution of ∆χ 2 MO obtained for a large ensemble of throws, for both true normal and inverted orderings, for a number of different exposures. There is a uniform distribution of true δ CP used in the throws at each exposure. The change in shape at higher exposures in Figure 16 is due to the degeneracy between δ CP and the effect of the mass ordering, and as might be expected from Figure 15, the separation between hierarchies is greater for some true values of δ CP than others. This additional structure starts to become obvious from a ≈66 kt-MW-yr exposure, at which point the CPV sensitivity is not very strong (see Section IV). For all exposures, the shape of the throw distribution is highly non-Gaussian, which makes it difficult to apply simple corrections to the sensitivity of the sort described in Ref. [73]. As a result alternatives to ∆χ 2 MO as a sensitivity metric are not explored, the full information is given in Figure 16. Figure 16 also indicates the probability for the test statistic ∆χ 2 MO to be less (more) than zero from the toy throws for true normal (inverted) orderings at each exposure. This information is summarized in Figure 17. This marks the proportion of toys which appear more like the incorrect ordering than the true ordering for the toy, and gives a sense of the ambiguity between the hierarchies, although it is not easily converted to a single number sensitivity. It is clear from Figures 16 and 17 that DUNE is sensitive to the mass ordering even from very low (≈12 kt-MW-yr) exposures, with a small probability for preferring the incorrect ordering. By exposures of 66 kt-MW-yr, the overlap between the orderings is very small with ≈1% of toy throws which appear more like NO values shown for both true normal (red) and true inverted (blue) hierarchies built using random throws of the systematic parameters, the oscillation parameters and with statistical variations. In each case, the χ 2 values are separately minimized with respect to all variable parameters before calculating the test statistic. The fraction of throws for which the value of ∆χ 2 MO is greater than (less than) 0 is also given for inverted (normal) hierarchies. For each ordering and exposure, approximately 100,000 throws were used. the incorrect ordering than the true ordering. Figure 18 shows an alternative way to present the result of the throws as a function of δ CP , which is complementary to Figure 15. The fraction of throws for which the simple figure of merit (the square-root of Equation 4) exceeds different confidence levels are shown, for 1-5σ significances, and a variety of exposures, all for true NO. The same throws are used as in Figures 16. Despite the caveats regarding the intepretation of ∆χ 2 MO as units of σ, the general trend is clear, and provides more information about the expected DUNE sensitivity at low exposures. As with Figures 11 and 13, the point at which the median significance (50% of throws) passes different significance thresholds can be easily read from the figures, and can be compared with those shown in Figure 15. The same general shape as a function of δ CP as was observed in Figure 15 can be seen. The general trend would be very similar in IO, reflected in the line δ CP = 0, although a slightly longer exposure is required to reach the same sensitivity. The median significance for δ CP = −π/2 exceeds 5σ for 24 kt-MW-yr, at which point the fraction of throws for which the significance is 3σ or smaller is only ≈2%. By 66 kt-MW-yr, 100% of the throws exceed 5σ at δ CP = −π/2. By 100 kt-MW-yr exposures, the median significance approaches 5 σ for all true values of δ CP . At long exposures of 336 kt-MW-yr, almost 100% of the throws exceed 5σ for all values of δ CP .

VI. CONCLUSION
In this work a detailed exploration of DUNE's sensitivity to CPV and the mass ordering at low exposures has been presented. The analysis uses the same framework, flux, cross section and detector models and selections as were used in Ref. [21], which showed the ultimate DUNE Exposure (kt-MW-yr) sensitivity to CPV, the neutrino mass ordering and other oscillation parameters, with large statistics samples after long exposures.
The effect of operating with different run plans, involving different ratios of FHC and RHC beam modes, on the mass ordering and CPV Asimov sensitivities was explored. It was found that for low exposures, the sensitivity to both CPV and the mass ordering can be increased for certain regions of parameter space, but at a cost to the sensitivity in other regions. This sensitivity increase is in part produced by leveraging the strong θ 13 constraint available from reactor experiments. If there is a strong reason to favor the exploration of a given region of parameter space when DUNE begins to take data, this issue should be re-visited. However, with no strong motivation to focus on a given ordering or region of δ CP parameter space, equal FHC and RHC beam running provides a close to optimal sensitivity across all of the parameter space, so was used for the subsequent detailed sensitivity studies. The increase in sensitivity for unequal beam running is also a feature of low exposure running, and degrades the sensitivity almost uniformly across the parameter space investigated for large exposures, with and without a θ 13 constraint applied.
The studies presented here demonstrate that a full treatment of DUNE's sensitivity at low exposures supports the conclusions made in Refs. [21] and [8] using Asimov studies. In particular, the median CPV sensitivity is ≈3σ for δ CP = ±π/2 after approximately a 100 kt-MW-yr FD exposure. Variations in the expected sensitivity around the median value were also explored. Additionally, it was shown that the CPV sensitivity is not significantly degraded when Feldman-Cousins corrections are included, leading to ≈10% longer exposures to reach a given significance level. Crucially, it was found that after an initial low-exposure rise, the Feldman-Cousins ∆χ 2 c do not change as a function of exposure, unlike the rise with exposure which has been observed by the T2K experiment [13].
It has also been shown that strong statements on the mass ordering can be expected with very short exposures of ≈12 kt-MW-yr, which supports the results shown in Refs. [21] and [8] with a more complete treatment of the systematic uncertainty.
Although the analysis used here makes no assumptions about the FD staging scenario, and results are given as a function of exposure only, the results are dependent on having a performant ND complex from the start of the experiment. In particular, the low-exposures necessary to make world-leading statements about the mass ordering can only be given with confidence with ND samples included in the fit. Additional samples of events from detectors other than ND-LAr in the DUNE ND complex are not explicitly included in this analysis, but there is an assumption that it will be possible to control the uncertainties to the level used in the analysis, and it should be understood that that implicitly relies on having a highly capable ND.

Appendix: Feldman-Cousins throw distributions
The distribution of throws used to calculate the ∆χ 2 c values for Figure 9 for nine different exposures with δ CP = 0 are shown in Figure 19; for Figure 10a for nine different values of δ CP with an exposure of 100 kt-MW-yr in Figure 20; and for Figure 10b for nine different values of δ CP with an exposure of 336 kt-MW-yr in Figure 21. For each distribution shown in Figures 19, 20 and 21, the calculated ∆χ 2 c values corresponding to for 68.27% (1σ), 90%, 95.45% (2σ) and 99.73% (3σ) of the throws are given and indicated with a vertical line. The number of throws used is also given. The ∆χ 2 c values were only calculated up to the 3σ level due to the very large number of throws required for higher confidence levels.