Atmospheric neutrino oscillation analysis with neutron tagging and an expanded fiducial volume in Super-Kamiokande I–V

We present a measurement of neutrino oscillation parameters with the Super-Kamiokande detector using atmospheric neutrinos from the complete pure-water SK I–V (April 1996–July 2020) data set, including events from an expanded fiducial volume. The data set corresponds to 6511 . 3 live days and an exposure of 484 . 2 kiloton-years. Measurements of the neutrino oscillation parameters ∆ m 232 , sin 2 θ 23 , sin 2 θ 13 , δ CP , and the preference for the neutrino mass ordering are presented with atmospheric neutrino data alone, and with constraints on sin 2 θ 13 from reactor neutrino experiments. Our analysis including constraints on sin 2 θ 13 favors the normal mass ordering at the 92 . 3 % level.


I. INTRODUCTION
Neutrino oscillations in the Pontecorvo-Maki-Nakagawa-Sakata (PMNS) paradigm are parameterized by three mixing angles, two squared-mass differences and a CP-violating phase [1,2].Experiments measuring neutrinos of different flavors, energies, and baselines have constrained many of the PMNS parameters with increasing levels of precision.However, the octant of the mixing angle θ 23 , the phase δ CP , and the sign of the larger of the two squared-mass differences, ∆m 2  32 , which determines the neutrino mass ordering, are all presently unknown.To date, the long-baseline accelerator neu-trino experiments T2K [3] and NOvA [4] have made the world's most precise measurements of θ 23 , ∆m 2  32 , and δ CP , but have yet to definitively resolve the remaining questions.
Atmospheric neutrinos are an independent and natural counterpart to accelerator neutrinos for studying neutrino oscillations.Neutrinos created in the earth's atmosphere span a range of energies and baselines that make their oscillations sensitive to the θ 23 mixing angle and the magnitude of the ∆m 2  32 squared-mass difference.Additionally, atmospheric neutrinos which pass near or through the dense core of the earth experience matter effects which alter their oscillation probabilities.An observation of these modified oscillation probabilities in either atmospheric neutrino or anti-neutrino data would provide important information towards resolving the neutrino mass ordering.
In this work, we analyze 6511.3 live days of atmospheric neutrino data from the Super-Kamiokande (SK) detector.This analysis improves upon the previous work [5] in three major ways: We use the number of tagged neutrons to enhance the separation of neutrino events from anti-neutrino events, we enhance the efficiency of classifying multi-ring events using a boosted decision tree (BDT), and we add 48 % exposure by analyzing events from an expanded fiducial volume and from 1186 additional live-days, including data collected after a major detector refurbishment in 2018.In addition to the atmospheric-only analysis, we present an analyses of SK data with an external constraint on the mixing angle θ 13 from the average measurements of the reactor neutrino experiments Daya Bay [6], RENO [7] and Double-Chooz [8].
The paper is organized as follows: Section I presents an overview of neutrino oscillation phenomenology relevant to atmospheric neutrino oscillations.Section II provides a description of the Super-Kamiokande detector and its capabilities for reconstructing neutrino interactions.Section III describes the simulation used to model atmospheric neutrinos interactions at SK. Section IV describes the analysis methodology and presents the results of the analyses without external constraints and with constraints on sin 2 θ 13 .We provide an interpretation and summary of the results in Section V.

A. Neutrino Oscillations
Neutrinos are produced as flavor eigenstates of the weak interaction, which may be treated as superpositions of mass eigenstates via the PMNS matrix: where α is a label for each lepton flavor, one of e, µ, or τ , and U αi is an element of the PMNS matrix.The PMNS matrix is parameterized by three mixing angles and a phase, and factorizes into three sub-matrices which describe rotations by each mixing angle from the neutrino mass basis into the neutrino flavor basis: In Eq. ( 2), sines and cosines of the mixing angles are written as cos θ ij ≡ c ij and sin θ ij ≡ s ij respectively, and the phase δ CP changes sign for the anti-neutrino case.The probability of a neutrino of one flavor |ν α ⟩ oscillating to a different flavor |ν β ⟩ after some time t, or, equivalently, along a baseline L, is found by computing the amplitude |⟨ν β |ν α ⟩| 2 .The probability is nonzero for the case α ̸ = β if the mass states have nonzero mass differences given by the signed quantity ∆m 2 ij = m 2 i − m 2 j .In the simplest case, neutrinos oscillate in a vacuum and oscillation probabilities may be computed by propagating neutrino states according to their vacuum Hamiltonian, written here in the mass basis: This leads to oscillation probabilities of the form: where ∆ ij = 1.27∆m 2 ij L/E.Here, ∆m 2 ij is expressed in units of eV 2 , L is the oscillation baseline in kilometers, and E is in the neutrino energy in GeV.Experiments have measured all mixing angles and squared mass differences to be significantly different from zero, while the value of the phase δ CP is still unknown 1 .In addition, solar neutrino oscillation experiments observe evidence for matter effects in the sun which imply the ∆m 2  21 squared mass difference is positive, establishing an ordering for two of the neutrino masses, m 2 > m 1 [10][11][12][13].However, current experiments are consistent with either the normal ordering, m 3 ≫ m 2 , m 1 , or the inverted ordering, m 2 , m 1 ≫ m 3 .Consequently, the sign of the squared mass difference between m 3 and the next-most-massive neutrino, given by either ∆m 2  32 or ∆m 2  31 , is not known.We use the notation ∆m 2  32,31 or simply ∆m 2 for this squared mass difference where the ordering is not explicitly specified.
Numerically, ∆m 2  32,31 has been measured to be approximately 30 times larger than ∆m 2  21 , such that ∆m 2 32 ≈ ∆m 2  31 .The difference in magnitude between ∆m 2 21 and ∆m 2  32,31 also implies that the terms in Eq. ( 4) containing one or the other squared mass differences dominate for different ranges of L/E.For long-baseline beam and atmospheric neutrinos, where neutrino baselines range from tens of kilometers to several thousand kilometers, and typical neutrino energies range from MeV to several GeV, the ∆m 2  21 terms are sub-dominant, leading to 1 Recent results from T2K favor maximal CP violation, δ CP ≈ −π/2 [9], while recent measurements from NOvA disfavor CPviolating scenarios [4].
approximate flavor oscillation probabilities of the form The approximate oscillation probabilities in Eq. ( 5) are primarily functions of the mixing angles and the absolute value of the squared-mass difference ∆m 2 .The phase δ CP , and the neutrino mass ordering, i.e., the sign of the squared-mass difference, are sub-leading effects which make them challenging experimental signatures.
Neutrino oscillations in matter enhance the dependence of oscillation probabilities on the neutrino mass ordering.In matter, due to an increased forward scattering amplitude, electron-flavor neutrinos experience a larger potential relative to µ and τ flavors which modifies the vacuum Hamiltonian via an additional term, where Here, G F is the Fermi constant, N e is the electron density, and U is the PMNS matrix.The sign of a is positive for neutrinos and negative for anti-neutrinos.Propagating the neutrino states according to the matter Hamiltonian leads to an effective squared-mass difference and mixing angle: where Γ ≡ 2aE/∆m 2 .Equation (7) shows that the effective quantities depend on the sign of ∆m 2 .In particular, for neutrinos in the normal ordering, Γ ≈ cos 2θ 13 maximizes the effective mixing angle sin 2 θ 13,M .A maximum also occurs for anti-neutrinos in the inverted ordering.This maximum effective mixing angle predicts a resonant enhancement of muon-to-electron flavor conversions for either neutrinos or anti-neutrinos according to the neutrino mass ordering.

B. Atmospheric Neutrinos
Atmospheric neutrinos are produced when cosmic rays interact with nuclei in the earth's atmosphere.These interactions result in hadronic showers of primarily pions and kaons which decay into neutrinos.The atmospheric neutrino energy spectrum extends from a few MeV to several TeV and has an approximate flavor ratio in the few GeV range of (ν µ + νµ )/(ν e + νe ) ≈ 2 : 1.While present, tau neutrinos intrinsic to the atmospheric neutrino flux are suppressed by many orders of magnitude relative to electron-and muon-flavor neutrinos due to kinematic restrictions on their production.The zenith angle θ z describes atmospheric neutrino baselines.Neutrinos produced directly above a detector are downward going, θ z = 0, and are produced at an average distance of 15 km above earth's surface.Neutrinos produced on the other side of the earth from a detector are upward-going, θ z = π, and travel an approximate distance of 13 000 km through the earth.Oscillation signatures are most evident in upward-going atmospheric neutrinos due to the longer baselines.
A general atmospheric neutrino baseline begins at a production point in the atmosphere and passes through the earth before ending at a detector near the surface.We model the matter effects induced by passage through the earth assuming a simplified version of the preliminary reference Earth model (PREM) [14], where the earth is treated as a sphere with radius R Earth = 6371 km and contains concentric spherical shells of decreasing densities.Table I lists the earth layers and corresponding densities assumed in this work.
To compute neutrino oscillation amplitudes through layers of different matter densities, amplitudes along steps through matter of fixed densities are multiplied together [15].The general matrix form of the propagated mass eigenvectors X for neutrinos passing through a fixed matter density is where M 2 i /2E are the eigenvalues of H Matter .This definition allows the neutrino probability along a baseline of changing matter density to be written as where L i and ρ i are the baseline and density of the i th step respectively.In the case of a spherically symmetric earth, as is assumed in this work, the neutrino oscillation baseline L only depends on the zenith angle and production height and does not depend on the azimuth.
Figure 1 shows the calculated oscillation probabilities for atmospheric neutrinos as a function of the cosine of the zenith angle and the neutrino energy E ν .The two sets of figures show the cases for muon-to electronflavor neutrino and anti-neutrino oscillation probabilities in each mass ordering scenario, and assuming the current global average neutrino oscillation parameters [16].The resonance in ν µ → ν e or νµ → νe oscillations due to matter effects is exclusively visible in the normal and inverted scenarios respectively, and occurs at baselines of several thousand kilometers and neutrino energies around a few GeV.This resonance in atmospheric neutrinos is the experimental signature of the unknown neutrino mass ordering.
In addition to mass ordering sensitivity via earth matter effects, atmospheric neutrino oscillations also provide sensitivity to other oscillation parameters.Muon-to-tau flavor conversions provide sensitivity to |∆m 2  32,31 | and sin 2 θ 23 .While the tau neutrinos are often too low energy to produce CC interactions, the P (ν µ → ν µ ) survival probability manifests as a disappearance of upwardgoing muon neutrinos.Atmospheric neutrinos also provide modest sensitivity to the combined effects of δ CP and sin 2 θ 13 through electron neutrino or anti-neutrino appearance for neutrinos of all energies.

II. THE SUPER-KAMIOKANDE DETECTOR
Super-Kamiokande (SK) is a 50 kiloton cylindrical water-Cherenkov detector located within the Kamioka mine in Gifu, Japan [17,18].The detector consists of two optically separated regions: an inner detector (ID) which contains 32 kilotons of water and is viewed by over 11 000 inward-facing 20-inch photomultiplier tubes (PMTs), and a 2 m thick outer detector (OD) with over 1800 outward-facing 8-inch PMTs for vetoing cosmic backgrounds.To increase the light collected in the OD, the OD walls are covered with reflective Tyvek, and wavelength-shifting plates are mounted to the OD PMTs.
SK has been operational since its construction in 1996, and, until July 2020, has operated with pure water.During the this period, there were five distinct data-taking phases, SK I-V, which had similar operating conditions with a few notable exceptions: At the end of the SK I phase (1996)(1997)(1998)(1999)(2000)(2001), an accident2 resulted in the loss of roughly half of the experiment's PMTs.During the SK II phase (2002)(2003)(2004)(2005), the remaining PMTs were rearranged to provide uniform but reduced (19 %) photocoverage.New PMTs were installed to restore photocoverage to original conditions starting with the SK III phase (2006)(2007)(2008).In 2008, the experiment's electronics were upgraded [19], marking the start of the SK IV phase (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018).The SK IV electronics upgrade extended the window of hit times recorded following neutrino-like events which improved the efficiency for detecting decay electrons and enabled the detection of the 2.2 MeV gamma emission following neutron captures on hydrogen [20].
During 2018, the SK detector was drained to conduct work in preparation of loading gadolinium sulfate, a compound with a high neutron capture cross section, into the detector's water.The work comprised of installing a new water circulation system capable of continuously purifying gadolinium sulfate, sealing the welding joints of the tank walls to repair and prevent leaks, and replacing several hundred PMTs that had failed during the SK III and SK IV phases.The subsequent SK V (2019-2020) phase resumed data taking with pure water and reaffirmed the detector's stability and performance after the refurbishment work.In July 2020, gadolinium was dissolved into the detector's water for the first time [21], marking the start of the SK Gd phase.This work includes data from the pure water, SK I-V phases only.Future analyses using data from the SK Gd phase will feature enhanced neutron tagging efficiency due to the high neutron capture cross section of the gadolinium and its subsequent 8 MeV γ cascade.The operating conditions of the SK phases are summarized in Tab.II.
The SK detector observes Cherenkov light from charged particles with sufficient momentum produced following neutrino interactions.The light projected onto the PMT-lined walls of the detector forms ring patterns of hit PMTs.The ring patterns are reconstructed using the APFit [22] algorithm: The timing information of hits establishes an event vertex, and a fit considering the spatial distribution and observed charges of hit PMTs determines the particle's direction, momentum, and par- Electrons and photons tend to scatter and produce electromagnetic showers, resulting in many overlapping rings which appear as a single ring with blurred edges.Heavier particles such as muons and charged pions do not create showers and therefore have sharp ring edges.Highermomentum particles produce more light, so the charge contained within the ring provides an estimate of the particle's momentum.Figure 2 shows an example of the Cherenkov light patterns observed in SK following a neu-trino candidate interaction, and their fitted properties.
The event contains multiple ring patterns, each corresponding to a different particle.
In addition to Cherenkov rings, SK identifies electrons from muon decays.Decay electrons are found by scanning for time-clustered hits above background following a primary neutrino interaction trigger.A hit-time-based fitter estimates the decay electron vertex for each candidate hit cluster, and candidates are accepted if there are 50 or more hits within a 50 ns time window.The overall decay electron tagging efficiency is estimated to be 96 % for µ + and 80 % for µ − in the SK IV and SK V periods.The reduced efficiency for µ − is due to µ − capture in the water, in which no decay electron is produced.
Neutrons in the SK detector are captured on hydrogen, producing deuterium in an excited state.The decay of the excited deuterium produces a 2.2 MeV γ which results in a few time-coincident and spatially clustered PMT hits.These γ emissions from neutron captures are identified using a two-step process [23]: In the first step, a sliding 10 ns hit-time window finds candidate neutron captures from clusters of 7-50 hits with fewer than 200 hits in a surrounding 200 ns window.In the second step, hits within each candidate window are used to form variables which quantify the isotropy, likelihood of single-vertex origin, and the time spread of the hits.A neural network classifies candidates as either signal or background based on these variables.The average neutron tagging efficiency for neutron capture on hydrogen is 26 % in the SK IV and SK V periods.
In the SK detector, the charge of particles, and therefore neutrino and anti-neutrino interactions, cannot be differentiated on an event-by-event basis.However, statistical separation is possible.For example, in the process νµ + p → p + µ + + π − , in which an anti-neutrino interacts with a proton p, the outgoing negatively-charged pion is more likely to be captured by an 16 O nucleus before decaying than is a positively-charged pion produced in the equivalent ν µ interaction.Captured pions do not produce decay electrons, so requiring one or more decay electrons preferentially selects more neutrino than anti-neutrino events for this process.The statistical separation can be further improved by also considering the number of neutrons, which will be described in Sec.II B 2.

A. Calibration
Calibration ensures an accurate and consistent response of the detector to particle interactions.Calibration studies based on the detector geometry, PMT responses, and properties of the SK water are documented in Ref. [18].
We assess APFit's energy determination using calibration sources at multiple energies which span as much of the atmospheric neutrino energy spectrum as possible.Cosmic ray muons which stop within the detector and produce a decay electron provide a way to measure photoproduction as a function of the muon's track length.The track length of these muons is determined by their ID entrance point and decay electron vertex.The expected momentum of each muon assuming minimum-ionization along the track length may be compared to the fitted Cherenkov ring momentum.The energy spectrum of decay electrons from these muons also provides a lowenergy calibration source.We use the momenta of the two e-like rings from neutral pion decay, π 0 → γγ, which add to form an invariant mass distribution of neutral pion events, as a third calibration source.Figure 3 shows the difference between data and Monte Carlo (MC) for these calibration sources during each SK phase.In this analysis, the decay electron and π 0 mass measurements were also performed separately using events with vertices greater than 200 cm (conventional fiducial volume) and between 100 cm-200 cm (additional fiducial volume, see Sec.II B 4) from the detector walls.

B. Neutrino Sample Selection
Neutrino events at SK are broadly categorized as fullycontained (FC), partially-contained (PC) or upwardgoing muons (Up-µ).FC and PC events have a reconstructed event vertex within the ID and are differentiated based on the number of hits detected in the OD: FC events have minimal OD activity, while PC events have OD activity following the primary event trigger.Up-µ events are neutrino interactions within the rock below the SK tank or in the OD water which produce muons travelling upward.Figure 4 shows the average number of FC, PC, and Up-µ events observed per day during the SK I-V phases.
We separate neutrino candidate interactions from each category into analysis samples to enhance the different oscillation signals present in the atmospheric neutrino data.The data taken during the SK IV and SK V periods, 3705 days, or 57 % of the total SK I-V exposure, uses the observed number of neutron captures on hydrogen as an additional classification handle to enhance the purity of neutrino and anti-neutrino samples.

Analysis Samples
Fully contained events span the energy range of 100 MeV to 100 GeV and are a mixture of chargedcurrent (CC) and neutral current (NC) neutrino interactions of all flavors.Because the normal and inverted neutrino mass ordering scenarios predict an enhancement to either the number of ν µ → ν e or νµ → νe events, and SK cannot distinguish the sign of neutrino interactions on an event-by-event basis, the FC sample definitions are designed to increase the statistical purity of ν e and νe events.
Fully contained events with a single Cherenkov ring are first separated by the ring's particle identification (PID) score, either e-like or µ-like.Events are further divided based on the visible energy E vis.into sub-GeV, E vis.< 1330 MeV, and multi-GeV, E vis.> 1330 MeV.Next, events are separated by the number of decay elec-trons.For sub-GeV events, we use the number of decay electrons to separate events by likely interaction processes.Samples enhanced with quasi-elastic interactions are formed by requiring no decay electrons for e-like events and either zero or one decay electron for µ-like events.Sub-GeV e-like and µ-like events with one or more, or two or more decay electrons, respectively, are separated into additional samples enhanced in interaction processes which produce a pion below Cherenkov threshold.For multi-GeV events, the number of decay electrons, either zero, or one or more, is used to separate e-like events into anti-neutrino-and neutrino-enhanced samples.Fully contained single ring events in the SK IV and SK V phases have been revised in this analysis to additionally incorporate the number of tagged neutrons, and are discussed in Sec.II B 2.
Multi-ring events can contain mixtures of e-like and µlike rings, making the neutrino flavor ambiguous.However, multi-ring events also provide extra information which is useful for separating ν e events from νe events.For multi-GeV multi-ring events, we use a boosted decision tree (BDT) to classify these events as ν e -like, νe -like, µ-like, or "other."The "other" sample primarily selects NC events.More details on the BDT are presented in Sec.II B 3.
Sub-GeV multi-ring events are not included in the Zenith angle or momentum distributions for the 19 analysis samples without neutron tagging.The first column shows the momentum bins for samples which are not also binned by zenith angle, while the second through fifth columns show the 1D zenith angle distributions for samples which use 2D momentum-and-zenith-angle binning.The MC distributions are shown at the best-fit point in the normal ordering, cf.Tab.IV, with best-fit systematic pulls applied.Different colors in the MC histograms correspond to the true neutrino flavors present in each sample.Panels beneath each distribution show the data-MC ratio, and all error bars are statistical.FC (non-PC and non-Up-µ) sub-GeV and multi-GeV single-ring samples, marked with an asterisk (*), contain events from only SK I-III, while the corresponding events from SK IV-V are separated into samples using tagged neutron information, shown in Fig. 6.
analysis due to their poor direction resolution and consequently minimal sensitivity to oscillation effects, with two exceptions: First, multi-ring events with E vis.> 600 MeV where the most energetic ring is µ-like and has a reconstructed momentum of at least 600 MeV/c are included in the multi-ring µ-like sample.Second, FC NC interactions which produce a π 0 are an oscillationinsensitive background to the other samples, but are included in the analysis to constrain NC interactions.These neutral current π 0 events are identified from sub-GeV events using a dedicated fitter which assumes there are two rings present, regardless of the number of reconstructed rings.Events are classified as π 0 -like based on the likelihood that the two fitted rings originate from a π 0 decay.Events which are classified as π 0 -like are separated into two samples based on the number of reconstructed rings without the two-ring assumption, either one or two.
Partially contained events have typical energies between 1 GeV to 1 TeV, and are nearly all ν µ CC interactions, as the muon produced in the interaction often exits the ID.The muon momentum in PC events can only be estimated using the portion of the track within the detector.Partially contained events which exit the ID and stop within the OD, determined by comparing the amount of light in the OD to simulated PC events, are classified as stopping, while PC events that completely exit the detector are classified as through-going.
Upward-going muon events are the highest-energy events, up to ∼ 10 TeV, observed at SK and are classified into three samples: "stopping" if the muon stops within the ID, otherwise "showering" or "non-showering," depending on whether or not the exiting muon's charge deposition is consistent with radiative losses.The earth shields the Up-µ sample from cosmic ray backgrounds from below the horizon.However, upward-scattered muons from downward-going cosmic rays are an irreducible background near the horizon.We estimate the background rate in each Up-µ sample at the horizon by comparing the cosmic muon rate at the azimuthal directions with the smallest and largest mountain overburdens.The estimated number of background events is subtracted from the number of Up-µ data events with reconstructed directions near the horizon.The estimated Up-µ background rate is approximately 2 − 3 %.
The event classification outlined in this section corresponds to 19 distinct analysis samples.There are eight FC CC-enhanced single ring samples which are used for data collected during the SK I-III phases, while the remaining 11 FC multi-ring, NC π 0 , PC, and Up-µ samples are used for data taken during all phases.The data and MC counts for the 19 standard analysis samples are presented in Fig. 5 as a function of reconstructed lepton momentum or zenith angle.

SK IV-V Neutron-Tagged Samples
This analysis modifies the FC single-ring event selection during the SK IV and SK V phases based on the number of observed neutron captures on hydrogen.The modification is motivated by the greater average neutron production in anti-neutrino interactions relative to neutrino interactions: Additional neutrons are expected in anti-neutrino events from proton-to-neutron (p → n) conversions in CC processes, νl + p → l + + n + X, where l denotes lepton flavor, and also in both CC and NC deep inelastic scattering (DIS) processes due to the larger fraction of energy transferred to the recoiling hadronic system.We incorporated neutron information into the analysis sample definitions to create 10 additional samples, five for sub-GeV events and five for multi-GeV events, as follows: • FC single-ring e-like: Both sub-GeV and multi-GeV events are divided into three samples: Events with one or more decay electron and any number of neutrons are classified as ν e -like.Events with no decay electrons are considered νe -like, and are further separated into two samples based on whether or not there is at least one tagged neutron.
• FC single-ring µ-like: Both sub-GeV and multi-GeV events are divided into two sub-samples: Events with exactly one decay electron and one or more tagged neutrons are considered νµ -like, otherwise they are considered ν µ -like.
More details about the neutron tagging algorithm and the event selection modifications may be found in Refs.[23,24].Figure 6 shows the data and MC comparisons for the SK IV-V neutron tagged samples.Compared to the equivalent anti-neutrino samples shown in Fig. 5, the SK IV-V νe samples requiring at least one tagged neutron have a higher purity of true νe MC events, summarized in Tab.III.

Multi-Ring Boosted Decision Tree
A previous version of this analysis [25] classified multiring events into e-like or µ-like using a likelihood function based on seven input variables: the number of rings, the number of decay electrons, a continuous PID variable of the most energetic ring present in the event, the fraction of energy carried by the most energetic ring, the transverse momentum of the most energetic ring, the maximum distance of any decay electrons from the event vertex, and the visible energy.The selection was performed in two steps, first to separate e-like events from µ-like events, then to further separate e-like events into ν e -like and νe -like events.An additional "other" category was also present to separate NC events from CC events.This analysis replaces the likelihood-based selection with a boosted decision tree (BDT) selection for multiring events, utilizing the same set of variables as in the likelihood-based classification [26].Labeled true multiring MC events are used to train the BDT classifier.Additionally, we optimize the BDT for sensitivity to the neutrino mass ordering by weighting these training events: Adjusting the weights of CC ν e and νe training events relative to CC ν µ and "other" events changes the expected signal purities obtained from the BDT selection, which in turn affects the expected mass ordering sensitivity.We find that the optimized BDT improves the efficiency for selecting CC events while maintaining an equivalent wrong-flavor contamination rate as the likelihood selec- is the maximum distance of any decay electron from the event's primary vertex.pT and pMax.are the transverse momentum and the momentum of the most energetic ring, respectively, and pTot. is the sum of the momenta from all reconstructed rings.
tion.The increased efficiency results in a higher proportion of correctly-classified CC ν e -like, νe -like, and ν µ -like multi-ring events and a decreased proportion of "other" events relative to the previous analysis.
Figure 7 shows the distributions of the BDT input variables from MC events.The largest differences between electron-and muon-flavor neutrino interactions are in the PID of the most energetic ring, the number of decay electrons, and the maximum distance travelled by decay electrons, L Decaye .The difference in L Decaye between µlike and e-like events results from muons produced in ν µ CC interactions having higher momenta and traveling further on average than charged pions which are potentially produced in multi-ring interactions of any flavor.Electron neutrino and anti-neutrino interactions differ in the number of rings, and in the fractional and transverse momentum of the most energetic ring.These differences are a consequence of the different angular and momentum dependencies between neutrino and anti-neutrino cross sections.Figure 8 demonstrates the ν e and νe separation performance of the BDT in the SK IV-V phases.Neutron information from the SK IV-V phases was also considered as an input variable to the BDT, but was found to only result in a marginal improvement to ν e -ν e separation while introducing a large amount of systematic uncertainty.Negative values indicate the event has a single-ring, while positive values indicate the event has multiple rings.In these panels, the CC QE histograms include both 1p1h and 2p2h processes, and CC Other refers to single, non-pion hadron production.All MC distributions are calculated assuming neutrino oscillations and without the effect of any systematic uncertainties.

Expanded Fiducial Volume
Previous SK analyses used a 22.5 kiloton fiducial volume for FC events defined by requiring the event vertex be at least 200 cm away from the ID walls (conventional fiducial volume).This analysis includes events in the 4.7 kiloton region 100-200 cm from the ID walls (additional fiducial volume) for all SK phases, representing a 20 % increase in exposure.To include events in the additional fiducial volume region, both the reconstruction algorithms and background estimations required re-evaluation [27].In 2018, the pre-tabulated expected charge distributions, which are the basis of the ring counting and PID likelihoods used in APFit, were re-computed as a function of a particle's distance to the nearest detector wall.These new charge tables encode the reduced number of hits and increased angular dependence expected from events near the detector walls.Adopting the updated charge tables reduced the e-µ misidentification of single ring events in the additional region by up to 35 %.
Cosmic muons with vertices mis-reconstructed in the fiducial volume also become more prevalent closer to the detector walls.To estimate the size of these backgrounds, events in the additional fiducial volume region were eyescanned by experts.A slightly higher background rate of 0.5 %, compared to 0.1 % in the conventional fiducial volume, was found.The increased background rate in the additional fiducial volume is acceptable, although the background rate rises quickly for events with vertices < 100 cm, prohibiting further expansion.
We note that Ref. [27] included events from the additional fiducial volume region in the context of a proton decay search which was primarily concerned with sub-GeV atmospheric neutrino backgrounds.In contrast, this oscillation analysis includes both sub-GeV and multi-GeV atmospheric neutrino events.Additional studies to those presented in Ref. [27] were needed to confirm the performance of the reconstruction algorithms for multi-GeV events in the additional fiducial volume region.We studied the reconstructed momentum and direction biases and resolutions using MC events, and confirmed that these quantities were similar between events in the conentional and additional fiducial volume regions.We also studied the data versus MC agreement between the ring PID and ring counting likelihoods for events in the additional fiducial volume region, as shown in Fig. 9.The figure compares the data and MC distributions for sub-GeV and multi-GeV events in both fiducial volume regions.The level of agreement in the likelihood distributions using the updated reconstruction algorithms is found to be equivalent between events in the conventional and expanded fiducial volume regions at both sub-GeV and multi-GeV energies.This analysis is the first SK atmospheric neutrino oscillation analysis to include FC events from the additional fiducial volume region, resulting in a total fiducial volume of 27.2 kiloton for all SK phases.The total exposure for FC events in this analysis is 484.2 kiloton-years, a 48 % increase over the previous published analysis.

III. SIMULATION
We produce simulated Monte Carlo (MC) events to provide a prediction of atmospheric neutrino data.The SK MC consists of simulated neutrino interactions generated according to the flux model of Honda et al. [28] and the cross section models of the neut simulation software [29,30].neut also propagates intermediate particles within the nuclear medium, and produces the finalstate particles which exit the nucleus.These final-state particles are stepped through a geant3-based [31] simulation of the SK detector which models particle propagation in the detector water, Cherenkov radiation emission, and the detection of Cherenkov radiation by PMTs [18].The SK detector simulation also implements a datadriven model of PMT electronics, dark noise, and disabled PMTs.

A. Neutrino Interaction Model
Because atmospheric neutrinos span several orders of magnitude in energy, multiple interaction processes are relevant across the different SK neutrino samples.Fully contained events, which are the lowest-energy atmospheric neutrinos in this analysis, have leading contributions from charged-current quasi-elastic (CCQE) and single-pion production processes.Partially contained and Up-µ events consist of higher-energy neutrinos where deep inelastic scattering (DIS) becomes dominant.The contribution to the event rate from each cross section process is shown in Fig. 10.
This analysis uses neut version 5.4.0 [30] for the nominal neutrino interaction models, which is an update from version 5.3.6 used in the previous analysis.Notable changes in this version include the replacement of the nominal one-particle-one-hole (1p1h) model, responsible for the CCQE and NCQE processes, from the Smith and Moniz relativistic Fermi gas (RFG) model [32] to the local Fermi gas (LFG) model by Nieves et al. [33].The nominal value of the axial mass parameter M QE A was decreased from 1.21 GeV/c 2 to 1.05 GeV/c 2 .This change results in an overall decrease of CCQE events in the nominal MC by ∼ 20 %, although the previous M QE A value is still within the 1σ uncertainty, see Sec.IV B. The LFG model also includes a correction to the cross section at low four-momentum (q 2 ) transfers due to weak charge screening calculated via the random phase approximation (RPA) technique [34].The BBBA05 [35] vector form factors are used in the nominal 1p1h model, which is unchanged from the previous analysis.
neut implements the Rein-Sehgal resonant single-pion production model [36], which is unchanged from the previous analysis.The coherent pion production cross section calculation, which was previously based on the Rein-Sehgal model, has been updated to use the Berger-Sehgal model [37] for neutrino events with E ν < 10 GeV.This was done to improve the agreement in total cross section .Simulated neutrino energy spectra by interaction modes for the different SK event categories: fully contained (FC), partially contained (PC) and upward-going muon (Upµ).The simulated event rates are for an oscillated MC, with oscillation parameters set to the Particle Data Group best-fit values [16], except for δCP which is set to 0. "CC Other" modes refer to the production of single, non-pion hadrons.
and angular distribution of the outgoing pion with recent scattering experiments [38].Deep inelastic scattering cross sections in neut are calculated using the GRV98 parton distribution functions (PDFs) [39], with corrections to the low-q 2 regime from Bodek & Yang [40].neut simulates the production of multiple hadrons using two separate models, selected based on the invariant mass of the hadronic system, W .For W < 2 GeV, a custom multi-pion generator is used, while for W > 2 GeV, neut uses pythia v5.72 [41].The DIS hadron production models are only used to produce final states with multiple hadrons to avoid overlap with the single-hadron models.
Final state interactions (FSIs) are processes which modify particles exiting the nucleus due to intra-nuclear effects.Secondary interactions (SIs) are equivalent processes which occur in the detector medium instead of within the nucleus.neut implements four FSI+SI processes: quasi-elastic scattering, charge exchange, pion absorption, and hadron production.Six parameters adjust the probabilities of these processes.neut 5.4.0 updated the default values of these parameters [42], which increased the pion absorption probability by 27 %, and decreased the probability of charge exchange for pions with momentum < 400 MeV/c by 30 % compared to version 5.3.6.
The cross section for the two-particle-two-hole (2p2h) process, in which no pions are produced but a pair of nucleons is ejected from the nucleus, is calculated using the Valencia model of Nieves et al. [43].This is unchanged from the previous version of neut.

IV. ATMOSPHERIC NEUTRINO OSCILLATION ANALYSIS
Atmospheric neutrino data are analyzed via fit of binned MC counts to data.We bin atmospheric neutrino MC and data events using two-dimensional bins of reconstructed cosine zenith angle and momentum.The bin definitions for the 29 samples used in this analysis are listed in Tab.III.The zenith angle bin definitions were updated for this analysis, and are discussed in Sec.IV A.
Systematic uncertainties are incorporated in the analysis as additional nuisance parameters which modify the nominal MC prediction in the fit.We estimate the effect of each systematic uncertainty by quantifying the change induced in each analysis bin due to +1σ and −1σ deviations from its nominal value.The systematic uncertainties considered in this analysis are described in Sec.IV B.

A. Zenith Angle Binning
In the previous analysis, FC and PC events were binned into 10 evenly spaced cosine zenith angle bins FIG.11.Updated zenith angle bin edges for this analysis (black, solid lines) compared to equally-spaced bins (red, dashed lines).The updated bin edges are symmetric about zero and align with the matter-enhanced νµ → νe resonance regions present assuming the PREM.The oscillation probabilities and corresponding color scale are the same as in Fig. 1, for neutrinos in the normal ordering.
on the interval [-1,1].This separated the expected oscillation resonance region across two bins, reducing the signal-to-background ratio in samples sensitive to the neutrino mass ordering.In this analysis, the zenith angle bins for FC and PC events have been updated to more precisely cover the resonance region, demonstrated in Fig. 11.The updated bins are defined by the edges −1, −0.839, −0.644, −0.448, −0.224, 0, 0.224, 0.448, 0.644, 0.839, and 1.With the present statistics, adopting the updated bins results in a negligible effect on the expected sensitivity of this analysis to the mass ordering.However, the updated bins reduce the ambiguity as to whether or not events in the signal region occur in the expected zenith angle range or slightly outside of it.

B. Systematic Uncertainties
The analysis includes 193 independent systematic uncertainty sources: 48 describe atmospheric neutrino flux and cross section effects common to all phases, and the remaining 29×5 describe reconstruction efficiencies which are separately estimated for each data taking phase.Additional details on the formulation of the systematic uncertainties may be found in Ref. [44].

Flux and Cross Section Uncertainties
The atmospheric neutrino flux uncertainties in this analysis are unchanged from the previous analysis [5].Two energy-dependent uncertainties scale the normalization of the flux above and below 1 GeV.Additionally, there are three energy-dependent uncertainties which modify the ratios of muon-to-electron, electron-to-antielectron, and muon-to-anti-muon flavor neutrinos present in the atmospheric neutrino flux.An uncertainty based on the ratio of the Honda flux calculation with a modified kaon-to-pion ratio over the nominal model is also included [45].
The effects of uncertainties on parameters in the CCQE and single-pion cross section models are computed by re-weighting MC events by the ratio of the double-differential cross section after +1σ and −1σ changes in each parameter.For CCQE events, M QE A is taken to be (1.05 ± 0.16) GeV/c 2 .There are three parameters which control the single pion production cross section in the Rein-Sehgal model: the axial mass, M Res A = (0.95 ± 0.15) GeV/c 2 , the axial form factor coefficient, C 5 A = 1.01 ± 0.12, and the isospin-1 2 background contribution scaling parameter, I 1 2 = 1.30 ± 0.20.In addition to the M QE A uncertainty, we place additional uncertainty on CCQE events due to differences between the LFG and RFG models.There are five uncertainties where the 1σ effect is computed by forming ratios between the predictions of the two models: the absolute event rate (both sub-GeV and multi-GeV), the shape of the E ν dependence, and in the ratios of ν µ /ν e and νµ /ν e .These uncertainties are computed as a function of neutrino energy.The uncertainty on the 2p2h contribution to CCQE interactions is set to 100 %, due to a lack of direct measurements, and is unchanged from the previous analysis.
An uncertainty on FSIs and SIs is implemented by recomputing the final states for MC events using multiple sets of the six neut FSI parameters.Each set consists of variations of one or more FSI parameters by their ±1σ uncertainties as determined by a fit to external pion scattering from Ref. [46].The set which produces the largest change in classification outcome is taken as a conservative estimate of the FSI+SI uncertainty.
Two neutron-related systematic uncertainties are included in this analysis to account for the dependence on neutron production models in the neutron-based event selection used for SK IV and SK V data.Variations in these systematic uncertainties move events between samples with no tagged neutrons and samples with one or more tagged neutrons.The largest of the two uncertainties originates from measurements of neutron production as a function of transverse momentum by T2K [47].An additional neutron-related systematic uncertainty accounts for differences in neutron multiplicity predicted by the different neutrino interaction generators neut and genie [48].
In addition to model parameter uncertainties, we assign a 20 % uncertainty on the ratio of NC to CC events and a 25 % uncertainty on the tau neutrino CC cross section [49].The tau neutrino cross section uncertainty was estimated from a comparison between neut and the cross section model of Hagiwara et al. [50].

Reconstruction Uncertainties
Ring reconstruction systematic uncertainties include efficiencies of ring-based particle identification (PID), ring counting, and identification of π 0 decays.The estimation of these systematic uncertainties is unchanged from the previous analysis, and is performed using a "scale-and-shift" procedure: First, MC events are labeled as either signal or background, where the particular la-bel differs for each systematic uncertainty source.For example, the ring counting systematic uncertainty estimation considers events with a single charged particle with true momentum sufficient to produce a Cherenkov ring as signal, and events with multiple particles capable of producing rings as background.Similarly, the PID uncertainty estimation considers true e-like events as signal and true µ-like events as background.Once labeled, the signal and background distributions of each reconstruction quantity, x, e.g., the ring-counting or PID likelihoods, are scaled and shifted by a linear function, x ′ = β 0 +β 1 x, and fit to a subset of atmospheric neutrino data.The signal and background distributions have independent β parameters.The fitted β parameters are then fluctuated according to their fitted uncertainties to generate toy data sets with random amounts of signal and background events.The maximum fractional change in signal purity observed in the toy data sets is taken as a conservative estimate of the uncertainty.
The scale-and-shift procedure is used to separately evaluate ring-related systematic uncertainties for events in the conventional and additional fiducial volume regions during each data-taking phase.The ring counting and PID systematic uncertainties are separately computed for each combination of e-like and µ-like, and sub-GeV and multi-GeV events.As in the previous analysis, the scaleand-shift procedure is used to place an additional uncertainty on the normalization of ν µ contamination present in the e-like samples, estimated from the fitted fraction of ν µ events classified as e-like.We also use the scaleand-shift procedure to estimate the uncertainty associated with the multi-ring BDT: Two sources of systematic uncertainty are considered for multi-ring events based on the data-MC agreement in the input distributions used to train the BDT, and in the output BDT scores themselves.The changes in the event rate in each multi-ring sample obtained through fluctuating the input and output distributions according to their fitted uncertainty are taken as the 1σ effects.
As in the previous analysis, this analysis takes the maximum data-MC disagreement between the decay electron, π 0 mass, and stopping muon calibration sources during each SK phase, shown in Fig. 3, as an estimate of the overall energy scale uncertainty.Variations in this uncertainty increase or decrease the reconstructed momenta of all MC events, causing these events to move between momentum bins.The energy scale uncertainty was newly estimated for the SK V phase and is 1.8 % in the conventional fiducial volume and 2.0 % in the additional fiducial volume.
This analysis introduces a systematic uncertainty for the neutron tagging efficiency and false detection rate during the SK IV and SK V phases, which is factored into two components: detector conditions, and neutron travel distance.The contribution to the uncertainty due to detector conditions was estimated from comparisons between data from an AmBe source [51] and MC events generated with variations in PMT gain, PMT dark rate, 12.Estimated fractional uncertainty on the ratio of upward-going, cos θz < 0, versus downward-going, cos θz > 0, events in the multi-GeV e-like samples (both single-ring and multi-ring) from statistical and systematic uncertainties.The fractional uncertainties are estimated by randomly fluctuating the nominal MC counts either by Poisson statistics for the "Statistics" entry, or by Gaussian fluctuations of one or more systematic uncertainties.Uncertainties on the up/down ratio in the multi-GeV e-like samples have the largest impact on the sensitivity to the neutrino mass ordering.and water transparency within the ranges observed during the SK IV-V phases.The neutron travel distance contribution to the uncertainty was estimated from an atmospheric neutrino data-MC comparison of the distance between the primary neutrino event vertex and reconstructed neutron capture vertex, convolved with the estimated capture efficiency as a function of the distance.The quadrature sum of these effects sets the overall neutron tagging uncertainty.Similarly to the neutron production uncertainties, variations in the tagging efficiency uncertainty move events between samples requiring no tagged neutrons and samples requiring one or more tagged neutron.
The systematic uncertainties with the largest effect on the analysis' sensitivity to the neutrino mass ordering are those which can change the ratio of upwardgoing, cos θ z < 0, versus downward-going, cos θ z > 0, events in the multi-GeV e-like samples.This is because the mass ordering signal region occurs in the upwardgoing bins of these samples, and downward-going neutrino events act as an un-oscillated data set which constrains uncertainties without zenith angle dependence independent of oscillation effects.Figure 12 shows the estimated largest sources of uncertainty which affect the ratio of upward-versus-downward going events in the multi-GeV e-like samples.The largest source of uncertainty is statistics, indicating the mass ordering sensitivity is statistics-limited.The ν τ cross section uncertainty is the next-largest: Tau neutrino interactions at SK are almost exclusively upward-going since they are the result of ν µ → ν τ oscillations, so there are no downward-going tau neutrino events to constrain the ν τ cross section uncertainty.Additionally, tau neutrino interactions tend to be reconstructed as multi-GeV, multi-ring e-like events, which places them in the mass ordering signal region [52].
In contrast, uncertainties on the overall flux and cross section normalizations are less important for the mass ordering analysis, since these modify the predicted number of events independent of zenith angle.

C. Fitting Method
This work presents two fits of atmospheric neutrino data: First, we perform an "atmospheric-only" fit which measures ∆m 2  32,31 , sin 2 θ 23 , δ CP , sin 2 θ 13 , and the neutrino mass ordering using the SK atmospheric neutrino data with no external constraints.Next, we analyze the same data including a constraint on sin 2 θ 13 from reactor neutrino experiments.
In both fits, we evaluate data-MC agreement by computing a χ 2 statistic at each point on a fixed grid of neutrino oscillation parameters, defined for each fit in Tab.IV.Oscillation probabilities are applied to the MC events using the oscillation parameters at each point in the grid, and systematic uncertainty nuisance parameters are fit to determine the lowest χ 2 value.The best-fit neutrino oscillation parameters are taken to be the grid point with the lowest χ 2 value.
The χ 2 statistic is computed via a summation over n bins assuming Poisson statistics, and with systematic uncertainty pull terms, ϵ i , which have units of σ.Each ϵ i scales the effect of the i th systematic uncertainty, and is included in the χ 2 calculation as an additional penalty term: In Eq. (10) effect of systematic uncertainties.The effect of the i th systematic uncertainty is calculated as ϵ i times the fractional change induced in the n th bin by a 1σ change, f i,n : The constraint ∂χ 2 /∂ϵ i = 0 at the minimum χ 2 value yields a system of equations which may be solved to find the configuration of ϵ i s that produce the smallest χ 2 for a given MC prediction.

D. Mass Ordering Sensitivity with Neutron Tagging
We studied the impact of using neutron captures on hydrogen with a tagging efficiency of 26 % on the sensitivity to the mass ordering.As discussed in Sec.II B 2, tagged neutrons provide an additional handle during event reconstruction which improves the statistical separation of neutrinos and anti-neutrinos in SK versus only considering the number of decay electrons.This corresponds to an increased purity of correct-sign ν e and νe events in samples where sensitivity to the the mass ordering is expected.The increased purity can be seen by comparing the FC multi-GeV single ring e-like samples listed in Tab.III: The SK I-III multi-GeV νe -like sample requiring no decay electron has a MC purity of 34 % true νe CC interactions, while the SK IV-V multi-GeV νe -like sam-ple requiring no decay electrons and one tagged neutron has a MC purity of 46 % true νe CC interactions.
Figure 13 shows a comparison of the sensitivities obtained using the standard 19 samples discussed in Sec.II B for all SK phases versus the present configuration, which separates out the FC single-ring samples from the SK IV and SK V phases (approximately 57 % of the total exposure) into additional samples which use neutron information.The figure demonstrates that using the neutron tagged samples improves the expected sensitivity to the mass ordering and δ CP by producing higher ∆χ 2 value [see Eq. ( 10)] away from the true oscillation parameters.

Atmospheric-only Results
Results in this section and the next are presented as ∆χ 2 contours of one or two oscillation parameters taken with respect to the global best-fit point, with and without constraints on θ 13 .The contours are profiled: We draw the minimum ∆χ 2 values among all other combinations of oscillation parameters scanned with one or two oscillation parameters fixed.
Figure 14 shows the one-dimensional (1D) ∆χ 2 contours in the atmospheric-only analyses for the fitted neutrino oscillation parameters δ CP , sin 2 θ 13 , ∆m 2  32,31 , and sin 2 θ 23 , with respect to the best-fit point across both mass orderings.The normal ordering is preferred: The difference between the minimum χ 2 in the inverted and normal orderings is ∆χ Sensitivity to δ CP in the SK data originates from both sub-GeV and multi-GeV e-like samples, where values of δ CP near −π/2 indicate increased ν e appearance and decreased νe appearance relative to δ CP = 0.The bestfit value of δ CP is −1.89 in both orderings, signalling increased ν e appearance.The constraints on δ CP and sin 2 θ 13 assuming the inverted ordering are weaker, although consistent with the best-fit values assuming the normal ordering.These weaker constraints are an expected consequence of the freedom to adjust sin 2 θ 13 and δ CP simultaneously, combined with the reduced antineutrino statistics relative to neutrino statistics in the atmospheric neutrino data.
The preferred values of sin 2 θ 13 are 0.020 and 0.010 in the normal ordering and inverted ordering respectively.Both results are consistent with the reactor-preferred value of sin 2 θ 13 = 0.0220 at the 90 % confidence level, and sin 2 θ 13 = 0 is disfavored in the normal ordering at the 99 % level.The data favor the magnitude of the fitted squared mass difference |∆m straints on this parameter, especially evident in the inverted ordering fit, is a consequence of rapidly varying oscillation probabilities in the sub-GeV samples.Finally, the atmospheric neutrino data place the best-fit value of sin 2 θ 23 in the lower octant, sin 2 θ 23 = 0.45, although values in each octant are allowed at the 68 % level.
As discussed in Sec.I A, the combination of nonzero sin 2 θ 13 and a normal neutrino mass ordering leads to electron neutrino appearance for upward-going multi-GeV events.We observe excess electron-flavor upwardgoing multi-GeV, single ring and multi-GeV, multi-ring events in the SK data. Figure 15 shows a projection of the multi-GeV e-like samples as an up-down asymmetry: where "Up" is the number of upward-going, cos θ z < −0.4,events, and "Down" is the number of downwardgoing, cos θ z > 0.4 events.The figure plots the asymmetry for these data as a function of reconstructed energy, and the expected asymmetry for the normal and inverted ordering scenarios assuming the best-fit oscillation parameters from the fit to all atmospheric neutrino data.
The ν e -enhanced samples, multi-GeV ν e -like and multiring ν e -like, have the largest excesses relative to either ordering, and drive the preference for the normal mass ordering in the analysis.

Results with Reactor Constraints on sin 2 θ13
Figure 16 shows the 1D ∆χ 2 profiles for the fitted neutrino oscillation parameters assuming the constraint sin 2 θ 13 = 0.0220 ± 0.0007 from reactor antineutrino disappearance experiments [16].The constraint on sin 2 θ 13 is incorporated by introducing an additional systematic uncertainty for this fit, where the 1σ effect is defined as the change induced by varying sin 2 θ 13 by its measured 1σ uncertainty.
The best-fit value of δ CP in both the normal and in- Up-down asymmetry for multi-GeV e-like events.The y-axis is the asymmetry parameter, the ratio between the difference and sum of upward-going, cos θz < 0.4 and downward going, cos θz > 0.4 events.The x-axis is the reconstructed neutrino energy: For single ring events, the reconstructed energy is the visible energy of the ring assuming the reconstructed ring is an electron, while for multi-ring events, it is the total visible energy of the event.All error bars are statistical.MC lines for the normal and inverted orderings are drawn assuming the best-fit oscillation parameters of the SK + sin 2 θ13-constrained analysis.SK IV-V multi-GeV single-ring events are selected using the number of tagged neutrons, and so are separated from the SK I-III multi-GeV single-ring samples.and sin 2 θ23 for the normal mass ordering.Contours are drawn for a 90 % critical χ 2 value assuming 2 degrees of freedom, with the ∆χ 2 computed for each experiment with respect to the best-fit point in the normal mass ordering.The Super-K contour shows the result of this analysis, and other contours are adapted from publications by MINOS+ [53], NOvA [56], T2K [54], and IceCube [55].Bestfit points are indicated with markers for each experiment.
verted orderings for the fit with sin 2 θ 13 constrained is −1.75, which is consistent with the atmospheric-only analysis at the 1σ level.This fit also finds improved constraints on δ CP in the inverted ordering for values near π/2: The constraint on sin 2 θ 13 fixes the effect size of the mass ordering, such that the separate modifications to ν e appearance from δ CP are more readily resolved.
In this fit, the preference for the normal ordering increases to ∆χ 2 I.O.−N.O.= 5.69.This improvement is consistent with the observed preference for smaller value of sin 2 θ 13 in the inverted ordering fit with sin 2 θ 13 free: The χ 2 value in the inverted ordering increases with the added constraint, while the χ 2 value in the normal ordering remains similar to the result without the constraint.
Figure 17 shows the comparison of the results of the θ 13 -constrained analysis of SK atmospheric neutrino data with other contemporary results from MI-NOS/MINOS+ [53], NOvA [4], T2K [54], and Ice-Cube [55].As can be seen from the contours, SK atmospheric neutrino data are consistent with the other experiments at the 90 % level.While the atmospheric neutrino data find a best-fit value of sin 2 θ 23 in the lower octant, we note that the previous publication found a best-fit value in the upper octant [5], and that value of sin 2 θ 23 in both octants are allowed at the 1σ level.

V. INTERPRETATION
Table V summarizes the fit results of the analyses presented in this work.In both analyses, the normal ordering is preferred, and the best-fit oscillation parameters predict weaker sensitivities to the neutrino mass ordering than the observed ∆χ 2 I.O.−N.O. .To quantify the significance of the mass ordering preference from the fit results, we generated ensembles of toy data sets to produce distribution of the ∆χ 2 I.O.−N.O.statistic3 .Each toy data set consists of fluctuated counts according to each bin's statistical uncertainty which are scaled by randomly sampling the systematic uncertainty coefficients.Ensembles were generated assuming both the normal and inverted mass orderings and with oscillation parameters fixed at the best-fit points.We fit each toy data set in each ordering with ∆m 2  32 , sin 2 θ 23 , and δ CP as free parameters to compute ∆χ 2 I.O.−N.O. .Figure 18 shows the distribution of ∆χ 2 I.O.−N.O.compared with the data fit result for the atmospheric analysis with sin 2 θ 13 constrained.The probability of observing a more extreme result than the data (the p-value) is given by the area to the right of the data line in the normal ordering scenario, and by the area to the left of the data line in the inverted ordering scenario.While the figure shows the p-value determined from simulated data sets for the inverted mass ordering is 0.0091, we note that with the present SK statistics, the expected sensitivity remains weak for rejecting either ordering.Indeed, the p-value for the data result within the normal ordering, 0.88, is not especially likely either.For the situation in which the data must select between two mutually-exclusive hypotheses, the CL s method [58] provides an estimate of the p-value that considers simultaneous agreement from both hypotheses: where p N.O.and p I.O.refer to the p-values in the normal or inverted ordering.This prescription decreases the significance for rejecting the inverted hypothesis proportional to the simultaneous significance of accepting the normal ordering hypothesis.The CL s for the atmospheric analysis with sin 2 θ 13 constrained is 0.077, corresponding to a rejection of the inverted mass ordering at the 92.3 % confidence level.This number is similar to the previous SK result, CL s = 0.070 [5], despite originating from a larger ∆χ 2 , 5.69 versus 4.33.While the mass ordering sensitivity and data result both increased for this analysis, the probability of obtaining the data result simultaneously decreased in both orderings, resulting in a similar CL s value.The p-value obtained from toy data sets depends on the choice of oscillation parameters.While the atmospheric analyses places sin 2 θ 23 in the lower octant, values of sin 2 θ 23 spanning both octants are allowed at the 1σ level.Larger values of sin 2 θ 23 and δ CP values near −π/2 increase the sensitivity of SK for rejecting the incorrect mass ordering since they enhance the ν µ → ν e signal.Accordingly, the mass ordering is more difficult to resolve for smaller values of sin 2 θ 23 and values of δ CP near π/2.To demonstrate the dependence of CL s outcomes on the choice of oscillation parameters, we repeated the generation of toy data sets for configurations of oscillation parameters which maximize and minimize the expected sensitivity to rejecting the incorrect mass ordering and are allowed at the 90 % confidence level.The range of CL s values obtained in the atmospheric fit with sin 2 θ 13 constrained span 0.033-0.220.We observe that upperoctant values of sin 2 θ 23 predict larger∆χ 2 I.O.−N.O.values which are closer to the observed data result.We anticipate that better constraints on the sin 2 θ 23 octant will reduce the difference between the ∆χ 2 I.O.−N.O.values expected from MC and obtained from data.

VI. CONCLUSION
We analyzed 6511.3 live-days of atmospheric neutrino data collected with the Super-Kamiokande experiment operating with pure water and an expanded fiducial volume.An event selection using tagged neutron information was used to enhance the statistical separation of neutrino and anti-neutrino data, thereby increasing the sensitivity to the neutrino mass ordering.An analysis of SK data with constraints on sin 2 θ 13 measures the oscillation parameters to be ∆m 2 32 = 2.40 +0.07 −0.09 eV 2 , sin 2 θ 23 = 0.45 +0.06 −0.03 , and δ CP = −1.75+0.76 −1.25 .The analysis prefers the normal ordering over the inverted ordering at the 92.3 % confidence level.We anticipate improvements to the mass ordering sensitivity in future analyses of SK atmospheric data which include gadolinium-enhanced neutron tagging for enhanced neutrino and anti-neutrino separation.
This work is accompanied by a data release [59].The data release contains the 1D and 2D contours of fitted oscillation parameters at the 68 % and 90 % confidence levels from the analyses described in Sec.IV, and listings of the data and MC counts in the 930 atmospheric neutrino bins used throughout this work.Summary statistics of the MC neutrino energies, directions, and flavors are provided for each bin.

FIG. 3 .FIG. 4 .
FIG.3.Performance of energy reconstruction for various sources across all SK periods.Solid points indicate measurements made within the conventional fiducial volume (vertex is greater than 200 cm from the detector walls) while dashed points correspond to measurements in the additional fiducial volume (vertex is between 100 cm and 200 cm from the detector walls).Michel-e refers to the stopping muon decay electron spectrum calibration.

FIG. 6 .
FIG.6.Zenith angle distributions for the SK IV & SK V FC sub-GeV and multi-GeV single-ring samples using the number of tagged neutrons.The MC configuration and the meaning of the colors are identical to Fig.5.

FIG. 7 .
FIG.7.Monte-Carlo distributions of input variables for the FC multi-GeV, multi-ring BDT.Distributions are area-normalized.LDecay e. is the maximum distance of any decay electron from the event's primary vertex.pT and pMax.are the transverse momentum and the momentum of the most energetic ring, respectively, and pTot. is the sum of the momenta from all reconstructed rings.

FIG. 8 .
FIG.8.Data and MC distributions of the multi-ring BDT classifier scores for multi-ring νe and νe events.The BDT makes a final classification decision for each event based on the class with the highest score: Negative values correspond to νe-like and positive values correspond to νe-like.Distributions are shown for the SK IV-V phases, including events in the expanded fiducial volume region.MC events are shown with oscillations applied, and without the effect of any fitted systematic uncertainty parameters.The "MC total" distribution includes contributions from true νµ, ντ , and NC events.

FIG. 9 .
FIG.9.SK IV-V data and MC distributions of the likelihood variables used to classify events into the different analysis samples.The top row shows events in the conventional fiducial volume region, while the bottom row shows events in the additional fiducial volume region.The first and third columns show sub-GeV events and the second and fourth columns show multi-GeV events.The left four panels show the PID likelihood for the most energetic ring in each event: Negative values indicate the ring is e-like while positive values indicate the ring is µ-like.The right four panels show the ring-counting likelihood: Negative values indicate the event has a single-ring, while positive values indicate the event has multiple rings.In these panels, the CC QE histograms include both 1p1h and 2p2h processes, and CC Other refers to single, non-pion hadron production.All MC distributions are calculated assuming neutrino oscillations and without the effect of any systematic uncertainties.
FIG.10.Simulated neutrino energy spectra by interaction modes for the different SK event categories: fully contained (FC), partially contained (PC) and upward-going muon (Upµ).The simulated event rates are for an oscillated MC, with oscillation parameters set to the Particle Data Group best-fit values[16], except for δCP which is set to 0. "CC Other" modes refer to the production of single, non-pion hadrons.
FIG.12.Estimated fractional uncertainty on the ratio of upward-going, cos θz < 0, versus downward-going, cos θz > 0, events in the multi-GeV e-like samples (both single-ring and multi-ring) from statistical and systematic uncertainties.The fractional uncertainties are estimated by randomly fluctuating the nominal MC counts either by Poisson statistics for the "Statistics" entry, or by Gaussian fluctuations of one or more systematic uncertainties.Uncertainties on the up/down ratio in the multi-GeV e-like samples have the largest impact on the sensitivity to the neutrino mass ordering.
2 I.O.−N.O.= 5.23.The MC expectation for the mass ordering at the best-fit oscillation parameters is ∆χ 2 I.O.−N.O.= 1.53, indicated by dashed lines in Fig. 14.The difference between the data result and the MC expectation is discussed in Sec.V.

2 32 ,FIG. 14 .
FIG. 14. 1D ∆χ2 profiles of oscillation parameters in the SK-only analysis with sin 2 θ13 treated as a free parameter.Solid lines correspond to the data fit result, while dashed lines correspond to the MC expectation at the data best-fit oscillation parameters, cf.Tab.IV.Dashed lines show critical values of the χ 2 distribution for 1 degree of freedom with lowest to highest corresponding to 68 %, 90 %, 95 %, and 99 % probabilities.
FIG.15.Up-down asymmetry for multi-GeV e-like events.The y-axis is the asymmetry parameter, the ratio between the difference and sum of upward-going, cos θz < 0.4 and downward going, cos θz > 0.4 events.The x-axis is the reconstructed neutrino energy: For single ring events, the reconstructed energy is the visible energy of the ring assuming the reconstructed ring is an electron, while for multi-ring events, it is the total visible energy of the event.All error bars are statistical.MC lines for the normal and inverted orderings are drawn assuming the best-fit oscillation parameters of the SK + sin 2 θ13-constrained analysis.SK IV-V multi-GeV single-ring events are selected using the number of tagged neutrons, and so are separated from the SK I-III multi-GeV single-ring samples.

FIG. 16 .
FIG.16.1D ∆χ 2 profiles of oscillation parameters in the SK-only analysis with sin 2 θ13 constrained.Solid lines correspond to the data fit result, while dashed lines correspond to the MC expectation at the data best-fit oscillation parameters, cf.Tab.IV.Dashed lines show critical values of the χ 2 distribution for 1 degree of freedom corresponding to 68 %, 90 %, 95 %, and 99 % probabilities.

FIG. 18 .
FIG. 18. Distribution of the mass ordering preference statistic, ∆χ 2 I.O.−N.O., for ensembles of simulated data sets, assuming either the normal or inverted mass orderings.The data result from the atmospheric analysis with sin 2 θ13-constrained analysis is shown as the vertical black line.The blue and orange histograms indicate the distribution of this statistic for toy data sets assuming the normal and inverted ordering respectively.The filled areas to the left of the data result for inverted toy data sets and to the right of the data result for normal toy data sets indicate the p-values.

TABLE I .
[14]rino propagation layers and corresponding densities used for calculating neutrino oscillation probabilities in this analysis, based on a simplified PREM[14].

TABLE II .
Super-Kamiokande data-taking phases.An electronics upgrade at the start of SK IV enabled neutron tagging on hydrogen (H), utilized in the SK IV and SK V phases.During 2020, gadolinium (Gd) was added to the detector's water to increase the neutron-tagging efficiency.At the time of this writing, SK Gd is ongoing and data from the SK Gd phase are not included in this analysis.

TABLE III .
Monte Carlo CC and NC purities by sample and data event counts used in this analysis.Purities and MC counts are shown with oscillation probabilities applied and without the effects of systematic pulls.The "ντ CC" column shows the purity of both ντ and ντ CC events.The Up-µ data counts are shown after background subtraction.

TABLE IV .
The oscillation parameter grid definitions used in the three analyses presented in this work.The grids are evenly spaced, and include the minimum and maximum values.The grids are the same for both the normal and inverted mass orderings in all analyses.All three analyses treat ∆m221 and sin 2 θ12 as constrained parameters.
, n indexes each bin and O n are the observed counts in the n th bin.The expected counts E n are calculated from a nominal MC prediction E 0 n scaled by the FIG.13. 1D ∆χ 2 sensitivity profile of δCP for the normal and inverted mass ordering scenarios, with (solid lines) and without (dashed lines) neutron-based event classification for SK IV and SK V data.The sensitivity is computed assuming the true neutrino oscillation parameters are the global bestfit parameters from [16] and the normal mass ordering.The parameters ∆m 2 32 , sin 2 θ23, and δCP are free parameters in the fit, while the other oscillation parameters are fixed.

TABLE V .
Best-fit neutrino oscillation parameters from the analyses presented in this work.The uncertainties on each oscillation parameter are the ±1σ allowed regions assuming a χ 2 distribution with one degree of freedom.The second-to-last column shows the total χ 2 .Both analyses have 930 bins.