Measurements of neutrino oscillation in appearance and disappearance channels by the T2K experiment with 6.6E20 protons on target

We report on measurements of neutrino oscillation using data from the T2K long-baseline neutrino experiment collected between 2010 and 2013. In an analysis of muon neutrino disappearance alone, we find the following estimates and 68% confidence intervals for the two possible mass hierarchies: Normal Hierarchy: $\sin^2\theta_{23}=0.514^{+0.055}_{-0.056}$ and $\Delta m^2_{32}=(2.51\pm0.10)\times 10^{-3}$ eV$^2$/c$^4$ Inverted Hierarchy: $\sin^2\theta_{23}=0.511\pm0.055$ and $\Delta m^2_{13}=(2.48\pm0.10)\times 10^{-3}$ eV$^2$/c$^4$ The analysis accounts for multi-nucleon mechanisms in neutrino interactions which were found to introduce negligible bias. We describe our first analyses that combine measurements of muon neutrino disappearance and electron neutrino appearance to estimate four oscillation parameters and the mass hierarchy. Frequentist and Bayesian intervals are presented for combinations of these parameters, with and without including recent reactor measurements. At 90% confidence level and including reactor measurements, we exclude the region: $\delta_{CP}=[0.15,0.83]\pi$ for normal hierarchy and $\delta_{CP}=[-0.08,1.09]\pi$ for inverted hierarchy. The T2K and reactor data weakly favor the normal hierarchy with a Bayes Factor of 2.2. The most probable values and 68% 1D credible intervals for the other oscillation parameters, when reactor data are included, are: $\sin^2\theta_{23}=0.528^{+0.055}_{-0.038}$ and $|\Delta m^2_{32}|=(2.51\pm0.11)\times 10^{-3}$ eV$^2$/c$^4$.

I.

INTRODUCTION
Neutrino oscillation was firmly established in the late 1990's with the observation by the Super-Kamiokande (SK) experiment that muon neutrinos produced by cosmic ray interactions in our atmosphere changed their flavor [1]. Measurements from the Sudbury Neutrino Observatory a few years later, in combination with SK data, revealed that neutrino oscillation was responsible for the apparent deficit of electron neutrinos produced in the Sun [2]. In the most recent major advance, the T2K experiment [3,4] and reactor experiments [5][6][7][8] have established that all three neutrino mass states are mixtures of all three flavor states, which allows the possibility of CP violation in neutrino oscillation. This paper describes our most recent measurements of neutrino oscillation including our first results from analyses that combine measurements of muon neutrino disappearance and electron neutrino appearance.
The Tokai to Kamioka (T2K) experiment [9] was made possible by the construction of the J-PARC high-intensity proton accelerator at a site that is an appropriate distance from the SK detector for precision measurements of neutrino oscillation. Protons, extracted from the J-PARC main ring, strike a target to produce secondary hadrons, which are focused and subsequently decay in-flight to produce an intense neutrino beam, consisting mostly of muon neutrinos. The neutrino beam axis is directed 2.5 degrees away from the SK detector, in order to produce a narrow-band 600 MeV flux at the detector, the energy that maximizes muon neutrino oscillation at the 295 km baseline. Detectors located 280 m downstream of the production target measure the properties of the neutrino beam, both on-axis (INGRID detector) and off-axis in the direction of SK (ND280 detector).

T2K began operation in 2010 and was interrupted for one year by the Great East Japan
Earthquake in 2011. The results reported in this paper use data collected through 2013, as summarized in Tab. I. With these data, almost 10% of the total proposed for the experiment, T2K enters the era of precision neutrino oscillation measurements. In 2014, we began to collect our first data in which the current in the magnetic focusing horns is reversed, so as to produce a beam primarily of muon anti-neutrinos. Future publications will report on measurements using that beam configuration.
We begin this paper by describing the neutrino beamline and how we model neutrino production and interactions. We then summarize the near detectors and explain how we use their data to improve model predictions of neutrino interactions at the far detector. This is followed by an overview of the far detector, how neutrino candidate events are selected, and how we model the detector response. Next, we describe the neutrino oscillation model, list the external inputs for the oscillation parameters, summarize the approaches used in the oscillation analyses, and characterize our main sources of systematic uncertainty. The final sections give detailed descriptions and results for the analysis of ν µ disappearance alone [10] and for the joint analyses of ν µ disappearance and ν e appearance.  II.

NEUTRINO BEAMLINE
The T2K primary beamline transports and focuses the 30 GeV proton beam extracted from the J-PARC Main Ring onto a 91.4-cm long graphite target. The secondary beamline consists of the target station, decay volume, and beam dump. The apparatus has been described in detail elsewhere [9].
The upstream end of the target station contains a collimator to protect the three down- The secondary beamline is simulated in order to estimate the nominal neutrino flux (in absence of neutrino oscillations) at the near and far detectors and the covariance arising from uncertainties in hadron production and the beamline configuration [11]. We use the FLUKA 2008 package [12,13] to model the interactions of the primary beam protons and the subsequently-produced pions and kaons in the graphite target. As described below, we tune this simulation using external hadron production data. Particles exiting the target are tracked through the magnetic horns and decay volume in a GEANT3 [14] simulation using the GCALOR [15] package to model the subsequent hadron decays.
In order to precisely predict the neutrino flux, each beam pulse is measured in the primary neutrino beamline. The suite of proton beam monitors consists of five current transformers which measure the proton beam intensity, 21 electrostatic monitors which measure the proton beam position, and 19 segmented secondary emission monitors and an optical transition radiation monitor [16] which measure the proton beam profile. The proton beam properties have been stable throughout T2K operation, and their values and uncertainties for the most recent T2K run period, Run 4, are given in Tab. II. The values for other run periods  TABLE II: Summary of the estimated proton beam properties and their systematic errors at the collimator for the T2K Run 4 period. Shown are the mean position (X, Y ), angle (X , Y ), width (σ), emittance ( ), and Twiss parameter (α) [17]. have been published previously [11]. The neutrino beam position and width stability is also monitored by the INGRID detector, and the results are given in Sec. IV A.
To improve the modeling of hadron interactions inside and outside the target, we use data from the NA61/SHINE experiment [18,19] collected at 31 GeV/c and several other experiments [20][21][22]. The hadron production data used for the oscillation analyses described here are equivalent to those used in our previous publications [3,11], including the statisticslimited NA61/SHINE dataset taken in 2007 on a thin carbon target. The NA61/SHINE data analyses of the 2009 thin-target and T2K-replica-target data are ongoing, and these additional data will be used in future T2K analyses. We incorporate the external hadron production data by weighting each simulated hadron interaction according to the measured multiplicities and particle production cross sections, using the true initial and final state hadron kinematics, as well as the material in which the interaction took place. The predicted flux at SK from the T2K beam is shown in Fig. 1.

B. Neutrino flux uncertainties
Uncertainty in the neutrino flux prediction arises from the hadron production model, proton beam profile, horn current, horn alignment, and other factors. For each source of uncertainty, we vary the underlying parameters to evaluate the effect on the flux prediction in bins of neutrino energy for each neutrino flavor [11]. Table III shows the breakdown for  TABLE III: Contributions to the systematic uncertainties for the unoscillated ν µ and ν e flux prediction at SK, near the peak energy and without the use of near detector data.
The largest uncertainty from beam monitor calibrations arises in the beam current measurement using a current transformer, but its effect on the oscillation analyses is reduced through the use of near detector data. The remaining uncertainties due to the uncertain position and calibration of the other beam monitors are significantly smaller. As described in Sec. IV A, the neutrino beam direction is determined with the INGRID detector, and therefore the assigned uncertainty on the off-axis angle comes directly from the INGRID beam profile measurement. To account for the horn current measurement that drifts over time and a possible scale uncertainty, 5 kA is assigned as a conservative estimate of the horn current error. In the flux simulation, the horn magnetic field is assumed to have a 1/r dependence. Deviations from this field, measured using a Hall probe, are used to define the uncertainty of the horn field. Horn and target alignment uncertainties come from survey measurements.
Systematic uncertainties in modeling particle multiplicities from hadronic interactions come from several sources: experimental uncertainties in the external data, the uncertain scaling to different incident particle momenta and target materials, and extrapolation to regions of particle production phase space not covered by external data [11]. The overall uncertainty is described by calculating the covariance of the pion, kaon, and secondary nucleon multiplicities and their interaction lengths.
The systematic errors on the ν µ flux at SK, without applying near detector data, are shown in bins of neutrino energy in Fig. 2. The dominant source of uncertainty is from hadron production.
For analyses of near and far detector data, the uncertainties arising from the beamline configuration and hadron production are propagated using a vector of systematic parameters, b, which scale the nominal flux in bins of neutrino energy, for each neutrino type (ν e , ν µ ,ν e , ν µ ) at each detector (ND280 and SK). The energy binning for each neutrino type is shown in Fig. 1. The covariance for these parameters is calculated separately for each T2K run period given in Tab. I, and the POT-weighted average is the flux covariance, V b , used by the near detector and oscillation analyses. We define b n and b s as the sub-vector elements of b for ND280 and SK. It is through the covariance between b n and b s that the near detector measurements of ν µ events constrain the expected unoscillated far detector ν µ and ν e event rates in the oscillation analyses.  This section describes the interaction model in NEUT, the primary neutrino interaction generator used by T2K, explains how we use data from external experiments to provide initial constraints on the model before fitting to T2K data, discusses remaining uncertainties not constrained by external data sources, and discusses uncertainties based on differences between the NEUT model and those found in other interaction generators.

A. Neutrino Interaction Model
The interaction model used in this analysis is NEUT [23] version 5.1.4.2, which models neutrino interactions on various nuclear targets over a range of energies from ∼100 MeV to ∼100 TeV. NEUT simulates seven types of charged current (CC) and neutral current (NC) interactions: (quasi-)elastic scattering, single pion production, single photon production, single kaon production, single eta production, deep inelastic scattering (DIS), and coherent pion production. Interactions not modeled in this version of NEUT include, but are not limited to, multi-nucleon interactions in the nucleus [24,25], and neutrino-electron scattering processes.
The Llewellyn Smith model [26] is used as the basis to describe charged current quasielastic (CCQE) and neutral current elastic scattering (NCEL) interactions. In order to take into account the fact that the target nucleon is in a nucleus, the Relativistic Fermi Gas (RFG) model by Smith and Moniz [27,28]  The Rein and Sehgal model [30] is used to simulate neutrino-induced single pion production. The model assumes the interaction is split into two steps as follows: ν + N → + N , N → π + N , where N and N are nucleons, is an outgoing neutrino or charged lepton, and N is the resonance. For the initial cross section calculation, the amplitude of each resonance production is multiplied by the branching fraction of the resonance into a pion and nucleon. Interference between 18 resonances with masses below 2 GeV/c 2 are included in the calculation. To avoid double counting processes that produce a single pion through either resonance or DIS in calculating the total cross section, the invariant hadronic mass W is restricted to be less than 2 GeV/c 2 . The model assigns a 20% branching fraction for the additional delta decay channel that can occur in the nuclear medium, ∆ + N → N + N , which we refer to as pion-less delta decay (PDD). Since the Rein and Sehgal model provides the amplitudes of the neutrino resonance production, we adjust the NEUT predictions for the cross sections of single photon, kaon, and eta production by changing the branching fractions of the various resonances.
The coherent pion production model is described in [31]. The interaction is described as where A is the target nucleus, is the outgoing lepton, π is the outgoing pion, and X is the remaining nucleus. The CC component of the model takes into account the lepton mass correction provided by the same authors [32].
The DIS cross section is calculated over the range of W > 1.3 GeV/c 2 . The structure functions are taken from the GRV98 parton distribution function [33] with corrections proposed by Bodek and Yang [34] to improve agreement with experiments in the low-Q 2 region.
To avoid double counting single pion production with the resonance production described above, in the region W ≤ 2 GeV/c 2 the model includes the probability to produce more than one pion only. For W > 2 GeV/c 2 , NEUT uses PYTHIA/JetSet [35]  formation zone is taken into account. The hadron is moved a small distance and interaction probabilities for that step are calculated. The interaction types include charge exchange, inelastic scattering, particle production, and absorption. If an interaction has occurred, then the kinematics of the particle are changed as well as the particle type if needed. The process is repeated until all particles are either absorbed or escape the nucleus.

B. Constraints From External Experiments
To establish prior values and errors for neutrino-interaction systematic parameters x and constrain a subset for which ND280 observables are insensitive, neutrino-nucleus scattering data from external experiments are used.
The datasets external to T2K come from two basic sources: pion-nucleus and neutrinonucleus scattering experiments. To constrain pion-nucleus cross section parameters in the NEUT FSI model, pion-nucleus scattering data on a range of nuclear targets are used. The most important external source of neutrino data for our interaction model parameter constraints is the MiniBooNE experiment [37]. The MiniBooNE flux [38] covers an energy range similar to that of T2K and as a 4π detector like SK has a similar phase space acceptance, meaning NEUT is tested over a broader range of Q 2 than current ND280 analyses.

Constraints From Pion-Nucleus Scattering Experiments
To evaluate the uncertainty in the pion transport model in the nucleus, we consider the effects of varying the pion-nucleus interaction probabilities via six scale factors. These scale factors affect the following processes in the cascade model: absorption (x F SABS ), low energy QE scattering including single charge exchange (x F SQE ) and low energy single charge exchange (SCX) (x F SCX ) in a nucleus, high energy QE scattering (x F SQEH ), high energy SCX (x F SCXH ), and pion production (x F SIN EL ). The low (high) energy parameters are used for events with pion momenta below (above) 500 MeV/c with the high energy parameters explicitly given and the remaining parameters all low energy. The simulation used to perform this study is similar to the one in [39]. The model is fit to a range of energy-dependent cross  sections comprising nuclear targets from carbon to lead . The best-fit scale factors for these parameters are shown in Tab. IV as well as the maximum and minimum values for each parameter taken from 16 points on the 1σ surface of the 6-dimensional parameter space.
The parameter sets are used for assessing systematic uncertainty in secondary hadronic interactions in the near and far detectors, as discussed in Secs. VB and VIC, respectively.

Constraints From MiniBooNE CCQE Measurements
To constrain parameters related to the CCQE model and its overall normalization, we fit the 2D cross-section data from MiniBooNE [67], binned in the outgoing muon kinetic energy, T µ , and angle with respect to the neutrino beam direction, θ µ . The NEUT interactions selected for the fit are all true CCQE interactions. Our fit procedure follows that described by Juszczak et al. [68], with the χ 2 defined as where the index i runs over the bins of the (T µ , cos θ µ ) distribution, p d(p) i is the measured (predicted) differential cross section, ∆p i is its uncertainty, λ is the CCQE normalization, and ∆λ is the normalization uncertainty, set at 10.7% by MiniBooNE measurements. The main difference from the procedure in [68] is that we include (T µ , cos θ µ ) bins where a large percentage of the events have 4-momentum transfers that are not allowed in the RFG model.
We find M QE A = 1.64 ± 0.03 GeV/c 2 and λ = 0.88±0.02 with χ 2 min /DOF = 26.9/135. It should be noted that MiniBooNE does not report correlations, and without this information assessing the goodness-of-fit is not possible. To take this into account, we assign the uncertainty to be the difference between the fit result and nominal plus the uncertainty on the fit result. The M QE A fit uncertainty is set to 0.45 GeV/c 2 , which covers (at 1 standard deviation) the point estimates from our fit to the MiniBooNE data, the K2K result [69] and a world deuterium average, 1.03 GeV/c 2 [70]. The normalization uncertainty for neutrinos with E ν < 1.5 GeV, x QE 1 , is set to 11%, the MiniBooNE flux normalization uncertainty, since most of the neutrinos from MiniBooNE are created in this energy range.

Constraints From MiniBooNE Inclusive π Measurements
To constrain single pion production parameter errors, we use published MiniBooNE differential cross-section datasets for CC single π 0 production (CC1π 0 ) [71], CC single π + production (CC1π + ) [72], and NC single π 0 production (NC1π 0 ) [73]. Because the modes are described by a set of common parameters in NEUT, we perform a joint fit to all three data sets.
The selection of NEUT simulated events follows the signal definition in each of the Mini-BooNE measurements. For the (CC1π 0 , CC1π + , NC1π 0 ) selections, the signals are defined as (ν µ , ν µ , ν) interactions with (1,1,0) µ − and exactly one (π 0 ,π + ,π 0 ) exiting the target nucleus, with no additional leptons or mesons exiting. In all cases, there is no constraint on the number of nucleons or photons exiting the nucleus.
We consider a range of models by adjusting 9 parameters shown in Tab. V. M RES A is the axial vector mass for resonant interactions, which affects both the rate and Q 2 shape of interactions. The "W shape" parameter is an empirical parameter that we introduce in order to improve agreement with NC1π 0 |p π 0 | data. The weighting function used is a Breit-Wigner function with a phase space term: where S is the "W shape" parameter, W 0 = 1218 MeV/c, P (W ; m π , m N ) is the phase space for a two-body decay of a particle with mass W into particles with masses m π and m N , and α is a normalization factor calculated to leave the total nucleon-level cross section unchanged as S is varied. The nominal values of S and W 0 come from averages of fits to two W distributions of NEUT interactions, one with a resonance decaying to a neutron and π + and the other with it decaying to a proton and π 0 . The "CCOther shape" parameter, x CCOth , modifies the neutrino energy dependence of the cross section for a combination of CC modes, as described in Sec. III C, along with the remaining parameters that are normalizations applied to the NEUT interaction modes. Simulated events modified by x CCOth constitute a small fraction of the selected samples. As a result, the data have minimal power to constrain this parameter and likewise for the NC1π + , NC coherent pion, and NCOther normalization parameters, x N C1π ± , x N Ccohπ , and x N COth , respectively. The T2K oscillation analyses are insensitive to these poorly determined parameters, and an arbitrary constraint is applied to stabilize the fits. In our external data analysis the NC coherent normalization cannot be constrained independently of the NC1π 0 normalization, x N C1π 0 , because there is no difference in the |p π 0 | spectrum between the two components. The errors given in Tab. V also include the variance observed when refitting using the 16    , and x N C1π 0 . 74 1 x CCcohπ and x N Ccohπ , respectively. The CC coherent pion cross section is assigned an error of 100% due to the fact that the CC coherent pion cross section had only 90% confidence upper limits for sub-GeV neutrino energies at the time of this analysis. In addition, when included in the MiniBooNE pion production fits, the data are consistent with the nominal NEUT model at 1σ and with zero cross section at 2σ. The NC coherent pion production data [76] differ from NEUT by 15%, within the measurement uncertainty of 20%. To account for the difference and the uncertainty, we conservatively assign a 30% overall uncertainty to The anti-neutrino/neutrino cross section ratios are assigned an uncertainty of 40%. This is a conservative estimate derived from doubling the maximum deviation between the energydependent MiniBooNE CCQE neutrino cross section and the RFG model assuming an axial mass of M QE A = 1.03 GeV/c 2 , which was 20%.
For ν e CCQE interactions, there may be some effects that are not accounted for in the NEUT model, such as the existence of second class currents, as motivated in Ref. [77]. The dominant source of uncertainty is the vector component, which may be as large as 3% at the T2K beam peak, and thus is assigned as an additional error on ν e CCQE interactions relative to ν µ CCQE interactions. data and ν e analyses at SK, resonant production that produces a nucleon and charged pion is also included in the NCOther definition, though kept separate in other analyses.
NCOther interactions have a 30% normalization error assigned to them, which is given to the parameters x N COth and x N C1π ± .

D. Alternative Models
As mentioned above, NEUT's default model for CCQE assumes an RFG for the nuclear potential and momentum distribution of the nucleons. An alternative model, referred to as the "spectral function" (SF) [79], appears to be a better model when compared to electron scattering data. SF is a generic term for a function that describes the momentum and energy distributions of nucleons in a nucleus. In the model employed in [79], the SF consists of a mean-field term for single particles and a term for correlated pairs of nucleons, which leads to a long tail in the momentum and binding energy. It also includes the nuclear shell structure of oxygen, the main target nucleus in the T2K far detector. The difference between the RFG and SF models is treated with an additional systematic parameter.
At the time of this analysis, the SF model had not been implemented in NEUT, so the NuWro generator [80] was used for generating SF interactions with the assumption that a NEUT implementation of SF would produce similar results. The SF and RFG distributions were produced by NuWro and NEUT, respectively, for ν µ and ν e interactions on both carbon and oxygen, while using the same vector and axial form factors.
The ratio of the SF and RFG cross sections in NuWro is the weight applied to each NEUT CCQE event, according to the true lepton momentum, angle, and neutrino energy of the interaction. Overall, this weighting would change the predicted total cross section by 10%.
Since we already include in the oscillation analysis an uncertainty on the total CCQE cross section, the NuWro cross section is scaled so that at E ν =1 GeV it agrees with the NEUT CCQE cross section.
A parameter x SF is included to allow the cross section model to be linearly adjusted between the extremes of the RFG (x SF =0) and SF (x SF =1) models. The nominal value for x SF is taken to be zero, and the prior distribution for x SF is assumed to be a standard Gaussian (mean zero and standard deviation one) but truncated outside the range [0,1].

E. Summary of cross section systematic parameters
All the cross section parameters, x, are summarized in Tab. VII, including the errors prior to the analysis of near detector data. They are categorized as follows: 1. We define x n to be the set of cross section systematic parameters which are constrained by ND280 data (category 1), to distinguish them from the remaining parameters x s (categories 2 and 3).

NEAR DETECTORS
Precision neutrino oscillation measurements require good understanding of the neutrino beam properties and of neutrino interactions. The two previous sections describe how we model these aspects for the T2K experiment and how we use external data to reduce model uncertainty. However, if only external data were used, the resulting systematic uncertainty would limit the precision for oscillation analyses.
In order to reduce systematic uncertainty below the statistical uncertainty for the experiment, an underground hall was constructed 280 m downstream of the production target for near detectors to directly measure the neutrino beam properties and neutrino interactions.
The hall contains the on-axis INGRID detector, a set of modules with sufficient target mass and transverse extent to continuously monitor the interaction rate, beam direction, and profile, and the off-axis ND280 detector, a sophisticated set of sub-detectors that measure neutrino interaction products in detail.
This section describes the INGRID and ND280 detectors and the methods used to select high purity samples of neutrino interactions. The observed neutrino interaction rates and distributions are compared to the predictions using the beamline and interaction models, with nominal values for the systematic parameters. Section V describes how ND280 data are used to improve the systematic parameter estimates and compares the adjusted model predictions with the ND280 measurements.
A. INGRID

INGRID detector
The main purpose of INGRID is to monitor the neutrino beam rate, profile, and center.
In order to sufficiently cover the neutrino beam profile, INGRID is designed to sample the beam in a transverse section of 10 m×10 m, with 14 identical modules arranged in two identical groups along the horizontal and vertical axes, as shown in Fig. 3. Each of the modules consists of nine iron target plates and eleven tracking scintillator planes, each made of two layers of scintillator bars (X and Y layers). They are surrounded by veto scintillator planes to reject charged particles coming from outside of the modules. Scintillation light from each bar is collected and transported to a photo-detector with a wavelength shifting  This analysis [83] significantly improves upon the original method established in 2010 [82].
The new track reconstruction algorithm has a higher track reconstruction efficiency and is less susceptible to MPPC dark noise. Event pileup, defined as more than one neutrino interaction occurring in a module in the same beam pulse, occurs in as many as

Corrections
Corrections for individual iron target masses and the background are applied in the same way as the previous INGRID analysis [82]. In addition, we apply corrections for dead channels and event pileup which can cause events to be lost. There are 18 dead channels out of 8360 channels in the 14 standard modules and the correction factor for the dead channels is estimated from a Monte Carlo simulation. The correction factor for the event pileup is estimated as a linear function of the beam intensity, since the event-pileup effect is proportional to the beam intensity. The slope of the linear function is estimated from the beam data by combining events to simulate event pileup [83]. The inefficiency due to pileup is less than 1% for all running periods.

Systematic error
Simulation and control samples are used to study potential sources of systematic error and to assign systematic uncertainties. The sources include target mass, MPPC dark noise and efficiency, event pileup, beam-induced and cosmic background, and those associated with the event selection criteria.
The total systematic error for the selection efficiency, calculated from the quadratic sum of all the systematic errors, is 0.91%. It corresponds to about a quarter of the 3.73% error from the previous analysis method [82]. The reduction of the systematic error results from the analysis being less sensitive to MPPC dark noise and event pileup, the improved track reconstruction efficiency, and more realistic evaluations of systematic errors which had been conservatively estimated in the previous analysis.  Figure 4 shows the daily rates of the neutrino events normalized by POT. When the horn current was reduced to 205 kA due to a power supply problem, the on-axis neutrino flux decreased because the forward focusing of the charged pions by the horns becomes weaker.

Results of the beam measurement
An increase by 2% and a decrease by 1% of event rate were observed between Run1 and Run2, and during Run4, respectively. However, for all run periods with the horns operated at 250 kA, the neutrino event rate is found to be stable within 2% and the RMS/mean of the event rate is 0.7%.
A Monte Carlo (MC) simulation that implements the beamline and neutrino interaction models described earlier, along with the INGRID detector simulation, is used to predict the neutrino event rate with the horns operating at 250 kA and 205 kA. The ratios of observed to predicted event rates, using the nominal values for the beamline and neutrino interaction systematic parameters, are: respectively. The neutrino flux uncertainty arising from possible incorrect modeling of the beam direction is evaluated from this result. This uncertainty, when evaluated without ND280 data, is significantly reduced compared to the previous analysis, as shown in Fig. 6.
The horizontal and vertical beam width measurements are given by the standard deviations of the Gaussians fit to the observed profiles. Figure 7 shows the history of the horizontal and vertical beam widths with the horns operating at 250 kA which are found to be stable within the statistical errors. The ratios of observed to predicted widths, using nominal values for the systematic parameters, are: for the horizontal and vertical direction, respectively.  In designing the experiment, it was recognized that detailed measurements of neutrino interactions near the production target and along the direction to the far detector would be necessary to reduce uncertainty in the models of the neutrino beam and of neutrino interactions. To achieve this, the T2K collaboration chose to use a combination of highly segmented scintillator targets and gaseous trackers in a magnetic spectrometer. Segmented active targets allow for the neutrino interaction to be localized and the trajectories of the charged particles to be reconstructed, and those passing through the gaseous trackers have their charge, momentum, and particle type measured. The targets and gaseous trackers are surrounded by a calorimeter to detect photons and assist in particle identification. The refurbished UA1/NOMAD magnet was acquired and its rectangular inner volume led to a design with rectangular sub-detectors. Spaces within the yoke allowed for the installation of outer muon detectors.
The following sections describe the ND280 detector, its simulation, and the analyses used as input for the T2K oscillation analyses.

ND280 detector
The ND280 detector is illustrated in Fig. 8, where the coordinate convention is also indicated. The x and z axes are in the horizontal plane and the y axis is vertical. The origin is at the center of the magnet and the 0.2 T magnetic field is along the +x direction. The z axis is the direction to the far detector projected onto the horizontal plane.
The analyses presented in this paper use neutrino interactions within the ND280 tracker, composed of two fine-grained scintillator bar detectors (FGDs [84]), used as the neutrino interaction target, sandwiched between three gaseous time projection chambers (TPCs [85]). reconstruction of charged particles that pass through the TPCs. The curvature due to the magnetic field provides measurements of particle momenta and charges and, when combined with ionization measurements, allows for particle identification (PID).
Data quality is assessed weekly. Over the entire running period, the ND280 data taking efficiency is 98.5%. For the analyses presented here, only data recorded with all detectors having good status are used, giving an overall efficiency of 91.5%.

ND280 simulation
A detailed simulation is used to interpret the data recorded by ND280. The neutrino flux model described in Sec. II A is combined with the NEUT neutrino interaction model described in Sec. III A and a detailed material and geometrical description of the ND280 detector including the magnet, to produce a simulated sample of neutrino interactions distributed throughout the ND280 detector with the beam time structure. For studies of particles originating outside of the ND280 detector, separate samples are produced using a description of the concrete that forms the near detector hall and the surrounding sand.
The passage of particles through materials and the ND280 detector response are modeled using the GEANT4 toolkit [89]. To simulate the scintillator detectors, including the FGDs, we use custom models of the scintillator photon yield, photon propagation including reflections and attenuation, and electronics response and noise [90]. The gaseous TPC detector simulation includes the gas ionization, transverse and longitudinal diffusion of the electrons, transport of the electrons to the readout plane through the magnetic and electric field, gas amplification, and a parametrization of the electronics response.
Imperfections in the detector response simulation can cause the model to match the detector performance poorly, potentially generating a systematic bias in parameter estimates.
After describing the methods to select neutrino interactions in the following section, we quantify the systematic uncertainty due to such effects with data/simulation comparisons in Sec. IV B 4.

ND280 ν µ Tracker analysis
We select an inclusive sample of ν µ CC interactions in the ND280 detector in order to constrain parameters in our flux and cross section model. Our earlier oscillation analyses divided the inclusive sample into two: CCQE-like and the remainder. New to this analysis is the division of the inclusive sample into three sub-samples, defined by the number of final state pions: zero (CC0π-like), one positive pion (CC1π + -like), and any other combination of number and charge (CCOther-like). This division has enhanced ability to constrain the CCQE and resonant single pion cross section parameters, which, in turn, decreases the uncertainty they contribute to the oscillation analyses.
The CC-inclusive selection uses the highest momentum negatively charged particle in an event as the µ − candidate and it is required to start inside the FGD1 fiducial volume (FV) and enter the middle TPC (TPC2). The FV begins 58 mm inward from the boundaries of the FGD1 active volume in x and y and 21 mm inward from the upstream boundary of the FGD1 active volume in z, thereby excluding the first two upstream layers. The TPC requirement has the consequence of producing a sample with predominantly forwardgoing µ − . Additional requirements are included to reduce background in which the start of the µ − candidate is incorrectly assigned inside the FGD1 FV, due to a failure to correctly reconstruct a particle passing through the FGD1 (through-going veto). The µ − candidate is required to be consistent with a muon (muon PID requirement) based on a truncated mean of measurements of energy loss in the TPC gas [85]. A similar PID has been developed for the FGD, which is not used for the muon selection, but is used in secondary particle identification [84]. In the simulation we find that the CC-inclusive sample is composed of 90.7% true ν µ CC interactions within the FGD fiducial volume, and 89.8% of the muon candidates are muons (the rest are mainly mis-identified negative pions). Table VIII shows the number of events

ND280 detector systematics
In this section we explain how we use control samples to assess uncertainty in the modeling of FGD and TPC response and of neutrino interactions outside of the fiducial volume of the FGD.
TPC systematic uncertainties are divided into three classes: selection efficiency, momentum resolution and PID. The efficiency systematic uncertainty arises in the modeling of the ionization, cluster finding (where a cluster is defined as a set of contiguous pads in a row or column with charge above threshold), track finding, and charge assignment. This is assessed by looking for missed track components in control samples with particles that pass through   evaluated by comparing data and simulation of the charge mis-identification probability as a function of momentum. This is found to be less than 1% for momenta less than 5 GeV/c. The momentum resolution is studied using particles crossing at least one FGD and two TPCs by evaluating the effect on the reconstructed momenta when the information from one of the TPCs is removed from the analysis. The inverse momentum resolution is found to be better in simulations than in data, typically by 30%, and this difference is not fully understood. A scaling of the difference between true and reconstructed inverse momentum is applied to the simulated data to account for this. Uncertainty in the overall magnetic field strength leads to an uncertainty on the momentum scale of 0.6%, which is confirmed using the range of cosmic ray particles that stop in the FGD.
The TPC measurement of energy loss for PID is evaluated by studying high-purity control samples of electrons, muons and protons. The muon control sample has the highest statistics and is composed of particles from neutrino interactions outside the ND280 detector that pass through the entire tracker. For muons with momenta below 1 GeV/c, the agreement between data and simulation is good, while above 1 GeV/c the resolution is better in simulation than in data. Correction factors are applied to the simulation to take into account this effect.
The performance for track finding in the FGD is studied separately for tracks which are using a sample of stopping protons going from TPC1 to FGD1. This efficiency is found to be slightly better for data than simulation when cos θ µ < 0.9. A correction is applied to the simulation to account for this and the correction uncertainty is included in the overall detector uncertainty.
The FGD PID performance is evaluated by comparing the energy deposited along the track with the expected energy deposit for a given particle type and reconstructed range in the FGD. We use control samples of muons and protons tagged by TPC1 and stopping in FGD1. The pull distributions (residual divided by standard error) for specific particle hypotheses (proton, muon or pion) for data and simulation are fitted with Gaussian distributions. To account for the differences in the means and widths of the distributions between data and simulation, corrections are applied to simulation and the correction uncertainty is included in the overall detector uncertainty. The Michel electron tagging efficiency is studied using a sample of cosmic rays that stop in FGD1 for which the delayed electron is detected. The Michel electron tagging efficiency is found to be (61.1 ± 1.9)% for simulation and (58.6 ± 0.4)% for data. A correction is applied to simulation and the correction uncertainty is included in the overall detector uncertainty.
The uncertainty on the mass of the FGD, computed using the uncertainties in the size and density of the individual components, is 0.67% [84].
There is systematic uncertainty in the modeling of pion interactions traveling through the FGD. This is evaluated from differences between external pion interaction data [40][41][42][43][44][45][46][47][48][49][50][51] and the underlying GEANT4 simulation. The external data do not cover the whole momentum range of T2K, so some extrapolation is necessary. Incorrect modeling can migrate events between the three sub-samples and for some ranges of momentum this produces the largest detector systematic uncertainty.
An out-of-fiducial volume (OOFV) systematic is calculated by studying nine different categories of events that contribute to this background. Examples of these categories are: a high energy neutron that creates a π − inside the FGD that is mis-identified as a muon, a backwards-going π + from the barrel-ECal that is mis-reconstructed as a forward-going muon, and a through-going muon passing completely through the FGD and the TPC-FGD matching failed in such a way that mimics a FV event. Each of these categories is assigned a rate uncertainty (of 0 or 20%) and a reconstruction-related uncertainty. The reconstructionrelated uncertainty is below 40% for all categories but one: we assign a reconstruction-related uncertainty of 150% to the high-angle tracks category, in which matching sometimes fails to include some hits that are outside the FGD FV.
An analysis of the events originating from neutrino interactions outside the ND280 detector (pit walls and surrounding sand) is performed using a dedicated simulation (sand muon simulation). The data/simulation discrepancy is about 10% and is included as a systematic uncertainty on the predicted number of sand muon events in the CC-inclusive sample.
Pileup corrections are applied to account for the inefficiency due to sand muons crossing the tracker volume in coincidence with a FV event. The correction is evaluated for each dataset separately and is always below 1.3%; the systematic uncertainty arising from this correction is always below 0.16%. Table XI shows the full list of base detector systematic effects considered and the way each one is treated within the simulated samples to propagate the uncertainty. Normalization systematics are treated by a single weight applied to all events. Efficiency systematics are treated by applying a weight that depends on one or more observables. Finally, several systematics are treated by adjusting the observables and re-applying the selection.   V.

NEAR DETECTOR ANALYSIS
In this section we explain how we use the large and detailed samples from ND280 in conjunction with models for the beam, neutrino interactions, and the ND280 detector to improve our predictions of the flux at SK and some cross section parameters. The systematic parameters for the beam model ( b), binned in energy as shown in Fig. 1, the cross section model ( x), listed in Tab. VII, and detector model ( d), illustrated in Fig. 13, are used to describe the systematic uncertainties in the analysis. We use the three ν µ CC samples described in Sec. IV B and external data discussed in Sec. III B and summarize our knowledge of the neutrino cross section parameters and unoscillated neutrino flux parameters with a covariance matrix, assuming that a multivariate Gaussian is an appropriate description.

A. ND280 Likelihood
The three ν µ CC samples are binned in the kinematic variables p µ and cos θ µ , as shown in Fig. 13, and the observed and predicted number of events in the bins are used to define the likelihood, where N p i is the number of unoscillated MC predicted events and N d i is the number of data events in the ith bin of the CC samples, the second line assumes the Poisson distribution, and c is a constant. The number of MC predicted events, , is a function of the underlying beam flux b, cross section x, and detector d parameters, and these parameters are constrained by external data as described in the previous sections. We model these constraints as multivariate Gaussian likelihood functions and use the product of the above defined likelihood and the constraining likelihood functions as the total likelihood for the near detector analysis. This total likelihood is maximized to estimate the systematic parameters and evaluate their covariance. In practice, the quantity −2 ln L total is minimized. Explicitly, this quantity is: where b 0 , x 0 , and d 0 are the nominal values (best estimates prior to the ND280 analysis) and V b , V x , and V d are the covariance matrices of the beam, cross section, and detector systematic parameters.

B. Fitting methods
A reference Monte Carlo sample of ND280 events is generated using the models described For the detector parameters, the reconstructed momentum and angle of the muon candidate are used. For cross section scaling parameters (e.g., x QE 1 ), weights are applied according to the true interaction mode and true energy. For other cross section parameters (e.g., M QE A ), including the FSI parameters, the ratio of the adjusted cross section to the nominal cross section (calculated as a function of the true energy, interaction type, and lepton kinematics) is used to weight the event. The FSI parameters are constrained by a covariance matrix constructed by using representative points on the 1-σ surface for the parameters in Table IV. The fit is performed by minimizing −2 ln L total using MINUIT [91]. Parameters not of interest to the oscillation analyses (e.g. ND280 detector systematic uncertainties) are treated as nuisance parameters.  Figure 17 shows the values of the ν µ flux and cross section parameters that are constrained by the near detector analysis for the oscillation analyses; Table XII lists the flux parameters   and Table XIII lists the values of the cross section parameters. These tables contain all of  constrained by ND280 are 7-10%. The estimators of these flux and cross section parameters have a strong negative correlation, however, because they use the rate measurements in the near detector. As a result, their contribution to the SK event rate uncertainty is less than 3%, significantly smaller than the individual flux and cross section parameter uncertainties.
A cross-check to this analysis is performed by studying a selection of electron neutrino interactions in ND280 [92], and finds that the relative rate of selected electron neutrino events to that predicted by MC using the best-fit parameter values from this analysis is R(ν e ) = 1.01 ± 0.10.      The energy of the incoming neutrino can be calculated assuming the kinematics of a CCQE interaction and neglecting Fermi motion: where E rec  We select ν µ CC candidate events using the selection criteria shown in Tab. XV. The momentum cut rejects charged pions and misidentified electrons from the decay of unobserved muons and pions. We require fewer than two Michel electrons to reject events with additional unseen muons or pions. After all cuts are applied, 120 events remain in the ν µ CC candidate sample. Figure 20 shows the candidate event spectra for the appearance (ν e ) and disappearance (ν µ ) channels. We monitor the vertex distributions of the candidate event samples for signs of bias that might suggest background contamination. Figure 21 shows the vertex distribution of the ν e CC candidate events in the SK tank coordinate system. We observe no unexpected clustering and combined KS tests for uniformity in r 2 and z yields a p-value of 0.6.

B. π 0 Rejection with the New Event Reconstruction Algorithm
As mentioned in the previous section, in order to select ν e CC events, we require that only one electron-like ring is reconstructed. The ν e CC selection criteria 1-5 in Tab. XIV are based on the information provided by SK event reconstruction software which has been used at SK for atmospheric neutrino and nucleon decay analyses [1] and, as shown in the table, we reject most of the background events by these selection cuts. The ν e appearance signal purity is 59.3% and the selection efficiency for the signal is 71.8%. The remaining backgrounds are predominantly NC single π 0 events, as one of the two decay γs from a π 0 is occasionally missed and the other γ forms an electron-like ring.
In order to reject such π 0 events, we employ a new event reconstruction algorithm which is based on the methods developed by MiniBooNE [96]. The new algorithm adopts a maximum likelihood method to reconstruct particle kinematics in the SK detector. For a given event, we construct a likelihood function which uses the observed charge and time information from the PMTs: In the equation, x represents particle track parameters such as the vertex, direction, and momentum which are to be estimated. The first index j runs over the PMTs which do not register a hit, and for each of such PMTs the conditional probability P j (unhit|x) of not registering a hit given x is evaluated. For each PMT which does register a hit, in addition to the hit probability, we calculate the probability density f q (q i |x) of observing charge q i as well as the probability density f t (t i |x) of the hit occurring at time t i . The estimated track parameters, x, are those that maximize the likelihood function. For every event we construct and maximize the likelihood assuming several different particle hypotheses, and particle identification is done using ratios of the maximum likelihoods for the different hypotheses.
In this analysis, we use a single electron hypothesis and a π 0 hypothesis for π 0 rejection. The single electron hypothesis has seven parameters which are the initial vertex position, time, direction, and momentum. Since a π 0 decays into two γs and produces two electronlike Cherenkov rings, the π 0 hypothesis is constructed by combining the charge and time contributions from two electron tracks which point back to a common vertex. In addition to the common vertex and the directions and momenta of the two γ tracks, each track has an additional free parameter which shifts its origin along its direction in order to account for photon conversion points. The π 0 hypothesis therefore has twelve parameters.
In order to distinguish signal ν e CC events from π 0 background events, we use the maximum likelihood values of the electron hypothesis L e and the π 0 hypothesis L π 0 as well as the reconstructed invariant mass m γγ obtained from the π 0 hypothesis. Figure 22 shows the 2D distributions of the logarithm of the likelihood ratio ln(L π 0 /L e ) vs. m γγ for signal ν e CCQE and background NC π 0 events which satisfy the ν e selection criteria 1-5, produced by MC.
We see a clear separation between the two event types, and we accept an event as a ν e CC candidate if it satisfies ln(L π 0 /L e ) < 175 − 0.875 × m γγ [ MeV/c 2 ], which is indicated by the diagonal line in the plots. As shown in Tab. XIV, the remaining NC background is reduced by roughly a factor of nine by introducing the π 0 rejection cut. After the cut, the purity In earlier published T2K ν e appearance analysis results [3,97], we used a π 0 rejection method which is different from what is described above [98]. To demonstrate the improvement over the previous method, Fig. 23 shows the efficiency for rejecting NC π 0 events for the two methods, plotted as a function of the energy of the less energetic γ. In calculating the efficiencies, only the events which satisfy the ν e selection criteria 1-5 are included. As the figure indicates, the rejection efficiency by the new method remains high even in cases where the energy of one of the two γs is low. By employing the new method, we have reduced the π 0 background remaining in the final ν e CC candidate event sample by 69% relative to the previous method.

C. Systematic uncertainty
This section describes the studies and treatment of uncertainty in modeling the SK detector that lead to systematic uncertainty in estimating the selection efficiency and background for the oscillation samples. We use SKDETSIM [3,9], a GEANT3-derived simulation of the SK detector, to model the propagation of particles produced by neutrino interactions. The GCALOR physics package is used to simulate hadronic interactions in water owing to its ability to reproduce pion interaction data around 1 GeV/c. However for pions with momentum below 500 MeV/c, custom routines are employed based on the cascade model used by NEUT to simulate interactions of final state hadrons. SKDETSIM incorporates the propagation of photons in water, subject to absorption, Rayleigh scattering, and Mie scattering.
The simulation of these processes is tuned using laser calibration sources in situ [93].
Control samples that are not related to the T2K beam spills are used to assess systematic uncertainty, including muons and neutrinos produced from cosmic ray interactions with the atmosphere (cosmic ray muons and atmospheric neutrinos) and combinations of simulated and cosmic ray data (hybrid-π 0 sample). As described below, cosmic ray muons are used to evaluate the systematic uncertainty due to the fully-contained (FC), fiducial-volume, and decay-electron requirements. Atmospheric neutrinos are used to assess uncertainty from the ring counting, particle identification, and π 0 rejection. The hybrid-π 0 sample is used to study the SK response to π 0 's. The uncertainties due to energy scale, modeling of pion final state interactions (FSI) and secondary interactions (SI) are evaluated separately.
Cosmic ray muon samples are used to estimate uncertainties related to the FC, fiducialvolume and decay-electron requirements, for the selections of both ν e and ν µ CC candidates.
The error from the initial FC event selection is 1% and is dominated by the event-by-event flasher rejection cut. The uncertainty in the fiducial volume is estimated to be 1% using the vertex distribution of cosmic ray muons which have been independently determined to have stopped inside the ID. The uncertainty due to the Michel electron tagging efficiency is estimated by comparing cosmic ray stopped-muon data and MC. This uncertainty is applied based on the fraction of events with true Michel electrons in the T2K beam MC. The rate of falsely identified Michel electrons is estimated from MC and 100% uncertainty in that rate is assumed. Overall, the event rate uncertainty related to the decay-electron requirements is small. For the ν e CC candidate sample, it is 0.2% for ν e CC events and 0.4% for ν µ CC and NC events. For the ν µ CC candidate sample it is 1.0%.
Other studies of systematic uncertainty in SK modeling divide simulated events into categories according to their final state (FS) topologies, with the criteria shown in Tab. XVI.
These topologies do not correspond exactly with true interaction modes due to subsequent interactions within the nucleus or with neighboring nuclei or because one or more particles are produced below Cherenkov threshold.
Atmospheric neutrino data are used to assess possible mismodeling of the ring count-  CC 1µ ν µ CC and N π 0 = 0 and N π ± = 0 and N P = 0 CC µ other ν µ CC and N π 0 = 0 CC µ π 0 other ν µ CC and N π 0 > 0 NC 1π 0 NC and not NC 1γ and N π 0 = 1 and N π ± = 0 and N P = 0 NC π 0 other NC and not NC 1γ and N π 0 ≥ 1 and not NC 1π 0 NC 1γ NEUT truth NC 1π ± NC and not NC 1γ and N π 0 = 0 and N π ± = 1 and N P = 0 NC other NC and not NC 1γ and not NC 1π 0 and not NC 1π ± and not NC π 0 other ing (RC), particle identification, and π 0 rejection for the first four FS topologies shown in Tab In order to adjust the modeling of ring counting, particle identification, and π 0 rejection, a set of parameters is defined to alter the cut values for these three classifiers. Separate parameters are used for the first four FS topologies in Tab. XVI and for each visible energy bin within those topologies. By adjusting these parameters, simulated events, generated according to models of the atmospheric neutrino flux, migrate between the branches in Tab. XVII thus changing the efficiency for true CC 1e and CC e other (CC 1µ and CC µ other) events in the ν e (ν µ ) core samples. Using the observed numbers of core and tail data events in each  is no more than 1%. This error is added in quadrature to the uncertainty for the CC 1e FS topology described above to give the total uncertainty on the NC 1γ background in the ν e CC candidate sample.
Muon decay-in-flight events make up a small background in the ν e CC candidate sample.
Such events can be misidentified because the decay electron is boosted in the direction of the parent muon and thus their Cherenkov rings can overlap. MC studies indicate that such events make up 19% of the background from ν µ interactions and its rate is assigned a 16% selection uncertainty. The remaining background from ν µ interactions are assigned a conservative 150% error. A conservative 100% uncorrelated error is assigned to the NC 1π ± and the NC other.
For the ν µ CC candidate sample, the dominant NC backgrounds are NC 1π ± and events with just a single proton (NC other). The relative uncertainty in this background, due to systematic uncertainties in ring counting and particle identification, is found to be 59%. The background from ν e interactions in the ν µ CC candidate sample is assigned a conservative 100% error.
All aspects of SK detector simulation that can affect the modeling of the SK candidate

Systematic uncertainties in pion interactions in the target nucleus (FSI uncertainties)
and SK detector (SI uncertainties) are evaluated by varying pion interaction probabilities in the NEUT cascade model. In the NEUT sample we store the information necessary to recompute the pion cascade using modified interaction probabilities to weight each event.
Altered CC sample distributions are produced using 16 representative points x F SI k on the 1-σ surface for the parameters. The covariance matrix V , which describes the variations in the number of events in the binned observables (N i ) due to the variation in x F SI , is given by The binning of this matrix is chosen to match that of the detector error covariance matrix shown in Table XVIII. A simulation of photo-nuclear (PN) interactions is incorporated into the SK MC. The  VII.

OSCILLATION MODEL AND PARAMETER ESTIMATION
The previous sections have described the T2K experiment and the way we model all elements of the experiment and neutrino interactions which are necessary to interpret our data, and how we use internal and external data to improve our models. In this section, we turn our attention to general aspects of estimating neutrino oscillation parameters from our data. The oscillation model is given along with the predictions for the probability for muon neutrino disappearance and electron neutrino appearance, the key observables for our experiment. We explain how we use external data for some of the oscillation parameters and the general approaches we use to estimate the remaining parameters. Finally, we characterize the importance of the different sources of systematic uncertainty. Sections VIII-X describe the individual analyses and their results in detail.

A. Oscillation model
The Pontecorvo-Maki-Nakagawa-Sakata (PMNS) matrix, U , defines the mixture of the mass eigenstates (ν 1 , ν 2 , and ν 3 ) that make up each flavor state: and it has become standard to parametrize this matrix, ignoring the Majorana phases, as: where s ij = sin θ ij , c ij = cos θ ij , and δ = δ CP is the CP-violating phase.
The ν µ -survival probability for a neutrino with energy E traveling a distance L is: where φ ij = ∆m 2 ij L 4E (17) in natural units and ∆m 2 ij = m 2 i −m 2 j is the difference in the squares of masses of eigenstates. The ν e -appearance probability, to first order approximation in matter effects, can be written as: Since the neutrino mass hierarchy (MH) is not yet known, we parametrize the large mass splitting by |∆m 2 | = ∆m 2 32 for normal hierarchy (NH, where m 3 is the largest mass) and |∆m 2 | = ∆m 2 13 for inverted hierarchy (IH, where m 3 is the smallest mass). It is not possible to estimate all of the oscillation parameters using only our measurements of ν µ -disappearance and ν e -appearance. Instead, we estimate the four oscillation parameters, |∆m 2 |, sin 2 θ 23 , sin 2 θ 13 , δ CP , and the mass hierarchy, and use external measurements for the solar oscillation parameters, sin 2 θ 12 and ∆m 2 21 , as we have negligible sensitivity to those. Figure 25 illustrates how our key observables depend on the two parameters, sin 2 θ 23 and δ CP , for the two mass hierarchies. In this figure the neutrino energy is at the oscillation maximum (0.6 GeV), and the other oscillation parameters are fixed (solar parameters as established in Sec. VII B and sin 2 θ 13 = 0.0243). To a good approximation, with our current dataset, ν µ -disappearance can be treated on its own to estimate θ 23 . The oscillation parameter dependence on ν e -appearance cannot be factorized, however. In order to estimate the full set of oscillation parameters and properly account for all uncertainties, it is necessary to do a joint analysis of ν µ -disappearance and ν e -appearance.

B. External input for oscillation parameters
Since our experiment is insensitive to the solar oscillation parameters, we fix them to the values sin 2 θ 12 = 0.306 and ∆m 2 21 = 7.5 × 10 −5 eV 2 /c 4 from [99]. As a check, the Bayesian analysis presented in Sec. X applies Gaussian priors with standard deviations (0.017 and 0.2 × 10 −5 eV 2 /c 4 ) and finds that the uncertainties in these parameters do not affect the intervals of the other oscillation parameters.
When combining the results for the T2K joint oscillation analyses in Secs. IX and X with the results from the reactor experiments, we use the weighted average of the results from the three reactor experiments Daya Bay, RENO, and Double Chooz which is: (sin 2 2θ 13 ) reactor = 0.095±0.01 [100]. In terms of the parametrization that we use in this paper, (sin 2 θ 13 ) reactor = 0.0243 ± 0.0026.

C. Oscillation parameter estimation
Sections VIII-X describe analyses which use T2K and external data to estimate oscillation parameters and provide frequentist confidence intervals or Bayesian credible intervals. Using the disappearance channel alone, the atmospheric oscillation parameters are studied using frequentist approaches. The disappearance and appearance channels are used in combination to study a larger set of oscillation parameters, using frequentist and Bayesian approaches.
This section describes general methods that are applied in these analyses.    and ν e CC (right) candidate events, using typical oscillation parameter values, with and without the ND280 constraint applied.
These measurements were performed within the framework of the PMNS oscillation model described in Section VII A and provided best-fit estimates and frequentist confidence intervals for the values of the mixing parameter sin 2 θ 23 and the mass-squared splitting ∆m 2

32
(∆m 2 13 ) in the case of the normal (inverted) hierarchy. Each successive measurement analyzed a larger dataset, and the most recent measurement provides the world's strongest constraint on sin 2 θ 23 [10]. This section gives a more detailed description of that analysis and the study of multi-nucleon effects. Reducing the uncertainty on the values of these two parameters is important for measuring CP violation in neutrino oscillations by T2K and other current and future experiments. Furthermore, precise measurements of sin 2 θ 23 could constrain models of neutrino mass generation [107][108][109][110][111][112].

A. Method
The ν µ -disappearance analysis is performed by comparing the rate and spectrum of reconstructed neutrino energies, Eq. (11), in the ν µ CC candidate event sample with predictions calculated from Monte Carlo simulation. The predicted spectrum is calculated by applying the survival probability in Eq. (16) to a prediction for the unoscillated rate and spectrum.
These predictions are derived from our models of the total expected neutrino flux at the detector (explained in Sec. II) and the cross section predictions for neutrino-nucleus interactions on water (described in Sec. III), which are constrained by near detector data (described in Sec. V), and a GEANT3 model of particle interactions and transport in the SK detector. The models of the flux, interaction physics, and detector include systematic parameters, whose uncertainties are accounted for in the analysis by using their corresponding covariance matrices.
The oscillation parameters are estimated using two independent maximum likelihood fits to the reconstructed energy spectrum. The fits use different likelihoods and software in order to serve as cross-checks to each other. One analysis uses an extended unbinned likelihood (M1), while the other uses a binned likelihood (M2). The log-likelihood definitions, ignoring constant terms, are: and • M2 likelihood, For the M1 likelihood, f (E rec ν,i | θ, g, x s , s) is the probability density of observing an event with reconstructed energy, E rec ν,i , given values for the oscillation and systematic parameters. The value of f (E rec ν,i | θ, g, x s , s) is calculated with a linear interpolation between the bins of a histogram of the normalized energy spectrum. For the M2 likelihood, the number of data and predicted events in the j th reconstructed energy bin, N d j and N p j respectively, are used instead.
Both ν µ -disappearance fits consider a total of 48 parameters: 6 oscillation parameters, 16 flux parameters, 20 neutrino interaction parameters and 6 parameters related to the response of SK. In order to find the best-fit values and confidence intervals for sin 2 θ 23 and |∆m 2 |, the profiled likelihood is maximized. Separate fits are performed for the different neutrino mass hierarchy assumptions.

B. Determining Confidence Intervals
As explained in Sec. VII C, the Neyman method with the approach recommended by Feldman and Cousins (FC) was used to calculate confidence intervals for the two oscillation parameters, sin 2 θ 23 and ∆m 2 32 (∆m 2 13 ), for the normal (inverted) hierarchy. The constant-∆χ 2 method does not provide correct coverage due to the physical boundary near sin 2 2θ 23 = 1 and because of the non-linear parametrization. Critical values of the FC statistic were determined on a fine grid of the two oscillation parameters of interest using 10,000 toy datasets at each point. Each toy dataset had a set of values of the systematic parameters sampled from a multi-dimensional Gaussian having means at the nominal values, and covariances V . Each oscillation parameter, sin 2 θ 12 , ∆m 2 21 , and sin 2 θ 13 , is sampled from a Gaussian with mean and sigma values listed in Sec. VII B. The values of δ CP are sampled uniformly between −π and +π. The systematic parameters and these additional oscillation parameters are removed from the likelihood function by profiling. In order to calculate an interval of just one oscillation parameter (sin 2 θ 23 or ∆m 2 ), we determine the critical values by marginalizing over the second oscillation parameter. The marginalization assumes that the probability is proportional to the likelihood using T2K data.

C. Results
Both the M1 and M2 analyses find the point estimates sin 2 θ 23 = 0.514 and ∆m 2 32 = 2.51× 10 −3 eV 2 /c 4 when assuming the normal mass hierarchy and sin 2 θ 23 = 0.511 and ∆m 2 13 = 2.48 × 10 −3 eV 2 /c 4 when assuming the inverted mass hierarchy.  Figure 27 shows the best-fit values of the oscillation parameters, the 2D confidence intervals calculated using the Feldman and Cousins method, assuming normal and inverted hierarchy, and the sensitivity at the current exposure. The size of the confidence interval found by the fit to the data is smaller than the sensitivity.
This arises because the best-fit point is at the physical boundary corresponding to maximum disappearance probability. The amount by which the region is smaller is not unusual in an ensemble of toy MC experiments produced under the assumption of maximal disappearance.  The best-fit spectrum from the normal hierarchy fit compared to the observed spectrum is shown in Figure 28, showing as well the ratio of the number of observed events to the predicted number of events with sin 2 θ 23 = 0. The observed oscillation dip is significant and well fit by simulation. The calculated 1D Feldman and Cousins confidence intervals are given in Table XXII. Figure 29 shows the -2∆ ln L distributions for sin 2 θ 23 and |∆m 2 | from the data, along with the 90% CL critical values.

D. Multi-Nucleon Effects Study
Recently, experimental [67,[113][114][115] and theoretical [24,25,[116][117][118][119][120][121][122][123][124][125][126][127][128][129] results have suggested that the charged-current neutrino-nucleus scattering cross section at T2K energies could contain a significant multi-nucleon component. Such processes are known to be impor- This section describes the joint 3-flavor oscillation analysis performed by combining the ν µ disappearance and ν e appearance channels using a frequentist approach. The oscillation parameters, θ = |∆m 2 |, sin 2 θ 23 , sin 2 θ 13 , and δ CP , described in Sec. VII A, are simultaneously determined. This is done by comparing the reconstructed energy spectra of the ν µ CC and ν e CC candidate events observed at SK, selected as described in Sec. VI, with the predicted reconstructed energy spectra. Point estimates of the oscillation parameters are found by minimizing the negative log-likelihood N d e,i ln N p e,i ( θ, g, x s , s) where N d µ,i (N d e,i ) is the observed number of ν µ CC (ν e CC) candidate events in the i th reconstructed energy bin, and N p µ,i (N p e,i ) is the corresponding predicted number of events, calculated as a function of the oscillation parameters θ and the vectors of systematic parameters, g, x s , s, as described for Eq. (20).
The negative log-likelihood function is minimized using MINUIT. As explained in Sec. VII, the solar oscillation parameters are kept fixed for this analysis. To combine our measurement with the reactor measurements, we add the term, where (sin 2 θ 13 ) reactor and σ reactor are given in Sec. VII B.
When maximizing the likelihood, the systematic parameters are allowed to vary in a

A. Results
Point estimates for the oscillation parameters and the expected number of events are summarized in Tab. XXIII. Notably, the value obtained for sin 2 θ 13 by T2K is larger than the value found by the reactor experiments, the best-fit value of sin 2 θ 23 is consistent with maximal disappearance, and the difference in ∆χ 2 between the solutions for each mass hierarchy is negligible.
The profiled ∆χ 2 of each oscillation parameter was obtained by minimizing the negative log-likelihood with respect to the systematic parameters and other three oscillation parameters using MINUIT. Figure 31 presents the profiled ∆χ 2 of each oscillation parameter,  reactor data, with different mass hierarchy assumptions using ∆χ 2 with respect to the best-fit point -that from the inverted hierarchy. The parameter |∆m 2 | represents ∆m 2 32 or ∆m 2 13 for normal and inverted mass hierarchy assumptions respectively. The lower left plot shows 1D confidence intervals in sin 2 θ 13 for different values of δ CP .
where the appearance probability is largest, as shown in Fig. 25.
The profiled ∆χ 2 as a function of each oscillation parameter are presented in Fig. 33 whereas the result from SK has sin 2 θ 13 fixed to the previous reactor value in [99]. In the three analyses δ CP was removed by profiling.  In order to thoroughly cross-check the analysis described above, an alternate frequentist joint fit analysis was performed which differs in the treatment of the systematic errors. This originated as part of an effort to simplify and reduce the computing power needed for the analysis and to perform a study of the future sensitivity of the experiment [133]. A new set   The posterior distribution, produced using Bayes' theorem, is too difficult to compute analytically. We use two numerical methods to perform the high-dimensional integral necessary when computing the posterior distribution: a Markov Chain Monte Carlo (MCMC) in Sec. X A and a sampling method in Sec. X B which is used as a cross-check.
A. Joint Near-Far Markov Chain Monte Carlo Analysis

Point estimates
To extract information about the point estimate of oscillation parameters from the posterior distribution generated by the MCMC, the density of points in 4-dimensional space was estimated using a kernel density estimator (KDE) [134,135]. A KDE estimates a PDF by smearing the discrete points of a MCMC in the 4 dimensions of interest. The Gaussian width of the smearing was set to be variable, and inversely proportional to the local density of MCMC points; this technique counters potential under-smoothing in low density regions and potential over-smoothing in high density regions. The maximum of the PDF produced by the KDE was then maximized using MINUIT to find the most probable value. In the case of using only T2K data, there is little sensitivity to the δ CP parameter, and so a line of most probable values was created by finding the 3-dimensional density of the MCMC at a series of values of δ CP .

Samples
Unlike the frequentist analyses described above, the joint near-far analysis does not use the covariance matrix produced by the ND280 analysis described in Sec. V. Instead, this analysis is performed simultaneously with the three ND280 ν µ CC samples, and the SK ν µ CC, and SK ν e CC samples. By fitting all samples simultaneously, this analysis avoids any error coming from neglecting non-linear dependencies of the systematic parameters constrained by ND280 analysis on the oscillation parameters.
The systematic uncertainties used for the ND280 samples are nearly identical to those in Sec. V with the following exceptions: the uncertainties on the cross section ratios σ νe /σ νµ and σν/σ ν are applied and the NC normalization uncertainties are divided into NC1π 0 , NC1π ± , NC coherent, and NCOther for all samples. Additionally, the number of bins in the ND280 detector systematic covariance matrix is reduced to 105, in order to reduce the total number of parameters. There are no differences in the systematic uncertainties for the SK samples. Ignoring constant terms, the negative log of the posterior probability is given by, The vector θ sr contains the solar oscillation parameters and for combined fits with reactor data sin 2 2θ 13 , with priors described in Sec. VII B. The priors on the other oscillation parameters of interest are uniform in sin 2 θ 13 between 0 and 1, sin 2 θ 23 between 0 and 1, |∆m 2 32 | between 0.001 and 0.005 eV 2 /c 4 , and δ CP between −π and π. Additionally, the prior probability of the normal hierarchy and inverted hierarchy are each 0.5. Priors for the systematic parameters are the multivariate Gaussian terms shown, with the exception of the cross section spectral function parameters which are given a uniform prior between 0 and 1.
In this analysis, both ND280 and SK MC sample events are weighted individually for  all parameters in the analysis. This means that each PDF is rebuilt from the MC at every iteration of the MCMC. This has the advantage of retaining shape information within each bin of the PDF, especially desirable for the oscillation parameters, and also allows a more natural treatment of certain parameters such as the SK energy scale uncertainty which may cause events to migrate between bins. The increase in computational load was offset by performing certain calculations on GPUs, including the event-by-event calculation of oscillation probability [136].

Results
The MCMC was run with 5.6 × 10 7 steps using only T2K data, and for 1.  over the mass hierarchy; in particular, the most probable value line appears to be offset from the center of the credible region. This is because the most probable value line is for the preferred inverted hierarchy, and the credible intervals are marginalized over hierarchy. Fig. 40 shows the posterior probability for δ CP with 68% and 90% credible intervals for the T2K+reactor combined analysis. Figure 41 shows comparisons of SK ν µ CC and ν e CC candidate events with the best-fit spectra produced from the T2K-only and T2K+reactor combined analyses. Each best-fit spectrum is formed by calculating the most probable value for the predicted number of events in each energy bin, using all of the MCMC points from the corresponding analysis. The fit spectrum for ν µ CC events does not change appreciably when the reactor prior is included, but the ν e CC fit spectrum shows a noticeable reduction in the number of events. Another interesting feature of this analysis is that it provides a natural way to study the preference of the data for normal versus inverted hierarchy and lower versus upper octant in θ 23 . This is done simply by comparing the total probability (that is, the number of MCMC steps) in the region of interest. Table XXVII shows the probability for the various cases for the T2K-only analysis. Note that the inverted hierarchy is preferred in

B. Cross-check analysis
A second Bayesian joint analysis (JB2) is used to cross-check the results from the analysis described above (JB1). Like the frequentist analyses, JB2 uses the output from the ND280 analysis described in Sec. V to constrain some of the systematic uncertainties, by applying them as prior probability densities. Also, JB2 does not use (by default) the reconstructed energy spectrum for ν e candidate events, but instead the 2D distribution of the momentum and angle with respect to beam direction (p e , θ e ) of the particle reconstructed as an electron in those events. This is similar to what was used in the previously reported electron neutrino appearance observation [4]. JB2 can also use the shape of the reconstructed energy spectrum for ν e candidate events, so that the results of the two analyses can be compared in both cases. On a technical level, MCMC is not used in this second analysis to marginalize over the nuisance parameters; the integration is done numerically by averaging the posterior probability over 10,000 throws of those parameters following their prior distribution. Finally, a second technical difference is that in JB2 the weighting is not done event by event but by (p e , θ e ) bin.
C. Comparison of analyses

Comparison of Bayesian joint analyses
The results obtained with the two joint Bayesian analyses are very similar, both in terms of posterior probabilities for the different models and credible intervals for the oscillation parameters. The comparison in the case of the posterior probability for δ CP is shown in Fig. 44: the posterior probabilities obtained by the two analyses are similar, and most of the difference comes from JB2 using the (p e , θ e ) spectrum shape for ν e candidate events instead of the reconstructed energy spectrum shape as JB1 does. This also shows that at the current statistics, fitting the near and far detector samples at the same time and using the output of the near detector analysis described in Sec. V are equivalent.

Treatment of the systematic uncertainties
We also compare, using JB2, the marginalization and profiling approaches described in Sec. VIIC to reduce the dimensionality of the likelihood. In the case of δ CP , the marginal (obtained by integrating the product of the likelihood and priors over the nuisance parameters) and profile (obtained by maximizing the likelihood with respect to those parameters) XI.

CONCLUSIONS
With the data collected between 2010 and 2013 we have analyzed the ν µ -disappearance to estimate the two oscillation parameters, |∆m 2 | and sin 2 θ 23 . For the first time, we have used a combined analysis of ν µ -disappearance and ν e -appearance, to advance our knowledge of the oscillation parameters |∆m 2 |, sin 2 θ 23 , sin 2 θ 13 , δ CP , and the mass hierarchy.
Uncertainty arising from systematic factors has been carefully assessed in the analyses and its effect is small compared to statistical errors. Our understanding of neutrino oscillation will continue to improve as we collect more data in the coming years, in both neutrino and anti-neutrino mode [133]. The general approach followed in this paper that couples the separate analysis of the beamline, neutrino interactions, near detectors, and far detector, through sets of systematic parameters and their covariances, will be extended to deal with additional information from anti-neutrino data and from additional selections with the near detector data.

ACKNOWLEDGMENTS
We thank the J-PARC staff for superb accelerator performance and the CERN NA61/SHINE collaboration for providing valuable particle production data. We acknowledge the support