First T2K measurement of transverse kinematic imbalance in the muon-neutrino charged-current single-$\pi^+$ production channel containing at least one proton

This paper reports the first T2K measurement of the transverse kinematic imbalance in the single-$\pi^+$ production channel of neutrino interactions. We measure the differential cross sections in the muon-neutrino charged-current interaction on hydrocarbon with a single $\pi^+$ and at least one proton in the final state, at the ND280 off-axis near detector of the T2K experiment. The extracted cross sections are compared to the predictions from different neutrino-nucleus interaction event generators. Overall, the results show a preference for models which have a more realistic treatment of nuclear medium effects including the initial nuclear state and final-state interactions.

(Dated: February 8, 2021) This paper reports the first T2K measurement of the transverse kinematic imbalance in the singleπ + production channel of neutrino interactions. We measure the differential cross sections in the muon-neutrino charged-current interaction on hydrocarbon with a single π + and at least one proton in the final state, at the ND280 off-axis near detector of the T2K experiment. The extracted cross sections are compared to the predictions from different neutrino-nucleus interaction event generators. Overall, the results show a preference for models which have a more realistic treatment of nuclear medium effects including the initial nuclear state and final-state interactions.

I. INTRODUCTION
In recent years, neutrino oscillation measurements have reached unprecedented precision [1][2][3][4][5][6][7]. The next generation of long-baseline (LBL) neutrino oscillation experiments, such as DUNE [8] and Hyper-Kamiokande [9], aim to measure important neutrino properties such as the CP-violating phase and mass ordering [10,11]. This requires unprecedented constraints on the neutrino flux, neutrino cross sections and interaction model, and detector response. Amongst all the systematic uncertainties, the limited knowledge of neutrino-nucleus interactions, especially those related to nuclear medium effects, is particularly concerning because it can cause biases in event classification and neutrino energy reconstruction. In the latest T2K oscillation analysis [12], the uncertainty in nucleon removal energy in charged current quasielastic (CCQE) interactions is the dominant systematic component. In order to reduce its value, a more refined analysis is necessary to avoid potential biases in the next measurement of ∆m 2 32 . In the range of energies of current LBL experiments, neutrinos interact predominantly with nucleons. The initial state nucleon can be described by Fermi motion together with nucleon-nucleon correlations in a mean field potential. After a neutrino interacts with a nucleon, the residual nucleus may be left in a simple oneparticle-one-hole (1p1h) excited state, or collective 1p1h excitations described by random phase approximations (RPA) [13][14][15][16][17]. It is also possible to have two-particletwo-hole (2p2h) excitations due to meson-exchange currents (MEC) or short-range correlations [17][18][19][20][21][22][23]. However, in most generators, these correlations are only implemented in the quasielastic (QE) channel, not for the resonant production (RES) or deep inelastic scattering * also at INFN-Laboratori Nazionali di Legnaro † also at J-PARC, Tokai, Japan ‡ affiliated member at Kavli IPMU (WPI), the University of Tokyo, Japan § also at National Research Nuclear University "MEPhI" and Moscow Institute of Physics and Technology, Moscow, Russia ¶ also at the Graduate University of Science and Technology, Vietnam Academy of Science and Technology * * also at JINR, Dubna, Russia † † also at Nambu Yoichiro Institute of Theoretical and Experimental Physics (NITEP) ‡ ‡ also at BMCC/CUNY, Science Department, New York, New York, U.S.A.
Moreover, after the primary neutrino-nucleon interaction, the outgoing hadrons have to propagate through the nuclear remnant before they can be detected. Finalstate interactions (FSI) may cause energy dissipation and hadron absorption, or conversely induce the emission of additional hadrons. As a result, FSI can change the finalstate topology of a neutrino-nucleon interaction, making the identification of primary neutrino-nucleon interaction and the measurement of primary hadronic kinematics difficult. Neutrino cross sections are often measured in terms of experimentally accessible final-state topologies, e.g. in charged-current (CC) interactions, the CC0π topology has only one charged lepton, any number of nucleons and no other particles; the CC1π + topology has only one charged lepton, one π + , any number of nucleons and no other particles.
To achieve the designed sensitivity of future LBL experiments, nuclear effects have to be modelled accurately and consistently amongst all interaction channels. Experimental studies probing nuclear effects in carbon, through the measurement of transverse kinematic imbalance (TKI) in CC interactions [24,25], have been performed in T2K [26] and MINERνA [27][28][29]. TKI explores the lepton-hadron correlations on the plane that is transverse to the initial neutrino direction and helps precisely identify intranuclear dynamics [25][26][27][28][29][30][31][32][33][34][35][36], or the absence thereof [24,[37][38][39][40], in neutrino-nucleus interactions. These measurements, in particular, either focused on final-state topologies without any pions, or final-state topologies with at least one neutral pion. These studies suggest that modelling nuclear effects with Fermi gas initial state models is insufficient, but more data is needed to draw solid conclusions.
In this paper, we describe the first measurement of the ν µ cross section on hydrocarbon as a function of TKI variables in CC production of exactly one π + and no other mesons, and at least one proton. We introduce TKI in Section II and the T2K experiment in Section III. The event simulation and event selection of the analysis are described in Section IV and Section V respectively. Then, the analysis procedure is explained in Section VI, followed by the interpretation of results in Section VII. We conclude in Section VIII.

II. OBSERVABLES
In a ν µ CC RES π + interaction on a free proton p, ν µ + p → µ − + π + + p, (1) a ν µ interacts with an initial-state p to produce a finalstate µ − , π + and p. This is the most important channel that produces π + with the T2K neutrino beam which is narrowly peaked at 0.6 GeV. However, in most neutrino experiments, the target involves some nucleus, A, heavier than hydrogen. In general, a ν µ CC1π + interaction with at least one proton in the final-state can be written as where A is the final-state hadronic system consisting of the nuclear remnant and other possible knocked-out nucleons. Apart from the RES interaction in Eq. (1), this topology also includes DIS interactions where multiple pions are produced and some are subsequently absorbed through FSI, leaving only one π + visible in the detector. Alternatively, CCQE interactions can be included in this topology when an additional π + is produced through FSI. The kinematics of the µ − , π + and p tracks are used to construct the TKI. If there is more than one proton observed in the final state, only the highest momentum one is considered. The set of three TKI variables, δp T T , p N and δα T , were first introduced in Refs. [24,25,30,33]. These observables are designed to characterize the nuclear effects that are most relevant to oscillation experiments: the initial nuclear state, such as the Fermi motion of initial state nucleon and the nucleon removal energy, and the FSI of outgoing hadrons. The term "transverse" refers to the fact that all these observables are closely related to the transverse momentum component p i T (with respect to the incoming neutrino direction) of the final-state particle i. In this analysis, the relevant transverse momenta are the transverse momenta of the muon, p µ T , pion, p π T , and proton, p p T . The first observable δp T T is the double-transverse momentum imbalance [24], illustrated in Fig. 1a. A doubletransverse axis is defined aŝ and the pion and proton momenta are projected onto this axis: The imbalance is defined as In the absence of nuclear effects, δp T T = 0 is expected due to momentum conservation. Inside a nuclear FIG. 1. Schematic illustration of the TKI variables. The total momentum of particle i is given by pi, while its transverse component with respect to the neutrino direction is represented by p i T . In (b), the black circle represents the initial nucleon; the gray plane shows the transverse plane; the orange circles and dashed lines indicate possible FSI experienced by the outgoing hadrons. Figures adapted from Refs. [25,29]. medium, an imbalance is caused by the initial state of the bound nucleon and the FSI experienced by the outgoing pion and proton.
The second observable p N is the initial nucleon momentum. Assuming the target nucleus is at rest and there are no FSI, p N can be computed following the steps in Ref. [33]. The transverse component of p N is equal to δ p T which is the sum of the transverse momenta [25] ( Fig. 1b): The longitudinal component of p N is given by [30] where p i L and E i are the longitudinal momentum and the energy of the final-state particles. The target nucleus mass M A and the residual nucleus mass M A are related by where M p is the proton mass, and p = 26.1 MeV [41] is the proton mean excitation energy for carbon. The total initial nucleon momentum p N is given by [30] which probes the Fermi motion inside the nucleus. Smearing by FSI can shift the peak position of p N , and cause a long tail in the region of large imbalance (similarly for δp T T ). The third observable δα T is the transverse boosting angle [25]: as illustrated in Fig. 1b. This observable quantifies whether the hadronic system is accelerated or decelerated by nuclear effects. Without FSI, the isotropic Fermi motion of the initial-state nucleon would produce a rather flat δα T distribution. However, FSI usually slows down the outgoing hadrons, making δα T > 90 o . Therefore, the strength of FSI can be inferred from the shape of δα T . In the case where there are multiple nucleons emitted, these nucleons are not included in the above calculation and very likely result in a large imbalance in all the TKI variables.

III. THE T2K EXPERIMENT
The Tokai-to-Kamioka (T2K) experiment [42] is a LBL accelerator-based neutrino experiment measuring oscillations with a ν µ (ν µ ) beam. The neutrino beam is produced at the Japan Proton Accelerator Research Complex (J-PARC) which is located on the East coast of Japan in Tōkai, Ibaraki. The neutrino beam is discussed in more detail in Section III A. J-PARC is also home to a suite of near detectors used to measure the properties of the unoscillated beam.
The near detector complex is located at 280 m from the neutrino beam production target and consists of several detectors. INGRID [43] is an on-axis detector consisting of an array of 16 iron/scintillator modules, which precisely measures the beam direction and intensity. The detector of primary interest for this analysis is the Near Detector at 280 m (ND280) which is placed 2.5 • away from the beam axis and measures neutrino interactions for the off-axis flux. It is discussed in more detail in Section III B. The WAGASCI [44] and BabyMIND [45] detectors are located in the same near detector complex but are situated 1.5 • off-axis.
The far detector Super-Kamiokande [46] is a 50 kt water Cherenkov detector located at a distance of 295 km away from the J-PARC facility on the West coast of Japan in Hida, Gifu. Super-Kamiokande is on the same off-axis angle as ND280. Neutrino CC interaction events can be classified into ν µ and ν e like, according to the shape of Cherenkov rings of the outgoing leptons.

A. Neutrino Beam
The J-PARC facility utilizes a 30 GeV proton beam as the primary beamline. A proton spill consists of eight bunches spaced 580 ns apart and is produced every 2.48 s. The beam power has increased over time and reached 520 kW during the latest data-taking period in 2019. To produce a neutrino beam, the proton beam is collided with a 91.4 cm graphite target to produce a secondary beam which is primarily composed of pions and kaons. Three magnetic horns are used to focus positively (negatively) charged hadrons which then decay to produce a beam dominated by ν µ (ν µ ). The magnetic horns are operated with a current of 250 kA (−250 kA) to produce a ν µ (ν µ ) beam. The data taken while producing a ν µ (ν µ ) beam is qualified as neutrino-mode (antineutrino-mode). The focused beam of hadrons then enters a helium-filled, 96 m long decay volume where they decay to produce neutrinos. At the end of the decay volume there is a beam dump and, behind this, a muon monitor [47,48] which is used to monitor the stability of the secondary beam.
The neutrino beams are made up of ν µ ,ν µ , ν e andν e components. The neutrino flux predictions and the different flavour components at ND280 are shown in Fig. 2 [49]. The off-axis configuration allows a narrow energy spectrum with a peak energy of around 0.6 GeV.

B. The off-axis Near Detector
In this analysis, we measure the ν µ differential cross sections as a function of TKI variables at the off-axis detector ND280. As shown in Fig. 3, ND280 is composed of an upstream π 0 detector (PØD) [50], followed by a central tracker region, all surrounded by an electromagnetic calorimeter (ECal) [51]. The outermost component is the former UA1/NOMAD magnet, which provides a 0.2 T dipole field, and contains scintillator modules in the air gaps acting as the side muon range detector (SMRD) [52]. The central tracker region contains two fine grained detectors (FGD1 and FGD2) [53] and three time projection chambers (TPCs) [54]. The FGDs are instrumented with finely segmented scintillator bars which act as both the target mass and particle tracker. The scintillator bars are made of 86.1% carbon, 7.4% hydrogen and 3.7% oxy-gen by mass. The bars are oriented alternately along the two detector coordinate axes (XY axes) transverse to the incoming neutrino beam (Z axis), and allow 3D tracking of charged particles. The most upstream FGD (FGD1) is composed of 15 XY planes of scintillator with each plane having 2 × 192 bars. The downstream FGD (FGD2) has seven XY planes of scintillator with six 2.54 cm thick layers of water in between, which allows cross section measurements to be made on water. This study focuses on carbon interactions and only events occurring in FGD1 are analyzed. For charged particles entering the TPCs, the curvature of the particle's track and thus its momentum can be determined in the presence of the magnetic field with a resolution of 10% at 1 GeV. In combination with the measurement of energy loss per unit distance, TPCs provide high quality particle identification (PID) for charged particles.
The ECal is a sampling calorimeter consisting of three key parts: the PØD ECal which surrounds the PØD; the Barrel ECal which surrounds the FGDs and TPCs; and the Downstream ECal which is located downstream of the FGDs and TPCs. The Barrel ECal and Downstream ECal together are referred to as the tracker-ECal. All ECals use layers of plastic scintillator bonded to lead sheets, and each alternating scintillator layer is rotated by 90 • to give 3D reconstruction. The tracker-ECal is designed to complement the FGDs and TPCs by giving detailed reconstruction of electromagnetic showers and a secondary PID, with an energy resolution of 10% at 1 GeV.

IV. EVENT SIMULATION
For all T2K analyses, we need a reference Monte Carlo (MC) simulation to get a prediction based on the nominal neutrino flux, neutrino interaction model and detector effects. Data are then compared to MC to extract the physics quantities of interest and estimate the systematic uncertainties.
The modelling of the T2K neutrino flux [49] starts with the modelling of interactions of protons with the graphite target, which is done using the FLUKA 2011 package [55,56]. Outside the target, the simulation of hadronic interactions and decays is done using the GEANT3 [57] and GCALOR [58] software packages. Hadronic interactions are further tuned with the recent measurements of π ± yields performed by NA61/SHINE experiment using a T2K replica target [59]. The conditions of the proton beam, magnetic horn current and neutrino beam axis direction are continuously monitored and incorporated into the simulation. This data-driven strategy helps to reduce the neutrino flux uncertainty near the flux peak (0.5 -0.6 GeV) to 5%. This results in a significant improvement with respect to previous T2K cross-section analyses [60,61] where the uncertainty was around 8.5% [62]. A comparison of the flux uncertainty used in this analysis and the flux uncertainty used in previous T2K analyses is shown in Fig. 4. Neutrino-nucleus interactions and FSI of the outgoing particles are simulated using the neutrino event generator NEUT version 5.4.0 [63,64]. NEUT uses the spectral function (SF) in Ref. [65] to describe the CCQE cross section. The modelling of 2p2h interactions is based on the model from Nieves et al. [66]. The RES pion production process is described by the Rein-Sehgal model [67] with updated nucleon form-factors [68] and an axial mass (M RES A ) of 1.07 GeV/c 2 . The model contains contributions from non-resonant, I 1/2 pion-production channels. The nuclear model used for RES is a relativistic global Fermi gas (RFG) [69], without a removal energy and with a Fermi momentum of 217 MeV/c. DIS interactions are modelled using the GRV98 [70] parton distribution functions with corrections from Bodek and Yang [71]. In the low invariant hadronic mass, W, region (1.3 < W ≤ 2.0 GeV/c 2 ) a custom hadronisation model [72] is used with suppressed single pion production to avoid double counting RES interactions. For W > 2 GeV/c 2 , PYTHIA/JETSET [73] is used as the hadronization model. The FSI, describing the transport of hadrons produced in elementary neutrino interaction through the nucleus, are simulated using a semi-classical intranuclear cascade model. The NEUT cascade model has been tuned to external pion-scattering data, which is described in Ref. [74].
Outside the nucleus, final-state particles are then propagated through the detector material using GEANT4 version 4.9.4 [75]. The physics list [76] QGSP BERT is used for the hadronic physics, emstandard opt3 for the electromagnetic physics and G4DecayPhysics for the particle decays. The pion secondary interactions are handled by the cascade model in NEUT. The detector readout is simulated with a custom electronics simulation [42].

V. DATA AND EVENT SELECTION
In this analysis, the neutrino-mode data collected between 2010 and 2017 is used, which corresponds to 11.6 × 10 20 protons on target (POT) and an integrated muon neutrino flux of 2.2 × 10 13 /cm 2 . Events are required to have an interaction vertex in the FGD1 fiducial volume (FV), which includes all the XY planes of scintillator except for the most upstream one, and excludes the outermost five bars on either end of each layer. This leaves the FV with 2 × 182 × 14 bars, and a total mass of approximately 973 kg. The MC contains simulated data equivalent to 195.1 × 10 20 POT.

A. Signal definition
The goal of this analysis is to characterize nuclear effects in ν µ CC1π + interactions on carbon using neutrino interactions inside FGD1, which is a hydrocarbon (CH) target. Since the CC1π + production on carbon and on hydrogen cannot be clearly separated, the combined cross section on CH is measured, with the TKI variables on hydrogen calculated in the same way as on carbon: for hydrogen signal events, in which there are no nuclear effects, it is expected that δp T T = 0 and p N ≈ 26 MeV/c. δα T is undefined for interactions on hydrogen because δp T = 0. A flat distribution across 0-180 o is assigned because it resembles the real δα T distribution due to the small but non-vanishing isotropic Fermi motion of a free proton.
To ensure the cross section results are not dependent on the signal model used in the reference T2K simulation, extensive precautions are taken in the analysis. A crucial one is to have the signal definition only be reliant on observables experimentally accessible to ND280. Therefore, the signal is defined as any event with one µ − , one π + and no other mesons, and at least one proton in the final state, so that there is need to account for the pion and proton FSI. Hereafter, the signal topology is denoted as CC1π + Xp, where X≥1. In order to mitigate model dependence in the efficiency correction, phase-space restrictions are applied in the signal definition to restrict the measurement to the regions of kinematic phase space ND280 is sensitive to. These restrictions are defined in Table I. However, the consideration of three-particle final states in this analysis necessitates the inclusion of a high dimensional kinematic phase space over which the efficiency cannot be kept entirely flat with simple phasespace constraints. This leads to a potential source of bias from the input neutrino interaction model predictions.
To alleviate this concern, additional model uncertainties are added (discussed in Section VI B) to allow a variation of the input simulation in regions of the underlying particle kinematics where the efficiency is not flat. The size of this uncertainty roughly double the largest variation in the efficiency seen from a wide variety of different generator predictions (broadly spanning those shown in Section VII A). TABLE I. CC1π + Xp signal phase-space restrictions for the post-FSI final-state particles. The angle θ is relative to the neutrino direction. For events with multiple protons, only the highest momentum proton is considered, and other protons are ignored.

Particle
Momentum We select one signal sample for the events of interest, and four control samples to constrain the number of background events in the signal sample. The five samples are shown schematically in Fig. 5.

B. Signal sample selection
The signal sample contains neutrino events with exactly one µ − track, one π + track, and at least one proton track, maximizing the number of signal events selected with minimal background.
The selection starts by searching for a good quality µ − track. Events within a 120 ns time window around one of the eight bunch centers per 5 µs spill structure of the beam are considered. The highest momentum, negatively charged track originating from the FGD1 FV and making a long track through the downstream TPC is chosen to be a µ − candidate. Other detector activities in or around FGD1 are used as a veto to ensure the µ − track is not a broken segment of another track from outside the FV. Then a muon PID cut is applied based on the energy loss and momentum measurement in the TPC as in Ref. [77]. After this step a ν µ CC sample of 90.3% purity is obtained.
Next, all other tracks originating from the FGD1 FV with a long segment in the TPC are classified by the TPC PID. For positively charged tracks, three particle hypotheses are considered: π + , e + and proton; for negatively charged tracks, only two particle hypotheses are considered: π − and e − . Events with exactly one π + track, and at least one proton track are selected. Those with π − or e ± are rejected because they are likely to be the products of DIS or other background interactions.
Additional pions are identified in FGD1 and the tracker-ECal. Tracks fully contained inside FGD1 are classified into pions or protons if the energy deposition and range are consistent with the corresponding particle hypotheses. Michel electrons [78] are also identified by looking for a time-delayed FGD1 hit cluster, and are regarded as products of the pion-muon-electron decay chain. The tracker-ECal is employed to identify isolated objects that are consistent with a photon shower, and tags these as products of π 0 → 2γ decay. Events with additional charged pions in FGD1 or π 0 in the ECal are rejected.
In the final step, events with additional tracks in FGD1 (either the fully contained tracks that are not classified, or the non-fully contained tracks without TPC PID) are rejected to reduce the low energy pion backgrounds that are missed by the pion selection processes. Then we require the µ − , π + and p tracks to have their starting positions to be within a box of 50 mm×50 mm×30 mm in the XY and Z planes. This ensures the tracks are coming from the same interaction vertex. Events that are not reconstructed to have matched the kinematic requirements in Table II are put into an out-of-phase-space (OOPS) bin. Compared to the signal definition in Table I, the kinematic cuts have slightly larger ranges in momenta to compensate for the finite momentum resolution. The extremely good angular resolution (about 1 • ) allows us to use the same angular restriction. TABLE II. Kinematic cuts for the reconstructed particles in the analysis samples. The particle type and kinematics are the reconstructed quantities. The angle θ is relative to the neutrino direction. For events with multiple reconstructed protons, only the highest momentum proton is considered, and other protons are ignored.

Particle
Momentum Following the signal sample identification, the selected events (except the OOPS bin) are binned in one of the reconstructed TKI variables and the reconstructed highest proton momentum, p p . The binning in TKI variables is the same as that used in the cross section extraction in Section VI. The binning in p p helps to correct for the bias in estimating selection efficiencies. The binning in p p is chosen over other kinematic variables because nucleon emission from neutrino interactions is less understood than pion or muon emission. In addition, the TPC proton detection threshold is around 400 MeV, which might significantly affect the efficiency. Table III summarizes the signal sample binning. The CC1π + Xp cross sections are measured as a function of a single TKI variable only, thus the number of reconstructed bins is much more than the number of cross-section bins. For example, in the δp T T measurement, there are six p p bins for each of the five δp T T bins in the signal sample. In total there are 6 × 5 = 30 signal sample bins to extract the differential cross sections in five bins of δp T T . Fig. 6 shows the distribution of the reconstructed TKI variables and p p in the signal sample (without the OOPS bin). A total of 366 events are observed in data. The overall signal selection efficiency is around 14%. When broken down by final-state topology, the total CC1π + 1p (one proton) and CC1π + Np (multiple protons) signal purity is 61.1%. The four categories of CC-other events with multiple pions in the final-state, CC1π + 1π − , CC1π + Xπ 0 , CC-other-Xπ 0 and CC-other-  0π 0 , are mostly produced by DIS interactions and are the dominant backgrounds. Details on how to constrain these backgrounds are described in Sec.V C. There are also small amounts of neutral current (NC) andν µ /ν e /ν e events where a π − /e − is misidentified as a µ − . In most cases the misidentification comes from NC interactions. The contribution from out of fiducial volume (OOFV) events is almost negligible. The OOPS background in Fig. 6 refers to CC1π + Xp events which do not satisfy the phase-space restrictions in Table I, and the separated OOPS bin is used to constrain this background.

C. Control sample selection
To better constrain the CC-other background in the signal sample, dedicated control samples (on the right of Fig. 5) are selected based on the number of charged and neutral pions identified in the events. Following the FGD1-TPC µ − , π + and p tracks selection described in Section V B, the control samples require the identification of additional π ± tracks in the FGD/TPC or the identification of a π 0 in the tracker-ECal. These events are then classified into four samples according to the additional identified pions: (i) CC1π + 1π − enriched sample -events with one π − candidate from FGD1 or the TPC; (ii) CC1π + Xπ 0 enriched sample -events with π 0 candidates from the ECal; (iii) CC-other-Xπ 0 enriched sample -events with charged pion candidates from FGD1 or the TPC, and π 0 candidates from the ECal; (iv) CC-other-0π 0 enriched sample -events with charged pion candidates from FGD1 or the TPC, excluding the case of single π − candidate.
The four separate samples allow for better characterization of the pion emission model and detector responses to different particles compared to a single CC-other sample. The same kinematic cuts in Table II are applied to the µ − , highest momentum π + and p tracks, and the TKI variables are calculated using only these tracks. The selected events are binned in the reconstructed TKI variable only, using the same binning in Table III. Fig. 7 shows the reconstructed TKI variable distributions for the four control samples. The nominal MC shows a deficit of events and also some shape discrepancies with respect to data, indicating the need for background correction.

A. Binned likelihood fitting
The analysis is performed using an unregularized binned likelihood fit as in Refs. [26,60,61,77,79], with control samples to constrain the background, to unfold the detector smearing and extract the number of selected signal events from the signal sample. Compared to previous cross-section analyses, significant improvements have been achieved in the analysis framework, including the use of principle component analysis to reduce the dimensionality of the fit, and the proper treatment of MC statistical uncertainties. An unregularized fit means that there is no prior constraint on the shape of TKI from the input signal model, thus reducing model bias on the fitted cross sections. The numbers of signal events (and thus cross sections) as a function of the three TKI variables are fitted independently in this study.
The input MC is varied by a set of fit parameters, and the set of parameters which best describes the observed data is extracted together with its associated errors. The fit parameters of primary interest are the "signal template parameters", c i , which scale the number of signal events in the truth TKI variable bin i without prior constraints. The remaining parameters are the nuisance parameters which describe plausible systematic variations of the flux, detector response and neutrino interaction model. The effect of these parameters is propagated to the number of selected events in the reconstructed bins.
The best-fit parameters are found by minimizing the following negative log-likelihood (χ 2 ): where and Eq. (12) is the modified Poisson likelihood ratio which includes the statistical uncertainty of finite MC statistics using the Barlow-Beeston method [80,81]. bin j, for MC and data respectively. β j is the Barlow-Beeston scaling parameter given by and σ 2 j is the relative variance of N M C j . In the limit of infinite MC statistics, σ j → 0 and β j → 1 which gives the standard Poisson likelihood ratio. Eq. (13) describes how well the nuisance parameters a syst agree with their prior values a syst prior , where V syst cov is the covariance matrix describing the confidence in the prior values as well as correlations between parameters.
The MC prediction N MC j in the signal and control samples is composed of both the signal and background events, which can be written as where N sig i,j and N bkg i,j are the number of signal and background events in the truth bin i, contributing to the reconstructed bin j, predicted by the T2K MC; w sig i,j and w bkg i,j are the event weights coming from the same set of systematic variations and thus are correlated.

B. Sources of systematic uncertainties
Three sources of systematic uncertainties are considered in this analysis.
Neutrino flux uncertainty: This is parametrized as scale factors in bins of true neutrino energy (same binning as in Fig. 4). Such scale factors are constrained by their prior uncertainty, encoded in a covariance matrix. At the same energy, identical event weights are applied on the signal and background events.
Detector uncertainty: The detector response (efficiency and resolution) is not perfectly modelled in the simulation. Dedicated and independent control samples are used to evaluate each possible uncertainty based on the data-MC agreement. The overall detector uncertainty is parametrized as a covariance matrix that describes the rate uncertainty and correlation between each reconstructed bin. The uncertainty related to the modelling of the pion secondary interactions, one of the largest detector systematics in previous T2K analyses, has been reduced by around 40% using external data and the cascade model implemented in NEUT [74]. In the signal sample and control samples without reconstructed π 0 , the biggest uncertainty comes from the modelling of proton secondary interactions which causes a 5% uncertainty on the event rate. On the other hand, π 0 -tagging uncertainty is dominant (around 10%) in the control samples with reconstructed π 0 . (1.07±0.15 GeV/c 2 ), the value of the axial form factor at zero transferred 4-momentum C A 5 (0.96±0.15), and the normalization of the isospin non-resonant component I 1/2 (0.96±0.40) predicted in the Rein-Sehgal model. Initial central values and uncertainties for these parameters are obtained in a fit to low energy neutrino-deuterium single pion production data from ANL [82,83] and BNL [84][85][86] (flux-corrected data in Ref. [87] is used), and carbon-like data from MiniBooNE [88]. One additional parameter varying the ∆ ++ decay width with 50% uncertainty, and ad hoc scale parameters binned in signal particle momenta and angles with a 20% uncertainty, are included to give extra freedom to the efficiency correction. The ad hoc variations are chosen to cover the efficiency's dependency on the initial state nuclear medium effects, which is not otherwise parametrized. In the DIS channel, a CC-other shape parameter x CC-Other with a 40% uncertainty is used, which scales the cross section by (1 + x CC-Other /E ν ) and gives greater flexibility at low E ν . Four normalization parameters with a 50% uncertainty, with the same categorization as the four CC-other topologies, are introduced to better parametrize multiple pion production. The neutral current and electron (anti)neutrino interactions, which are not constrained by the control samples, are given a normalization uncertainty of 30% and 3% respectively. Finally, there are parameters varying the pion and proton FSI. The tunable pion interactions in the nucleus are charge exchange, where the charge of the pion changes; absorption, where the pion is absorbed through two-or three-body processes; elastic scattering, where the pion only exchanges momentum and energy; and inelastic scattering, where additional pions are produced. Their prior is given by Ref. [74]. For proton FSI, there is a single parameter scaling the overall interaction probability inside the cascade with a 50% uncertainty, without tuning specific processes. It is verified that with such comprehensive list of parameters, the fit can cover the bias in signal efficiency and background subtraction under extreme model variations as discussed in Section VI C.

C. Cross section extraction, error propagation and validation
After the number of signal events is extracted from the fit, the differential cross section as a function of the true TKI variable is calculated by the following formula: where N signal i is the measured number of signal events in the i-th bin, for all CC1π + Xp events on hydrocarbon satisfying the kinematic phase restrictions in Table I. Interactions on other elements are estimated by MC and subtracted. Since the fraction of non-hydrocarbon events is small, the potential bias due to cross-section or detector mismodelling is insignificant. i is the selection efficiency in the i-th bin, contributed by both the signal and control samples. Φ is the overall flux integral, evaluated at the best-fit flux parameter values, and N FV nucleons is the number of target nucleons (only hydrocarbon) in the fiducial volume. x i is one of the TKI variables and ∆x i is the bin width.
We use a similar method as in Refs. [26,89] to numerically propagate the uncertainty of the fit to the cross section result, assuming the uncertainties of the fit parameters and cross sections are part of a Gaussian distribution. The covariance matrix of the fit parameters is Cholesky decomposed and multiplied by a vector of Gaussian random numbers to generate a set of random parameters. These random parameters are added to the best-fit parameters to create 2000 sets of variations ("toys") of parameters. This effectively samples the likelihood space encoded in the covariance matrix, and represents the spread of the plausible parameters according to the statistical and systematic uncertainties from the fit. For each toy, all variables in Eq. (16) (except ∆x i ), and thus dσ dxi , are re-evaluated with the toy parameters. The flux integral and selection efficiency are changed by the toy parameters. The resultant uncertainty of the flux integral is around 5%, and Fig. 8 shows the mean values and uncertainties of the efficiency extracted from toys. The number of target nucleons N FV nucleons is sampled independently with a mean value of 5.5×10 29 and an uncertainty of 0.67% [53]. Finally, a covariance matrix V of dσ dxi is built from such toys. This method is different from the one used in previous analyses [60,61], where the uncertainty was estimated by repeating the fit many times with toys of input MC.
To ensure our results are not biased, a plethora of mock data studies with alternative neutrino event generators, nuclear ground state models, background models and altered flux models have been performed. It has been verified that even in the case of extreme deviations from the input MC model, such as doubling the signal/background interactions or completely turning off the FSI, the cross section extraction method employed can always recover the truth values to within a 1σ-uncertainty. The fit performance for every mock data study has been quantified  11)). The p-value for each mock data study has been computed from this distribution and an acceptance threshold of 5% has been chosen to quantify good fitter performances. All the mock data studies performed (without applying statistical fluctuations) have a p-value around 90%, showing that the model differences are well covered by the conservative systematic uncertainties. On the other hand, the agreement on the measured cross sections is quantified by the χ 2 tot statistic: where σ meas is the measured cross section, and σ truth is the truth cross section in the mock data. All mock data fits return a χ 2 tot /ndof less than 0.4, where ndof is the number of degrees of freedom, and a p-value greater 80%, showing the robustness of the cross section extraction method employed for this analysis.

VII. RESULTS
Figs. 9 to 11 show the distributions of the reconstructed events in the signal and control samples, together with the prediction from the pre-fit and post-fit MC. Overall, the fit is able to reproduce the observed distributions, with a p-value greater than 10% for all the TKI variable fits, and is qualified to have a good data-MC agreement in the presence of statistical fluctuations. All nuisance parameters are fitted within their prior uncertainties. The normalization difference in control samples before the fit is well covered by the nuisance parameters, mostly through the CC-other normalization parameters. In the signal sample, there are few bins of reconstructed p p where the post-fit χ 2 stat is worse than the pre-fit one. This indicates there might not be enough freedom in the shape of the signal particle kinematics. However, from the mock data studies, it is concluded that the potential bias is much smaller than the statistical uncertainty and has little impact on this analysis. Fig. 12 estimates the uncertainties of the cross sections as a function of the TKI variables, together with the correlation between bins. Contributions from each kind of systematic uncertainties are estimated by running the fit with only the relevant nuisance parameters. As expected, the statistical error is much larger than the individual or combined systematic uncertainties. The largest systematic uncertainties are those related to the neutrino interaction model, which affect both the signal selection efficiency and background estimation. The bin-by-bin correlation in δα T is larger than that in δp T T and p N because the cross section on hydrogen is uniform across all bins of δα T .

A. Comparisons to models
In the following, the measured cross sections are compared to different neutrino interaction models. The agreement is quantified by the χ 2 tot statistic in Eq. (17), with σ truth replaced by the model prediction σ model .
On the other hand, the overall normalization uncertainty, which is fully correlated between bins, may constitute a relatively large fraction of the uncertainty. Therefore, the χ 2 tot statistics may suffer from "Peelle's pertinent puzzle" [90,91], in which the assumption in Eq. (17) that the variance is distributed as a multivariate Gaussian may not be valid for highly correlated results. To mitigate this problem, the shape only χ 2 shape is also provided: where σ model int and σ meas int are the total integrated cross sections per nucleon estimated from the model and data respectively. The shape only covariance matrix W is built by the same method as described in Section VI C but on the shape variable dσ meas dxi 1 σ meas int instead. It is important to notice that the ndof is one less for χ 2 shape compared to χ 2 tot since the sum of the shape variables is equal to one by construction, reducing the number of independent dimensions.
To compare the measured cross sections with model predictions, a sufficiently large number of events are generated on hydrocarbon from each model using the T2K flux. Events satisfying the CC1π + Xp signal definition in Table I are selected to calculate the cross sections per target nucleon. The number of target nucleons for each CH is equal to 13 which includes all seven protons and six neutrons. The following models are considered.
(i) NEUT version 5.4.0: models implemented in this event generator are described in Section IV. RFG is used as the nuclear ground state for pion production.
(iii) GiBUU [104] version 2019: it uses an LFG-based nuclear ground state to describe all neutrino interaction modes and FSI consistently. In the RES channel, 13 resonances are included and the nonresonant contribution is described by a phenomenological model. Rather than a simple cascade model, GiBUU models FSI by solving the dynamical evolution of the particle phase space density in the nuclear mean field potential.
(iv) NuWro [105] version 19.02: four different nuclear ground state models are implemented. These include three Fermi gas models: LFG, RFG, and BR-RFG; and an effective approximation of a spectral function (ESF) [106]. The Adler-Rarita-Schwinger single ∆ model [107,108] is used for RES, and the FSI cascade model is based on the Metropolis algorithm [109]. Fig. 13 shows the comparisons between the measured cross sections and model predictions. The full χ 2 tot and shape only χ 2 shape are summarized in Table IV. It is observed that χ 2 shape is usually much smaller than χ 2 tot , implying a large part of the model separation power in this analysis comes from normalization differences.

B. Discussion
Amongst all the models compared, GiBUU shows marginally better agreement with data. Both χ 2 tot and χ 2 shape are smaller than the corresponding ndof for each TKI variable. It is explained in Ref. [110] that GiBUU uses a density-and momentum-dependent mean field to model the nucleon-nucleus potential and prepare the nuclear ground state. Since the same potential is used in all interaction channels, it may provide a more accurate prediction than other generators which often treat QE and pion production processes differently. Also GiBUU's modelling of FSI in the transport theory is a more complete approach than the commonly used cascade models, which might be a contributing factor to the overall agreement. The nice shape agreement at low p N suggests that the nuclear ground state modelling in GiBUU is better than other generators. In the tail all models have similar predictions, meaning that we are not sensitive to the FSI differences there.
Within the NuWro models, ESF and BRRFG have better agreement than LFG and RFG. In the pion channel, these nuclear models affect properties like the removal energy and nucleon momentum distribution. This suggests that ESF and BRRFG may provide a more realistic nuclear ground state description. From the p N result, the characteristic nucleon momentum peak at the Fermi surface (∼220 MeV/c) in RFG is strongly disfavored. On the contrary, LFG predicts a large number of events with p N < 120 MeV/c which is also incompatible with data.
NEUT RFG, GENIE BRRFG and GENIE LFG use the same types of Fermi gas nuclear ground state models as in NuWro. The choices of pion production and FSI models make a difference in their predictions, but in general the same nuclear ground state model shows similar features across generators in the small imbalance regions of δp T T and p N . This indicates these observables are a good probe of the nuclear ground state models.
In general the model separation in δp T T and p N is better than that in δα T , with most of the sensitivity coming from the central bin of δp T T and the first two bins in p N where the imbalance is small. While δα T is rather insensitive to the initial nuclear state, the hardening of δα T towards 180 • is strongly affected by FSI which usually slow down the final-state hadrons but not the lepton. However, with the present signal phase space restrictions, in particular the high proton momentum threshold of 450 MeV/c, many of the CC1π + Xp events that undergo FSI are lost, making the measurement less effective. With improved detector acceptance in the coming ND280 upgrade [111], δα T will be an extremely useful and independent probe of FSI.
If only χ 2 shape is considered, most models have a χ 2 shape less than or roughly equal to the ndof. The worst case is the p N prediction from NuWro RFG which has a pvalue of 15%. Nevertheless, the large normalization discrepancy exhibited by the RFG and LFG nuclear ground state models cannot be simply explained by flux or other normalization uncertainties. Thus one should be careful in interpreting the model agreement when the difference between χ 2 shape and χ 2 tot is large. It is not straight-forward to compare this study to the T2K CC0π [26] and MINERνA [27,29] TKI results, because of the different signal definition and, more importantly, a significant contribution from free nucleon targets (hydrogen) in this measurement. For example, Fig. 14 shows the interaction target and channel breakdown of the GiBUU prediction. As explained in Section V A, the hydrogen cross section is a Dirac delta function at δp T T = 0 and p N ≈ 26 MeV/c, and is flat in δα T . The cross section on hydrogen is related to the carbon component via the common neutrino-nucleon cross section modelling; both components will scale similarly when the neutrino-nucleon cross section is changed. However the ratio between the hydrogen and carbon components is highly dependent on the modelling of the nuclear medium effects. Qualitatively, almost all models are compatible with the p N tail in both T2K and MINERνA data, but have an over-prediction in the peak region. However, there are not sufficient statistics to measure the peak of p N more precisely. MINERνA also reported a mild asymmetry in δp T T , and attributed it to the interference between ∆ and non-resonant amplitudes [35], but such an asymmetry is not observed within errors in this study. The tight phase space restrictions used in this study reduces our sensitivity to FSI modelling. The rather flat distribution of δα T compared to MINERνA results can be attributed to the difference in phase space restrictions, where MINERνA applied no phase space restriction on the final-state π 0 . The more energetic (∼3 GeV) neutrino beam of MINERνA also produces more energetic final-state particles and a more curved δα T .
While GiBUU has a good agreement with this CC1π + Xp and MINERνA CCπ 0 measurements, it shows an incompatibility with our CC0π TKI results [26,31]. This incompatibility is not in the δp T (Eq. (6)) tail or normalisation, suggesting this might be related to the nuclear ground state. In our previous CC0π cross section measurements as a function of outgoing muon kinematics [60,61], the GiBUU prediction also shows a large discrepancy, mainly in the most forward bin where the nuclear physics governing low energy and momentum transfer interactions is the most important.

VIII. CONCLUSION
In this paper, the CC1π + Xp muon neutrino differential cross sections on hydrocarbon as a function of the three TKI variables, δp T T , p N and δα T , have been measured independently in the ND280 tracker. δp T T and p N are most sensitive to the initial nuclear ground state, and δα T is an independent probe of FSI. The analysis is performed with a joint fit between the signal and control samples to minimize the uncertainties on background estimation, and a maximum likelihood fit is used to unfold the detector smearing effect and extract cross sections in the truth space. The reduced flux uncertainty and better detector modelling allow to have a reduced systematic uncertainty with respect to previous T2K cross section analyses. Due to the complex and multifaceted nature of this analysis, exceptional care has been taken in mit- igating sources of potential model bias in the extracted results. An extensive comparison of the extracted results to state-of-the-art neutrino interaction models shows a slight preference for GiBUU, which uses a more realistic nuclear ground state to handle all interaction channels consistently. Our results are statistically limited and a large part of the model separation power comes from normalization differences. In general the simple Fermi gas models (RFG and LFG) show a large disagreement in p N with χ 2 tot /ndof > 2, which indicates a mis-modelling of the nucleon Fermi motion. The similar data-MC comparison to the MINERνA CCπ 0 results [29] seems to confirm that the mis-modelling is general in pion production channels. While the tight phase space restrictions limit our sensitivity to FSI modelling, the relatively flat δα T in T2K results is in strong contrast to MINERνA results, indicating a possible energy dependence of hadronic FSI.
Future analyses will aim to unfold cross sections in multiple TKI variables simultaneously and obtain their correlations which can then be used to separate effects due to the initial nuclear state and FSI. The upcoming ND280 upgrade is going to expand the measurable phase space, especially in the low energy and high angle regions. Thus the ND280 upgrade is expected to increase our statistics and model sensitivity significantly. Another possible extension is to isolate hydrogen interactions from carbon ones by selecting events with small δp T T and p N . With better detector resolution, this technique could better identify and separate interactions on hydrogen on an event-by-event basis, and provide the first "free nucleon data" since the hydrogen bubble chamber experiments [24,[37][38][39][40].
The data release for the results presented in this analysis is posted in Ref. [112]. It contains the analysis binning, the differential cross section best-fit values, and associated covariance matrices.