First observation and amplitude analysis of the $B^{-}\to D^{+}K^{-}\pi^{-}$ decay

The $B^{-}\to D^{+}K^{-}\pi^{-}$ decay is observed in a data sample corresponding to $3.0~\rm{fb}^{-1}$ of $pp$ collision data recorded by the LHCb experiment during 2011 and 2012. Its branching fraction is measured to be ${\cal B}(B^{-}\to D^{+}K^{-}\pi^{-}) = (7.31 \pm 0.19 \pm 0.22 \pm 0.39) \times 10^{-5}$ where the uncertainties are statistical, systematic and from the branching fraction of the normalisation channel $B^{-}\to D^{+}\pi^{-}\pi^{-}$, respectively. An amplitude analysis of the resonant structure of the $B^{-}\to D^{+}K^{-}\pi^{-}$ decay is used to measure the contributions from quasi-two-body $B^{-}\to D_{0}^{*}(2400)^{0}K^{-}$, $B^{-}\to D_{2}^{*}(2460)^{0}K^{-}$, and $B^{-}\to D_{J}^{*}(2760)^{0}K^{-}$ decays, as well as from nonresonant sources. The $D_{J}^{*}(2760)^{0}$ resonance is determined to have spin~1.


Introduction
Excited charmed mesons are of great theoretical and experimental interest as they allow detailed studies of QCD in an interesting energy regime. Good progress has been achieved in identifying and measuring the parameters of the orbitally excited states, notably from Dalitz plot (DP) analyses of three-body B decays. Relevant examples include the studies of B − → D + π − π − [1,2] and B 0 → D 0 π + π − [3] decays, which provide information on excited neutral and charged charmed mesons (collectively referred to as D * * states), respectively. First results on excited charm-strange mesons have also recently been obtained with the DP analysis technique [4][5][6]. Studies of prompt charm resonance production in e + e − and pp collisions [7,8] have revealed a number of additional high mass states. Most of these higher mass states are not yet confirmed by independent analyses, and their spectroscopic identification is unclear. Analyses of resonances produced directly from e + e − and pp collisions do not allow determination of the quantum numbers of the produced states, but can distinguish whether or not they have natural spin parity (i.e. J P in the series 0 + , 1 − , 2 + , ...). The current experimental knowledge of the neutral D * * states is summarised in Table 1 (here and throughout the paper, natural units with = c = 1 are used). The D * 0 (2400) 0 , D 1 (2420) 0 , D 1 (2430) 0 and D * 2 (2460) 0 mesons are generally understood to be the four orbitally excited (1P) states. The experimental situation as well as the spectroscopic identification of the heavier states is less clear.
The B − → D + K − π − decay can be used to study neutral D * * states. The D + K − π − final state is expected to exhibit resonant structure only in the D + π − channel, and unlike the Cabibbo-favoured D + π − π − final state does not contain any pair of identical particles. This simplifies the analysis of the contributing excited charm states, since partial wave analysis can be used to help determine the resonances that contribute.
One further motivation to study B − → D + K − π − decays is related to the measurement of the angle γ of the unitarity triangle defined as γ ≡ arg [−V ud V * ub /(V cd V * cb )], where V xy are elements of the Cabibbo-Kobayashi-Maskawa (CKM) quark mixing matrix [10,11]. Table 1: Measured properties of neutral D * * states. Where more than one uncertainty is given, the first is statistical and the others systematic.

Resonance
Mass One of the most powerful methods to determine γ uses B − → DK − decays, with the neutral D meson decaying to CP eigenstates [12,13]. The sensitivity to γ arises due to the interference of amplitudes proportional to the CKM matrix elements V ub and V cb , associated with D 0 and D 0 production respectively. However, a challenge for such methods is to determine the ratio of magnitudes of the two amplitudes, r B , that must be known to extract γ. This is usually handled by including D meson decays to additional final states in the analysis. By contrast, in B − → D * * K − decays the efficiency-corrected ratio of yields of B − → D * * K − → D − π + K − and B − → D * * K − → D + π − K − decays gives r 2 B directly [14]. The decay B − → D * * K − → Dπ 0 K − where the D meson is reconstructed in CP eigenstates can be used to search for CP violation driven by γ. Measurement of the first two of these processes would therefore provide knowledge of r B in B − → D * * K − decays, indicating whether or not a competitive measurement of γ can be made with this approach.
In this paper, the B − → D + K − π − decay is studied for the first time, with the D + meson reconstructed through the K − π + π + decay mode. The inclusion of charge conjugate processes is implied. The topologically similar B − → D + π − π − decay is used as a control channel and for normalisation of the branching fraction measurement. A large B − → D + K − π − signal yield is found, corresponding to a clear first observation of the decay, and allowing investigation of the DP structure of the decay. The amplitude analysis allows studies of known resonances, searches for higher mass states and measurement of the properties, including the quantum numbers, of any resonances that are observed. The analysis is based on a data sample corresponding to an integrated luminosity of 3.0 fb −1 of pp collision data collected with the LHCb detector, approximately one third of which was collected during 2011 when the collision centre-of-mass energy was √ s = 7 TeV and the rest during 2012 with √ s = 8 TeV. The paper is organised as follows. A brief description of the LHCb detector as well as reconstruction and simulation software is given in Sec. 2. The selection of signal candidates is described in Sec. 3, and the branching fraction measurement is presented in Sec. 4. Studies of the backgrounds and the fit to the B candidate invariant mass distribution are in Sec. 4.1, with studies of the signal efficiency and a definition of the square Dalitz plot (SDP) in Sec. 4.2. Systematic uncertainties on, and the results for, the branching fraction are discussed in Secs. 4.3 and 4.4 respectively. A study of the angular moments of B − → D + K − π − decays is given in Sec. 5, with results used to guide the Dalitz plot analysis that follows. An overview of the Dalitz plot analysis formalism is given in Sec. 6, and details of the implementation of the amplitude analysis are presented in Sec. 7. The evaluation of systematic uncertainties is described in Sec. 8. The results and a summary are given in Sec. 9.

LHCb detector
The LHCb detector [15,16] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector [17] surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [18] placed downstream of the magnet. The polarity of the dipole magnet is reversed periodically throughout data-taking. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV. The minimum distance of a track to a primary vertex, the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [19]. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [20].
The trigger [21] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, in which all tracks with p T > 500 (300) MeV are reconstructed for data collected in 2011 (2012). The software trigger line used in the analysis reported in this paper requires a two-, three-or four-track secondary vertex with significant displacement from the primary pp interaction vertices (PVs). At least one charged particle must have p T > 1.7 GeV and be inconsistent with originating from the PV. A multivariate algorithm [22] is used for the identification of secondary vertices consistent with the decay of a b hadron.
In the offline selection, the objects that fired the trigger are associated with reconstructed particles. Selection requirements can therefore be made not only on the trigger line that fired, but on whether the decision was due to the signal candidate, other particles produced in the pp collision, or a combination of both. Signal candidates are accepted offline if one of the final state particles created a cluster in the hadronic calorimeter with sufficient transverse energy to fire the hardware trigger. These candidates are referred to as "triggered on signal" or TOS. Events that are triggered at the hardware level by another particle in the event, referred to as "triggered independent of signal" or TIS, are also retained. After all selection requirements are imposed, 57 % of events in the sample were triggered by the decay products of the signal candidate (TOS), while the remainder were triggered only by another particle in the event (TIS-only).
Simulated events are used to characterise the detector response to signal and certain types of background events. In the simulation, pp collisions are generated using Pythia [23] with a specific LHCb configuration [24]. Decays of hadronic particles are described by EvtGen [25], in which final state radiation is generated using Photos [26]. The interaction of the generated particles with the detector and its response are implemented using the Geant4 toolkit [27] as described in Ref. [28].

Selection requirements
Most selection requirements are optimised using the B − → D + π − π − control channel. Loose initial selection requirements on the quality of the tracks combined to form the B candidate, as well as on their p, p T and χ 2 IP , are applied to obtain a visible peak in the invariant mass distribution. The χ 2 IP is the difference between the χ 2 of the PV reconstruction with and without the considered particle. Only candidates with invariant mass in the range 1770 < m(K − π + π + ) < 1968 MeV are retained. Further requirements are imposed on the vertex quality (χ 2 vtx ) and flight distance from the associated PV of the B and D candidates. The B candidate must also satisfy requirements on its invariant mass and on the cosine of the angle between the momentum vector and the line joining the PV under consideration to the B vertex (cos θ dir ). The initial selection requirements are found to be about 90 % efficient on simulated signal decays.
Two neural networks [29] are used to further separate signal from background. The first is designed to separate candidates that contain real D + → K − π + π + decays from those that do not; the second separates B − → D + π − π − signal decays from background combinations. Both networks are trained using the D + π − π − control channel, where the sPlot technique [30] is used to statistically separate B − → D + π − π − signal decays from background combinations using the D (B) candidate mass as the discriminating variable for the first (second) network. The first network takes as input properties of the D candidate and its daughter tracks, including information about kinematics, track and vertex quality. The second uses a total of 27 input variables. They include the χ 2 IP of the two "bachelor" pions (i.e. pions that originate directly from the B decay) and properties of the D candidate including its χ 2 IP , χ 2 vtx , cos θ dir , the output of the D neural network and the square of the flight distance divided by its uncertainty squared (χ 2 flight ). Variables associated to the B candidate are also used, including p T , χ 2 IP , χ 2 vtx , χ 2 flight and cos θ dir . The p T asymmetry and track multiplicity in a cone with half-angle of 1.5 units of the plane of pseudorapidity and azimuthal angle (measured in radians) around the B candidate flight direction [31], which contain information about the isolation of the B candidate from the rest of the event, are also used in the network. The neural network input quantities depend only weakly on the kinematics of the B decay. A requirement is imposed on the second neural network output that reduces the combinatorial background by an order of magnitude while retaining about 75 % of the signal.
The selection criteria for the B − → D + K − π − and B − → D + π − π − candidates are identical except for the particle identification (PID) requirement on the bachelor track that differs between the two modes. All five final state particles for each decay mode have PID criteria applied to preferentially select either pions or kaons. Tight requirements are placed on the higher momentum pion from the D + decay and on the bachelor kaon in B − → D + K − π − to suppress backgrounds from D + s → K − K + π + and B − → D + π − π − decays, respectively. The combined efficiency of the PID requirements on the five final state tracks is around 70 % for B − → D + π − π − decays and around 40 % for B − → D + K − π − decays. The PID efficiency depends on the kinematics of the tracks, as described in detail in Sec. 4.2, and is determined using samples of D 0 → K − π + decays selected in data by exploiting the kinematics of the D * + → D 0 π + decay chain to obtain clean samples without using the PID information.
To improve the B candidate invariant mass resolution, track momenta are scaled [32,33] with calibration parameters determined by matching the measured peak of the J/ψ → µ + µ − decay to the known J/ψ mass [9]. Furthermore, a fit to the kinematics and topology of the decay chain [34] is used to adjust the four-momenta of the tracks from the D candidate so that their combined invariant mass matches the world average value for the D + meson [9]. An additional B mass constraint is applied in the calculation of the variables that are used in the Dalitz plot fit.
To remove potential background from misreconstructed Λ + c decays, candidates are rejected if the invariant mass of the D candidate lies in the range 2280-2300 MeV when the proton mass hypothesis is applied to the low momentum pion track. Possible backgrounds from B − meson decays without an intermediate charm meson are suppressed by the requirement on the output value from the first neural network, and any surviving background of this type is removed by requiring that the D candidate vertex is displaced by at least 1 mm from the B decay vertex. The efficiency of this requirement is about 85 %.
Signal candidates are retained for further analysis if they have an invariant mass in the range 5100-5800 MeV. After all selection requirements are applied, fewer than 1 % of events with one candidate also contain a second candidate. Such multiple candidates are retained and treated in the same manner as other candidates; the associated systematic uncertainty is negligible.

Branching fraction determination
The ratio of branching fractions is calculated from the signal yields with event-by-event efficiency corrections applied as a function of square Dalitz plot position. The calculation is where N corr = i W i / i is the efficiency-corrected yield. The index i sums over all candidates in the data sample and W i is the signal weight for each candidate, which is determined from the fits described in Sec. 4.1 and shown in Figs. 1 and 2, using the sPlot technique [30]. Each fit is performed simultaneously to decays in the TOS and TIS-only categories. The efficiency of candidate i, i , is obtained separately for each trigger subsample as described in Sec. 4.2.

Determination of signal and background yields
The candidates that survive the selection requirements are comprised of signal decays and various categories of background. Combinatorial background arises from random combinations of tracks (possibly including a real D + → K − π + π + decay). Partially reconstructed backgrounds originate from b hadron decays with additional particles that are not part of the reconstructed decay chain. Misidentified decays also originate from b hadron decays, but where one of the final state particles has been incorrectly identified (e.g. a pion as a kaon). The signal (normalisation channel) and background yields are obtained from unbinned maximum likelihood fits to the D + K − π − (D + π − π − ) invariant mass distributions.
Both the B − → D + K − π − and B − → D + π − π − signal shapes are modelled by the sum of two Crystal Ball (CB) functions [35] with a common mean and tails on opposite sides, where the high-mass tail accounts for non-Gaussian reconstruction effects. The ratio of widths of the CB shapes and the relative normalisation of the narrower CB shape are constrained within their uncertainties to the values found in fits to simulated signal samples. The tail parameters of the CB shapes are also fixed to those found in simulation.
The combinatorial backgrounds in both D + K − π − and D + π − π − samples are modelled with linear functions; the slope of this function is allowed to differ between the two trigger subsamples. The decay B − → D * + K − π − is a partially reconstructed background for D + K − π − candidates, where the D * + decays to either D + γ or D + π 0 and the neutral particle is not reconstructed. Similarly the decay B − → D * + π − π − forms a partially reconstructed background to the D + π − π − final state. These are modelled with non-parametric shapes determined from simulated samples. The shapes are characterised by a sharp edge around 100 MeV below the B peak, where the exact position of the edge depends on properties of the decay including the D * + polarisation. The fit quality improves when the shape is allowed to be offset by a small shift that is determined from the data.
Most potential sources of misidentified backgrounds have broad B candidate invariant mass distributions, and hence are absorbed in the combinatorial background component in the fit. The decays B − → D ( * )+ π − π − and B − → D + s K − π − , however, give distinctive shapes in the mass distribution of D + K − π − candidates. For D + π − π − candidates the only significant misidentified background contribution is from B − → D ( * )+ K − π − decays. The misidentified background shapes are also modelled with non-parametric shapes determined from simulated samples.
The simulated samples used to obtain signal and background shapes are generated with flat distributions in the phase space of their SDPs. For B − → D + π − π − and B − → D * + π − π − decays, accurate models of the distributions across the SDP are known [1,2], so the simulated samples are reweighted using the B − → D + π − π − data sample; this affects the shape of the misidentified background component in the fit to the D + K − π + sample. Additionally, the D + and D * + portions of this background are combined according to their known branching fractions. All of the shapes, except for that of the combinatorial background, are common between the two trigger subsamples in each fit, but the signal and background yields in the subsamples are independent. In total there are 15 free parameters in the fit to the D + π − π − sample: yields in each subsample for signal, combinatorial, B − → D ( * )+ K − π − and B − → D * + π − π − backgrounds; the combinatorial slope in each subsample; the double CB peak position, the width of the narrower CB, the ratio of CB widths and the fraction of entries in the narrower CB shape; and the shift parameter of the partially reconstructed background. The result of the D + π − π − fit is shown in Fig. 1

) [MeV]
for both trigger subsamples and gives a combined signal yield of approximately 49 000 decays. Component yields are given in Table 2.
There are a total of 17 free parameters in the fit to the D + K − π − sample: yields in each subsample for signal, combinatorial, backgrounds; the combinatorial slope in each subsample; the same signal shape parameters as for the D + π − π − fit; and the shift parameter of the partially reconstructed background. Figure 2 shows the result of the D + K − π − fit for the two trigger subsamples that yield a total of approximately 2000 B − → D + K − π − decays. The yields of all fit components are shown in Table 3. The statistical signal significance, estimated in the conventional way from the change in negative log-likelihood from the fit when the signal component is removed, is in excess of 60 standard deviations (σ).

Signal efficiency
Since both B − → D + K − π − and B − → D + π − π − decays have non-trivial DP distributions, it is necessary to understand the variation of the efficiency across the phase space. Since, moreover, the efficiency variation tends to be strongest close to the kinematic boundaries of the conventional Dalitz plot, it is convenient to model these effects in terms of the SDP 5400 5600 5800 Candidates / (5 MeV) 238 ± 38 253 ± 36 defined by variables m and θ which are valid in the range 0 to 1 and are given for the where m max is the helicity angle of the D + π − system (the angle between the K − and the D + meson momenta in the D + π − rest frame). For the D + π − π − case, m and θ are defined in terms of the π − π − mass and helicity angle, respectively, since with this choice only the region of the SDP with θ (π − π − ) < 0.5 is populated due to the symmetry of the two pions in the final state.
Efficiency variation across the SDP is caused by the detector acceptance and by trigger, selection and PID requirements. The efficiency variation is evaluated for both D + K − π − and D + π − π − final states with simulated samples generated uniformly over the SDP. Datadriven corrections are applied to correct for known differences between data and simulation in the tracking, trigger and PID efficiencies, using identical methods to those described in Ref. [5]. The efficiency functions are fitted with two-dimensional cubic splines to smooth out statistical fluctuations due to limited sample size. The efficiency is studied separately for the TOS and TIS-only categories. The efficiency maps for each trigger subsample are shown for B − → D + K − π − decays in Fig. 3. Regions of relatively high efficiency are seen where all decay products have comparable momentum in the B rest frame; the efficiency drops sharply in regions with a low momentum bachelor track due to geometrical effects. The efficiency maps are used to calculate the ratio of branching fractions and also as inputs to the D + K − π − Dalitz plot fit. Table 4 summarises the systematic uncertainties on the measurement of the ratio of branching fractions. Selection effects cancel in the ratio of branching fractions, except for inefficiency due to the Λ + c veto. The invariant mass fits are repeated both with a wider veto (2270-2310 MeV) and with no veto, and changes in the yields are used to assign a relative systematic uncertainty of 0.2 %.

Systematic uncertainties
To estimate the uncertainty arising from the choice of invariant mass fit model, the D + K − π − mass fit is varied by replacing the signal shape with the sum of two bifurcated Gaussian functions, removing the smoothing of the non-parametric functions, using exponential and second-order polynomial functions to describe the combinatorial background, varying fixed parameters within their uncertainties and varying the binning of histograms used to reweight the simulated background samples. For the D + π − π − fit the same variations are made. The relative changes in the yields are summed in quadrature to give a relative systematic uncertainty on the ratio of branching fractions of 2.0 %.
The systematic uncertainty due to PID is estimated by accounting for three sources: the intrinsic uncertainty of the calibration (1.0 %); possible differences in the kinematics of tracks in simulated samples, used to reweight the calibration data samples, to those in the data (1.7 %); the granularity of the binning in the reweighting procedure (0.7 %). Combining these in quadrature, the total relative systematic uncertainty from PID is 2.1 %.
The bins of the efficiency maps are varied within uncertainties to make 100 new efficiency maps, for both D + K − π − and D + π − π − modes. The efficiency-corrected yields are evaluated for each new map and their distributions are fitted with Gaussian functions. The widths of these are used to assign a relative systematic uncertainty on the ratio of branching fractions of 0.8 %.
A number of additional cross-checks are performed to test the branching fraction result. The neural network and PID requirements are both tightened and loosened. The data sample is divided by dipole magnet polarity and year of data taking. The branching fraction is also calculated separately for TOS and TIS-only events. All cross-checks give consistent results.

Results
The ratio of branching fractions is found to be where the first uncertainty is statistical and the second systematic. The statistical uncertainty includes contributions from the event weighting used in Eq. (1) and from the shape parameters that are allowed to vary in the fit [36]. The world average value of B(B − → D + π − π − ) = (1.07 ± 0.05) × 10 −3 [9] assumes that B + B − and B 0 B 0 are produced equally in the decay of the Υ(4S) resonance.
This allows the branching fraction of B − → D + K − π − decays to be determined as where the third uncertainty is from B(B − → D + π − π − ). This measurement represents the first observation of the B − → D + K − π − decay.

Study of angular moments
To investigate which amplitudes should be included in the DP analysis of B − → D + K − π − decays, a study of its angular moments is performed. Such an analysis is particularly useful for B − → D + K − π − decays because resonant contributions are only expected to appear in the D + π − combination, and therefore the distributions should be free of effects from reflections that make them more difficult to interpret. The analysis is performed by calculating moments from the Legendre polynomials P L of order up to 2J max , where J max is the maximum spin of the resonances considered. Each candidate is weighted according to its value of P L (cos θ(D + π − )) with an efficiency correction applied, and background contributions subtracted. The results for J max = 3 are shown in Fig. 4 for the D + π − invariant mass range 2.0-3.0 GeV. The distributions of P 5 and P 6 are compatible with being flat, which implies that there are no significant spin 3 contributions. Considering only contributions up to spin 2, the following expressions are used to interpret Fig. 4 P where S-, P-and D-wave contributions are denoted by amplitudes h j e iδ j (j = 0, 1, 2 respectively). The D * 2 (2460) 0 resonance is clearly seen in the P 4 distribution of Fig. 4(e). The distribution of P 3 shows interference between spin 1 and 2 contributions, indicating the presence of a broad, possibly nonresonant, spin 1 contribution at low m(D + π − ). The difference in shape between P 1 and P 3 shows interference between spin 1 and 0 indicating that a broad spin 0 component is similarly needed. 6 Dalitz plot analysis formalism A Dalitz plot [37] is a representation of the phase-space for a three-body decay in terms of two of the three possible two-body invariant mass squared combinations. In B − → D + K − π − decays, resonances are expected in the m 2 (D + π − ) combination, therefore this and m 2 (D + K − ) are chosen to define the DP axes. For a fixed B − mass, all other relevant kinematic quantities can be calculated from these two invariant mass squared combinations.
The complex decay amplitude is described using the isobar approach [38][39][40], where the total amplitude is calculated as a coherent sum of amplitudes from resonant and nonresonant intermediate processes. The total amplitude is then given by where c j are complex coefficients giving the relative contribution of each intermediate process.
The F j (m 2 (D + π − ), m 2 (D + K − )) terms contain the resonance dynamics, which are composed of several terms and are normalised such that the integral of the squared magnitude over the DP is unity for each term. For a D + π − resonance where the functions R, X and T are described below, and p and q are the bachelor particle momentum and the momentum of one of the resonance daughters, respectively, both evaluated in the D + π − rest frame. The X(z) terms, where z = | q | r BW or | p | r BW , are Blatt-Weisskopf barrier factors [41] with barrier radius r BW , and are given by where z 0 is the value of z when the invariant mass is equal to the pole mass of the resonance and L is the spin of the resonance. For a D + π − resonance, since the B − meson has zero spin, L is also the orbital angular momentum between the resonance and the kaon. The barrier radius, r BW , is taken to be 4.0 GeV −1 ≈ 0.8 fm [5,42] for all resonances. The terms T ( p, q) describe the angular probability distribution and are given in the Zemach tensor formalism [43,44] by which are proportional to the Legendre polynomials, P L (x), where x is the cosine of the angle between p and q (referred to as the helicity angle). The function R (m(D + π − )) of Eq. (9) is the mass lineshape. The resonant contributions considered in the DP model are described by the relativistic Breit-Wigner (RBW) function where the mass-dependent decay width is where q 0 is the value of q = | q | for m = m 0 . Virtual contributions, from resonances with pole masses outside the kinematically accessible region of the phase space, can also be modelled by this shape with one modification: the pole mass m 0 is replaced with m eff 0 , a mass in the kinematically allowed region, in the calculation of the parameter q 0 . This effective mass is defined by the ad hoc formula [5] m eff 0 (m 0 ) = m min + (m max − m min ) 1 + tanh where m max and m min are the upper and lower limits of the kinematically allowed range, respectively. For virtual contributions, only the tail of the RBW function enters the Dalitz plot.
Given the large available phase-space in the B decay, it is possible to have nonresonant amplitudes (i.e. contributions that are not from any known resonance, including virtual states) that vary across the Dalitz plot. A model that has been found to describe well nonresonant contributions in several B decay DP analyses is an exponential form factor (EFF) [45], where m is a two-body (in this case Dπ) invariant mass and α is a shape parameter that must be determined from the data. Neglecting reconstruction effects, the DP probability density function would be where the dependence of A on the DP position has been suppressed in the denominator for brevity. The complex coefficients, given by c j in Eq. (8), are the primary results of most Dalitz plot analyses. However, these depend on the choice of normalisation, phase convention and amplitude formalism in each analysis. Fit fractions and interference fit fractions are also reported as these provide a convention-independent method to allow meaningful comparisons of results. The fit fraction is defined as the integral of the amplitude for a single component squared divided by that of the coherent matrix element squared for the complete Dalitz plot, The fit fractions do not necessarily sum to unity due to the potential presence of net constructive or destructive interference, described by interference fit fractions defined for i < j only by where the dependence of F

Dalitz plot fit
The Laura++ [46] package is used to perform the Dalitz plot fit, with the two trigger subsamples fitted simultaneously using the Jfit method [47]. The two subsamples have separate signal and background yields, efficiency maps and background SDP distributions, but all parameters of the signal model are common. The likelihood function that is used is where the index i runs over N c candidates, while k distinguishes the signal and background components with N k the yield in each component. The probability density function for signal events, P sig , is given by Eq. (16) where the |A (m 2 (D + π − ), m 2 (D + K − )) | 2 terms are multiplied by the efficiency function described in Sec. 4.2. The mass resolution is approximately 2.4 MeV, which is much lower than the width of the narrowest contribution to the Dalitz plot (∼ 50 MeV); therefore, this has negligible effect on the likelihood and is not considered further. The signal and background yields that enter the Dalitz plot fit are taken from the mass fit described in Sec. 4.1. Only candidates in the signal region, defined as ±2.5σ around the B signal peak, where σ is the width of the peak, are used in the Dalitz plot fit. Within this region, in the TOS subsample the result of the B candidate invariant mass fit corresponds to yields of 1060 ± 35, 37 ± 6, 26 ± 8 and 16 ± 4 in the signal, combinatorial background, D ( * )+ π − π − and D + s K − π − components, respectively. The equivalent yields in the TIS-only subsample are 849 ± 30, 39 ± 6, 5 ± 5 and 9 ± 3 candidates. The contribution from D * + K − π − decays is negligible in the signal window. The distributions of the candidates in the signal region over the DP and SDP are shown in Fig. 5.
The SDP distributions of the D ( * )+ π − π − and D + s K − π − background sources are obtained from simulated samples using the same procedures as described for their invariant mass distributions in Sec. 4.1. The distribution of combinatorial background events is modelled by considering D + K − π − candidates in the sideband high-mass range 5500-5800 MeV, with contributions from D ( * )+ π − π − in this region subtracted. The dependence of the SDP distribution on B candidate mass was investigated and found to be negligible. The SDP distributions of these backgrounds are shown in Fig. 6. These histograms are used to model the background contributions in the Dalitz plot fit.
Using the results of the moments analysis of Sec. 5 as a guide, the nominal Dalitz plot fit model for B − → D + K − π − decays is determined by considering several resonant, Table 5: Signal contributions to the fit model, where parameters and uncertainties are taken from Ref. [9]. States labelled with subscript v are virtual contributions.

Resonance
Spin DP axis Model Parameters Determined from data (see Table 6)  Table 5: three resonances, two virtual resonances and two nonresonant terms. Parts of the model are known to be approximations. In particular both S-and P-waves in the Dπ system are modelled with overlapping broad structures. The nominal model gives a better description of the data than any of the alternative models considered; alternative models are used to assign systematic uncertainties as discussed in Sec. 8. The free parameters in the fit are the c j terms introduced in Eq. (8), with the real and imaginary parts of these complex coefficients determined for each amplitude in the fit model. The D * 2 (2460) 0 component, as the reference amplitude, is the exception with real and imaginary parts fixed to 1 and 0, respectively. Fit fractions and interference fit fractions are derived from these free parameters, as are the magnitudes and phases of the complex coefficients. Statistical uncertainties for the derived parameters are calculated using large samples of simulated pseudoexperiments to ensure that non-trivial correlations are accounted for. Several other parameters are also determined from the fit as described below.
In Dalitz plot fits it is common for the minimisation procedure to find local minima of the likelihood function. To find the global minimum, the fit is performed many times using randomised starting values for the complex coefficients. In addition to the global minimum of the likelihood, corresponding to the results reported below, several additional minima are found. Two of these have negative log-likelihood (NLL) values close to that of global minimum. The main differences between secondary minima and the global minimum are the interference patterns in the Dπ S-and P-waves, as shown in App. A.
The shape parameters, defined in Eq. (15), for the nonresonant components are determined from the fit to data to be 0.36 ± 0.03 GeV −2 and 0.36 ± 0.04 GeV −2 for the S-wave and P-wave, respectively, where the uncertainties are statistical only. The mass and width of the D * 2 (2460) 0 resonance are determined from the fit to improve the fit quality.  Since the mass and width of the D * J (2760) 0 state have not been precisely determined by previous experiments, these parameters are also allowed to vary in the fit. The masses and widths of the D * 2 (2460) 0 and D * J (2760) 0 are reported in Table 6. The spin of the D * J (2760) 0 state has not been determined previously. Fits are performed with all values up to 3, and spin 1 is found to be preferred with changes relative to the spin 0, 2 and 3 hypotheses of 2∆NLL = 37.3, 49.5 and 48.2 units, respectively. For comparison, the value of 2∆NLL obtained from a fit with the D * 1 (2760) 0 state excluded is 75.0 units. The alternative models discussed in Sec. 8 give very similar values and therefore do not affect the conclusion that the D * J (2760) 0 state has spin 1. The values of the complex coefficients and fit fractions returned by the fit are shown in Table 7. Results for the interference fit fractions are given in App. B. The total fit fraction exceeds unity mostly due to interference between the D * 0 (2400) 0 and S-wave nonresonant contributions.
The consistency of the fit model and the data is evaluated in several ways. Numerous one-dimensional projections (including several shown below and those shown in Sec. 5) show good agreement. A two-dimensional χ 2 value is determined by comparing the data and the fit model in 100 equally populated bins across the SDP. The pull, i.e. the difference between the data and fit model divided by the uncertainty, is shown with this SDP binning in Fig. 7. The χ 2 value obtained is found to be within the bulk of the distribution expected from simulated pseudoexperiments. Other unbinned fit quality tests [48] also show acceptable agreement between the data and the fit model. Figure 8 shows projections of the nominal fit model and the data onto m(Dπ), m(DK) and m(Kπ). Zooms are provided around the resonant structures on m(Dπ) in Fig. 9. Projections of the cosine of the helicity angle of the Dπ system are shown in Fig. 10. Good agreement is seen between the data and the fit model. The effect of imperfect knowledge of the background distributions over the SDP is tested by varying the histograms used to model the shapes within their statistical uncertainties. For D ( * )+ π − π − decays the ratio of the D * + and D + contributions is varied. Where applicable, the reweighting of the SDP distribution of the simulated samples is removed.
The uncertainty related to the knowledge of the variation of efficiency across the SDP is determined by varying the efficiency histograms before the spline fit is performed. The central bin in each cell of 3 × 3 bins is varied by its statistical uncertainty and the surrounding bins in the cell are varied by interpolation. This procedure accounts for possible correlations between the bins, since a systematic effect on a given bin is likely also to affect neighbouring bins. The effects on the DP fit results are assigned as systematic uncertainties. An additional systematic uncertainty is assigned by varying the binning scheme of the control sample used to determine the PID efficiencies. Systematic uncertainties related to possible intrinsic fit bias are investigated using an ensemble of pseudoexperiments. Differences between the input and fitted values from the ensemble for the fit parameters are found to be small. Systematic uncertainties are assigned as the sum in quadrature of the difference between the input and output values and the uncertainty on the mean of the output value determined from a fit to the ensemble.
Systematic uncertainties due to fixed parameters in the fit model are determined by varying the parameters within their uncertainties and repeating the fit. The fixed parameters considered are the mass and width of the D * 0 (2400) 0 resonance and the Blatt-Weisskopf barrier radius, r BW . The mass and width are varied by the uncertainties shown in Table 5 and the barrier radius is varied between 3 and 5 GeV −1 [5]. For each fit parameter, the difference compared to the nominal fit model is assigned as a systematic uncertainty for each source.
The other parameters are assigned as the systematic uncertainties. Dalitz plot analysis of B 0 s → D 0 K + π − revealed that a structure at m(D 0 K + ) ∼ 2.86 GeV has both spin 1 and spin 3 components [4,5]. Although there is no evidence for a spin 3 resonance in this analysis, the excess at m(D + π − ) ∼ 2.76 GeV could have a similar composition. A putative D * 3 (2760) resonance is added to the fit model, and the effect on the other parameters is used to assign systematic uncertainties.
The EFF lineshapes used to model the nonresonant S-and P-wave contributions are replaced by a power-law model and the change in the fit parameters used as a systematic uncertainty. The dependence of the results on the effective pole mass description of Eq. (14) that is used for the virtual resonance contributions is found by using a fixed width in Eq. (12), removing the dependency on m eff 0 . The total experimental and model systematic uncertainties for fit fractions and complex coefficients are summarised in Tables 8 and 9, respectively. The contributions for the fit fractions, masses and widths are broken down in Tables 10 and 11. Similar tables summarising the systematic uncertainties on the interference fit fractions are given in App. B. The largest source of experimental systematic uncertainty on the fit fractions is  In general, the model uncertainties are larger than the experimental systematic uncertainties for the fit fractions and the masses and widths. Several cross-checks are performed to confirm the stability of the results. The data sample is divided into two parts depending on the charge of the B candidate, the polarity of the magnet and the year of data taking. Selection effects are also checked by varying the requirement on the neural network output variable and the PID criteria applied to the bachelor kaon. A fit is performed for each of the subsamples individually and each is seen to be consistent with the default fit results, although in some cases one of the secondary minima described in App. A becomes the preferred solution. To cross-check the amplitude model, the fit is repeated many times with an extra resonance with fixed mass, width and spin included in the model. All possible mass and width values, and spin up to 3, were considered. None of the additional resonances are found to contribute significantly.

Results and summary
The results for the complex coefficients are reported in Tables 12 and 13 in terms of real and imaginary parts and of magnitudes and phases, respectively. The results for the fit fractions are given in Table 14 Table 15; they cannot be converted into absolute branching fractions because the branching fractions for
The measurement of B(B − → D + K − π − ) corresponds to the first observation of this decay mode. Therefore, the resonant contributions to the decay are also first observations. The significance of the B − → D * 1 (2760) 0 K − observation is investigated by removing the corresponding resonance from the DP model. A fit without the D * 1 (2760) 0 component increases the value of 2∆NLL by 75.0 units, corresponding to a high statistical significance. Only the systematic effects due to uncertainties in the DP model could in principle significantly change the conclusion regarding the need for this resonance. However, in alternative DP models where a Dπ resonance with spin 3 is added and where the B * v contribution is removed, the shift in 2∆NLL remains above 50 units. The alternative models also do not significantly impact the level at which the D * 1 (2760) 0 state is preferred to be spin 1. Therefore, these results represent the first observation of the B − → D * 1 (2760) 0 K − and the measurement of the spin of the D * 1 (2760) 0 resonance. In summary, the B − → D + K − π − decay has been observed in a data sample corresponding to 3.0 fb −1 of pp collision data recorded by the LHCb experiment. An amplitude analysis of its Dalitz plot distribution has been performed, in which a model containing resonant contributions from the D * 0 (2400) 0 , D * 2 (2460) 0 and D * 1 (2760) 0 states in addition to both S-wave and P-wave nonresonant amplitudes and components due to virtual D * v (2007) 0 and B * 0 v resonances was found to give a good description of the data. The B − → D * 2 (2460) 0 K − decay may in future be used to determine the angle γ of the CKM unitarity triangle. The results provide insight into the spectroscopy of charm mesons, and demonstrate that further progress may be obtained with Dalitz plot analyses of larger data samples.

A Secondary minima
The results, in terms of fit fractions and complex coefficients, corresponding to the two secondary minima discussed in Sec. 7 are compared to those of the global minimum in Table 16. The main difference between the global and secondary minima is in the interference pattern in the Dπ P-waves, while the third minimum exhibits a different interference pattern in the Dπ S-wave than the global minimum and has a very large total fit fraction due to strong destructive interference.

B Results for interference fit fractions
The central values and statistical errors for the interference fit fractions are shown in Table 17. The experimental systematic and model uncertainties are given in Tables 18  and 19. The interference fit fractions are common to both trigger subsamples.   3.6 ± 1.9 −6.7 ± 2.3 −11.1 ± 3.6 A 5 38.0 ± 7.4 0.0 ± 0.0 A 6 23.8 ± 5.6