First observation of the decay $B^0 \rightarrow D^0 \overline{D}{}^0 K^+ \pi^-$

The first observation of the decay $B^0 \rightarrow D^0 \overline{D}{}^0 K^+ \pi^-$ is reported using proton-proton collision data corresponding to an integrated luminosity of 4.7 $\mathrm{fb}^{-1}$ collected by the LHCb experiment in 2011, 2012 and 2016. The measurement is performed in the full kinematically allowed range of the decay outside of the $D^{*-}$ region. The ratio of the branching fraction relative to that of the control channel $B^0 \rightarrow D^{*-} D^0 K^+$ is measured to be $\mathcal{R} = (14.2 \pm 1.1 \pm 1.0)\%$, where the first uncertainty is statistical and the second is systematic. The absolute branching fraction of $B^0 \rightarrow D^0 \overline{D}{}^0 K^+ \pi^-$ decays is thus determined to be $\mathcal{B}(B^0 \rightarrow D^0 \overline{D}{}^0 K^+ \pi^-) = (3.50 \pm 0.27 \pm 0.26 \pm 0.30) \times 10^{-4}$, where the third uncertainty is due to the branching fraction of the control channel.

The family of B → D ( * ) D ( * ) K and B → D ( * ) D ( * ) Kπ decays, each with two charm hadrons and a kaon in the final state, proceed at quark level through Cabibbo-Kobayashi-Maskawa favoured b → ccs transitions. These transitions occur with either an external or internal W emission process, as shown in Fig. 1, offering the opportunity to search for new cs or cc states. In addition, measurements of the amplitude structure of the D ( * ) D ( * ) system in these processes can provide important information to calculations of the cc contribution above the open-charm threshold in b → s + − decays [1]. There is considerable debate whether the theoretical uncertainties associated with these longdistance contributions [2][3][4][5] could alleviate the tensions in a wide range of measurements involving b → s + − transitions [6][7][8][9][10][11][12][13][14][15][16] with Standard Model predictions. Therefore, measurements that can provide input to these calculations are of the utmost importance.
Although measurements involving B → D ( * ) D ( * ) K decays have been performed by the ALEPH, BaBar, Belle and LHCb collaborations [17][18][19][20][21][22], no measurements involving B → D ( * ) D ( * ) Kπ transitions have been performed to date. The B 0 → D 0 D 0 K + π − branching fraction, based on considerations of similar decay modes, is expected to be O(10 −4 ), but the product of the branching fractions including the D 0 → K − π + charm meson decays is much smaller, at the level of O(10 −7 ). This Letter presents the first observation of the B 0 → D 0 D 0 K + π − decay, excluding contributions from B 0 → D * − D 0 K + transitions, with D * − → D 0 π − decays. 1 The branching fraction of this decay is measured in the full kinematically allowed range of the decay outside of the D * − region, relative to the control mode B 0 → D * − D 0 K + . After the decay of the D * − meson via the strong interaction, signal and control modes present the same final-state particles D 0 D 0 K + π − . The measurement is performed using data collected with the LHCb detector in proton-proton collisions at centre-of-mass energies of 7 TeV and 8 TeV during Run 1, and 13 TeV during 2016. The corresponding integrated luminosities for the years 2011, 2012 and 2016 are 1.0, 2.0 and 1.7 fb −1 , respectively.
The LHCb detector [23,24] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector elements that are particularly relevant to this analysis are: a silicon-strip vertex detector surrounding the pp interaction region [25] that allows c and b hadrons to be identified from their characteristically long flight distance; a tracking system that provides a measurement of the momentum, p, of charged particles [26,27]; and two ring-imaging Cherenkov detectors that are able to discriminate between different species of charged hadrons [28]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic for each trigger category. The training and testing is performed using the k-fold cross validation technique with k = 10 [43]. Simulated samples are used as a signal proxy and data from the sidebands of the D 0 or B 0 candidate invariant-mass distributions as the background proxy. Specifically, these are candidates outside of a ±40 MeV/c 2 window around the known D 0 -meson mass [39] for the NN D classifier and candidates satisfying The NN D classifier is trained using fourteen variables including PID information as well as kinematic properties and the decay topology of the tracks and D 0 meson candidate. Fourteen variables are also used to train the NN B classifier, including the output of the two NN D classifiers and other observables describing the topology and kinematics of the B 0 meson decay. As the NN D classifier is an input to the NN B classifier, a requirement is only placed on the output of the NN B classifier. This threshold is optimised by maximising the figure of merit N S √ N S +N B separately in each of the two trigger categories and two data taking periods. Here N S is the expected signal yield calculated using the signal efficiency from the simulation and the estimated branching fraction based on branching fraction ratios of similar decays and the known branching fraction B(B + → D 0 D 0 K + ) [39]. The background yield N B is extrapolated from fits to the sidebands of the B 0 candidate invariant-mass distribution. The classifiers are found to be independent of the m(D 0 D 0 K + π − ) distribution.
The family of decays H b → D 0( * ) D 0( * ) H ( * ) , where H b is a beauty hadron and H ( * ) any one-or two-body collection of light or strange hadrons, are examined to search for possible background contributions. These are referred to as peaking backgrounds. Of these, four decay modes ( are found to have substantial contributions to the signal channel. The B + → D 0 D 0 K + decays are removed using requirements on the three-and four-body invariant masses 5220 < m(D 0 D 0 K + ) < 5340 MeV/c 2 for candidates with m(D 0 D 0 K + π − ) > 5380 MeV/c 2 . The corresponding partially reconstructed decay B + → D 0 D * 0 K + is similarly removed with the requirement 5050 < m(D 0 D 0 K + ) < 5200 MeV/c 2 . Contributions from B 0 s → D 0 D 0 φ decays are suppressed using tighter PID requirements in the invariant-mass window 5321 < m(D 0 D 0 K + K − ) < 5411 MeV/c 2 , where the π − candidate is reconstructed under the K − mass hypothesis. Similarly, Λ 0 b → D 0 D 0p K + candidates are removed using PID requirements for candidates satisfying 5575 < m(D 0 D 0 K +p ) < 5665 MeV/c 2 , with the π − candidate reconstructed using the p mass hypothesis. All of these backgrounds are reduced to negligible levels, and only the B + → D 0 D * 0 K + veto induces a sizeable signal loss with an efficiency of 93%.
A particularly challenging source of background are the modes B 0 → D 0 K + π − K + π − , B 0 → D 0 K − π + K + π − and B 0 → K − π + K + π − K + π − , so called single-charm and charmless backgrounds, respectively. Contributions from these decays are reduced by the flight distance criterion for the D 0 mesons, but must be estimated carefully because they peak at the known B 0 meson mass. The residual backgrounds are estimated from the sidebands of the D 0 invariant-mass distributions to be 10 ± 7 candidates. These candidates are subtracted from the yields during the fitting procedure described below.
The efficiency of the selections applied to the signal and control modes are calculated from simulated samples. The selection efficiencies include the geometrical acceptance of the LHCb detector, the online trigger and event reconstruction, offline selections and the neural network classifiers. For the signal mode, a single total efficiency is calculated and the resulting dependence on this efficiency model is considered as a systematic uncertainty.
For the control mode, efficiency variations are seen over the phase space. Therefore, an efficiency is calculated for each candidate that depends on the two-dimensional Dalitz plot of the control mode decay.
Extended unbinned maximum-likelihood fits are performed to the B 0 candidate invariant-mass distributions of the signal and control channels in the range 5235 < m(D 0 D 0 K + π − ) < 5600 MeV/c 2 . The resolution of the m(D 0 D 0 K + π − ) distribution means that the contribution from partially reconstructed B → D * 0 D 0 K + π − and B → D * 0 D * 0 K + π − decays is negligible in this fit range [44].
The fit to the control mode is performed separately in the four data samples, corresponding to the two trigger categories and two data taking periods. The fit to the signal channel is performed simultaneously to these four categories. The invariant-mass distributions for signal and control mode are modelled with a double-sided Crystal Ball function [45]. The parameters describing the tails of these distributions are fixed from fits to simulation separately for each of the four data samples. For the control mode, the mean and width of the mass distribution are determined directly from fits to the data subsamples. The resulting values are compared to those obtained on a fit to simulation to derive correction factors, which are subsequently used in the fits to the mass distribution of the signal channel. For the signal and control mode fits, the combinatorial background in each data sample is modelled with an exponential function with a slope allowed to vary in the fit. In the signal mode, the selections against the peaking backgrounds smoothly modify the shape of the mass distribution of the combinatorial background. This is accounted for by modulating the exponential function by an empirical correction from simulation. In the subsequent fits to the mass distribution of the signal candidates the ratio of branching fractions between the signal and control modes, , is expressed in terms of the signal yield in each of the four data samples as where N sig and N con are the yields of the signal and control modes, respectively, and ε sig , ε con are the corresponding efficiencies. The R parameter is determined from the simultaneous fit to the four data samples. The yield N con and its uncertainty are propagated from the fit to the control mode with a Gaussian constraint. Invariant-mass distributions and fit projections of the B 0 candidates, summed over the trigger and data taking period subsamples, are shown in Fig. 2. In total 297 ± 14 signal and 1697 ± 42 control mode decays are found with a ratio of branching fractions R = (14.2 ± 1.1)%, where the uncertainties are statistical only. Figure 3 shows the background-subtracted [36] invariant-mass distributions of m(D 0 D 0 ), m(D 0 K + ) and m(K + π − ) overlaid with a simple phase-space distribution, including efficiency effects derived from simulation. There are hints of structures visible at the masses of the ψ(3770), D * s2 (2573) + and D * s(1,3) (2860) + , and K * (892) 0 states in the m(D 0 D 0 ), m(D 0 K + ) and m(K + π − ) distributions, respectively. Care should be taken with any interpretation of these projections because structures may be caused by reflections. Further analysis of these structures is left for future studies.
Several sources of systematic uncertainty are taken into account. The impact of using an averaged efficiency in the signal mode is considered by comparing the results using samples of B 0 → D 0 D 0 K * 0 simulated events. An event-by-event correction to the 5300 5400 5500 5600  efficiency is also considered, based on various three-dimensional parameterisations of the full five-dimensional phase space. The fit model uncertainty is calculated by comparing the nominal background model to a polynomial form, and varying the signal shape parameters by sampling multivariate Gaussian distributions to account for the variance in the fit to simulation. The overall fit procedure is tested using pseudoexperiments and no bias is observed. The limited simulation sample size introduces a systematic uncertainty related to the spread in results obtained by varying the overall selection efficiencies within statistical uncertainties. Additionally, the weighting algorithm used to correct the simulation, as well as the data-driven method correcting the PID variables, introduce an associated statistical uncertainty. An uncertainty is also assigned to the estimation of single-charm and charmless background yields, by varying this contribution during the simultaneous fit to data. A correction is applied to the NN B neural network classifier to account for possible mismodelling between data and simulation, and this uncertainty is calculated from the resulting difference in selection efficiencies. A small uncertainty is introduced due to the difference in the efficiency of selections applied to reconstruct candidates in signal and control modes. The systematic uncertainties are summarised in Table 1; they are summed in quadrature to give an overall relative systematic uncertainty on the ratio of branching fractions of 7.3%. In summary, the decay B 0 → D 0 D 0 K + π − is observed for the first time, and its branching ratio relative to B 0 → D * − D 0 K + is measured to be where the first uncertainty is statistical, and the second systematic. This measurement uses the full kinematically-allowed range of B 0 → D 0 D 0 K + π − outside of the D * − region, including the entire K + π − mass range, encompassing the K * (892) 0 resonance and the broad K + π − S-wave. The most precise measurement of the branching fraction of B 0 → D * − D 0 K + decays, performed by the BaBar collaboration [21], is B(B 0 → D * − D 0 K + ) = (2.47 ± 0.21) × 10 −3 [21]. Substituting in this value gives where the third uncertainty comes from the uncertainty on the branching fraction B(B 0 → D * − D 0 K + ). Recently, the LHCb collaboration performed a measurement of the ratio of branching fractions [22]. However, the current precision on the branching fraction of the decay B 0 → D 0 D − K + [39] does not yet allow for a more precise measurement of the decay rate B(B 0 → D * − D 0 K + ). The results in this Letter provide a crucial first step towards studying the rich resonant structure of these decays. An amplitude analysis will provide insights to both the spectroscopy of cs and cc states, and charm-loop contributions to b → s + − decays.

Supplemental Material
This supplemental material includes additional information to that already provided in the main Letter. Section 1 details the correction applied to the selection efficiency of the control mode. Section 2 shows the invariant-mass distributions and fit projections separately for the data subsamples.

Efficiency correction
To account for variations across the Dalitz plot, m 2 (D * − K + ) and m 2 (D 0 K + ), in the control mode, the efficiency is calculated, in each category, as Here w i is the weight derived from the sPlot technique, with the B 0 candidate invariant mass distribution as the discriminating variable. The symbol ε i refers to the efficiency as a function of the Dalitz plot position for candidate i in the control mode data sample. The efficiency variation across the Dalitz plot is displayed in Fig. S1. The average uncertainty across the bins of the Dalitz plane ranges from 1−8% for the four data samples. The two exclusive trigger categories are henceforth referred to as Trigger on Signal (TOS), for events triggered by the signal decay, and Trigger Independent of Signal (TIS), for events triggered by other particles in the event and that do not fall into the TOS category.     ii

Mass fits
The invariant-mass distributions and fit projections of the fit model results for each run period and trigger category are shown in Fig. S2 and Fig. S3 for the signal and control mode, respectively. Figure S4 shows the invariant-mass distributions for each category in an extended range to include the partially reconstructed backgrounds B 0 → D 0( * ) D 0( * ) K + π − for illustrative purposes. The partially reconstructed peaks are modelled with Crystal Ball functions [45]. Further systematic uncertainties would be required to account for the effects of modelling these additional peaks, motivating the nominal fit range 5235 < m(D 0 D 0 K + π − ) < 5600 MeV/c 2 , where the partially reconstructed contributions are negligible. As a cross check, performing the fit in this extended range results in a branching fraction ratio R that is in good agreement with the measurement from the nominal fit.