Observation of the decay $B_s^0 \to \overline{D}^0 K^+ K^-$

The first observation of the $B_s^0 \to \overline{D}^0 K^+ K^-$ decay is reported, together with the most precise branching fraction measurement of the mode $B^0 \to \overline{D}^0 K^+ K^-$. The results are obtained from an analysis of $pp$ collision data corresponding to an integrated luminosity of $3.0~\textrm{fb}^{-1}$. The data were collected with the LHCb detector at centre-of-mass energies of $7$ and $8$ TeV. The branching fraction of the $B^0 \to \overline{D}^0 K^+ K^-$ decay is measured relative to that of the decay $B^0 \to \overline{D}^0 \pi^+ \pi^-$ to be $$\frac{\mathcal{B}(B^0 \to \overline{D}^0 K^+ K^-)}{\mathcal{B}(B^0 \to \overline{D}^0 \pi^+ \pi^-)} = (6.9 \pm 0.4 \pm 0.3)\%,$$ where the first uncertainty is statistical and the second is systematic. The measured branching fraction of the $B_s^0 \to \overline{D}^0 K^+ K^-$ decay mode relative to that of the corresponding $B^0$ decay is $$\frac{\mathcal{B}(B_s^0 \to \overline{D}^0 K^+ K^-)}{\mathcal{B}(B^0 \to \overline{D}^0 K^+ K^-)} = (93.0 \pm 8.9 \pm 6.9)\%.$$ Using the known branching fraction of ${B^0 \to \overline{D}^0 \pi^+ \pi^-}$, the values of ${{\mathcal B}(B^0 \to \overline{D}^0 K^+ K^- )=(6.1 \pm 0.4 \pm 0.3 \pm 0.3) \times 10^{-5}}$, and ${{\cal B}(B_s^0 \to \overline{D}^0 K^+ K^-)=}$ $(5.7 \pm 0.5 \pm 0.4 \pm 0.5) \times 10^{-5}$ are obtained, where the third uncertainties arise from the branching fraction of the decay modes ${B^0 \to \overline{D}^0 \pi^+ \pi^-}$ and $B^0 \to \overline{D}^0 K^+ K^-$, respectively.


Introduction
The precise measurement of the angle γ of the Cabibbo-Kobayashi-Maskawa (CKM) Unitarity Triangle [1,2] is a central topic in flavour physics experiments. Its determination at the subdegree level in tree-level open-charm b-hadron decays is theoretically clean [3,4] and provides a standard candle for measurements sensitive to new physics effects [5]. In addition to the results from the B factories [6], various measurements from LHCb [7][8][9] allow the angle γ to be determined with an uncertainty of around 5 • . However, no single measurement dominates the world average, as the most accurate measurements have an accuracy of about 10 • to 20 • [10, 11]. Alternative methods are therefore important to improve the precision. Among them, an analysis of the decay B 0 s → D 0 φ has the potential to make a significant impact [12][13][14][15]. Moreover, a Dalitz plot analysis of B 0 s → D 0 K + K − decays can further improve the determination of γ due to the increased sensitivity to interference effects, as well as allowing the CP -violating phase φ s to be determined in B 0 s − B 0 s mixing with minimal theoretical uncertainties [16]. The mode B 0 s → D 0 φ has been previously observed by the LHCb collaboration with a data sample corresponding to an integrated luminosity of 1.0 fb −1 [17]. The observation of B 0 → D 0 K + K − and evidence for B 0 s → D 0 K + K − have also been reported by the LHCb collaboration using a data sample corresponding to 0.62 fb −1 [18]. These decays are mediated by decay processes such as those shown in Fig. 1.
In this paper an improved measurement of the branching fraction of the decay B 0 → D 0 K + K − and the first observation of the decay B 0 s → D 0 K + K − are presented. 1 The branching fractions are measured relative to that of the topologically similar and abundant decay B 0 → D 0 π + π − . The analysis is based on a data sample corresponding to an integrated luminosity of 3.0 fb −1 of pp collisions collected with the LHCb detector. 1 The inclusion of charge conjugate modes is implied throughout this paper. Approximately one third of the data was obtained during 2011, when the collision centreof-mass energy was √ s = 7 TeV, and the rest during 2012 with √ s = 8 TeV. Compared to the previous analysis [18], a revisited selection and a more sophisticated treatment of the various background sources are employed, as well as improvements in the handling of reconstruction and trigger efficiencies, leading to an overall reduction of systematic uncertainties. The present analysis benefits from the improved knowledge of the decays B 0 (s) → D 0 K − π + [19], Λ 0 b → D 0 ph − , where h − stands for a π − or a K − meson [20], which contribute to the background, and of the normalisation decay mode B 0 → D 0 π + π − [21].
This analysis sets the foundation for the study of the B 0 (s) → D ( * )0 φ decays, which are presented in a separate publication [22]. The current data set does not yet allow a Dalitz plot analysis of the B 0 (s) → D 0 K + K − decays to be performed, but these modes could provide interesting input to excited D + s meson spectroscopy, in particular because the decay diagrams are different from those of the B 0 s → D 0 K − π + decay [23] (i.e. different resonances can be favoured in each decay mode).
This paper is structured as follows. A brief description of the LHCb detector, as well as the reconstruction and simulation software, is given in Sect. 2. Signal selection and background suppression strategies are summarised in Sect. 3. The characterisation of the various remaining backgrounds and their modelling is described in Sect. 4 and the fit to the B 0 → D 0 π + π − and B 0 (s) → D 0 K + K − invariant-mass distributions to determine the signal yields is presented in Sect. 5. The computation of the efficiencies needed to derive the branching fractions is explained in Sect. 6 and the evaluation of systematic uncertainties is described in Sect. 7. The results on the branching fractions and a discussion of the Dalitz plot distributions are reported in Sect. 8.

Detector and simulation
The LHCb detector [24,25] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a siliconstrip vertex detector surrounding the pp interaction region [26], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes [27] placed downstream of the magnet. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov (RICH) detectors [28]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [29].
The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. At the hardware trigger stage, events are required to have a muon with high p T or a hadron, photon or electron with high transverse energy in the calorimeters. For hadrons, the transverse energy threshold is 3.5 GeV. A global hardware trigger decision is ascribed to the reconstructed candidate, the rest of the event or a combination of both; events triggered as such are defined respectively as triggered on signal (TOS), triggered independently of signal (TIS), and triggered on both. The software trigger requires a two-, three-or four-track secondary vertex with a significant displacement from the primary pp interaction vertices. At least one charged particle must have a transverse momentum p T > 1.7 GeV/c and be inconsistent with originating from a PV. A multivariate algorithm [30] is used for the identification of secondary vertices consistent with the decay of a b hadron.
Candidates that are consistent with the decay chain B 0 (s) → D 0 K + K − , D 0 → K + π − are selected. In order to reduce systematic uncertainties in the measurement, the topologically similar decay B 0 → D 0 π + π − , which has previously been studied precisely [21,31], is used as a normalisation channel. Tracks are required to be consistent with either the kaon or pion hypothesis, as appropriate, based on particle identification (PID) information from the RICH detectors. All other selection criteria are tuned on the B 0 → D 0 π + π − channel. The large yields available in the normalisation sample allow the selection to be based on data. Simulated samples are used to evaluate efficiencies and characterise the detector response for signal and background decays. In the simulation, pp collisions are generated using Pythia [32] with a specific LHCb configuration [33]. Decays of hadronic particles are described by EvtGen [34], in which final-state radiation is generated using Photos [35]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [36] as described in Ref. [37].

Initial selection
Signal B 0 (s) candidates are formed by combining D 0 candidates, reconstructed in the decay channel K + π − , with two additional tracks of opposite charge. After the trigger, an initial selection, based on kinematic and topological variables, is applied to reduce the combinatorial background by more than two orders of magnitude. This selection is designed using simulated B 0 → D 0 π + π − decays as a proxy for signal and data B 0 → D 0 π + π − candidates lying in the upper-mass sideband [5400, 5600] MeV/c 2 as a background sample. The combinatorial background arises from random combinations of tracks that do not come from a single decay. For the B 0 → D 0 π + π − mode, no b-hadron decay contribution is expected in the upper sideband [5320, 6000] MeV/c 2 , i.e. no B 0 s contribution is expected [38]. The reconstructed tracks are required to be inconsistent with originating from any PV. The D 0 decay products are required to originate from a common vertex with an invariant mass within ±25 MeV/c 2 of the known D 0 mass [39]. The invariant-mass resolution of the reconstructed D 0 mesons is about 8 MeV/c 2 and the chosen invariant-mass range allows most of the background from the D 0 → K + K − and D 0 → π + π − decays to be rejected. The D 0 candidates and the two additional tracks are required to form a vertex. The reconstructed D 0 and B 0 vertices must be significantly displaced from the associated PV, defined, in case of more than one PV in the event, as that which has the smallest χ 2 IP with respect to the B candidate. The χ 2 IP is defined as the difference in the vertex-fit quality χ 2 of a given PV reconstructed with and without the particle under consideration. The reconstructed D 0 vertex is required to be displaced downstream from the reconstructed B 0 (s) vertex, along the beam axis direction. This requirement reduces the background from charmless B decays, corresponding to genuine B 0 → K + π − h + h − decays, for instance from B 0 → K + π − ρ 0 or B 0 → K * 0 φ decays, to a negligible level. This requirement also suppresses background from prompt charm production, as well as fake reconstructed D 0 coming from the PV. The B 0 (s) momentum vector and the vector connecting the PV to the B 0 (s) vertex are requested to be aligned. Unless stated otherwise, a kinematic fit [40] is used to improve the invariant-mass resolution of the B 0 (s) candidate. In this fit, the B 0 (s) momentum is constrained to point back to the PV and the D 0 -candidate invariant mass to be equal to its known value [39], and the charged tracks are assigned the K or π mass hypothesis as appropriate.
within the range [5115, 6000] MeV/c 2 are then considered. This range allows the B 0 (s) signal regions to be studied, while retaining a sufficiently large upper sideband to accurately determine the invariant-mass shape of the surviving combinatorial background. The lower-mass limit removes a large part of the complicated partially reconstructed backgrounds and has a negligible impact on the determination of the signal yields.
The world-average value of the branching fraction B(B 0 → D 0 π + π − ) is equal to (8.8 ± 0.5) × 10 −4 [39] and is mainly driven by the Belle [31] and LHCb [21] measurements. This value is used as a reference for the measurement of the branching fractions of the decays B 0 (s) → D 0 K + K − . The large contribution from the exclusive decay chain B 0 → D * (2010) − π + , D * (2010) − → D 0 π − , with a branching fraction of (1.85 ± 0.09) × 10 −3 [39], is not included in the above value. Thus, a D * (2010) − veto is applied. The veto consists of rejecting candidates with m D 0 π − − m D 0 within ±4.8 MeV/c 2 of the expected mass difference [39], which corresponds to ±6 times the LHCb detector resolution on this quantity. Due to its high production rate and possible misidentification of its decay products, the decay B 0 → D * (2010) − (→ D 0 π − )π + could also contribute as a background to the B 0 (s) → D 0 K + K − channel. Therefore, the same veto criterion is applied to B 0 (s) → D 0 K + K − candidates as for the B 0 → D 0 π + π − normalisation mode, where the invariant mass difference m D 0 π − − m D 0 is computed after assigning the pion mass to each kaon in turn.
Only kaon and pion candidates within the kinematic region corresponding to the fiducial acceptance of the RICH detectors [28] are kept for further analysis. This selection is more than 90% efficient for the B 0 → D 0 π + π − signal, as estimated from simulation. Although the D 0 candidates are selected in a narrow mass range, studies on simulated samples show a small fraction of D 0 → K + K − (∼ 4.5 × 10 −5 ) and D 0 → π + π − (∼ 3.0 × 10 −4 ) decays, with respect to the genuine D 0 → K + π − signal, are still selected. Therefore, loose PID requirements are applied in order to further suppress D 0 → K + K − and D 0 → π + π − decays. In the doubly Cabibbo-suppressed D 0 → K + π − decay both the kaon and the pion are correctly identified and reconstructed, but the D 0 flavour is misidentified. This is expected to occur in less than R D = (0.348 +0. 004 −0.003 )% [7] of D 0 → K + π − signal decays. However, such an effect does not impact the measurements of the ratio of branching fractions , as the resulting dilution is the same for the numerator and the denominator.

Multivariate selection
Once the initial selections are implemented, a multivariate analysis (MVA) is applied to further discriminate between signal and combinatorial background. The implementation of the MVA is performed with the TMVA package [41,42], using the B 0 → D 0 π + π − normalisation channel to optimise the selection. For this purpose only, a loose PID criterion on the pions of the π + π − pair is set to reject the kaon and proton hypotheses. The sPlot technique [43] is used to statistically separate signal and background in data, with the B 0 candidate invariant mass used as the discriminating variable. The sPlot weights (sWeights) obtained from this procedure are applied to the candidates to obtain signal and background distributions that are then used to train the discriminant.
To compute the sWeights, the signal-and combinatorial-background yields are determined using an unbinned extended maximum-likelihood fit to the invariant-mass distribution of B 0 candidates. The fit uses the sum of a Crystal Ball (CB) function [44] and a Gaussian function for the signal distribution and an exponential function for the combinatorial background distribution. The fit is first performed in the invariant-mass range m D 0 π + π − ∈ [5240, 5420] MeV/c 2 , to compute the sWeights, and is repeated within the signal region [5240, 5320] MeV/c 2 with all the parameters fixed to the result of the initial fit, except the signal and the background yields, which are found to be 44 690 ± 540 and 81 710 ± 570, respectively. The training samples are produced by applying the necessary signal and background sW eights, with half of the data used and randomly chosen for training and the other half for validation.
Several sets of discriminating variables, as well as various linear and non-linear MVA methods, are tested. These variables contain information about the topology and the kinematic properties of the event, vertex quality, χ 2 IP and p T of the tracks, track multiplicity in cones around the B 0 candidate, relative flight distances between the B 0 and D 0 vertices and from the PV. All of the discriminating variables have weak correlations (< 1.6%) with the invariant mass m D 0 π + π − of the B 0 candidates. Very similar separation performance is seen for all the tested discriminants. Therefore, a Fisher discriminant [45] with the minimal set of the five most discriminating variables is adopted as the default MVA configuration. This option is insensitive to overtraining effects. These five variables are: the smallest values of χ 2 IP and p T for the tracks of the π + π − pair, the B flight distance χ 2 , the D χ 2 IP , and the signed minimum cosine of the angle between the direction of one of the pions from the B decay and the D 0 meson, as projected in the plane perpendicular to the beam axis. Figure 2 shows the distributions of the Fisher discriminant for the sWeighted training samples (signal and background) and their sum, compared to the data set of preselected B 0 → D 0 π + π − candidates. These distributions correspond to candidates in the invariantmass signal region, and agree well within the statistical uncertainties, demonstrating that no overtraining is observed. Based on the fitted numbers of signal and background candidates, the statistical figure of merit Q = N S / √ N S + N B is defined to find an optimal operation point, where N S and N B are the numbers of selected signal and background candidates above a given value x F of the Fisher discriminant. The value of x F that maximises Q is found to be −0.06, as shown in Fig. 2 and at this working point the signal efficiency is (82.4 ± 0.4)% and the fraction of rejected background is (89.2 ± 1.0)%. In Fig. 2 the distribution of simulated B 0 → D 0 K + K − signal decays is also shown to be in good agreement with the sWeighted B 0 → D 0 π + π − data training sample.

Particle identification of h + h − pairs
After the selections, specific PID requirements are set to identify the tracks of the B 0 (s) decays to distinguish the normalisation channel B 0 → D 0 π + π − and the B 0 (s) → D 0 K + K − signal modes. For the B 0 → D 0 π + π − normalisation channel, the π ± candidates must each satisfy the same PID requirements to identify them as pions, while the kaon and proton hypotheses are rejected. These criteria are tuned by comparing a simulated sample of B 0 → D 0 π + π − signal and a combination of simulated samples that model the misidentified backgrounds. The combination of backgrounds contains all sources expected to give the largest contributions, namely the The same tuning procedure is repeated for the two B 0 (s) → D 0 K + K − signal modes, where the model for the misidentified background is composed of the main contributing background decays: The K ± candidates are required to be positively identified as kaons and the pion and proton hypotheses are excluded. Loose PID requirements are chosen in order to favour the highest signal efficiencies and to limit possible systematic uncertainties due to data and simulation discrepancies, which arise when computing signal efficiencies related to PID (see Sect. 6).

Multiple candidates
Given the selection described above, 1.2% and 0.8% of the events contain more than one candidate in the B 0 → D 0 π + π − normalisation and the B 0 (s) → D 0 K + K − signal modes, respectively. There are two types of multiple candidates to consider. In the first type, for which two or more good B or D decay vertices are present, the candidate with the smallest sum of the B 0 (s) and D 0 vertex χ 2 is then kept. In the second type, which occurs if a swap of the mass hypotheses of the D decay products leads to a good candidate, the PID requirements for the two options K + π − and π + K − are compared and the candidate corresponding to the configuration with the highest PID probability is kept. In order no to bias the m D 0 h + h − invariant-mass distribution with the choice of the best candidate, it is checked with simulation that the variables used for selection are uncorrelated with the invariant mass, m D 0 h + h − . It is also computed with simulation that differences between the efficiencies while choosing the best candidate for B 0 → D 0 π + π − and B 0 (s) → D 0 K + K − decays are negligible [46].

Fit components and modelling 4.1 Background characterisation
The B 0 (s) → D 0 h + h − selected candidates consist of signal and various background contributions: combinatorial, misidentified, and partially reconstructed b-hadron decays.
The misidentified background originates from real b-hadron decays, where at least one final-state particle is incorrectly identified in the decay chain. For the B 0 → D 0 π + π − normalisation channel, three decays requiring a dedicated modelling are identified: Here as well, the contribution from B 0 → D 0 π + π − is negligible, due to the positive identification of both kaons. Using the simulation and recent measurements for the various branching fractions [18-21, 39, 47] and for the fragmentation factors f s /f d [48] and [49], an estimation of the relative yields with respect to those of the simulated signals is computed over the whole invariant-mass range, The values are listed in Table 1. The expected yields of the backgrounds related to decays of Λ 0 b baryons cannot be predicted accurately due the limited knowledge of their branching fractions and of the relative production rate [49]. The partially reconstructed background corresponds to real b-hadron decays, where a neutral particle is not reconstructed and possibly one of the other particles is misidentified.
where the photon or the neutral pion is not reconstructed. This type of background populates the lowmass region m D 0 h + h − < 5240 MeV/c 2 . For the fit of the B 0 → D 0 π + π − invariant-mass distribution, the main contributions that need special treatment are B 0 s → D * 0 K − π + and B 0 → D * 0 [D 0 γ]π + π − , for which the branching fractions are poorly known [50]. For the Using simulation and the available information on the branching fractions [39], and by making the assumption that B(B 0 s → D * 0 K − π + ) and B(B 0 s → D 0 K − π + ) are equal (this is approximately the case for B 0 → D * 0 π + π − and B 0 → D 0 π + π − decays), Table 1: Relative yields, in percent, of the various exclusive b-hadron decay backgrounds with respect to that of the B 0 → D 0 π + π and B 0 (s) → D 0 K + K − signal modes. These relative contributions are estimated with simulation in the range an estimate of the relative yields with respect to those of the simulated signals is computed over the whole invariant-mass range, The values are given in Table 1. The contributions from these backgrounds are somewhat larger than those of the misidentified background, but are mainly located in the mass region

Signal modelling
The invariant-mass distribution for each of the signal B 0 (s) → D 0 h + h − modes is parametrised with a probability density function (PDF) that is the sum of two CB functions with a common mean, The parameters α 1,2 and n 1,2 describing the tails of the CB functions are fixed to the values fitted on simulated samples generated uniformly (phase space) over the B 0 (s) → D 0 h + h − Dalitz plot. The mean value m 0 , the resolutions σ 1 and σ 2 , and the fraction f CB between the two CB functions are free to vary in the fit to the B 0 → D 0 π + π − normalisation channel. For the fit to B 0 (s) → D 0 K + K − data, the resolutions σ 1 and σ 2 are fixed to those obtained with the normalisation channel, while the mean value m 0 and the relative fraction f CB of the two CB functions are left free.

Combinatorial background modelling
For all channels, the combinatorial background contributes to the full invariant-mass range. It is modelled with an exponential function where the slope a comb. and the normalisation parameter N comb. is free to vary in the fit. The invariant-mass range extends up to 6000 MeV/c 2 to include the region dominated by combinatorial background. This helps to constrain the combinatorial background yield and slope.

Misidentified and partially reconstructed background modelling
The shape of misidentified and partially reconstructed components is modelled by nonparametric PDFs built from large simulation samples. These shapes are determined using the kernel estimation technique [51]. The normalisation of each component is free in the fits. For the normalisation channel B 0 → D 0 π + π − , a component for the decay B 0 → D * 0 [D 0 π 0 ]π + π − is added and modelled by a Gaussian distribution. This PDF also accounts for a possible contribution from the B + → D 0 π + π + π − decay, which has a similar shape. In the case of the B 0 (s) → D 0 K + K − signal channels, the low-mass background also includes a Gaussian distribution to model the decay B 0 → D * 0 K + K − . To account for differences between data and simulation, these PDFs are modified to match the width and mean of the m D 0 π + π − distribution seen in the data. The normalisation parameter, N Low−m , of these partially reconstructed backgrounds is free to vary in the fit.

Specific treatment of the
Although their branching fractions have been recently measured [20], the broadness of these backgrounds impacts the determination of both the B 0 → D 0 h + h − and the B 0 The yields of these modes can be determined in data by assigning the proton mass to the h − track of the B 0 (s) → D 0 h + h − decay, where the charge of h ± is chosen such that it corresponds to the Cabibbo-favoured D 0 mode in the The invariant-mass distribution of Λ 0 b → D 0 pπ − is obtained from the B 0 → D 0 π + π − candidates. A Gaussian distribution is used to model the Λ 0 b → D 0 pπ − signal, while an exponential distribution is used for the combinatorial background. The validity of the background modelling is checked by assigning the proton mass hypothesis to the pion of opposite charge to that expected in the B 0 decay. Different fit regions are tested, as well as an alternative fit, where the resolution of the Gaussian PDF that models the Λ 0 b → D 0 pπ − mass distribution is fixed to that of B 0 → D 0 π + π − . The relative variations of the various configurations are compatible within their uncertainties; the largest deviations are used as the systematic uncertainties. Finally, the obtained yield for Λ 0 b → D 0 pπ − is 1101 ± 144,including the previously estimated systematic uncertainties. This yield is then used as a Gaussian constraint in the fit to the m D 0 π + π − invariant-mass distribution presented in Sect. 5.2 and the fit results are presented in Table 2.
The corresponding m D 0 pK − and m D 0 pπ − distributions are determined using the B 0 (s) → D 0 K − K + data set. Five components are used to describe the data and to fit the two distributions simultaneously: , and combinatorial background. A small contribution from the Ξ 0 b → D 0 pK − decay is observed and is included in the default B 0 (s) → D 0 K + K − fit, where its nonparametric PDF is obtained from simulation. The Λ 0 b → D 0 pπ − distribution is contaminated by the misidentified backgrounds Λ 0 b → D 0 pK − , Ξ 0 b → D 0 pK − , and  baryons is fixed to its known value [39]. The effect of the latter constraint is minimal and is not associated with any systematic uncertainty. The combinatorial background is modelled with an exponential function, while other misidentified backgrounds are modelled by non-parametric PDFs obtained from simulation. As for the previous case with B 0 → D 0 π + π − candidates, alternative fits are applied, leading to consistent results where the largest variations are used to assign systematic uncertainties for the determination of the yields of the various components. A test is performed to include a specific cross-feed contribution from the channel B 0 s → D 0 K + K − . No noticeable effect is observed, except on the yield of the B 0 s → D 0 K − π + contribution. The outcome of this test is nevertheless included in the systematic uncertainty. The obtained yields for the Λ 0 , 64 ± 21, and 74 ± 32, respectively, where the systematic uncertainties are included. These yields and their uncertainties, listed in Table 2, are used as Gaussian constraints in the fit to the B 0 (s) → D 0 K + K − invariant-mass distribution presented in Sect. 5.2.

Likelihood function for the B 0 (s) → D 0 h + h − invariant-mass fit
The total probability density function P tot θ (m D 0 h + h − ) of the fitted parameters θ, is used in the extended likelihood function where m i,D 0 h + h − is the invariant mass of candidate i, v is the sum of the yields and n the number of candidates observed in the sample. The likelihood function N j,bkg × P j,bkg (m D 0 π + π − ), (3) Table 3: Parameters from the default fit to B 0 → D 0 π + π − and B 0 (s) → D 0 K + K − data samples in the invariant-mass range m D 0 h + h − ∈ [5115, 6000] MeV/c 2 . The quantity χ 2 /ndf corresponds to the reduced χ 2 of the fit for the corresponding number of degrees of freedom, ndf, while the p-value is the probability value associated with the fit and is computed with the method of least squares [39].
The PDFs used to model the signals P

Default fit and robustness tests
The default fit to the data is performed, using the MINUIT/MINOS [52] and the RooFit [53] software packages, in the mass-range m   Figure 3: Fit to the m D 0 π + π − invariant-mass distribution with the associated pull plot.
given in Table 3.
An unconstrained fit to the m D 0 π + π − distribution returns a negative B 0 s → D * 0 K − π + yield, which is consistent with zero within statistical uncertainties (−2167 ± 1514), while the expected yield is around 1.8% that of the signal yield, or 540 (see Table 1). The B 0 s → D * 0 K − π + contribution lies in the lower mass region, where background contributions are complicated, but have little effect on the signal yield determination. In the fit results listed in Table 3, this contribution is fixed to be 540. The difference in the signal yield with and without this constraint amounts to 77, which is included as a systematic uncertainty. The results obtained for the other backgrounds are consistent with the estimated relative yields computed in Sect. 4.1. The fit uses Gaussian constraints in the fitted likelihood function for the yields of the modes The fitted signal yields are N B 0 →D 0 π + π − = 29 943 ± 243, N B 0 →D 0 K + K − = 1918 ± 74, and N B 0 s →D 0 K + K − = 473 ± 33, and the ratio r B 0   For the B 0 (s) → D 0 K + K − channels, the fitted contributions for the B 0 s → D 0 K − π − and B 0 → D 0 K + π + decays are compatible with zero. These components are removed one-by-one in the default fit. The results of these tests are compatible with the output of the default fit. Therefore, no systematic uncertainty is applied.
Pseudoexperiments are generated using the default fit parameters with their uncertainties (see Table 3), to build 500 (1000) samples of B 0 → D 0 π + π − (B 0 (s) → D 0 K + K − ) candidates according to the yields determined in data. The fit is then repeated on these samples to compute the three most important observables N B 0 →D 0 π + π − , N B 0 →D 0 K + K − , and r B 0 s /B 0 . No bias is seen in the three considered quantities. A coverage test is performed based on the associated pull distributions yields Gaussian distributions, with the expected mean and standard deviation. This test demonstrates that the statistical uncertainties on the yields obtained from the fit are well estimated.

Calculation of efficiencies and branching fraction ratios
The ratios of branching fractions are calculated as and where the yields are obtained from the fits described in Sect. 5 and the fragmentation factor ratio f s /f d is taken from Ref. [48]. The efficiencies ε account for effects related to reconstruction, triggering, PID and selection of the B 0 (s) → D 0 h + h − decays. These efficiencies vary over the Dalitz plot of the B decays. The total efficiency factorises as where ε X|Y is the efficiency of X relative to Y. The contribution ε geom is determined from the simulation, and corresponds to the fraction of simulated decays which can be fully reconstructed within the LHCb detector acceptance. The term ε sel|geom accounts for the software part of the trigger system, the pre-filtering, the initial selection, the Fisher discriminant selection efficiencies, and for the effects related to the reconstruction of the charged tracks. It is computed with simulation, but the part related to the tracking includes corrections obtained from data control samples. The PID selection efficiency ε PID|sel & geom is determined from the simulation corrected using pure and abundant D * (2010) + → D 0 π + and Λ → pπ − calibration samples, selected using kinematic criteria only. Finally, ε HW Trig|PID & sel & geom is related to the effects due to the hardware part of the trigger system. Its computation is described in the next section. As ratios of branching fractions are measured, only the ratios of efficiencies are of interest. Since the multiplicities of all the final states are the same, and the kinematic distributions of the decay products are similar, the uncertainties in the efficiencies largely cancel in the ratios of branching fractions. The main difference comes from the PID criteria for the B 0 → D 0 π + π − and B 0 → D 0 K + K − final states.

Trigger efficiency
The software trigger performance is well described in simulation and is included in ε sel|geom . The efficiency of the hardware trigger depends on data-taking conditions and is determined from calibration data samples. The candidates are of type TOS or TIS, and both types (see Sect. 2). the efficiency ε HW Trig|PID & sel & geom can be written as The quantity N ref is the number of signal decays that pass all the selection criteria, and N TOS&!TIS is the number of candidates only triggered by TOS (i.e. not by TIS). Using Eq. 8, the hardware trigger efficiency is calculated from three observables: ε TIS , f , and ε TOS .
The quantities ε TIS and f are effectively related to the TIS efficiency only. Therefore they are assumed to be the same for the three channels B 0 (s) → D 0 h + h − and are obtained from data. The value f = (69 ± 1)% is computed using the number of signal candidates in the B 0 → D 0 π + π − sample obtained from a fit to data for each trigger requirement. The independence of this quantity with respect to the decay channel is checked both in simulation and in the data with the two B 0 (s) → D 0 K + K − modes. Similarly, the value of ε TIS is found to be (42.2 ± 0.7)%.
The efficiency ε TOS is computed for each of the three decay modes B 0 (s) → D 0 h + h − from phase-space simulated samples corrected with a calibration data set of D * + → D 0 [K − π + ]π + decays. Studies of the trigger performance [54,55] provide a mapping for these corrections as a function of the type of the charged particle (kaon or pion), its electric charge, p T , the region of the calorimeter region it impacts, the magnet polarity (up or down), and the time period of data taking (year 2011 or 2012). The value of ε TOS for each of the three signals is listed in Table 4.

Total efficiency
The simulated samples used to obtain the total selection efficiency ε B 0 (s) →D 0 h + h − are generated with phase-space models for the three-body B 0 (s) → D 0 h + h − decays. The threebody distributions in data are, however, significantly nonuniform (see Sect. 8). Therefore corrections on ε B 0 (s) →D 0 h + h − are derived to account for the Dalitz plot structures in the considered decays. The relative selection efficiency as a function of the D 0 h + and the D 0 h − squared invariant masses, ε(m 2 D 0 h + , m 2 D 0 h − ), is determined from simulation and parametrised with a polynomial function of fourth order. The function ε(m 2 Table 4: Total efficiencies ε B 0 (s) →D 0 h + h − and their contributions (before and after accounting for three-body decay kinematic properties) for the each three modes B 0 → D 0 π + π − , B 0 → D 0 K + K − , and B 0 s → D 0 K + K − . Uncertainties smaller than 0.1 are displayed as 0.1, but are accounted with their nominal values in the efficiency calculations.
is normalised such that its integral is unity over the kinematically allowed phase space. The total efficiency correctionε DP corr. factor is calculated, accounting for the position of each candidate across the Dalitz plot, as where m 2 i,D 0 h + and m 2 i,D 0 h − are the squared invariant masses of the D 0 h + and D 0 h − combinations for the i th candidate in data, and ω i is its signal sW eight obtained from the default fit to the B 0 (s) → D 0 h + h − invariant-mass distribution (m B 0 (s) ∈ [5115, 6000] MeV/c 2 ). The statistical uncertainties on the efficiency corrections is evaluated with 1000 pseudoexperiments for each decay mode. The computation of the average efficiency is validated with an alternative procedure in which the phase space is divided into 100 bins for the B 0 → D 0 π + π − normalisation channel and 20 bins for the B 0 (s) → D 0 K + K − signal modes. This binning is obtained according to the efficiency map of each decay, where areas with similar efficiencies are grouped together. The total average efficiency is then computed as a function of the efficiency and the number of candidates in each bin. The two methods give compatible results within the uncertainties. The values ofε DP corr. for each of the three signals are listed in Table 4. Table 4 shows the value of the total efficiency ε B 0 (s) →D 0 h + h − and its contributions. The relative values of ε TIS B 0 (s) →D 0 h + h − and ε TOS B 0 (s) →D 0 h + h − , for TIS and TOS triggered candidates, are also given. The total efficiency is obtained as (see Eq. 8) where f = (69 ± 1)%. The total efficiencies for the three B 0 (s) → D 0 h + h − modes are compatible within their uncertainties.

Systematic uncertainties
Many sources of systematic uncertainty cancel in the ratios of branching fractions. Other sources are described below.

Trigger
The calculation of the hardware trigger efficiency is described in Sect. 6.1. To determine ε HW Trig|PID & sel & geom , a data-driven method is exploited. It is based on ε TOS , as described in Refs. [55] and [56], and on the quantities f and ε TIS , determined on the data normalisation channel B 0 → D 0 π + π − (see Eq. 8). The latter two quantities depend on the TIS efficiency of the hardware trigger and are assumed to be the same for all three modes.
The values of f and ε TIS are consistent for the B 0 → D 0 K + K − and the B 0 s → D 0 K + K − channels; no systematic uncertainty is assigned for this assumption. Simulation studies show that these values are consistent for B 0 → D 0 K + K − and B 0 → D 0 π + π − channels. A 2.0% systematic uncertainty, corresponding to the maximum observed deviation with simulation, is assigned on the ratio of their relative ε HW Trig|PID & sel & geom efficiencies.

PID
A systematic uncertainty is associated to the efficiency ε PID|sel & geom when final states of the signal and normalisation channels are different. For each track which differs in the signal channel B 0 → D 0 K + K − and the normalisation channel B 0 → D 0 π + π − , an uncertainty of 0.5% per track due to the kaon or pion identification requirement is applied (e.g. see Refs. [19,57]). As the same PID requirements are used for D 0 decay products for all modes, the charged tracks from those decay products do not need to be considered. The relevant systematic uncertainties are added linearly to account for correlations in these uncertainties. An overall PID systematic uncertainty of 2.0% on the ratio B(B 0 → D 0 K + K − )/B(B 0 → D 0 π + π − ) is assigned.

Signal and background modelling
Systematic effects due to the imperfect modelling of both the signal and background distributions in the fit to m D 0 h + h − are studied. Additional components are considered for each fit on m Dπ + π − and m D 0 K + K − . Moreover the impact of backgrounds with a negative yield, or compatible with zero at one standard deviation is evaluated. The various sources of systematic uncertainties discussed in this section are given in Table 5. The main sources are related to resolution effects and to the modelling of the signal and background PDFs.
A systematic uncertainty is assigned for the modelling of the PDF P sig , defined in Eq. 1. The value of the tail parameters α 1,2 and n 1,2 are fixed to those obtained from simulation. To test the validity of this constraint, new sets of tail parameters, compatible with the covariance matrix obtained from a fit to simulated signal decays, are generated and used as new fixed values. The variance of the new fitted yields is 1.0% of the yield N B 0 →D 0 π + π − , which is taken as the associated systematic uncertainty. For the fit to the B 0 (s) → D 0 K + K − candidates, the above changes to the tail parameters correspond to a 1.4% relative effect on the yield N B 0 →D 0 K + K − and 0.4% on the ratio r B 0 s /B 0 . Another systematic uncertainty is linked to the relative resolution of the B 0 s → D 0 K + K − mass Table 5: Relative systematic uncertainties, in percent, on N B 0 →D 0 π + π − , N B 0 →D 0 K + K − and the ratio r B 0 s /B 0 , due to PDFs modelling in the m D 0 π + π − and m D 0 K + K − fits. The uncertainties are uncorrelated and summed in quadrature.
Source peak with respect to that of the B 0 → D 0 K + K − signal. In the default fit, the resolutions of these two modes are fixed to be the same. Alternatively, the relative difference of the resolution for the two modes can be taken to be proportional to the kinetic energy released in the decay, where m X indicates the known mass of the X meson, so that the resolution of the B 0 signal stays unchanged, while that of the B 0 s distribution is multiplied by Q s /Q d = 1.02. The latter effect results in a small change of 0.2% on N B 0 →D 0 K + K − , as expected, and a larger variation of 1.7% on r B 0 s /B 0 . A third systematic uncertainty on B 0 (s) → D 0 K + K − signal modelling is computed to account for the mass difference ∆m B which is fixed in this fit (see Sect. 4.2). When left free in the fit, the measured mass difference ∆m B = 88.29 ± 1.23 MeV/c 2 is consistent with the value fixed in the default fit, which creates a relative change of 1.6% on N B 0 →D 0 K + K − and a larger one of 3.8% on r B 0 s /B 0 . These three sources of systematic uncertainty on the B 0 (s) → D 0 K + K − invariant-mass modelling are considered as uncorrelated, and are added in quadrature to obtain a global relative systematic uncertainty of 2.1% on the yield N B 0 →D 0 K + K − and 4.2% on the ratio r B 0 s /B 0 . For the default fit on m D 0 π + π − (see Table 3), the B 0 → D * 0 [D 0 γ]π + π − and B 0 → D 0 K + π − components are the main peaking backgrounds and the contribution from B 0 s → D * 0 K − π + is fixed to the expected value from simulation. The B 0 → D * 0 [D 0 γ]π + π − background is modelled in the default fit with a nonparametric PDF determined on a phase-space simulated sample of B 0 → D * 0 [D 0 γ]π + π − decays. In an alternative approach, the modelling of that background is replaced by nonparametric PDFs determined from simulated samples of B 0 → D * 0 [D 0 γ]ρ 0 decays with various polarisations. Two values for the longitudinal polarisation fraction are tried, one from the colour-suppressed mode B 0 → D * 0 ω, f L = (66.5 ± 4.7 ± 1.5)% [58] (this result is consistent with the result presented in Ref. [59]) and the other from the colour-allowed mode B 0 → D * − ρ + , f L = (88.5±1.6±1.2)% [60]. A systematic uncertainty of 1.6% for the B 0 → D * 0 [D 0 γ]π + π − modelling, corresponding to the largest deviation from the nominal result, is assigned. A different model of simulation for the generation of the background B 0 → D 0 K + π − decays is used to define the nonparametric PDF used in the invariant-mass fit. The first is a phase-space model where the generated signals decays are uniformly distributed over a regular-Dalitz plot, while the other is uniformly distributed over the square version of the Dalitz plot. The definition of the square-Dalitz plots is given in Ref. [21]. The difference between these two PDFs for the B 0 → D 0 K + π − background corresponds to a 0.3% relative effect. The component B 0 s → D * 0 K − π + is found to be initially negative (and compatible with zero) and then fixed in the default fit, resulting in a relative systematic uncertainty of 0.4%.
The main background channels in the fit to The nonparametric PDF for B 0 s → D * 0 K + K − decays is computed from an alternative simulated sample, where the nominal phase-space simulation is replaced by that computed with a square-Dalitz plot generation of the simulated decays. The measured difference between the two models results in relative systematic uncertainties on N B 0 →D 0 K + K − and r B 0 s /B 0 of 0.5% and 1.3%, respectively. The component B 0 s → D * 0 K − π + is modelled with a nonparametric PDF from the square-Dalitz plot simulation. Alternatively, the PDF of the B 0 s → D * 0 K − π + background is modelled with a nonparametric PDF determined from a simulated sample of B 0 s → D * 0 K * 0 decays, with polarisation taken from the similar mode B + → D * 0 K * + , f L = (86 ± 6 ± 3)% [61]. The difference obtained for these two PDF models for the B 0 s → D * 0 K − π + background gives relative systematic uncertainties on N B 0 →D 0 K + K − and r Bs/B d equal to 1.4% and 0.4%.
Systematic uncertainties for the constrained Λ 0 b → D 0 pK − or Λ 0 b → D 0 pπ − and Ξ 0 b → D 0 pπ − decay yields are discussed in Sect. 4.5 and are already taken into account when fitting the B 0 (s) → D 0 h + h − invariant-mass distributions. Finally, the impact of the simulation tuning that is described in Sect. 4.4 is evaluated by performing the default fit without modifying the PDFs of the various backgrounds to match the width and mean invariant masses seen in data. The resulting discrepancies give a relative effect of 0.5% on N (B 0 → D 0 π + π − ), 0.1% on N (B 0 → D 0 K + K − ), and 0.9% on r B 0 s /B 0 .

Summary of systematic uncertainties
The systematic uncertainties contributing to the ratio of branching fractions Eq. 5) and for the ra- Table 6. All sources of systematic uncertainties are uncorrelated and are therefore summed in Table 6: Relative systematic uncertainties, in percent, on the ratio of branching fractions The uncertainties are uncorrelated and summed in quadrature. Source

Results
The ratios of branching fractions are measured to be B(B 0 → D 0 K + K − ) B(B 0 → D 0 π + π − ) = (6.9 ± 0.4 ± 0.3)% (11) and = (93.0 ± 8.9 ± 6.9)%, where the first uncertainties are statistical and the second are systematic. Using the branching fraction B(B 0 → D 0 π + π − ) = (8.8 ± 0.5) × 10 −4 [39], the branching fraction of the B 0 → D 0 K + K − decay is measured to be where the third uncertainty is due to the limited knowledge of B(B 0 → D 0 π + π − ). The branching ratio of the decay B 0 s → D 0 K + K − is measured to be where the third uncertainty is due to the limited knowledge of B(B 0 → D 0 K + K − ). These results are compatible with and more precise than the previous LHCb results [18] for the same decays, i.e. B B 0 → D 0 K + K − = (4.7 ± 0.9 ± 0.6 ± 0.5) × 10 −5 and B B 0 s → D 0 K + K − = (4.2 ± 1.3 ± 0.9 ± 1.1) × 10 −5 , which were based on a subset of the current data set. The measurement of the branching ratios B(B 0 (s) → D 0 K + K − ) is the first step towards a Dalitz plot analysis of these modes using the LHC Run-2 data sample. Nonetheless, an inspection of the Dalitz plot is performed and several structures are visible in the B 0 → D 0 K + K − and B 0 s → D 0 K + K − decays. The Dalitz plot (m 2 D 0 K − , m 2 K − K + ) distribution of B 0 → D 0 K + K − candidates populating the B 0 signal mass range, m D 0 K + K − ∈ [5240, 5320] MeV/c 2 (i.e. ±40 MeV/c 2 around the B 0 mass), is displayed in Fig. 6. Several resonances are clearly visible. In the K + K − system, some unknown combination of the resonances a 0 (980) and f 0 (980) seem to be dominant. The search for the rare B 0 → D 0 φ decay using the same data sample is described in a separate publication [22]. For the D 0 K − system, the first band below 6 GeV 2 /c 4 corresponds to the partially reconstructed decay B 0 s → D s1 (2536) − K + /π + , with D s1 (2536) − → D * 0 K − (i.e. a background component due to the decay B 0 s → D * 0 K − K + or B 0 s → D * 0 K − π + , with the pion misidentified). The decay D s1 (2536) − → D 0 K − is forbidden by the conservation of parity in strong interactions and cannot explain the observed feature. The second band around 6.6 GeV 2 /c 4 is related to the mode the B 0 s mass), is shown in Fig. 7. Again, several resonances can be clearly identified. In the K + K − system, the φ resonance is observed and the study of the corresponding decay is presented in a separate publication [22]. There is some possible accumulation of candidates in a broad structure around 1.7 GeV/c 2 , which may correspond to the φ(1680) state. In addition, in the D 0 K − system, the D * s2 (2573) − resonance is identifiable. An analysis with additional LHCb data will enable the study of D * * s spectroscopy, particularly those resonances that are natural spin-parity members of the 1D and 1F families. The differences between the B 0 and B 0 s modes are also interesting. In addition, different resonances can contribute strongly with respect to B 0 s → D 0 K − π + decays [23,62].  [9] LHCb collaboration, Update of the LHCb combination of the CKM angle γ, LHCb-CONF-2018-002.