Observation of $B_{c}^{+} \to J/\psi D^{(*)} K^{(*)}$ decays

A search for the decays $B_c^+ \to J/\psi D^{(*)0} K^+$ and $B_c^+ \to J/\psi D^{(*)+} K^{*0}$ is performed with data collected at the LHCb experiment corresponding to an integrated luminosity of 3 fb$^{-1}$. The decays $B_c^+ \to J/\psi D^0 K^+$ and $B_c^+ \to J/\psi D^{*0} K^+$ are observed for the first time, while first evidence is reported for the $B_c^+ \to J/\psi D^{*+} K^{*0}$ and $B_c^+ \to J/\psi D^+ K^{*0}$ decays. The branching fractions of these decays are determined relative to the $B_c^+ \to J/\psi \pi^+$ decay. The $B_c^+$ mass is measured, using the $J/\psi D^0 K^+$ final state, to be $6274.28 \pm 1.40 (stat) \pm 0.32 (syst)$ MeV/$c^2$. This is the most precise single measurement of the $B_c^+$ mass to date.


Introduction
Composed of two heavy quarks of different flavour, the B + c meson is the least understood member of the pseudoscalar bottom-meson family. The high centre-of-mass energies at the Large Hadron Collider enable the LHCb experiment to study the production, properties and decays of the B + c meson 1 [1-14]. As for the B + c → J/ψ D ( * )+ s decays [10], the B + c → J/ψ D ( * ) K ( * ) decays are expected to proceed mainly through spectator diagrams. In contrast to decays of other beauty hadrons, the weak annihilation topology is not suppressed and can contribute significantly to the decay amplitude (Fig. 1). The B + c → J/ψ D ( * ) K ( * ) decays offer a unique opportunity to study D + s spectroscopy in the D ( * ) K ( * ) system [15,16]. Given a large enough sample size, the quantum numbers of possible excited D + sJ states can be determined, complementary to inclusive searches [17,18] and Dalitz analyses of other B meson decays [19,20]. The complex structure of the B + c → J/ψ D ( * ) K ( * ) decay also allows the search for exotic charmonium states in the J/ψ D ( * ) combination. A measurement of the relative branching fraction B(B + c → J/ψ D ( * ) K * )/B(B + c → J/ψ D ( * ) K) provides information on the branching fraction of the as yet unobserved B → D * D ( * ) K * decay, in which exotic charmonia close to the D * D ( * ) threshold can be studied. The search for B + c → J/ψ D ( * ) K ( * ) decays in this paper is a first step towards such spectroscopy studies. The current world average of the B + c mass measurements [21] is dominated by the LHCb results using J/ψ π + [1], J/ψ D + s [10] and J/ψ ppπ + [13] decays. The J/ψ π + measurement benefits from a large yield while the latter two have smaller systematic uncertainties because of their reduced Q-values. 2 With a Q-value even smaller than the B + c → J/ψ D + s or J/ψ ppπ + channels, the B + c → J/ψ D 0 K + decay enables another precise B + c mass measurement.
The purpose of this analysis is to search for the B + c meson decaying into the final states J/ψ D 0 K + , J/ψ D * 0 K + , J/ψ D + K * 0 and J/ψ D * + K * 0 . The D 0 meson is reconstructed in both K − π + and K − π + π − π + final states in the search for the B + c → J/ψ D ( * )0 K + decays, and only in the K − π + final state for the other decays. The D + meson is reconstructed in the K − π + π + final state. The decays D * 0 → D 0 γ, D * 0 → D 0 π 0 , and D * + → D 0 π + are partially reconstructed retaining only the D 0 while neglecting the photon or pion. The J/ψ is reconstructed in the µ + µ − final state. The relative branching fraction of the while the other channels are normalised to the B + c → J/ψ D 0 K + decay. The determination of the B + c mass is performed with the B + c → J/ψ D 0 (→ K − π + )K + final state only.

Detector and dataset
This analysis uses pp collision data collected at the LHCb experiment corresponding to an integrated luminosity of 1.0 fb −1 at a centre-of-mass energy of 7 TeV and 2.0 fb −1 at 8 TeV. The LHCb detector [22,23] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The polarity of the dipole magnet is reversed periodically throughout data taking. The tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. For all decays considered in this paper, a trigger is used that enriches events with J/ψ decays into the two-muon final state. At the hardware trigger level the signal candidates are required to contain at least one muon with p T > 1.48 GeV/c (> 1.76 GeV/c) in the 7 TeV (8 TeV) data, or a muon pair where the product of the p T values of the muons is greater than (1.3 GeV/c) 2 and (1.6 GeV/c) 2 in the 7 TeV and 8 TeV data, respectively. In the first step of the software trigger a single muon candidate with p T > 1.0 GeV/c is required, or a pair of oppositely charged muons, each with p T > 500 MeV/c, with a combined invariant mass M µµ > 2.7 GeV/c 2 . Finally, a J/ψ candidate is required to be formed from a muon pair, and to have a mass within ±120 MeV/c 2 of the known J/ψ mass [21] and a vertex position displaced from its associated PV with a significance of at least three standard deviations (σ).
Simulated samples of the signal and the normalisation channel are used to optimise the selection criteria and to estimate the efficiencies. The simulation of B + c production in pp collisions is modelled with the BcVegPy generator [24,25], interfaced to Pythia 6 [26] with a specific LHCb configuration [27]. Decays of hadronic particles are described by EvtGen [28], in which final-state radiation is generated using Photos [29]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [30] as described in Ref. [31].

Event selection
The offline selection starts with a loose preselection and is followed by a multivariate selection using a boosted decision tree (BDT) [32,33]. This is done independently for each of the final states considered: • J/ψ π + (normalisation channel).
In the offline selection, trigger decisions are associated with reconstructed particles. In order to establish whether a significant signal is observed no requirements are placed on whether the trigger decision is due to the signal candidate itself or other particles in the event. In the branching fraction and mass measurements it is required that the trigger decision must be due to the signal candidate (denoted TOS, Trigger-On-Signal) for a better determination of the trigger efficiency.
In the preselection each J/ψ candidate is formed from a pair of muons, each with a good-quality track fit, p T in excess of 550 MeV/c, and minimum χ 2 IP with respect to any reconstructed PV greater than 4, where χ 2 IP is the difference between the vertex-fit χ 2 of a given PV reconstructed with and without the considered track. The χ 2 IP requirement rejects tracks that come from the associated PV rather than from B + c decays, where the associated PV is the primary vertex 3 with respect to which the B + c candidate has the smallest χ 2 IP . The muons are required to be positively identified with neural-network-based particle identification (PID) variables using information from different sub-detectors. The muon pair is required to form a vertex of good quality and have an invariant mass in the range 3040-3150 MeV/c 2 . The J/ψ candidate is then combined with hadron tracks to form a B + c candidate. All hadronic tracks are required to have a good-quality track fit, p T in excess of 100 MeV/c, and the minimum χ 2 IP with respect to any PV greater than 4. Loose PID requirements are applied to pions and kaons for the J/ψ D 0 (→ K − π + )K + final state, while tighter selections on kaons are applied at a later stage. For other final states, tighter PID selections are imposed in the preselections. The D 0 and D + candidates are required to have a good-quality vertex, and have a mass within ±30 MeV/c 2 of the known masses, where the size of the window corresponds to approximately ±4 times the mass resolution. The K * 0 meson is defined as a K + π − combination within the mass range 792-992 MeV/c 2 , roughly four times the K * (892) 0 natural width [21]. The B + c candidate is required to have a good-quality vertex and a mass within a wide window ±700 MeV/c 2 around the world average B + c mass [21]. A BDT discriminator is trained for each of the signal final states to further suppress the combinatorial background, except that the partially reconstructed J/ψ D * 0 K + decay shares the same BDT as the fully reconstructed J/ψ D 0 K + decay. The training uses simulated samples as signal, and background events from data containing K ( * ) candidates of opposite strangeness as in the respective signal decays (for example, J/ψ D 0 K − for J/ψ D 0 K + signal, or J/ψ D + K * 0 for J/ψ D + K * 0 signal, later referred to as "wrong-sign" samples). Taking the J/ψ D 0 (→ K − π + )K + decay as an example, the variables used in the training fall into the following categories: • the p T of the B + c candidate and its decay products: J/ψ , D 0 and K + ; • vertex-fit χ 2 per degree of freedom (χ 2 /ndf) of the B + c , J/ψ and D 0 mesons, as well as χ 2 /ndf from a refit of the B + c decay constraining the reconstructed J/ψ and D 0 masses to their known values, and the B + c momentum to point back to its associated PV; • variables describing the event geometry: the flight distance significances (FDS) of the B + c and D 0 candidates with respect to its associated PV, where FDS is the distance between the vertex and the reference point divided by its uncertainty; χ 2 IP and θ of the B + c meson relative to its associated PV, where θ is the angle between the B + c momentum and the line connecting its production vertex and decay vertex; χ 2 IP and θ of the D 0 meson relative to the B + c decay vertex; D 0 decay length from the refit with constraints mentioned above.
For other final states, the variables corresponding to the D 0 or K + mesons are replaced with those corresponding to the D + or K * 0 mesons as appropriate.
The thresholds of the BDT discriminants are chosen to maximise the figure of merit ε/(3/2 + √ N B ) [34], aiming for a signal significance of three standard deviations, where ε is the signal efficiency estimated from simulation and N B is the number of expected background candidates in the signal region (6263-6289 MeV/c 2 for fully reconstructed signals, and 6037-6149 MeV/c 2 for the partially reconstructed B + c → J/ψ D * + K * 0 decay), extrapolated from the wrong-sign samples. For the J/ψ D 0 (→ K − π + )K + final state the BDT discriminant output and the PID variables of the kaons are optimised simultaneously, while for the other final states only the BDT discriminant is optimised since tighter PID selections have already been imposed. When there is more than one candidate present in a selected event, the one with the smallest χ 2 /ndf in the constrained vertex refit is retained.
For the normalisation channel B + c → J/ψ π + , the training variables are similar to the signal channels, except for the absence of variables related to the D 0 meson, and the addition of the pion p T and χ 2 IP . Simulated signal decays are used in the training, while the background sample is taken from signal candidates in the upper sideband where N S is the expected signal yield, and N S + N B is the total number of candidates in the region 6241-6312 MeV/c 2 corresponding to ±3 times the mass resolution around the B + c mass.

Signal yields
The invariant mass spectrum of the selected J/ψ D 0 K + candidates is shown in Fig. 2(a), where both D 0 → K − π + and D 0 → K − π + π − π + samples are combined. The result of an extended unbinned maximum likelihood fit is also shown. The sharp peak at the B + c mass is the fully reconstructed B + c → J/ψ D 0 K + signal, which is fitted with the sum of a Gaussian function and a double-sided Crystal Ball function (DSCB), a modified Gaussian  Figure 2: The invariant mass distribution of J/ψ D 0 K + candidates: (a) D 0 → K − π + and D 0 → K − π + π − π + combined; (b) D 0 → K − π + only, and the events are required to be TOS. distribution with power-law tails on both sides, whose tail parameters are fixed from simulation. The Gaussian and the DSCB functions are constrained to have the same mean. The width of the Gaussian component is free to vary in the fit, while the ratio of the DSCB core width over the Gaussian width is fixed to the value expected from simulation.
The wider peaking structure at lower mass is due to partially reconstructed B + c → J/ψ D * 0 K + signal, which is modelled using a nonparametric shape obtained from simulated D * 0 → D 0 γ and D 0 π 0 decays, combined according to their relative branching fractions [35]. The combinatorial background is fitted with an exponential function. The signal yields of B + c → J/ψ D 0 K + and B + c → J/ψ D * 0 K + decays are 26 ± 7 and 102 ± 13, respectively. The signal significance, S, is estimated using the change in the fit likelihood from a background-only hypothesis to a signal-plus-background hypothesis S = −2 ln(L B /L S+B ) [36]. Taking into account the systematic effects discussed in Sec. 5, the significance of the B + c → J/ψ D 0 K + signal is 6.3σ and the significance of the partially reconstructed B + c → J/ψ D * 0 K + signal is 10.3σ. Both are observed for the first time. An alternative method gives a compatible significance estimation. In this method pseudoexperiments are generated using the background-only hypothesis, which are then fitted using the signal-plus-background hypothesis to obtain a cumulative probability distribution P(N ≥ N S ) as a function of the fitted signal yield N S . Given the actual yield from data, the p-value and signal significance can be derived. Figure 2 The invariant mass distributions of the J/ψ D 0 , D 0 K + and J/ψ K + combinations are shown in Fig. 3 for the B + c → J/ψ D 0 K + and J/ψ D * 0 K + signal events. The background is subtracted using the sPlot technique [37], with M (J/ψ D 0 K + ) as the discriminating variable. The distributions from simulation using a phase-space decay model are shown for comparison. The simulation shows comparatively poor agreement with data for the D 0 K + invariant mass. This distribution, sensitive to possible intermediate resonances, should be studied further with more data.
The invariant mass distributions of the final states containing K * 0 candidates are shown in Fig. 4. The B + c → J/ψ D * + K * 0 decay is partially reconstructed, neglecting the pion in the D * + → D 0 π + decay ( Fig. 4(a,c)). The shape of the signal distribution is fixed from simulation and the background is modelled with an exponential function. The B + c → J/ψ D + K * 0 decay is fully reconstructed and modelled with a DSCB function, while the background is described by an exponential function ( Fig. 4(b,d)). Without TOS requirements the yields of the B + c → J/ψ D * + K * 0 and B + c → J/ψ D + K * 0 decays are 11 ± 4 and 7.4 ± 2.9 events, and the significances are 4.0σ and 4.4σ, respectively, including systematic effects. With TOS requirements applied, their yields are 7.8 ± 3.2 and 3.9 ± 2.1, where the uncertainties are statistical only.
The J/ψ π + mass distribution of the normalisation channel is shown in Fig. 5 with TOS requirements applied. The signal is modelled with the sum of a DSCB and a Gaussian function, the combinatorial background with an exponential function, and the misidentified background from the B + c → J/ψ K + decay is modelled with a DSCB whose parameters are fixed to those that describe the simulated data. The signal yield is 3616 ± 73 events.

Branching fraction measurement
After correction for detection efficiencies, the signal yields obtained in Sec. 4 are used to determine relative branching fractions. The choice of the fit model is a significant source of systematic uncertainty on the signal yield and therefore also on the branching fraction. Alternative models are used for the signal (including a single DSCB function, a Gaussian function, and a nonparametric shape from simulation), and the combinatorial background (including first-and second-order polynomial functions). For the J/ψ D 0 K + final state, the feed-down from higher excited intermediate states is considered, such as J/ψ D * 0 (2400) 0 K + , J/ψ D 1 (2420) 0 K + and χ c1 (→ J/ψ γ)D 0 K + . If these contributions, with shapes estimated by simulation, are included in the fit, the branching fractions change by no more than 0.5%. The shape of partially reconstructed B + c → J/ψ D * 0 K + signal depends on the polarization and intermediate resonances in the decay. Extreme cases of helicity amplitude configurations are generated for the decay B + c → J/ψ D s1 (2536) + (→ D * 0 K + ) and it is found that the unknown polarization and decay structure can change the signal yield by up to 5.2%. A dedicated simulation study shows that the possible peaking background from the charmless B + c → J/ψ K + K − π + decay [11] is negligible. Additionally, the fits are repeated in different mass ranges. In the B + c → J/ψ D * + K * 0 sample, the background level is slightly high around 6450 MeV/c 2 , but consistent with a statistical fluctuation. A fit in a more narrow range excluding this region gives a compatible result. The total uncertainties due to fit modelling are found in Table 1 for each of the channels.
The total efficiencies are given by the product of three factors: the geometric detector acceptance, the reconstruction and selection efficiencies, and the trigger efficiency. They are generally estimated using simulated samples, corrected to match the data when the simulation is known to be imperfect. In the simulation the B + c meson is generated with a lifetime of 450 fs taken from an early world average with a large uncertainty [35]. For the efficiency estimation the simulated events are therefore weighted to obtain the same lifetime (τ = 511.4 fs) as the recent and more precise LHCb measurements [4,5]. The lifetime is varied by one standard deviation (9.3 fs) to study the corresponding systematic effect, which is found to be negligible. The simulation assumes a phase-space decay of the B + c → J/ψ D ( * ) K ( * ) averaged over all possible polarization configurations, and without any intermediate decay structure. The efficiency dependence on the invariant mass of the DK ( * ) system is studied and the efficiencies of selected candidates are corrected event-by-event according to the M (DK ( * ) ) value. The distributions of variables used in the BDT training are compared between simulation and background-subtracted data, and show good agreement. The tracking and PID efficiencies are determined in bins of track momenta, pseudorapidity and event multiplicity using a data-driven method [38]. The tracking efficiency uncertainty is estimated to be 0.4% per muon or hadron track, while for each hadron track an additional uncertainty of 1.4% is assigned due to the imperfect knowledge of the interaction with the detector material. Alternative binning schemes of track momentum, pseudorapidity and event multiplicity are applied to estimate the uncertainty on the PID efficiencies. The systematic uncertainty on the trigger efficiency is determined to be 1.1% from a comparison between data and simulation using a large J/ψ sample [10, 13]. The limited data size of the simulation samples introduces systematic uncertainties of less than 1%. The uncertainties of intermediate D ( * ) decay branching fractions [35] are propagated into the final results. Cross-checks have been performed to ensure the robustness of the results, such as confirming that the BDT output is not correlated with the B + c candidate mass.  The relative branching fractions of the B + c decays are measured to be

Source of uncertainty
where the first uncertainty is statistical and the second is systematic. The systematic uncertainties are summarised in Table 1.

Mass measurement
The B + c mass is determined from the fit to the B + c → J/ψ D 0 (→ K − π + )K + signal as shown in Fig. 2(b). The summary of systematic uncertainties is given in Table 2. The dominant term is the momentum scale calibration. For a mass measurement, the momenta of the final-state particles need to be measured precisely. In previous studies a large sample of B + → J/ψ K + , J/ψ → µ + µ − decays was used to calibrate the track momentum, and the uncertainty on the momentum scale calibration was determined to be 0.03% [39]. This causes a change in the central value of the B + c mass by up to 0.26 MeV/c 2 . Using the same procedure as described in Sec. 5, the choice of the model is estimated to introduce an uncertainty of 0.18 MeV/c 2 . The effect of soft photon emission via final-state radiation is minimised by constraining the reconstructed J/ψ and D 0 masses to their nominal values. Any remaining bias is investigated using a large sample of simulated pseudoexperiments, which results in a correction of +0.08 MeV/c 2 to the central value, with an uncertainty of 0.01 MeV/c 2 . The uncertainties associated with the J/ψ (0.006 MeV/c 2 ) and D 0 (0.05 MeV/c 2 ) masses [21] are propagated to the B + c mass. The effect of an imperfect energy loss correction has been studied in the previous b-hadron mass measurements [40] by varying the amount of detector material. The corresponding uncertainty is 0.05 MeV/c 2 for the B + c mass measurement. The B + c mass is determined to be 6274.28 ± 1.40 ± 0.32 MeV/c 2 , consistent with previous LHCb results [1, 10, 13] and the world average [21]. This is the most precise single measurement of the B + c mass. Including this result, the new LHCb average is 6274.6 ± 1.0 MeV/c 2 , where the correlated systematic uncertainties between the measurements including those due to momentum scale and energy loss corrections are fully accounted for.

Conclusion
The decays B + c → J/ψ D 0 K + and B + c → J/ψ D * 0 K + are observed for the first time with pp collision data corresponding to an integrated luminosity of 3 fb −1 , collected by the LHCb experiment at centre-of-mass energies of 7 and 8 TeV. First evidence is reported for the B + c → J/ψ D * + K * 0 and J/ψ D + K * 0 decays. The B + c → J/ψ D 0 K + branching fraction is measured relative to the B + c → J/ψ π + decay, and all the other signal channels are measured relative to the B + c → J/ψ D 0 K + decay. The B + c → J/ψ D ( * ) K + decay has significant potential for studies of excited D + s states when more data are recorded. The B + c mass is measured to be 6274.28 ± 1.40 ± 0.32 MeV/c 2 , which is the most precise single measurement and is in good agreement with the world average and the previous LHCb results. In combination with previous results by the LHCb [1, 10, 13] experiment, the B + c mass is determined to be 6274.6 ± 1.0 MeV/c 2 .   [6] LHCb collaboration, R. Aaij et al., First observation of the decay B + c → J/ψ π + π − π + , Phys. Rev. Lett. 108 (2012) 251802, arXiv:1204.0079.