Observation of the decay $B_c^+ \to \psi(2S)\pi^+$

The decay $B_c^+ \to \psi(2S)\pi^+$ with $\psi(2S) \to \mu^+\mu^-$ is observed with a significance of $5.2\,\sigma$ using $pp$ collision data corresponding to an integrated luminosity of $1.0\,\invfb$ collected by the LHCb experiment. The branching fraction of $B_c^+ \to \psi(2S)\pi^+$ decays relative to that of the $B_c^+ \to J/\psi\pi^+$ mode is measured to be \begin{equation*} \frac{\mathcal{B}(B_c^+ \to \psi(2S)\pi^+)}{\mathcal{B}(B_c^+ \to J/\psi\pi^+)} = 0.250 \pm 0.068 \,\text{stat} \pm 0.014 \,\text{\syst} \pm 0.006 \,(\mathcal{B}). \end{equation*} The last term is the uncertainty on the ratio $\mathcal{B}(\psi(2S) \to \mu^+\mu^-)/\mathcal{B}(J/\psi \to \mu^+\mu^-)$.

, is the only known meson composed of two flavours of heavy quarks, charm and beauty. Both quarks can decay via the weak interaction with the other quark being considered as a spectator, therefore a wide range of decay channels are possible. However, only a few of these channels have been experimentally observed [1][2][3][4]. The LHC opens a new era for B + c physics, with an expected production cross-section of ∼ 0.4 µb at centre-of-mass energy √ s = 7 TeV for the B + c meson [5,6]. The LHCb experiment has observed the decay B + c → J/ψ π + [7], and new channels such as B + c → J/ψ π + π + π − [8] have started to emerge. We report here the first observation of the decay B + c → ψ(2S)π + with ψ(2S) → µ + µ − and the measurement of the ratio of branching fractions B(B + c → ψ(2S)π + )/B(B + c → J/ψ π + ). The inclusion of charge conjugate modes is implied throughout the paper. The relativistic quark model [9] and several other models [10][11][12][13] make various theoretical predictions for this ratio of branching fractions. As a two-body decay, B + c → ψ(2S)π + is under better control theoretically than B + c → J/ψπ + π − π + , and therefore this measurement is particularly useful to test the models of B + c decays. The B + c → J/ψ π + decay mode is chosen as the normalisation channel because of its identical final state and similar event topology. Both channels take advantage of the large trigger efficiency due to the two muons in the final state.
The analysis is based on pp collision data corresponding to an integrated luminosity of 1.0 fb −1 at √ s = 7 TeV collected with the LHCb detector in 2011. The detector [14] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream. The combined tracking system has momentum resolution ∆p/p that varies from 0.4% at 5 GeV/c to 0.6% at 100 GeV/c, and impact parameter (IP) resolution of 20 µm for tracks with high transverse momentum (p T ). Charged hadrons are identified using two ring-imaging Cherenkov detectors and good kaon-pion separation is achieved for tracks with momentum between 5 GeV/c and 100 GeV/c. Photon, electron and hadron candidates are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The trigger system [15] consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software trigger that applies a full event reconstruction and reduces the event rate from 1 MHz to around 3 kHz. Candidate , are selected by requiring a single muon or dimuon with high p T in the hardware trigger. In the software trigger, a charged particle is required to have p T > 1.7 GeV/c, or p T > 1 GeV/c if identified as a muon; alternatively a dimuon trigger requires two oppositely charged muons with p T > 500 MeV/c, the invariant mass of the muon pair M µ + µ − > 2.95 GeV/c 2 , and that the muon track pair has a decay length significance with respect to the primary vertex greater than 5.
Further offline selections require both muons to have p T > 550 MeV/c, and a track fit χ 2 tr per degree of freedom (χ 2 tr /ndf) of less than 5. The mass of the ψ candidate is required to be within a window of 100 MeV/c 2 centred around the known ψ mass (3686 MeV/c 2 for ψ(2S) and 3097 MeV/c 2 for J/ψ ) [16]. The ψ vertex fit χ 2 vtx /ndf is required to be less than 20, and the ψ decay length significance larger than 5.
The B + c candidate is reconstructed from the ψ and a bachelor pion. The pion is required to have p T > 500 MeV/c, a track fit χ 2 tr /ndf < 10 and IP χ 2 IP with respect to the primary interaction great than 4. The IP χ 2 IP is defined as the difference between the χ 2 of the primary vertex reconstructed with and without the considered track. The B + c candidate is required to have mass within 0.5 GeV/c 2 around the world average value [16] and a vertex fit χ 2 vtx /ndf < 16. A boosted decision tree (BDT) [17], trained on data and simulation, is used to perform further background suppression. The pp collisions are simulated using Pythia 6.4 [18] with a specific LHCb configuration [19]. The B + c mesons are generated through the dominant hard subprocess gg → B + c + b + c with the dedicated generator Bcvegpy [20,21]. Decays of hadronic particles are described by EvtGen [22] in which final state radiation is generated using Photos [23]. The interaction of the generated particles with the detector and its response are implemented using Geant4 [24] as described in Ref. [25].
The choice of the variables used to train the BDT is based on two considerations: their power to separate signal and background, and the similarity of the distributions for the B + c → J/ψ π + and B + c → ψ(2S)π + candidates that causes the systematic uncertainties in the selections to cancel when the ratio of branching fractions is determined. The BDT input variables are: the π + IP χ 2 IP ; the B + c vertex fit χ 2 vtx /ndf; the B + c IP χ 2 IP ; the χ 2 of the distance between the B + c vertex and the associated primary vertex; the p T of the B + c candidate; and the χ 2 from a refit of the B + c decay vertex [26] using a J/ψ or ψ(2S) mass constraint and a constraint that the B + c candidate points to the primary vertex. The BDT is trained using a B + c → J/ψ π + simulation sample for the signal and sidebands from the B + c → J/ψ π + mass spectrum (6164 < M J/ψ π < 6206 MeV/c 2 or 6346 < M J/ψ π < 6388 MeV/c 2 ) for the background. The trained BDT is then applied to the data, and a signal estimator is calculated for each candidate; a large value indicates a signallike candidate. The cut on the estimator is optimised to maximise the B + c → ψ(2S)π + signal significance. The BDT selection efficiencies, estimated from simulation, for B + c → ψ(2S)π + and B + c → J/ψ π + candidates are 35.8% and 37.2% respectively, and the fraction of accepted background is 4.8 × 10 −4 as estimated from the sideband data.
After the BDT selection, it is further required that the unconstrained dimuon invariant mass is in the range 3030 < M µµ < 3170 MeV/c 2 for J/ψ and 3620 < M µµ < 3760 MeV/c 2 for ψ(2S). Information on particle identification for pions and kaons is also used to suppress the reflection background due to B + c → J/ψ K + decays. Figure 1 shows the invariant mass distributions of the B + c → J/ψ π + and B + c → ψ(2S)π + candidates. The relative branching fraction is calculated using where N is the number of selected signal events and ε is the total efficiency. The signal yields are obtained by performing an extended maximum likehood fit to the B + c mass spectra in Fig. 1. The signal is modelled with a double-sided Crystal Ball function [27] with the tail parameters on both sides determined from simulation. The main background component for both channels is combinatorial and is modelled using an exponential function. At the lower end of the mass spectrum, the contribution from the partially reconstructed background is modelled by an ARGUS function [28] convolved with a Gaussian distribution. For the B + c → J/ψ π + decay, the Cabibbo suppressed channel B + c → J/ψ K + also contributes, and is fitted with a double-sided Crystal Ball function with all parameters fixed to values obtained from simulation. The observed signal yields are 595 ± 29 for B + c → J/ψ π + and 20 ± 5 for B + c → ψ(2S)π + . Therefore the ratio of yields is The total efficiency is the product of the detector acceptance, and the trigger, reconstruction and selection efficiencies. Each contribution has been determined using simulated events for the two channels, and the ratio of the total efficiencies has been evaluated to be where the uncertainty is due to the limited size of the simulated sample. Several sources of systematic uncertainty have been considered. The measured ratio of signal yields is expected to be independent of the BDT selection, given that the distributions of training variables are very similar for the two channels. The ratio of signal yields is measured for different cuts on the BDT response, and is constant within the statistical uncertainties. The average of these ratios differs from the nominal value by 4.5%, which is taken as the systematic uncertainty due to the BDT selection. The B + c → ψ(2S)π + signal is fitted with a double-sided Crystal Ball function. Alternatively we determine the signal shape directly from the simulation using kernel estimation [29], and convolve it with a Gaussian function to take into account the detector resolution while allowing the mean of the mass to vary. This results in a 1.7% difference with respect to the nominal ratio, which is taken as the uncertainty due to the signal shape.
To consider the contribution from partially reconstructed background, the background is fitted with an exponential function within a narrower range (6164 < M ψπ < 6500 MeV/c 2 ). This results in a 2.9% change with respect to the nominal fit, and is assigned as a systematic uncertainty.
The statistical uncertainty on the simulation when estimating the ratio of efficiencies leads to an uncertainty of 0.9% on the ratio of branching fractions. The difference between data and simulation introduces a systematic uncertainty, especially from variables used as input for the BDT. The distributions of these variables in simulation and data are compared, after the background is subtracted from the data using the sPlot technique [30]. The difference is found to be negligible compared to the statistical fluctuation.
A summary of systematic uncertainties is given in Table 1. The total systematic uncertainty is 5.7%, with the most significant contribution coming from the BDT selection. Taking the systematic uncertainty into account and using the likelihood ratio test −2 log(L B /L S+B ) [31], the significance of the B + c → ψ(2S)π + decay is estimated to be a 5.2 σ, where L B and L S+B represent the likelihood of the background-only hypothesis and the signal-plus-background hypothesis respectively.