Observation of a resonance in $B^+ \to K^+ \mu^+\mu^-$ decays at low recoil

A broad peaking structure is observed in the dimuon spectrum of $B^+ \to K^+ \mu^+\mu^-$ decays in the kinematic region where the kaon has a low recoil against the dimuon system. The structure is consistent with interference between the $B^+ \to K^+ \mu^+\mu^-$ decay and a resonance and has a statistical significance exceeding six standard deviations. The mean and width of the resonance are measured to be $4191^{+9}_{-8}\mathrm{\,Me\kern -0.1em V/}c^2$ and $65^{+22}_{-16}\mathrm{\,Me\kern -0.1em V/}c^2$, respectively, where the uncertainties include statistical and systematic contributions. These measurements are compatible with the properties of the $\psi(4160)$ meson. First observations of both the decay $B^+ \to \psi(4160) K^+$ and the subsequent decay $\psi(4160) \to \mu^+\mu^-$ are reported. The resonant decay and the interference contribution make up 20\,% of the yield in the low recoil region, which is larger than theoretical estimates.

The decay of the B + meson to the final state K + µ + µ − receives contributions from tree level decays and decays mediated through virtual quantum loop processes. The tree level decays proceed through the decay of a B + meson to a vector cc resonance and a K + meson, followed by the decay of the resonance to a pair of muons. Decays mediated by flavour changing neutral current (FCNC) loop processes give rise to pairs of muons with a non-resonant mass distribution. To probe contributions to the FCNC decay from physics beyond the Standard Model (SM), it is essential that the tree level decays are properly accounted for. In all analyses of the B + → K + µ + µ − decay, from discovery [1] to the latest most accurate measurement [2], this has been done by placing a veto on the regions of dimuon mass, m µ + µ − , dominated by the J/ψ and ψ(2S) resonances. In the low recoil region, corresponding to a dimuon mass above the open charm threshold, theoretical predictions of the decay rate can be obtained with an operator product expansion (OPE) [3] in which the cc contribution and other hadronic effects are treated as effective interactions.
Nearly all available information about the J P C = 1 −− charmonium resonances above the open charm threshold, where the resonances are wide as decays to D ( * ) D ( * ) are allowed, comes from measurements of the crosssection ratio of e + e − → hadrons relative to e + e − → µ + µ − . Among these analyses, only that of the BES collaboration in Ref. [4] takes interference and strong phase differences between the different resonances into account. The broad and overlapping nature of these resonances means that they cannot be excluded by vetoes on the dimuon mass in an efficient way, and a more sophisticated treatment is required.
This Letter describes a measurement of a broad peaking structure in the low recoil re-gion of the B + → K + µ + µ − decay, based on data corresponding to an integrated luminosity of 3 fb −1 taken with the LHCb detector at a centre-of-mass energy of 7 TeV in 2011 and 8 TeV in 2012. Fits to the dimuon mass spectrum are performed, where one or several resonances are allowed to interfere with the nonresonant B + → K + µ + µ − signal, and their parameters determined. The inclusion of charge conjugated processes is implied throughout this Letter.
The LHCb detector [5] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a largearea silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of silicon-strip detectors and straw drift tubes placed downstream. The combined tracking system provides a momentum measurement with relative uncertainty that varies from 0.4 % at 5 GeV/c to 0.6 % at 100 GeV/c, and impact parameter resolution of 20 µm for tracks with high transverse momentum. Charged hadrons are identified using two ring-imaging Cherenkov detectors. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. Simulated events used in this analysis are produced using the software described in Refs. [6][7][8][9][10][11].
Candidates are required to pass a two stage trigger system [12]. In the initial hardware stage, candidate events are selected with at least one muon with transverse momentum, p T > 1.48 (1.76) GeV/c in 2011 (2012). In the subsequent software stage, at least one of the final state particles is required to have both p T > 1.0 GeV/c and impact parameter larger than 100 µm with respect to all of the primary pp interaction vertices (PVs) in the event. Finally, a multivariate algorithm [13] is used for the identification of secondary vertices consistent with the decay of a b hadron with muons in the final state.
The selection of the K + µ + µ − final state is made in two steps. Candidates are required to pass an initial selection, which reduces the data sample to a manageable level, followed by a multivariate selection. The dominant background is of a combinatorial nature, where two correctly identified muons from different heavy flavour hadron decays are combined with a kaon from either of those decays. This category of background has no peaking structure in either the dimuon mass or the K + µ + µ − mass. The signal region is defined as 5240 < m K + µ + µ − < 5320 MeV/c 2 and the sideband region as 5350 < m K + µ + µ − < 5500 MeV/c 2 . The sideband below the B + mass is not used as it contains backgrounds from partially reconstructed decays, which do not contaminate the signal region.
The initial selection requires: χ 2 IP > 9 for all final state particles, where χ 2 IP is defined as the minimum change in χ 2 when the particle is included in a vertex fit to any of the PVs in the event; that the muons are positively identified in the muon system; and that the dimuon vertex has a vertex fit χ 2 < 9. In addition, based on the lowest χ 2 IP of the B + candidate, an associated PV is chosen. For this PV it is required that: the B + candidate has χ 2 IP < 16; the vertex fit χ 2 must increase by more than 121 when including the B + candidate daughters; and the angle between the B + candidate momentum and the direction from the PV to the decay vertex should be below 14 mrad. Finally, the B + candidate is required to have a vertex fit χ 2 < 24 (with three degrees of freedom).
The multivariate selection is based on a boosted decision tree (BDT) [14] with the Ada-Boost algorithm [15] to separate signal from background. It is trained with a signal sample from simulation and a background sample consisting of 10 % of the data from the sideband region. The multivariate selection uses geometric and kinematic variables, where the most discriminating variables are the χ 2 IP of the final state particles and the vertex quality of the B + candidate. The selection with the BDT has an efficiency of 90 % on signal surviving the initial selection while retaining 6 % of the background. The overall efficiency for the reconstruction, trigger and selection, normalised to the total number of B + → K + µ + µ − decays produced at the LHCb interaction point, is 2 %. As the branching fraction measurements are normalised to the B + → J/ψ K + decay, only relative efficiencies are used. The yields in the K + µ + µ − final state from B + → J/ψ K + and B + → ψ(2S)K + decays are 9.6 × 10 5 and 8 × 10 4 events, respectively.
In addition to the combinatorial background, there are several small sources of potential background that form a peak in either or both of the m K + µ + µ − and m µ + µ − distributions. The largest of these backgrounds are the decays B + → J/ψ K + and B + → ψ(2S)K + , where the kaon and one of the muons have been interchanged. The decays B + → K + π − π + and B + → D 0 π + followed by D 0 → K + π − , with the two pions identified as muons are also considered. To reduce these backgrounds to a negligible level, tight particle identification criteria and vetoes on µ − K + combinations compatible with J/ψ , ψ(2S), or D 0 meson decays are applied. These vetoes are 99% efficient on signal.
A kinematic fit [16] is performed for all selected candidates. In the fit the K + µ + µ − mass is constrained to the nominal B + mass and the candidate is required to originate from its associated PV. For B + → ψ(2S)K + decays, this improves the resolution in m µ + µ − from 15 MeV/c 2 to 5 MeV/c 2 . Given the widths of the resonances that are subsequently analysed, resolution effects are neglected. While the ψ(2S) state is narrow, the large branching fraction means that its non-Gaussian tail is significant and hard to model. The ψ(2S) contamination is reduced to a negligible level by requiring m µ + µ − > 3770 MeV/c 2 . This dimuon mass range is defined as the low recoil region used in this analysis.
In order to estimate the amount of background present in the m µ + µ − spectrum, an unbinned extended maximum likelihood fit is performed to the K + µ + µ − mass distribution without the B + mass constraint. The signal shape is taken from a mass fit to the B + → ψ(2S)K + mode in data with the shape parameterised as the sum of two Crystal Ball functions [17], with common tail parameters, but different widths. The Gaussian width of the two components is increased by 5 % for the fit to the low recoil region as determined from simulation. The low recoil region contains 1830 candidates in the signal mass window, with a signal to background ratio of 7.8.
The dimuon mass distribution in the low recoil region is shown in Fig. 1. Two peaks are visible, one at the low edge corresponding to the expected decay ψ(3770) → µ + µ − and a wide peak at a higher mass. In all fits, a vector resonance component corresponding to this decay is included. Several fits are made to the distribution. The first introduces a vector resonance with unknown parameters. Subsequent fits look at the compatibility of the data with the hypothesis that the peaking structure is due to known resonances.
The non-resonant part of the mass fits contains a vector and axial vector component. Of these, only the vector component will interfere with the resonance. The probability density function (PDF) of the signal component is given as where A V nr and A AV nr are the vector and axial vector amplitudes of the non-resonant decay. The shape of the non-resonant signal in m µ + µ − is driven by phase space, P (m µ + µ − ), and the form factor, f (m 2 µ + µ − ). The parametrisation of Ref. [18] is used to describe the dimuon mass dependence of the form factor. This form factor parametrisation is consistent with recent lattice calculations [19]. In the SM at low recoil, the ratio of the vector and axial vector contributions to the non-resonant component is expected to have negligible dependence on the dimuon mass. The vector component accounts for (45 ± 6) % of the differential branching fraction in the SM (see, for example, Ref. [20]). This estimate of the vector component is assumed in the fit.
The total vector amplitude is formed by sum-ming the vector amplitude of the non-resonant signal with a number of Breit-Wigner amplitudes, A k r , which depend on m µ + µ − . Each Breit-Wigner amplitude is rotated by a phase, δ k , which represents the strong phase difference between the non-resonant vector component and the resonance with index k. Such phase differences are expected [18]. The ψ(3770) resonance, visible at the lower edge of the dimuon mass distribution, is included in the fit as a Breit-Wigner component whose mass and width are constrained to the world average values [21].
The background PDF for the dimuon mass distribution is taken from a fit to data in the K + µ + µ − sideband. The uncertainties on the background amount and shape are included as Gaussian constraints to the fit in the signal region.
The signal PDF is multiplied by the relative efficiency as a function of dimuon mass with respect to the B + → J/ψ K + decay. As in previous analyses of the same final state [22], this efficiency is determined from simulation after the simulation is made to match data by: degrading by ∼ 20 % the impact parameter resolution of the tracks, reweighting events to match the kinematic properties of the B + candidates and the track multiplicity of the event, and adjusting the particle identification variables based on calibration samples from data. In the region from the J/ψ mass to 4600 MeV/c 2 the relative efficiency drops by around 20 %. From there to the kinematic endpoint it drops sharply, predominantly due to the χ 2 IP cut on the kaon as in this region its direction is aligned with the B + candidate and therefore also with the PV.
Initially, a fit with a single resonance in addition to the ψ(3770) and non-resonant terms is performed. This additional resonance has its phase, mean, and width left free. The parameters of the resonance returned by the fit are a mass of 4191 +9 −8 MeV/c 2 and a width of 65 +22 −16 MeV/c 2 . Branching fractions are determined by integrating the square of the Breit-Wigner amplitude returned by the fit, normalising to the B + → J/ψ K + yield, and multiplying with the product of branching fractions, B(B + → J/ψ K + ) × B(J/ψ → µ + µ − ) [21]. The product B(B + → XK + ) × B(X → µ + µ − ) for the additional resonance, X, is determined to be (3.9 +0.7 −0.6 ) × 10 −9 . The uncertainty on this product is calculated using the profile likelihood. The data are not sensitive to the vector fraction of the non-resonant component as the branching fraction of the resonance will vary to compensate. For example, if the vector fraction is lowered to 30%, the central value of the branching fraction increases to 4.6 × 10 −9 . This reflects the lower amount of interference allowed between the resonant and non-resonant components.
The significance of the resonance is obtained by simulating pseudo-experiments that include the non-resonant, ψ(3770) and background components. The log likelihood ratios between fits that include and exclude a resonant component for 6 × 10 5 such samples are compared to the difference observed in fits to the data. None of the samples have a higher ratio than observed in data and an extrapolation gives a significance of the signal above six standard deviations.
The properties of the resonance are compatible with the mass and width of the ψ(4160) resonance as measured in Ref. [4]. To test the hypothesis that ψ resonances well above the open charm threshold are observed, another fit including the ψ(4040) and ψ(4160) resonances is performed. The mass and width of the two are constrained to the measurements from Ref. [4]. The data have no sensitivity to a ψ(4415) contribution. The fit describes the data well and the parameters of the ψ(4160) meson are almost unchanged with respect to  Fig. 1 and Table 1 reports the fit parameters. The resulting profile likelihood ratio compared to the best fit as a function of branching fraction can be seen in Fig. 2. In the fit with the three ψ resonances, the ψ(4160) meson is visible with B(B + → ψ(4160)K + ) × B(ψ(4160) → µ + µ − ) = (3.5 +0. 9 −0.8 ) × 10 −9 but for the ψ(4040) meson, no significant signal is seen, and an upper limit is set. The limit B(B + → ψ(4040)K + )×B(ψ(4040) → µ + µ − ) < 1.3 (1.5) × 10 −9 at 90 (95) % confidence level is obtained by integrating the likelihood ratio compared to the best fit and assuming a flat prior for any positive branching fraction.
In Fig. 3 the likelihood scan of the fit with a single extra resonance is shown as a function of the mass and width of the resonance. The fit is compatible with the ψ(4160) resonance, while a hypothesis where the resonance corresponds to the decay Y (4260) → µ + µ − is disfavoured by more than four standard deviations.
Systematic uncertainties associated with the normalisation procedure are negligible as the decay B + → J/ψ K + has the same final state as the signal and similar kinematics. Uncertainties due to the resolution and mass scale are insignificant. The systematic uncertainty associ-ated to the form factor parameterisation in the fit model is taken from Ref. [20]. Finally, the uncertainty on the vector fraction of the nonresonant amplitude is obtained using the EOS tool described in Ref. [20] and is dominated by the uncertainty from short distance contributions. All systematic uncertainties are included in the fit as Gaussian constraints. From comparing the difference in the uncertainties on masses, widths and branching fractions for fits with and without these systematic constraints, it can be seen that the systematic uncertainties are about 20 % the size of the statistical uncertainties and thus contribute less than 2% to the total uncertainty.
In summary, a resonance has been observed in the dimuon spectrum of B + → K + µ + µ − decays with a significance of above six standard deviations. The resonance can be explained by the contribution of the ψ(4160), via the decays B + → ψ(4160)K + and ψ(4160) → µ + µ − . It constitutes first observations of both decays. The ψ(4160) is known to decay to electrons with a branching fraction of (6.9±4.0)×10 −6 [4]. Assuming lepton universality, the branching fraction of the decay B + → ψ(4160)K + is   At each point all other fit parameters are reoptimised. The three ellipses are (red-solid) the best fit and previous measurements of (grey-dashed) the ψ(4160) [4] and (black-dotted) the Y (4260) [21] states. measured to be (5.1 +1.3 −1.2 ± 3.0) × 10 −4 , where the second uncertainty corresponds to the uncertainty on the ψ(4160) → e + e − branching fraction. The corresponding limit for B + → ψ(4040)K + is calculated to be 1.3 (1.7) × 10 −4 at a 90 (95) % confidence level. The absence of the decay B + → ψ(4040)K + at a similar level is interesting, and suggests future studies of B + → K + µ + µ − decays based on larger datasets may reveal new insights into cc spectroscopy.
The contribution of the ψ(4160) resonance in the low recoil region, taking into account interference with the non-resonant B + → K + µ + µ − decay, is about 20 % of the total signal. This value is larger than theoretical estimates, where the cc contribution is ∼10% of the vector amplitude, with a small correction from quarkhadron duality violation [23]. Results presented in this Letter will play an important role in controlling charmonium effects in future inclusive and exclusive b → sµ + µ − measurements.