Search for the rare decays $B^{0}\to J/\psi \gamma$ and $B^{0}_{s} \to J/\psi \gamma$

A search for the rare decay of a $B^{0}$ or $B^{0}_{s}$ meson into the final state $J/\psi\gamma$ is performed, using data collected by the LHCb experiment in $pp$ collisions at $\sqrt{s}=7$ and $8$ TeV, corresponding to an integrated luminosity of 3 fb$^{-1}$. The observed number of signal candidates is consistent with a background-only hypothesis. Branching fraction values larger than $1.7\times 10^{-6}$ for the $B^{0}\to J/\psi\gamma$ decay mode are excluded at 90% confidence level. For the $B^{0}_{s}\to J/\psi\gamma$ decay mode, branching fraction values larger than $7.4\times 10^{-6}$ are excluded at 90% confidence level, this is the first branching fraction limit for this decay.


Introduction
Decays of B mesons provide an interesting laboratory to study quantum chromodynamics (QCD). A typical approach for predicting the branching fractions of such decays is to factorize the decay into a short-distance contribution which can be computed with perturbative QCD and a long-distance contribution for which nonperturbative QCD is required. The extent to which this factorization assumption is valid leads to large theoretical uncertainties. Experimental measurements are therefore crucial to test the different calculations of the QCD interactions within these decays, so helping to identify the most appropriate theoretical approaches for predicting observables.
In the SM, the decays B 0 (s) → J/ψ γ proceed through a W boson exchange diagram as shown in Fig. 1, where one quark radiates a photon (the inclusion of charge conjugate processes is implied throughout). Theoretical predictions of the branching fractions of these decays vary significantly depending on the chosen approach for the treatment of QCD interactions in the decay dynamics. For example, in Ref. [1] the branching fraction, evaluated in the framework of QCD factorization [2], is expected to be ∼ 2 × 10 −7 , whereas the calculation in Ref. [3], using perturbative QCD, predicts a branching fraction of 5 × 10 −6 . The process is also sensitive to physics beyond the SM, for example righthanded currents [1]. The decay B 0 → J/ψ γ has been previously searched for by the BABAR collaboration, and a limit on the branching fraction of 1.6 × 10 −6 was set at 90% confidence level (C.L.) [4].
This paper describes a search for the decays B 0 s → J/ψ γ and B 0 → J/ψ γ, performed with proton-proton (pp) collision data collected by the LHCb experiment corresponding to an integrated luminosity of 1.0 (2.0) fb −1 recorded at center-of-mass energies of Event selections are described in Sec. 3. The signal yield is normalized to a set of B → J/ψ γX decays, described in Sec. 4. The relative efficiency between signal and normalization decay modes is calculated using simulated events. This efficiency is crosschecked using the decay B 0 → K * 0 γ. Finally, in Sec. 6, the upper limits on the branching fractions are calculated using the CL S method [5,6].

Detector and simulation
The LHCb detector [7,8] is a single-arm forward spectrometer covering the pseudorapidity range 2 < η < 5, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about 4 Tm, and three stations of siliconstrip detectors and straw drift tubes placed downstream of the magnet. The combined tracking system provides a measurement of momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 GeV/c. The minimum distance of a track to a primary vertex, the impact parameter (IP), is measured with a resolution of (15 + 29/p T ) µm, where p T is the component of the momentum transverse to the beam, in GeV/c. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic calorimeter (ECAL) and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. The online event selection is performed by a trigger system [9], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction.
In the simulation, pp collisions are generated using Pythia 6 and Pythia 8 [10] with a specific LHCb configuration [11]. Decays of hadronic particles are described by EvtGen [12], in which final-state radiation is generated using Photos [13]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [14] as described in Ref. [15].

Selection requirements
Candidate events are first required to pass the hardware trigger which requires at least one muon with p T > 1.48 (1.76) GeV/c in the 7 (8) TeV data. In the subsequent software trigger, at least one of the final-state particles is required to have p T > 0.8 GeV/c and IP larger than 100 µm with respect to any of the primary pp interaction vertices (PVs) in the event. Finally, the tracks of two final-state particles are required to form a vertex that is significantly displaced from the PVs.
In the offline selection of signal candidates, J/ψ decays are reconstructed from oppositely charged muon pairs where both muons have p T > 550 MeV/c, good track fit qualities and an IP with respect to any PV significantly different from zero. The muon pair is required to form a good quality decay vertex. In order to suppress background from decays such as B → J/ψ π 0 , where both photons from the π 0 decay are reconstructed as a single cluster in the ECAL, only photons which convert into e + e − pairs are used in the analysis. This reduces the signal efficiency by about a factor of 30 with respect to photons which do not convert, but improves the signal resolution in the reconstructed invariant J/ψ γ mass by a factor of 5. Furthermore, the direction of the photon momentum vector can be checked for consistency with the B decay vertex and used to reject combinatorial background. Photons are required to convert in the material before the second tracking system, which corresponds to about 0.25 radiation lengths [7]. They are reconstructed following a similar strategy to that described in Ref. [16], by combining electron and positron track pairs, which can be associated with electromagnetic clusters in the ECAL and are significantly displaced from the reconstructed B decay vertex. The energy loss of electrons by emission of bremsstrahlung photons is recovered by adding the energies of reconstructed photons associated with the track. The photon candidates are required to have a reconstructed invariant mass less than 100 MeV/c 2 and p T > 1 GeV/c. Candidates are separated into two categories based on where the photon converts in the detector. Conversions which occur early enough for the converted electrons to be reconstructed in the vertex detector, are referred to as long because the tracks pass through the full tracking system, while those which convert late enough such that track segments of the electrons cannot be formed in the vertex detector are referred to as downstream [17]. The J/ψ and γ candidates are combined to form a B candidate, which is required to have an invariant mass in the range 4500 < m(J/ψ γ) < 7000 MeV/c 2 and p T > 5 GeV/c. The momentum vector of the B candidate is required to be aligned with the vector between the associated PV and the decay vertex, in order to suppress combinational background.
A boosted decision tree (BDT) [18,19] is trained to reject combinatorial background, where the J/ψ and photon candidates originate from different decays. The signal is represented in the BDT training with simulated B 0 s → J/ψ γ decays, while selected data events in the high mass sideband, 5500 < m(J/ψ γ) < 6500 MeV/c 2 , are used to represent the background. The input variables used in the training are mostly kinematic and geometric variables, as well as isolation criteria used to reject background containing additional tracks in close proximity to the J/ψ vertex. Separate trainings are performed for events in which the photon conversion is long or downstream. The k-fold crossvalidation method [20], with k = 5, is used to increase the training statistics while avoiding overtraining. The requirement on the BDT response is optimized by maximizing the metric N S / √ N S + N B , where N S is the estimated number of signal events after selection assuming a branching fraction B(B 0 s → J/ψ γ) = 5 × 10 −6 , and N B is the estimated number of background events in the signal region, 5250 < m(J/ψ γ) < 5400, extrapolated from an exponential fit to the data in the high mass sideband. This requirement is 60% efficient for simulated signal candidates and rejects 98% of the combinatorial background.
The decay B 0 → K * 0 γ, where K * 0 → K + π − , is used to validate the selection and to assess systematic uncertainties arising from differences between simulation and data. The same BDT used for the signal selection, with the J/ψ and muon properties replaced by those of the K * 0 and its decay products, is applied to the B 0 → K * 0 γ candidates.

Branching fraction
The branching fraction is determined by performing a fit to the J/ψ γ invariant mass distribution in the range 4500 < m(J/ψ γ) < 7000 MeV/c 2 . In the fit, the signal yield is normalized to the following set of decay modes: B 0 (s) → J/ψ η(η → γγ), B 0 →J/ψ π 0 , B 0 → J/ψ K 0 S (K 0 S → π 0 π 0 ) and B + → J/ψ ρ + (ρ + → π 0 π + ), where only the J/ψ meson and one photon are reconstructed. These decay modes are chosen because they have relatively well measured branching fractions and are expected to contribute in the selected mass range. The normalization procedure is performed by expressing the branching fraction, B, as where i represents a normalization decay mode, N sig and N i norm are the observed number of signal and normalization candidates, f is the relevant production fraction and is the efficiency as determined from the simulation. Systematic uncertainties associated with these quantities are included in the fit as nuisance parameters.
In general, the normalization modes have a lower offline selection efficiency than the signal because the photon has a lower momentum and therefore the electron tracks are more likely to be bent outside the detector acceptance by the magnetic field. For example, the selection efficiency for signal is around 60% whereas that of B 0 → J/ψ π 0 is only 30%.
The dimuon mass is constrained to the known value of the J/ψ meson [21], which improves the m(J/ψ γ) resolution by ∼30%. The B 0 s → J/ψ γ signal shape is obtained by fitting a Gaussian function with a power-law tail to simulation. The m(J/ψ γ) resolution is approximately 90 (70) MeV/c 2 for long (downstream) decays. The search for the B 0 → J/ψ γ signal is performed separately to the B 0 s → J/ψ γ decay, where the same signal shape is used with the peak position adjusted by the difference in masses of the B 0 and B 0 s mesons given in Ref. [21]. The B 0 → J/ψ γ branching fraction is assumed to be zero when fitting for the B 0 s → J/ψ γ signal, and vice versa. The normalization modes form a broad shoulder below the signal peak and their shapes are modeled using dedicated simulation samples. The total normalization yield is allowed to float in the fit, with the contribution from each individual normalization decay mode constrained, taking into account the relative efficiencies and branching fractions between them. For the B 0 s modes, the ratio of fragmentation fractions, f s /f d is used to calculate the relative expected yields of B 0 and B 0 s meson decays. The fragmentation fractions for B + and B 0 decays are assumed to be the same.
Other B → J/ψ γX decays which are missing either a heavy particle or several particles are modeled by an exponential function with the shape obtained from simulated  B + → J/ψ K * + events. The choice of parameterization for these backgrounds is checked using simulation samples and no bias is observed for the signal yield. Finally, combinatorial background is modeled by an exponential function, the slope of which is allowed to float in the fit. The result of the fit, to the combined long and downstream samples, allowing for a B 0 s → J/ψ γ contribution, is shown in Fig. 2, where no significant signal is observed. The result is similar for the B 0 → J/ψ γ case.

Systematic uncertainties
Many systematic uncertainties cancel to a large extent as both signal and normalization modes contain the same reconstructed final-state particles. In particular, systematic uncertainties related to the ratio of efficiencies for the trigger and particle identification requirements are negligible. However, differences can arise as the photon is typically softer for the normalization modes than for the signal. These effects are accounted for using the B 0 → K * 0 γ decay as a control channel. All systematics uncertainties are included in the final likelihood fit as nuisance parameters. Their impact on the branching fraction measurement is summarized in Table. 1.

Source
Uncertainty (%) Normalisation branching fractions ±17 The largest systematic uncertainty comes from the knowledge of the branching fractions of the normalization modes taken from Ref. [21] which have uncertainties of 4%-21%, depending on the decay mode involved. The considerably smaller uncertainty from the J/ψ → µ + µ − branching fraction is neglected. An additional uncertainty originates from the measured value of the ratio of fragmentation fractions, f s /f d = 0.259 ± 0.015, taken from Refs. [22][23][24].
As the difference in the lifetimes between the mass eigenstates of the B 0 s meson, ∆Γ s , is significant, the signal efficiency depends on the admixture of the CP content of the final state [25]. As this is unknown for B 0 s → J/ψ γ, two extreme scenarios are compared, where the decay is either purely CP odd or purely CP even. The lifetimes for the CP eigenstates are taken from Ref. [26] to be 1.379 ± 0.031 (1.656 ± 0.033) ps for the CP -even (-odd) final states. The corresponding difference in efficiency is +8 −4 % compared to the average B 0 s lifetime, and is added as a systematic uncertainty. The shape of the signal is obtained from simulation. Potential mismodeling of this shape is assessed by comparing the signal peak position and signal width of B 0 → K * 0 γ decays in data and simulation. The K * 0 γ invariant mass distributions are fitted separately for long and downstream candidates using the simulation to model the signal shape and using an exponential function to model the combinatorial background. There is no significant difference in the peak position, while the signal resolution in data is (28 ± 14)% and (40 ± 13)% wider with respect to simulation for the long and downstream categories. These factors are used to correct the signal width and are constrained in the fit.
Simulation is relied upon to model any residual kinematic differences between the signal and normalization channels. The ability of the simulation to accurately emulate these differences in reconstruction is assessed by comparing simulation and data for the B 0 → K * 0 γ decay. Any differences are used to recompute the relative signal and normalization efficiency and then assigned as systematic uncertainties. The sPlot technique [27] is used to compare the data and simulation for the transverse momentum of the photon and the cosine of the pointing angle, defined as the angle between the momentum vector of the B 0 candidate and its flight direction. The effect on the relative efficiency of reweighting the simulation to match the data is 4% for long candidates and 2% for downstream candidates and these values are applied as systematic uncertainties.

Results
The CL S method [5,6] is used to determine upper limits on the B 0 s → J/ψ γ and B 0 → J/ψ γ branching fractions. The test statistic used is that described in Eq. 16 of Ref. [28]. For a given hypothesis of the branching fraction, B, q B is defined as the ratio of likelihoods given the hypothesis value and the best-fit value, whereB is the best-fit branching fraction andθ B are the best-fit values of the nuisance parameters given the hypothesis value B. Pseudoexperiments are generated in order to determine the observed and expected exclusion confidence level of the branching fraction value. The exclusion confidence level, CL S , is calculated as the ratio of the fraction of signal and background pseudoexperiments to the fraction of background-only pseudoexperiments, which have a test statistic value larger than that found in data.
The observed and expected CL S exclusions are shown as functions of the hypothesis branching fraction for B 0 s → J/ψ γ and B 0 → J/ψ γ decays in Fig. 3. The branching fraction upper limits are determined to be

Conclusion
A search for the decays B 0 s → J/ψ γ and B 0 → J/ψ γ has been performed with data collected by the LHCb experiment corresponding to an integrated luminosity of 3 fb −1 . These decay modes predominantly occur via a W boson exchange diagram and are sensitive to extensions of the SM. No significant signal is observed and an upper limit on the branching fraction is set at 7.3×10 −6 at 90% C.L. for the B 0 s → J/ψ γ decay mode and 1.5×10 −6 at 90% C.L. for the B 0 → J/ψ γ decay mode. The B 0 → J/ψ γ branching fraction limit is competitive with, and in agreement with, the previous measurement from BABAR [4]. This is the first limit on the decay B 0 s → J/ψ γ and is close to the sensitivity (5 × 10 −6 ) of the calculation of the branching fraction based on perturbative QCD [3].